6 Forecast error variance maximization

6.1 The main (unconditional) approach

The approach presented in this section exploits the derivations of Uhlig (2004). Barsky and Sims (2011) exploit this approach to identify a TFP news shock, that they define as the shock (a) that is orthogonal to the innovation in current utilization-adjusted TFP and (b) that best explains variation in future TFP.

Consider a process \(\{y_t\}\) that admits the infinite MA representation of Eq. (1.4). Let \(Q\) be an orthogonal matrix, an alternative decomposition of \(y_t\) is: \[\begin{eqnarray} y_t&=&\sum_{h=0}^{+\infty}\Psi_h\underbrace{\eta_{t-h}}_{Q\tilde \eta_{t-h}} = \sum_{h=0}^{+\infty}\underbrace{\Psi_hQ}_{\tilde\Psi_h}\tilde \eta_{t-h} = \sum_{h=0}^{+\infty}\tilde\Psi_h\tilde \eta_{t-h}, \end{eqnarray}\] where \(\tilde \eta_{t-h}=Q'\eta_{t-h}\) are the white-noise shocks associated with the new MA representation, \(Q\) being an orthgonal matrix. (They also satisfy \(\mathbb{V}ar(\tilde\eta_t)=Id\).)

The \(h\)-step ahead prediction error of \(y_{t+h}\), given all the data up to, and including, \(t-1\) is given by \[ e_{t+h}(h)=y_{t+h}-\mathbb{E}_{t-1}(y_{t+h})=\sum_{j=0}^h\tilde \Psi_h\tilde \eta_{t+h-j}. \]

The variance-covariance matrix of \(e_{t+h}(h)\) is \[ \Omega^{(h)}=\sum_{j=0}^h\tilde \Psi_j\tilde \Psi_j'=\sum_{j=0}^h \Psi_j \Psi_j'. \]

We can decompose \(\Omega^{(h)}\) into the contribution of each shock \(l\) (\(l^{th}\) component of \(\tilde{\eta}_t\)): \[ \Omega^{(h)}=\sum_{l=1}^n\Omega_l^{(h)}(Q) \] with \[ \Omega_l^{(h)}(Q) =\sum_{j=0}^h(\Psi_jq_l)(\Psi_jq_l)', \] where \(q_l\) is the \(l^{th}\) column of \(Q\).

This decomposition can be used with the objective of finding the impulse vector \(b\) that is s.t. that it explains as much as possible of the sum of the \(h\)-step ahead prediction error variance of some variable \(i\), say, for prediction horizons \(h \in [\underline{h} , \overline{h}]\).

Formally, the task is to explain as much as possible of the variance \[ \sigma^2(\underline{h},\overline{h},q_1)=\sum_{h=\underline{h}}^{\overline{h}} \sum_{j=0}^h\left[(\Psi_jq_1)(\Psi_jq_1)'\right]_{i,i} \] with a single impulse vector \(q_1\).

Denote by \(E_{ii}\) the matrix that is filled with zeros, except for its (\(i,i\)) entry, set to 1. We have: \[\begin{eqnarray*} \sigma^2(\underline{h},\overline{h},q_1)&=&\sum_{h=\underline{h}}^{\overline{h}} \sum_{j=0}^h\left[(\Psi_jq_1)(\Psi_jq_1)'\right]_{i,i}=\sum_{h=\underline{h}}^{\overline{h}} \sum_{j=0}^h Tr\left[E_{ii}(\Psi_jq_1)(\Psi_jq_1)'\right]\\ &=&\sum_{h=\underline{h}}^{\overline{h}} \sum_{j=0}^h Tr\left[q_1'\Psi_j'E_{ii}\Psi_j q_1\right]\\ &=& q_1'Sq_1, \end{eqnarray*}\] where \[\begin{eqnarray*} \begin{array}{lll}S&=&\sum_{h=\underline{h}}^{\overline{h}}\sum_{j=0}^{h}\Psi_j'E_{ii}\Psi_j\\ &=&\sum_{j=0}^{\overline{h}}(\overline{h}+1-max(\underline{h},j))\Psi_j'E_{ii}\Psi_j\\ &=&\sum_{j=0}^{\overline{h}}(\overline{h}+1-max(\underline{h},j))\Psi_{j,i}'\Psi_{j,i}\\ \end{array} \end{eqnarray*}\] where \(\Psi_{j,i}\) denotes row \(i\) of \(\Psi_{j}\), i.e., the response of variable \(i\) at horizon \(j\) (when \(Q=Id\)).

The maximization problem subject to the side constraint \(q_1'q_1=1\) can be written as a Lagrangian: \[ L=q_1'Sq_1-\lambda(q_1'q_1-1), \] with the first-order condition \(Sq_1=\lambda q_1\) (the side constraint is \(q_1'q_1=1\)). From this equation, we see that the solution \(q_1\) is an eigenvector of \(S\), the one associated with eigenvalue \(\lambda\). We also see that \(\sigma^2(\underline{h},\overline{h},q_1)=\lambda\). Thus, to maximize this variance, we need to find the eigenvector of \(S\) that is associated with the maximal eigenvalue \(\lambda\). That defines the first principal component (see Section 10.1). That is, if \(S\) admits the following spectral decomposition: \[ S = \mathcal{P}D\mathcal{P}', \] where \(D\) is diagonal matrix whose entries are the (ordered) eigenvalues: \(\lambda_1 \ge \lambda_2 \ge \dots \ge \lambda_n \ge 0\), then \(\sigma^2(\underline{h},\overline{h},q_1)\) is maximized for \(q_1 = p_1\), where \(p_1\) is the first column of \(\mathcal{P}\).

The following code identifies a ``main GDP shock’’ using Uhlig’s method.

library(IdSS)
library(readxl)
library(vars)
library(Matrix)
# Declare data:
TFP   <- levpan$tfp_lev
GDP   <- levpan$lngdpcap
E12   <- levpan$e12m
CONS  <- levpan$lnconcap
HOURS <- levpan$lnhrscap
y <- cbind(TFP,GDP,E12,CONS,HOURS)
names.of.variables <- c("TFP","GDP","E12","Consumption","Hours")
colnames(y)  <- names.of.variables
T <- dim(y)[1]
n <- dim(y)[2]
p <- 2
nb.periods.IRF <- 40
bootstrap.replications <- 1000
confid.interv <- 0.75
indic.plot <- 1
H <- 20
# ===============================
# FEVM
# ===============================
y0.star <- rep(0,n*p)
En <- array(0,c(n,n))
En[2,2] <- 1
stdv.IRFs <- list()
# do Choleski (we need the FULL IRFs)
cholesky.res <- svar.ordering.all(y,p,nb.periods.IRF,n)
# Store results
IRFs <- cholesky.res$IRFs
est.VAR <- cholesky.res$est.VAR
Phi     <- Acoef(est.VAR)
B.hat   <- cholesky.res$B.hat
cst     <- Bcoef(est.VAR)[,p*n+1]
resids  <- residuals(est.VAR)
Omega   <- var(resids)
# if no bootstrap then simply use point estimate
simulated.IRFs  <- replicate(1,IRFs, simplify="array")
simulated.B.hat <- replicate(1,B.hat, simplify="array")
simulated.Phi   <- replicate(1,Phi, simplify="array")
# if bootstrap then generate and store simulated IRFs, B.hat and Phi
if(bootstrap.replications>1){
  bootstrap.res <- param.bootstrap(y,p,nb.periods.IRF,n,bootstrap.replications,
                                  posit.of.shock = 0)
  simulated.IRFs  <- bootstrap.res$simulated.IRFs
  simulated.B.hat <- bootstrap.res$simulated.B.hat
  simulated.Phi   <- bootstrap.res$simulated.Phi}
# Initialize Q as identity matrix
Q <- replicate(bootstrap.replications,diag(n), simplify="array")
# This is where we will store the IRFs
IRFs.final <- array(NaN,c(n,n,nb.periods.IRF,bootstrap.replications))
# This loop identifies the relevant Q for each bootstrap replication
  for (l in 1:bootstrap.replications){
    ##############################
    # Identification of main GDP shock
    ##############################
    # Compute S
    WWW <- array(0,c(n,n))
    for (h in 1:H){
      V99 <- simulated.IRFs[,,h,l]
      JJ <- (H+1-h)*t(V99)%*%En%*%V99
      WWW <- WWW+JJ}
    r <- eigen(WWW) 
    # Take the eigenvector with the highest eigenvalue
    eigvec <- matrix(r$vectors[,1],n,1)
    # We might need to adjust the sign
    if (simulated.IRFs[1,,20,l]%*%eigvec>0){
      Q[,1,l] <- Q[,,l]%*%eigvec
      }else{Q[,1,l] <- - Q[,,l]%*%eigvec}
    Q[,2:n,l] <- Null(Q[,1,l]) # we ensure that columns 2 to n are 
    #   orthogonal to the first one.
    # New IRFs
    for (t in 1:nb.periods.IRF){
      IRFs.final[,,t,l] <- simulated.IRFs[,,t,l]%*%Q[,,l]}
  }
# compute some key moments of the simulated IRFs:
stdv.IRFs <- apply(IRFs.final,c(1,2,3),sd)
CI.lower.bounds <-
  apply(IRFs.final,c(1,2,3),function(x){quantile(x,(1-confid.interv)/2)})
CI.upper.bounds <-
  apply(IRFs.final,c(1,2,3),function(x){quantile(x,1-(1-confid.interv)/2)})
CI.median <-
  apply(IRFs.final,c(1,2,3),function(x){quantile(x,0.5)})
# Plot graphs
  par(mfrow=c(2,ifelse(round(n/2)==n/2,n/2,(n+1)/2)))
  for(i in 1:n){
    plot(CI.median[i,1,],type="l",lwd=2,xlab="",ylab="",
         ylim=c(min(CI.lower.bounds[i,1,]),
                max(CI.upper.bounds[i,1,])),
         main=paste("Effect of Main GDP shock on ",
                    names.of.variables[i],sep=""))
    abline(h=0,col="grey")
      lines(CI.lower.bounds[i,1,],col="red",lty=2,lwd=2)
      lines(CI.upper.bounds[i,1,],col="red",lty=2,lwd=2)}
variance.decomp <- variance.decomp(IRFs.final)
vardecomp <- variance.decomp$vardecomp
mean(vardecomp[2,2,40,,1]) 
## [1] 0.7360619
Main GDP shock.

Figure 6.1: Main GDP shock.

The ``main GDP shock’’ explains 74% of the variance of GDP.

The following code replicates Levchenko and Pandalai-Nayar (2018). They use a mix of zero and FEVD to identify TFP surprises, TFP news, and sentiment shocks.

nb.periods.IRF <- 40
p <- 2
bootstrap.replications <- 1000
confid.interv <- 0.75
indic.plot <- 1
nb.shocks <- 3
names.of.shocks <- c("TFP surprise","TFP news","Sentiment")
Hn <- 40 #horizon for news shock
Hs <- 2 #horizon for sentiment shock
# ===============================
# FEVM + zeros
# ===============================
En <- array(0,c(n,n))
En[1,1] <- 1
Es <- array(0,c(n,n))
Es[3,3] <- 1
# Initialize Q as identity matrix
Q <- replicate(bootstrap.replications,diag(n), simplify="array")
# This loop identifies the relevant Q for each bootstrap replication
  for (l in 1:bootstrap.replications){
    ##############################
    # Identification of news shocks
    ##############################
    # Compute S
    WWW <- array(0,c(n-1,n-1))
    for (h in 1:Hn){
      V99 <- simulated.IRFs[,,h,l]%*%Q[,2:n,l] 
      # Notice that we use only columns 2 to n of Q:
      #  the first column selects the TFP surprise shock,
      #  which is the first shock in the Cholesky
      #  decomposition where TFP is ordered first.
      JJ <- (Hn+1-h)*t(V99)%*%En%*%V99
      WWW <- WWW+JJ}
    r <- eigen(WWW) 
    # Take the eigenvector with the highest eigenvalue
    eigvec <- matrix(r$vectors[,1],n-1,1)
    # We might need to adjust the sign
    if (simulated.IRFs[1,,20,l]%*%Q[,2:n,l]%*%eigvec>0){
      Q[,2,l] <- Q[,2:n,l]%*%eigvec
      }else{
        Q[,2,l] <- - Q[,2:n,l]%*%eigvec}
    Q[,3:n,l] <- Null(Q[,1:2,l]) # we ensure that columns 3 to n
    #   are orthogonal to the first 2
    #####################################
    # Identification of sentiment shocks
    #####################################
    # Compute S
    WWW <- array(0,c(n-2,n-2))
    for (h in 1:Hs){
      V99 <- simulated.IRFs[,,h,l]%*%Q[,3:n,l]
      # Notice that we use only columns 3 to n of Q:
      # the first column selects the TFP surprise shock, 
      # which is the first shock in the Cholesky decomposition
      # where TFP is ordered first, the second column generates 
      # the linear combination of "Cholesky shocks" that is orthogonal
      # to TFP surprise and best explains TFP up to 
      # horizon 40 (see last step)
      JJ <- (Hs+1-h)*t(V99)%*%Es%*%V99
      WWW <- WWW+JJ}
    r <- eigen(WWW) 
    # Take the eigenvector with the highest eigenvalue
    eigvec <- matrix(r$vectors[,1],n-2,1)
    # We might need to adjust the sign
    if (simulated.IRFs[3,,2,l]%*%Q[,3:n,l]%*%eigvec>0){
      Q[,3,l] <- Q[,3:n,l]%*%eigvec
    }else{Q[,3,l] <- -Q[,3:n,l]%*%eigvec}
    Q[,4:n,l] <- Null(Q[,1:3,l]) # we ensure that columns 4 to n are
    # orthogonal to the first 3
    for (t in 1:nb.periods.IRF){
      IRFs.final[,,t,l] <- simulated.IRFs[,,t,l]%*%Q[,,l]}
  }
# compute some key moments of the simulated IRFs
stdv.IRFs <- apply(IRFs.final,c(1,2,3),sd)
CI.lower.bounds <-
  apply(IRFs.final,c(1,2,3),function(x){quantile(x,(1-confid.interv)/2)})
CI.upper.bounds <-
  apply(IRFs.final,c(1,2,3),function(x){quantile(x,1-(1-confid.interv)/2)})
CI.median <-
  apply(IRFs.final,c(1,2,3),function(x){quantile(x,0.5)})
# Plot graphs
for (j in 1:nb.shocks){
  par(mfrow=c(2,ifelse(round(n/2)==n/2,n/2,(n+1)/2)))
  for(i in 1:n){
    plot(CI.median[i,j,],type="l",lwd=2,xlab="",ylab="",
         ylim=c(min(CI.lower.bounds[i,j,]),
                max(CI.upper.bounds[i,j,])),
         main=paste("Effect of ",names.of.shocks[j]," on ",
                    names.of.variables[i],sep=""))
    abline(h=0,col="grey")
      lines(CI.lower.bounds[i,j,],col="red",lty=2,lwd=2)
      lines(CI.upper.bounds[i,j,],col="red",lty=2,lwd=2)}
}
Replication of Levchenko and Pandalai-Nayar (2020). FEVD and zero restrictions.

Figure 6.2: Replication of Levchenko and Pandalai-Nayar (2020). FEVD and zero restrictions.

Replication of Levchenko and Pandalai-Nayar (2020). FEVD and zero restrictions.

Figure 6.3: Replication of Levchenko and Pandalai-Nayar (2020). FEVD and zero restrictions.

variance.decomp <- variance.decomp(IRFs.final)
vardecomp <- variance.decomp$vardecomp
mean(vardecomp[2,2,40,,1])
## [1] 0.1064586
mean(vardecomp[2,2,40,,2])
## [1] 0.6242869
mean(vardecomp[2,2,40,,3])
## [1] 0.1196272
Replication of Levchenko and Pandalai-Nayar (2020). FEVD and zero restrictions.

Figure 6.4: Replication of Levchenko and Pandalai-Nayar (2020). FEVD and zero restrictions.

Sentiment shocks explain 12% of the variance of GDP, against 72% for TFP shocks (including 62% for TFP news shocks).

6.2 Restrictions based on narrative historical decomposition

A related approach, introduced by Antolín-Díaz and Rubio-Ramírez (2018), consists in imposing that, on some specific dates (based on narrative information), a particular shock was the most important contributor to the unexpected movement of some variable during a particular period.6 This can be formalized in two different ways (respectively called Type A and Type B by Antolín-Díaz and Rubio-Ramírez (2018)):

  • Type A: A given shock is the most important (least important) driver of the unexpected change in a variable during some periods. For these periods, the absolute value of its contribution to the unexpected change in a variable is larger (smaller) than the absolute value of the contribution of any other structural shock.
  • Type B: A given shock is the overwhelming (negligible) driver of the unexpected change in a given variable during the period. For these periods, the absolute value of its contribution to the unexpected change in a variable is larger (smaller) than the sum of the absolute value of the contributions of all other structural shocks.