2 Identification problem and standard identification techniques

2.1 The identification problem

In Section 1.4, we have seen how to estimate \(\mathbb{V}ar(\varepsilon_t) =\Omega\) and the \(\Phi_k\) matrices in the context of a VAR model. But the IRFs are functions of \(B\) and of the \(\Phi_k\)’s, not of \(\Omega\) the \(\Phi_k\)’s (see Section 1.2). We have \(\Omega = BB'\), which provides some restrictions on the components of \(B\), but this is not sufficient to fully identify \(B\). Indeed, seen a system of equations whose unknowns are the \(b_{i,j}\)’s (components of \(B\)), the system \(\Omega = BB'\) contains only \(n(n+1)/2\) linearly independent equations. For instance, for \(n=2\): \[\begin{eqnarray*} &&\left[ \begin{array}{cc} \omega_{11} & \omega_{12} \\ \omega_{12} & \omega_{22} \end{array} \right] = \left[ \begin{array}{cc} b_{11} & b_{12} \\ b_{21} & b_{22} \end{array} \right]\left[ \begin{array}{cc} b_{11} & b_{21} \\ b_{12} & b_{22} \end{array} \right]\\ &\Leftrightarrow&\left[ \begin{array}{cc} \omega_{11} & \omega_{12} \\ \omega_{12} & \omega_{22} \end{array} \right] = \left[ \begin{array}{cc} b_{11}^2+b_{12}^2 & \color{red}{b_{11}b_{21}+b_{12}b_{22}} \\ \color{red}{b_{11}b_{21}+b_{12}b_{22}} & b_{22}^2 + b_{21}^2 \end{array} \right]. \end{eqnarray*}\]

We then have 3 linearly independent equations but 4 unknowns. Therefore, \(B\) is not identified based on second-order moments. Additional restrictions are required to identify \(B\). This section covers two standard identification schemes: short-run and long-run restrictions.

  1. A short-run restriction (SRR) prevents a structural shock from affecting an endogenous variable contemporaneously.
  • It is easy to implement: the appropriate entries of \(B\) are set to 0.
  • A particular (popular) case is that of the Cholesky, or recursive approach.
  • Examples include Bernanke (1986), Sims (1986), Galí (1992), Ruibio-Ramírez, Waggoner, and Zha (2010).
  1. A long-run restriction (LRR) prevents a structural shock from having a cumulative impact on one of the endogenous variables.
  • Additional computations are required to implement this. One needs to compute the cumulative effect of one of the structural shocks \(u_{t}\) on one of the endogenous variable.
  • Examples include Blanchard and Quah (1989), Faust and Leeper (1997), Galí (1999), Erceg, Guerrieri, and Gust (2005), Christiano, Eichenbaum, and Vigfusson (2007).

The two approaches can be combined (see, e.g., Gerlach and Smets (1995)).

2.2 A stylized example motivating short-run restrictions

Let us consider a simple example that could motivate short-run restrictions. Consider the following stylized macro model: \[\begin{equation} \begin{array}{clll} g_{t}&=& \bar{g}-\lambda(i_{t-1}-\mathbb{E}_{t-1}\pi_{t})+ \underbrace{{\color{blue}\sigma_d \eta_{d,t}}}_{\mbox{demand shock}}& (\mbox{IS curve})\\ \Delta \pi_{t} & = & \beta (g_{t} - \bar{g})+ \underbrace{{\color{blue}\sigma_{\pi} \eta_{\pi,t}}}_{\mbox{cost push shock}} & (\mbox{Phillips curve})\\ i_{t} & = & \rho i_{t-1} + \left[ \gamma_\pi \mathbb{E}_{t}\pi_{t+1} + \gamma_g (g_{t} - \bar{g}) \right]\\ && \qquad \qquad+\underbrace{{\color{blue}\sigma_{mp} \eta_{mp,t}}}_{\mbox{Mon. Pol. shock}} & (\mbox{Taylor rule}), \end{array}\tag{2.1} \end{equation}\] where: \[\begin{equation} \eta_t = \left[ \begin{array}{c} \eta_{\pi,t}\\ \eta_{d,t}\\ \eta_{mp,t} \end{array} \right] \sim i.i.d.\,\mathcal{N}(0,I).\tag{2.2} \end{equation}\]

Vector \(\eta_t\) is assumed to be a vector of structural shocks, mutually and serially independent. On date \(t\):

  • \(g_t\) is contemporaneously affected by \(\eta_{d,t}\) only;
  • \(\pi_t\) is contemporaneously affected by \(\eta_{\pi,t}\) and \(\eta_{d,t}\);
  • \(i_t\) is contemporaneously affected by \(\eta_{mp,t}\), \(\eta_{\pi,t}\) and \(\eta_{d,t}\).

System (2.1) could be rewritten as follows: \[\begin{equation} \left[\begin{array}{c} d_t\\ \pi_t\\ i_t \end{array}\right] = \Phi(L) \left[\begin{array}{c} d_{t-1}\\ \pi_{t-1}\\ i_{t-1} + \end{array}\right] +\underbrace{\underbrace{ \left[ \begin{array}{ccc} 0 & \bullet & 0 \\ \bullet & \bullet & 0 \\ \bullet & \bullet & \bullet \end{array} \right]}_{=B} \eta_t.}_{=\varepsilon_t}\tag{2.3} \end{equation}\]

This is the reduced-form of the model. This representation suggests three additional restrictions on the entries of \(B\); the latter matrix is therefore identified as soon as \(\Omega = BB'\) is known (up to the signs of its columns).

2.3 Cholesky: a specific short-run-restriction situation

There are particular cases in which some well-known matrix decomposition of \(\Omega=\mathbb{V}ar(\varepsilon_t)\) can be used to easily estimate some specific SVAR. This is the case for the so-called Cholesky decomposition. Consider the following context:

  • A first shock (say, \(\eta_{n_1,t}\)) can affect instantaneously (i.e., on date \(t\)) only one of the endogenous variable (say, \(y_{n_1,t}\));
  • A second shock (say, \(\eta_{n_2,t}\)) can affect instantaneously (i.e., on date \(t\)) two endogenous variables, \(y_{n_1,t}\) (the same as before) and \(y_{n_2,t}\);
  • \(\dots\)

This implies

  1. that column \(n_1\) of \(B\) has only 1 non-zero entry (this is the \(n_1^{th}\) entry),
  2. that column \(n_2\) of \(B\) has 2 non-zero entries (the \(n_1^{th}\) and the \(n_2^{th}\) ones), etc.

Without loss of generality, we can set \(n_1=n\), \(n_2=n-1\), etc. In this context, matrix \(B\) is lower triangular. The Cholesky decomposition of \(\Omega_{\varepsilon}\) then provides an appropriate estimate of \(B\), since this matrix decomposition yields to a lower triangular matrix satisfying: \[ \Omega_\varepsilon = BB'. \]

For instance, Dedola and Lippi (2005) estimate 5 structural VAR models for the US, the UK, Germany, France and Italy to analyse the monetary-policy transmission mechanisms. They estimate SVAR(5) models over the period 1975-1997. The shock-identification scheme is based on Cholesky decompositions, the ordering of the endogenous variables being: the industrial production, the consumer price index, a commodity price index, the short-term rate, monetary aggregate and the effective exchange rate (except for the US). This ordering implies that monetary policy reacts to the shocks affecting the first three variables but that the latter react to monetary policy shocks with a one-period lag only.

The Cholesky approach can be employed when one is interested in one specific structural shock. This is the case, e.g., of Christiano, Eichenbaum, and Evans (1996). Their identification is based on the following relationship between \(\varepsilon_t\) and \(\eta_t\): \[ \left[\begin{array}{c} \boldsymbol\varepsilon_{S,t}\\ \varepsilon_{r,t}\\ \boldsymbol\varepsilon_{F,t} \end{array}\right] = \left[\begin{array}{ccc} B_{SS} & 0 & 0 \\ B_{rS} & B_{rr} & 0 \\ B_{FS} & B_{Fr} & B_{FF} \end{array}\right] \left[\begin{array}{c} \boldsymbol\eta_{S,t}\\ \eta_{r,t}\\ \boldsymbol\eta_{F,t} \end{array}\right], \] where \(S\), \(r\) and \(F\) respectively correspond to slow-moving variables, the policy variable (short-term rate) and fast-moving variables. While \(\eta_{r,t}\) is scalar, \(\boldsymbol\eta_{S,t}\) and \(\boldsymbol\eta_{F,t}\) may be vectors. The space spanned by \(\boldsymbol\varepsilon_{S,t}\) is the same as that spanned by \(\boldsymbol\eta_{S,t}\). As a result, because \(\varepsilon_{r,t}\) is a linear combination of \(\eta_{r,t}\) and \(\boldsymbol\eta_{S,t}\) (which are \(\perp\)), it comes that the \(B_{rr}\eta_{r,t}\)’s are the (population) residuals in the regression of \(\varepsilon_{r,t}\) on \(\boldsymbol\varepsilon_{S,t}\). Because \(\mathbb{V}ar(\eta_{r,t})=1\), \(B_{rr}\) is given by the square root of the variance of \(B_{rr}\eta_{r,t}\). \(B_{F,r}\) is finally obtained by regressing the components of \(\boldsymbol\varepsilon_{F,t}\) on the estimates of \(\eta_{r,t}\).

An equivalent approach consists in computing the Cholesky decomposition of \(BB'\) and the contemporaneous impacts of the monetary policy shock (on the \(n\) endogenous variables) are the components of the column of \(B\) corresponding to the policy variable.

library(IdSS)
library(vars)
data("USmonthly")
# Select sample period:
First.date <- "1965-01-01";Last.date <- "1995-06-01"
indic.first <- which(USmonthly$DATES==First.date)
indic.last  <- which(USmonthly$DATES==Last.date)
USmonthly   <- USmonthly[indic.first:indic.last,]
considered.variables <- c("LIP","UNEMP","LCPI","LPCOM","FFR","NBR","TTR","M1")
y <- as.matrix(USmonthly[considered.variables])
res.svar.ordering <- svar.ordering(y,p=3,
                                   posit.of.shock = 5,
                                   nb.periods.IRF = 20,
                                   nb.bootstrap.replications = 100,
                                   confidence.interval = 0.90, # expressed in pp.
                                   indic.plot = 1 # Plots are displayed if = 1.
                                   )
Response to a monetary-policy shock. Identification approach of Christiano, Eichenbaum and Evans (1996). Confidence intervals are obtained by boostrapping the estimated VAR model (see inference section).

Figure 2.1: Response to a monetary-policy shock. Identification approach of Christiano, Eichenbaum and Evans (1996). Confidence intervals are obtained by boostrapping the estimated VAR model (see inference section).

2.4 Long-run restrictions

Let us now turn to Long-run restrictions. Such a restriction concerns the long-run influence of a shock on an endogenous variable. Let us consider for instance a structural shock that is assumed to have no “long-run influence” on GDP. How to express this? The long-run change in GDP can be expressed as \(GDP_{t+h} - GDP_t\), with \(h\) large. Note further that: \[ GDP_{t+h} - GDP_t = \Delta GDP_{t+h} +\Delta GDP_{t+h-1} + \dots + \Delta GDP_{t+1}. \] Hence, the fact that a given structural shock (\(\eta_{i,t}\), say) has no long-run influence on GDP means that \[ \lim_{h\rightarrow\infty}\frac{\partial GDP_{t+h}}{\partial \eta_{i,t}} = \lim_{h\rightarrow\infty} \frac{\partial}{\partial \eta_{i,t}}\left(\sum_{k=1}^h \Delta GDP_{t+k}\right)= 0. \]

This long-run effect can be formulated as a function of \(B\) and of the matrices \(\Phi_i\) when \(y_t\) (including \(\Delta GDP_t\)) follows a VAR process.

Without loss of generality, we will only consider the VAR(1) case. Indeed, one can always write a VAR(\(p\)) as a VAR(1): for that, stack the last \(p\) values of vector \(y_t\) in vector \(y_{t}^{*}=[y_t',\dots,y_{t-p+1}']'\); Eq. (1.1) can then be rewritten in its companion form: \[\begin{equation} y_{t}^{*} = \underbrace{\left[\begin{array}{c} c\\ 0\\ \vdots\\ 0\end{array}\right]}_{=c^*}+ \underbrace{\left[\begin{array}{cccc} \Phi_{1} & \Phi_{2} & \cdots & \Phi_{p}\\ I & 0 & \cdots & 0\\ 0 & \ddots & 0 & 0\\ 0 & 0 & I & 0\end{array}\right]}_{=\Phi} y_{t-1}^{*}+ \underbrace{\left[\begin{array}{c} \varepsilon_{t}\\ 0\\ \vdots\\ 0\end{array}\right]}_{\varepsilon_t^*},\tag{2.4} \end{equation}\] where matrices \(\Phi\) and \(\Omega^* = \mathbb{V}ar(\varepsilon_t^*)\) are of dimension \(np \times np\); \(\Omega^*\) is filled with zeros, except the \(n\times n\) upper-left block that is equal to \(\Omega = \mathbb{V}ar(\varepsilon_t)\). (Matrix \(\Phi\) had been introduced in Eq. (1.8).)

Let us then focus on the VAR(1) case: \[\begin{eqnarray*} y_{t} &=& c+\Phi y_{t-1}+\varepsilon_{t}\\ & = & c+\varepsilon_{t}+\Phi(c+\varepsilon_{t-1})+\ldots+\Phi^{k}(c+\varepsilon_{t-k})+\ldots \\ & = & \mu +\varepsilon_{t}+\Phi\varepsilon_{t-1}+\ldots+\Phi^{k}\varepsilon_{t-k}+\ldots \\ & = & \mu +B\eta_{t}+\Phi B\eta_{t-1}+\ldots+\Phi^{k}B\eta_{t-k}+\ldots, \end{eqnarray*}\] which is the Wold representation of \(y_t\).

The sequence of shocks \(\{\eta_t\}\) determines the sequence \(\{y_t\}\). What if \(\{\eta_t\}\) is replaced with \(\{\tilde{\eta}_t\}\), where \(\tilde{\eta}_t=\eta_t\) if \(t \ne s\) and \(\tilde{\eta}_s=\eta_s + \gamma\)? Assume \(\{\tilde{y}_t\}\) is the associated “perturbated” sequence. We have \(\tilde{y}_t = y_t\) if \(t<s\). For \(t \ge s\), the Wold decomposition of \(\{\tilde{y}_t\}\) implies: \[ \tilde{y}_t = y_t + \Phi^{t-s} B \gamma. \] Therefore, the cumulative impact of \(\gamma\) on \(\tilde{y}_t\) will be (for \(t \ge s\)): \[\begin{eqnarray} (\tilde{y}_t - y_t) + (\tilde{y}_{t-1} - y_{t-1}) + \dots + (\tilde{y}_s - y_s) &=& \nonumber \\ (Id + \Phi + \Phi^2 + \dots + \Phi^{t-s}) B \gamma.&& \tag{2.5} \end{eqnarray}\]

Consider a shock on \(\eta_{1,t}\), with a magnitude of \(1\). This shock corresponds to \(\gamma = [1,0,\dots,0]'\). Given Eq. (2.5), the long-run cumulative effect of this shock on the endogenous variables is given by: \[ \underbrace{(Id+\Phi+\ldots+\Phi^{k}+\ldots)}_{=(Id - \Phi)^{-1}}B\left[\begin{array}{c} 1\\ 0\\ \vdots\\ 0\end{array}\right], \] that is the first column of the \(n \times n\) matrix \(\Theta \equiv (Id - \Phi)^{-1}B\).

In this context, consider the following long-run restriction: “the \(j^{th}\) structural shock has no cumulative impact on the \(i^{th}\) endogenous variable”. It is equivalent to \[ \Theta_{ij}=0, \] where \(\Theta_{ij}\) is the element \((i,j)\) of \(\Theta\).

Blanchard and Quah (1989) have implemented such long-run restrictions in a small-scale VAR. Two variables are considered: GDP and unemployment. Consequently, the VAR is affected by two types of shocks. Specifically, authors want to identify supply shocks (that can have a permanent effect on output) and demand shocks (that cannot have a permanent effect on output).2

Blanchard and Quah (1989)’s dataset is quarterly, spanning the period from 1950:2 to 1987:4. Their VAR features 8 lags. Here are the data they use:

library(IdSS)
data(BQ)
par(mfrow=c(1,2))
plot(BQ$Date,BQ$Dgdp,type="l",main="GDP quarterly growth rate",
     xlab="",ylab="",lwd=2)
plot(BQ$Date,BQ$unemp,type="l",ylim=c(-3,6),main="Unemployment rate (gap)",
     xlab="",ylab="",lwd=2)

Estimate a reduced-form VAR(8) model:

library(vars)
y <- BQ[,2:3]
est.VAR <- VAR(y,p=8)
Omega <- var(residuals(est.VAR))

Now, let us define a loss function (loss) that is equal to zero if (a) \(BB'=\Omega\) and (b) the element (1,1) of \(\Theta = (Id - \Phi)^{-1} B\) is equal to zero:

# Compute (Id - Phi)^{-1}:
Phi <- Acoef(est.VAR)
PHI <- make.PHI(Phi)
sum.PHI.k <- solve(diag(dim(PHI)[1]) - PHI)[1:2,1:2]
loss <- function(param){
  B <- matrix(param,2,2)
  X <- Omega - B %*% t(B)
  Theta <- sum.PHI.k[1:2,1:2] %*% B
  loss <- 10000 * ( X[1,1]^2 + X[2,1]^2 + X[2,2]^2 + Theta[1,1]^2 )
  return(loss)
}
res.opt <- optim(c(1,0,0,1),loss,method="BFGS",hessian=FALSE)
print(res.opt$par)
## [1]  0.8570358 -0.2396345  0.1541395  0.1921221

(Note: one can use that type of approach, based on a loss function, to mix short- and long-run restrictions.)

Figure 2.2 displays the resulting IRFs. Note that, for GDP, we cumulate the GDP growth IRF, so as to have the response of the GDP in level.

B.hat <- matrix(res.opt$par,2,2)
print(cbind(Omega,B.hat %*% t(B.hat)))
##             Dgdp       unemp                       
## Dgdp   0.7582704 -0.17576173  0.7582694 -0.17576173
## unemp -0.1757617  0.09433658 -0.1757617  0.09433558
nb.sim <- 40
par(mfrow=c(2,2));par(plt=c(.15,.95,.15,.8))
Y <- simul.VAR(c=matrix(0,2,1),Phi,B.hat,nb.sim,y0.star=rep(0,2*8),
               indic.IRF = 1,u.shock = c(1,0))
plot(cumsum(Y[,1]),type="l",lwd=2,xlab="",ylab="",main="Demand shock on GDP")
plot(Y[,2],type="l",lwd=2,xlab="",ylab="",main="Demand shock on UNEMP")
Y <- simul.VAR(c=matrix(0,2,1),Phi,B.hat,nb.sim,y0.star=rep(0,2*8),
               indic.IRF = 1,u.shock = c(0,1))
plot(cumsum(Y[,1]),type="l",lwd=2,xlab="",ylab="",main="Supply shock on GDP")
plot(Y[,2],type="l",lwd=2,xlab="",ylab="",main="Supply shock on UNEMP")
IRF of GDP and unemployment to demand and supply shocks.

Figure 2.2: IRF of GDP and unemployment to demand and supply shocks.