Documentation Center

  • Trial Software
  • Product Updates

Vector Autoregressive Models

Introduction to Vector Autoregressive (VAR) Models

Types of VAR Models

The multivariate time series models used in Econometrics Toolbox™ functions are based on linear, autoregressive models. The basic models are:

Model NameAbbreviationEquation
Vector Autoregressive VAR(p)

Vector Moving Average VMA(q)

Vector Autoregressive Moving Average VARMA(p, q)

Vector Autoregressive Moving Average with eXogenous inputs VARMAX(p, q, r)

Structural Vector Autoregressive Moving Average with eXogenous inputs SVARMAX(p, q, r)

The following variables appear in the equations:

  • yt is the vector of response time series variables at time t. yt has n elements.

  • a is a constant vector of offsets, with n elements.

  • Ai are n-by-n matrices for each i. The Ai are autoregressive matrices. There are p autoregressive matrices.

  • εt is a vector of serially uncorrelated innovations, vectors of length n. The εt are multivariate normal random vectors with a covariance matrix Q, where Q is an identity matrix, unless otherwise specified.

  • Bj are n-by-n matrices for each j. The Bj are moving average matrices. There are q moving average matrices.

  • Xt is an n-by-r matrix representing exogenous terms at each time t. r is the number of exogenous series. Exogenous terms are data (or other unmodeled inputs) in addition to the response time series yt.

  • b is a constant vector of regression coefficients of size r. So the product Xt·b is a vector of size n.

Generally, the time series yt and Xt are observable. In other words, if you have data, it represents one or both of these series. You do not always know the offset a, coefficient b, autoregressive matrices Ai, and moving average matrices Bj. You typically want to fit these parameters to your data. See the vgxvarx function reference page for ways to estimate unknown parameters. The innovations εt are not observable, at least in data, though they can be observable in simulations.

Lag Operator Representation

There is an equivalent representation of the linear autoregressive equations in terms of lag operators. The lag operator L moves the time index back by one: Lyt = yt–1. The operator Lm moves the time index back by m: Lmyt = ytm.

In lag operator form, the equation for a SVARMAX(p, q, r) model becomes

This equation can be written as

where

Stable and Invertible Models

A VAR model is stable if

This condition implies that, with all innovations equal to zero, the VAR process converges to a as time goes on. See Lütkepohl [69] Chapter 2 for a discussion.

A VMA model is invertible if

This condition implies that the pure VAR representation of the process is stable. For an explanation of how to convert between VAR and VMA models, see Changing Model Representations. See Lütkepohl [69] Chapter 11 for a discussion of invertible VMA models.

A VARMA model is stable if its VAR part is stable. Similarly, a VARMA model is invertible if its VMA part is invertible.

There is no well-defined notion of stability or invertibility for models with exogenous inputs (e.g., VARMAX models). An exogenous input can destabilize a model.

Building VAR Models

To understand a multiple time series model, or multiple time series data, you generally perform the following steps:

  1. Import and preprocess data.

  2. Specify a model.

    1. Specifying Models to set up a model using vgxset:

    2. Determining an Appropriate Number of Lags to determine an appropriate number of lags for your model

  3. Fit the model to data.

    1. Fitting Models to Data to use vgxvarx to estimate the unknown parameters in your models. This can involve:

  4. Analyze and forecast using the fitted model. This can involve:

    1. Checking Stability to determine whether your model is stable and invertible

    2. Forecasting with vgxpred to forecast directly from models

    3. Forecasting with vgxsim to simulate a model

    4. Generate Impulse Responses for a VAR model to calculate impulse responses, which give forecasts based on an assumed change in an input to a time series

    5. Comparing Forecasts with Forecast Period Data to compare the results of your model's forecasts to your data

Your application need not involve all of the steps in this workflow. For example, you might not have any data, but want to simulate a parameterized model. In that case, you would perform only steps 2 and 4 of the generic workflow.

You might iterate through some of these steps.

Data Structures

Multivariate Time Series Data

Often, the first step in creating a multiple time series model is to obtain data. There are two types of multiple time series data:

  • Response data. Response data corresponds to yt in the multiple time series models defined in Types of VAR Models.

  • Exogenous data. Exogenous data corresponds to Xt in the multiple time series models defined in Types of VAR Models.

Before using Econometrics Toolbox functions with the data, put the data into the required form. Use standard MATLAB commands, or preprocess the data with a spreadsheet program, database program, PERL, or other utilities.

There are several freely available sources of data sets, such as the St. Louis Federal Reserve Economics Database (known as FRED): http://research.stlouisfed.org/fred2/. If you have a license, you can use Datafeed Toolbox™ functions to access data from various sources.

Response Data Structure

Response data for multiple time series models must be in the form of a matrix. Each row of the matrix represents one time, and each column of the matrix represents one time series. The earliest data is the first row, the latest data is the last row. The data represents yt in the notation of Types of VAR Models. If there are T times and n time series, put the data in the form of a T-by-n matrix:

Y1t represents time series 1,..., Ynt represents time series n, 1 ≤ t ≤ T.

Multiple Paths.  Response time series data can have an extra dimension corresponding to separate, independent paths. For this type of data, use a three-dimensional array Y(t,j,p), where:

  • t is the time index of an observation, 1 ≤ t ≤ T.

  • j is the index of a time series, 1 ≤ j ≤ n.

  • p is the path index, 1 ≤ p ≤ P.

For any path p, Y(t,j,p) is a time series.

Example: Response Data Structure

The file Data_USEconModel ships with Econometrics Toolbox software. It contains time series from the St. Louis Federal Reserve Economics Database (known as FRED).

Enter

load Data_USEconModel

to load the data into your MATLAB workspace. The following items load into the workspace:

  • Data, a 249-by-14 matrix containing the time series of data,

  • Dataset, a 249-by-14 matrix containing the dataset array,

  • Description, a character array containing a description of the data series and the key to the labels for each series,

  • dates, a 249-element vector containing the dates for Data,

  • series, a 1-by-14 cell array of labels for the time series.

Examine the data structures:

firstPeriod = dates(1)
lastPeriod = dates(end)
firstPeriod =

      711217


lastPeriod =

      733863

  • dates are MATLAB serial date numbers, the number of days since the putative date January 1, 0000. (This "date" is not a real date, but is convenient for making date calculations; for more information, see Date Formats in the Financial Toolbox™ User's Guide.)

  • The Data matrix contains 14 columns. These represent the time series labeled by Series.

FRED SeriesDescription
COEPaid compensation of employees in $ billions
CPIAUCSL Consumer Price Index
FEDFUNDSEffective federal funds rate
GCEGovernment consumption expenditures and investment in $ billions
GDPGross Domestic Product
GDPDEFGross domestic product in $ billions
GDPIGross private domestic investment in $ billions
GS10Ten-year treasury bond yield
HOANBSNon-farm business sector index of hours worked
M1SL M1 money supply (narrow money)
M2SLM2 money supply (broad money)
PCECPersonal consumption expenditures in $ billions
TB3MS Three-month treasury bill yield
UNRATEUnemployment rate

Exogenous Data Structure

The data for exogenous inputs, Xt, has a different form. Put the data in a T-by-1 matrix of cell arrays (a column vector of cell arrays). Each cell array is an n-by-r matrix, where n is the number of time series in the response data structure, and r is the number of exogenous time series. The n-by-r matrix consists of the usual MATLAB floating-point numbers.

At each time t, the n-by-r matrix Xt multiplies the r-by-1 vector b, yielding an n-vector.

This structure is augmented in the case of multiple paths. If there are p paths, put the data in a T-by-p matrix of cell arrays. Each entry in the cell array again is an n-by-r matrix.

Exogenous Data Structure.  

This example uses linear exogenous data. Suppose you have a T-by-3 data series. The exogenous series is a T-element cell array of 3-by-3 matrices, where cell i is i *eye(3):

T = 10;
xdata = cell(T,1);
for indx = 1:T
    xdata{indx} = indx * eye(3);
end

firstPeriodExogenousMatrix = xdata{1}
lastPeriodExogenousMatrix = xdata{T}
firstPeriodExogenousMatrix =

     1     0     0
     0     1     0
     0     0     1


lastPeriodExogenousMatrix =

    10     0     0
     0    10     0
     0     0    10

Data Preprocessing

Your data might have characteristics that violate assumptions for linear multiple time series models. For example, you can have data with exponential growth, or data from multiple sources at different periodicities. You must preprocess your data to convert it into an acceptable form for analysis.

  • For time series with exponential growth, you can preprocess the data by taking the logarithm of the growing series. In some cases you then difference the result. For an example, see Transforming Data for Stationarity.

  • For data from multiple sources, you must decide how best to fill in missing values. Commonly, you take the missing values as unchanged from the previous value, or as interpolated from neighboring values.

    Note:   If you take a difference of a series, the series becomes 1 shorter. If you take a difference of only some time series in a data set, truncate the other series so that all have the same length, or pad the differenced series with initial values.

Testing Data for Stationarity.  You can test each time series data column for stability using unit root tests. For details, see Unit Root Nonstationarity.

Partitioning Response Data

To fit a lagged model to data, partition the response data in up to three sections:

  • Presample data

  • Estimation data

  • Forecast data

The purpose of presample data is to provide initial values for lagged variables. When trying to fit a model to the estimation data, you need to access earlier times. For example, if the maximum lag in a model is 4, and if the earliest time in the estimation data is 50, then the model needs to access data at time 46 when fitting the observations at time 50. vgxvarx assumes the value 0 for any data that is not part of the presample data.

The estimation data contains the observations yt. Use the estimation data to fit the model.

Use the forecast data for comparing fitted model predictions against data. You do not have to have a forecast period. Use one to validate the predictive power of a fitted model.

The following figure shows how to arrange the data in the data matrix, with j presample rows and k forecast rows.

Seemingly Unrelated Regression for Exogenous Data

The following discussion refers to one path. The ideas apply to multiple paths as well.

Exogenous variables affect the response time series through multiplication by the regression coefficient b. For more information, see Types of VAR Models. At each time index, the exogenous data is an n-by-r matrix, where n is the number of time series in the multiple time series model, and r is the number of exogenous time series.

To give the vgxvarx fitting function more flexibility in fitting the effect of the exogenous series on each response time series, expand or replicate your exogenous time series. This is the technique of seemingly unrelated regression (SUR).

Suppose, for example, a tariff is imposed over some time period. You suspect that this tariff affects GNP and other economic quantities. Suppose you are examining data that contains four time series. Here are two ways of including the exogenous tariffs:

  • Each cell in the exogenous time series consists of either

    ones(4,1) or zeros(4,1):

    a 4–by-1 matrix indicating the presence (1) or absence (0) of the tariff during the time in question.

  • Each cell in the exogenous time series consists of either

    eye(4) or zeros(4):

    a 4–by-4 matrix indicating the presence (1) or absence (0) of the tariff during the time in question.

The advantage of the larger (replicated) formulation is that it allows for vgxvarx to estimate the influence of the tariff on each time series separately. The resulting regression coefficient vector b can have differing values for each component. The different values reflect the different direct influences of the tariff on each time series.

Similarly, if you have an exogenous series X(t), an n-by-1 vector for each time t, you can include it in your formulation as diag(X), an n-by-n matrix.

You can model linear trends in your data by including the exogenous matrix eye(n)*t at time index t.

Model Specification Structures

Models for Multiple Time Series

Econometrics Toolbox functions require a model specification structure as an input before they simulate, calibrate, forecast, or perform other calculations. You can specify a model with or without time series data. If you have data, you can fit models to the data as described in VAR Model Estimation. If you do not have data, you can specify a model with parameters you provide, as described in Specification Structures with Known Parameters.

Specifying Models

Create an Econometrics Toolbox multiple time series model specification structure using the vgxset function. Use this structure for calibrating, simulating, predicting, analyzing, and displaying multiple time series.

There are three ways to create model structures using the vgxset function:

  • Specification Structures with Known Parameters. Use this method when you know the values of all relevant parameters of your model.

  • Specification Structures with No Parameter Values. Use this method when you know the size, type, and number of lags in your model, but do not know the values of any of the AR or MA coefficients, or the value of your Q matrix.

  • Specification Structures with Selected Parameter Values. Use this method when you know the size, type, and number of lags in your model, and also know some, but not all, of the values of AR or MA coefficients. This method includes the case when you want certain parameters to be zero. You can specify as many parameters as you like. For example, you can specify certain parameters, request that vgxvarx estimate others, and specify other parameters with [] to indicate default values.

Specification Structures with Known Parameters

If you know the values of model parameters, create a model structure with the parameters. The following are the name-value pairs you can pass to vgxset for known parameter values:

Model Parameters

NameValue
a

An n-vector of offset constants. The default is empty.

AR0

An n-by-n invertible matrix representing the zero-lag structural AR coefficients. The default is empty, which means an identity matrix.

AR

An nAR-element cell array of n-by-n matrices of AR coefficients. The default is empty.

MA0

An n-by-n invertible matrix representing the zero-lag structural MA coefficients. The default is empty, which means an identity matrix.

MA

An nMA-element cell array of n-by-n matrices of MA coefficients. The default is empty.

b

An nX-vector of regression coefficients. The default is empty.

Q

An n-by-n symmetric innovations covariance matrix. The default is empty, which means an identity matrix.

ARlag

A monotonically increasing nAR-vector of AR lags. The default is empty, which means all the lags from 1 to p, the maximum AR lag.

MAlag

A monotonically increasing nMA-vector of MA lags. The default is empty, which means all the lags from 1 to q, the maximum MA lag.

vgxset infers the model dimensionality, given by n, p, and q in Types of VAR Models, from the input parameters. These parameters are n, nAR, and nMA respectively in vgxset syntax. For more information, see Specification Structures with No Parameter Values.

The ARlag and MAlag vectors allow you to specify which lags you want to include. For example, to specify AR lags 1 and 3 without lag 2, set ARlag to [1 3]. This setting corresponds to nAR = 2 for two specified lags, even though this is a third order model, since the maximum lag is 3.

The following example shows how to create a model structure when you have known parameters. Consider a VAR(1) model:

Specifically, a = [0.05, 0, –.05]' and wt are distributed as standard three-dimensional normal random variables.

Create a model specification structure with vgxset:

A1 = [.5 0 0;.1 .1 .3;0 .2 .3];
Q = eye(3);
Mdl = vgxset('a',[.05;0;-.05],'AR',{A1},'Q',Q)
Mdl = 

      Model: 3-D VAR(1) with Additive Constant
          n: 3
        nAR: 1
        nMA: 0
         nX: 0
          a: [0.05 0 -0.05] additive constants
         AR: {1x1 cell} stable autoregressive process
          Q: [3x3] covariance matrix

vgxset identifies this model as a stable VAR(1) model with three dimensions and additive constants.

Specification Structures with No Parameter Values

By default, vgxvarx fits all unspecified additive (a), AR, regression coefficients (b), and Q parameters. You must specify the number of time series and the type of model you want vgxvarx to fit. The following are the name-value pairs you can pass to vgxset for unknown parameter values:

Model Orders

NameValue
n

A positive integer specifying the number of time series. The default is 1.

nAR

A nonnegative integer specifying the number of AR lags (corresponds to p in Types of VAR Models). The default is 0.

nMA

A nonnegative integer specifying the number of MA lags (corresponds to q in Types of VAR Models). The default is 0.

Currently, vgxvarx cannot fit MA matrices. Therefore, specifying an nMA greater than 0 does not yield estimated MA matrices.

nX

A nonnegative integer specifying the number regression parameters (corresponds to r in Types of VAR Models). The default is 0.

Constant

Additive offset logical indicator. The default is false.

The following example shows how to specify the model in Specification Structures with Known Parameters, but without explicit parameters.

Mdl = vgxset('n',3,'nAR',1,'Constant',true)
Mdl = 

      Model: 3-D VAR(1) with Additive Constant
          n: 3
        nAR: 1
        nMA: 0
         nX: 0
          a: []
         AR: {}
          Q: []

Specification Structures with Selected Parameter Values

You can create a model structure with some known parameters, and have vgxvarx fit the unknown parameters to data. Here are the name-value pairs you can pass to vgxset for requested parameter values:

Model Parameter Estimation

NameValue
asolve

An n-vector of additive offset logical indicators. The default is empty, which means true(n,1).

ARsolve

An nAR-element cell array of n-by-n matrices of AR logical indicators. The default is empty, which means an nAR-element cell array of true(n).

AR0solve

An n-by-n matrix of AR0 logical indicators. The default is empty, which means false(n).

MAsolve

An nMA-element cell array of n-by-n matrices of MA logical indicators. The default is empty, which means false(n).

MA0solve

An n-by-n matrix of MA0 logical indicators. The default is empty, which means false(n).

bsolve

An nX-vector of regression logical indicators. The default is empty, which means true(n,1).

Qsolve

An n-by-n symmetric covariance matrix logical indicator. The default is empty, which means true(n), unless the 'CovarType' option of vgxvarx overrides it.

Specify a logical 1 (true) for every parameter that you want vgxvarx to estimate.

Currently, vgxvarx cannot fit the AR0, MA0, or MA matrices. Therefore, vgxvarx ignores the AR0solve, MA0solve, and MAsolve indicators. However, you can examine the Example_StructuralParams.m file for an approach to estimating the AR0 and MA0 matrices. Enter help Example_StructuralParams at the MATLAB command line for information. See Lütkepohl [69] Chapter 9 for algorithms for estimating structural models.

Currently, vgxvarx also ignores the Qsolve matrix. vgxvarx can fit either a diagonal or a full Q matrix; see vgxvarx.

This example shows how to specify the model in Specification Structures with Known Parameters, but with requested AR parameters with a diagonal autoregressive structure. The dimensionality of the model is known, as is the parameter vector a, but the autoregressive matrix A1 and covariance matrix Q are not known.

Mdl = vgxset('ARsolve',{logical(eye(3))},'a',...
    [.05;0;-.05])
Mdl = 

      Model: 3-D VAR(1) with Additive Constant
          n: 3
        nAR: 1
        nMA: 0
         nX: 0
          a: [0.05 0 -0.05] additive constants
         AR: {}
    ARsolve: {1x1 cell of logicals} autoregressive lag indicators
          Q: []

Displaying and Changing a Specification Structure

After you set up a model structure, you can examine it in several ways:

  • Call the vgxdisp function.

  • Double-click the structure in the MATLAB Workspace browser.

  • Call the vgxget function.

  • Enter Spec at the MATLAB command line, where Spec is the name of the model structure.

  • Enter Spec.ParamName at the MATLAB command line, where Spec is the name of the model structure, and ParamName is the name of the parameter you want to examine.

You can change any part of a model structure named, for example, Spec, using the vgxset as follows:

Spec = vgxset(Spec,'ParamName',value,...)

This syntax changes only the 'ParamName' parts of the Spec structure.

Determining an Appropriate Number of Lags

There are two Econometrics Toolbox functions that can help you determine an appropriate number of lags for your models:

  • The lratiotest function performs likelihood ratio tests to help identify the appropriate number of lags.

  • The aicbic function calculates the Akaike and Bayesian information criteria to determine the minimal appropriate number of required lags.

Example: Using the Likelihood Ratio Test to Calculate the Minimal Requisite Lag.  lratiotest requires inputs of the loglikelihood of an unrestricted model, the loglikelihood of a restricted model, and the number of degrees of freedom (DoF). DoF is the difference in the number of active parameters between the unrestricted and restricted models. lratiotest returns a Boolean: 1 means reject the restricted model in favor of the unrestricted model, 0 means there is insufficient evidence to reject the restricted model.

In the context of determining an appropriate number of lags, the restricted model has fewer lags, and the unrestricted model has more lags. Otherwise, test models with the same type of fitted parameters (for example, both with full Q matrices, or both with diagonal Q matrices).

  • Obtain the loglikelihood (LLF) of a model as the third output of vgxvarx:

    [EstSpec,EstStdErrors,LLF,W] = vgxvarx(...)
  • Obtain the number of active parameters in a model as the second output of vgxcount:

    [NumParam,NumActive] = vgxcount(Spec)

For example, suppose you have four fitted models with varying lag structures. The models have loglikelihoods LLF1, LLF2, LLF3, and LLF4, and active parameter counts n1p, n2p, n3p, and n4p. Suppose model 4 has the largest number of lags. Perform likelihood ratio tests of models 1, 2, and 3 against model 4, as follows:

reject1 = lratiotest(LLF4,LLF1,n4p - n1p)
reject2 = lratiotest(LLF4,LLF2,n4p - n2p)
reject3 = lratiotest(LLF4,LLF3,n4p - n3p)

If reject1 = 1, you reject model 1 in favor of model 4, and similarly for models 2 and 3. If any of the models have rejectI = 0, you have an indication that you can use fewer lags than in model 4.

Example: Using Akaike Information Criterion to Calculate the Minimal Requisite Lag.  aicbic requires inputs of the loglikelihood of a model and of the number of active parameters in the model. It returns the value of the Akaike information criterion. Lower values are better than higher values. aicbic accepts vectors of loglikelihoods and vectors of active parameters, and returns a vector of Akaike information criteria, which makes it easy to find the minimum.

  • Obtain the loglikelihood (LLF) of a model as the third output of vgxvarx:

    [EstSpec,EstStdErrors,LLF,W] = vgxvarx(...)
  • Obtain the number of active parameters in a model as the second output of vgxcount:

    [NumParam,NumActive] = vgxcount(Spec)

For example, suppose you have four fitted models with varying lag structures. The models have loglikelihoods LLF1, LLF2, LLF3, and LLF4, and active parameter counts n1p, n2p, n3p, and n4p. Calculate the Akaike information criteria as follows:

AIC = aicbic([LLF1 LLF2 LLF3 LLF4],[n1p n2p n3p n4p])

The most suitable model has the lowest value of the Akaike information criterion.

VAR Model Estimation

Preparing Models for Fitting

To create a model of multiple time series data, decide on a parametric form of the model, and fit parameters to the data. When you have a calibrated (fitted) model, check if the model fits the data adequately.

To fit a model to data, you must have:

There are several Econometrics Toolbox functions that aid these tasks, including:

Structural Matrices.  The structural matrices in SVARMAX models are the A0 and B0 matrices. See Types of VAR Models for definitions of these terms. Currently, vgxvarx cannot fit these matrices to data. However, you can examine the Example_StructuralParams.m file for an approach to estimating the AR0 and MA0 matrices. Enter help Example_StructuralParams at the MATLAB command line for information. See Lütkepohl [69] Chapter 9 for algorithms for estimating structural models.

Including Exogenous Data

Include exogenous data in a VARMAX model Xt by setting the nX field to the number of exogenous time series. See Types of VAR Models for definitions of these terms. To fit the coefficient vector b (which multiplies the exogenous matrices), create your model structure with vgxset to have either bsolve = true(r,1), where r is the number of elements in b, or have nX defined as the number of elements in b. By default, vgxvarx fits all elements of the b vector.

Changing Model Representations

You can convert a VARMA model to an equivalent VAR model using the vgxar function. (See Types of VAR Models for definitions of these terms.) Similarly, you can convert a VARMA model to an equivalent VMA model using the vgxma function. These functions are useful in the following situations:

  • Calibration of models

    The vgxvarx function can calibrate only VAR and VARX models. So to calibrate a VARMA model, you could first convert it to a VAR model. However, you can ask vgxvarx to ignore VMA terms and fit just the VAR structure. See Fit a VARMA Model for a comparison of conversion versus no conversion.

  • Forecasting

    It is straightforward to generate forecasts for VMA models. In fact, vgxpred internally converts models to VMA models to calculate forecast statistics.

  • Analyzing models

    Sometimes it is easier to define your model using one structure, but you want to analyze it using a different structure.

The algorithm for conversion between models involves series that are, in principle, infinite. The vgxar and vgxma functions truncate these series to the maximum of nMA and nAR, introducing an inaccuracy. You can specify that the conversion give more terms, or give terms to a specified accuracy. See [69] for more information on these transformations.

Convert a VARMA Model to a VAR Model.  

This example creates a VARMA model, then converts it to a pure VAR model.

Create a VARMA model specification structure.

A1 = [.2 -.1 0;.1 .2 .05;0 .1 .3];
A2 = [.3 0 0;.1 .4 .1;0 0 .2];
A3 = [.4 .1 -.1;.2 -.5 0;.05 .05 .2];
MA1 = .2*eye(3);
MA2 = [.3 .2 .1;.2 .4 0;.1 0 .5];
Spec = vgxset('AR',{A1,A2,A3},'MA',{MA1,MA2})
Spec = 

      Model: 3-D VARMA(3,2) with No Additive Constant
          n: 3
        nAR: 3
        nMA: 2
         nX: 0
         AR: {3x1 cell} stable autoregressive process
         MA: {2x1 cell} invertible moving average process
          Q: []

Convert the structure to a pure VAR structure:

SpecAR = vgxar(Spec)
SpecAR = 

      Model: 3-D VAR(3) with No Additive Constant
          n: 3
        nAR: 3
        nMA: 0
         nX: 0
         AR: {3x1 cell} unstable autoregressive process
          Q: []

The converted process is unstable; see the AR row. An unstable model could yield inaccurate predictions. Taking more terms in the series gives a stable model:

specAR4 = vgxar(Spec,4)
specAR4 = 

      Model: 3-D VAR(4) with No Additive Constant
          n: 3
        nAR: 4
        nMA: 0
         nX: 0
         AR: {4x1 cell} stable autoregressive process
          Q: []

Convert a VARMA Model to a VMA Model.  

This example uses a VARMA model and converts it to a pure VMA model.

Create a VARMA model specification structure.

A1 = [.2 -.1 0;.1 .2 .05;0 .1 .3];
A2 = [.3 0 0;.1 .4 .1;0 0 .2];
A3 = [.4 .1 -.1;.2 -.5 0;.05 .05 .2];
MA1 = .2*eye(3);
MA2 = [.3 .2 .1;.2 .4 0;.1 0 .5];
Spec = vgxset('AR',{A1,A2,A3},'MA',{MA1,MA2})
Spec = 

      Model: 3-D VARMA(3,2) with No Additive Constant
          n: 3
        nAR: 3
        nMA: 2
         nX: 0
         AR: {3x1 cell} stable autoregressive process
         MA: {2x1 cell} invertible moving average process
          Q: []

Convert the structure to a pure VAR structure:

SpecAR = vgxar(Spec)
SpecAR = 

      Model: 3-D VAR(3) with No Additive Constant
          n: 3
        nAR: 3
        nMA: 0
         nX: 0
         AR: {3x1 cell} unstable autoregressive process
          Q: []

Convert the model specification structure Spec to a pure MA structure:

SpecMA = vgxma(Spec)
SpecMA = 

      Model: 3-D VMA(3) with No Additive Constant
          n: 3
        nAR: 0
        nMA: 3
         nX: 0
         MA: {3x1 cell} non-invertible moving average process
          Q: []

Notice that the pure VMA process has more MA terms than the original process. The number is the maximum of nMA and nAR, and nAR = 3.

The converted VMA model is not invertible; see the MA row. A noninvertible model can yield inaccurate predictions. Taking more terms in the series results in an invertible model.

specMA4 = vgxma(Spec,4)
specMA4 = 

      Model: 3-D VMA(4) with No Additive Constant
          n: 3
        nAR: 0
        nMA: 4
         nX: 0
         MA: {4x1 cell} invertible moving average process
          Q: []

Converting the resulting VMA model to a pure VAR model results in the same VAR(3) model as SpecAR.

SpecAR2 = vgxar(SpecMA);
vgxdisp(SpecAR,SpecAR2)
  Model 1: 3-D VAR(3) with No Additive Constant
           Conditional mean is not AR-stable and is MA-invertible
  Model 2: 3-D VAR(3) with No Additive Constant
           Conditional mean is not AR-stable and is MA-invertible
       Parameter        Model 1        Model 2
  -------------- -------------- --------------
      AR(1)(1,1)            0.4            0.4 
           (1,2)           -0.1           -0.1 
           (1,3)             -0             -0 
           (2,1)            0.1            0.1 
           (2,2)            0.4            0.4 
           (2,3)           0.05           0.05 
           (3,1)             -0             -0 
           (3,2)            0.1            0.1 
           (3,3)            0.5            0.5 
      AR(2)(1,1)           0.52           0.52 
           (1,2)           0.22           0.22 
           (1,3)            0.1            0.1 
           (2,1)           0.28           0.28 
           (2,2)           0.72           0.72 
           (2,3)           0.09           0.09 
           (3,1)            0.1            0.1 
           (3,2)          -0.02          -0.02 
           (3,3)            0.6            0.6 
      AR(3)(1,1)          0.156          0.156 
           (1,2)         -0.004         -0.004 
           (1,3)          -0.18          -0.18 
           (2,1)          0.024          0.024 
           (2,2)         -0.784         -0.784 
           (2,3)         -0.038         -0.038 
           (3,1)          -0.01          -0.01 
           (3,2)          0.014          0.014 
           (3,3)          -0.17          -0.17 
          Q(:,:)             []             [] 

Conversion Types and Accuracy.  Some conversions occur when explicitly requested, such as those initiated by calls to vgxar and vgxma. Other conversions occur automatically as needed for calculations. For example, vgxpred internally converts models to VMA models to calculate forecast statistics.

By default, conversions give terms up to the largest lag present in the model. However, for more accuracy in conversion, you can specify that the conversion use more terms. You can also specify that it continue until a residual term is below a threshold you set. The syntax is

SpecAR = vgxar(Spec,nAR,ARlag,Cutoff)
SpecMA = vgxma(Spec,nMA,MAlag,Cutoff)
  • nMA and nAR represent the number of terms in the series.

  • ARlag and MAlag are vectors of particular lags that you want in the converted model.

  • Cutoff is a positive parameter that truncates the series if the norm of a converted term is smaller than Cutoff. Cutoff is 0 by default.

For details, see the function reference pages for vgxar and vgxma.

Fitting Models to Data

The vgxvarx function performs parameter estimation. vgxvarx only estimates parameters for VAR and VARX models. In other words, vgxvarx does not estimate moving average matrices, which appear, for example, in VMA and VARMA models. For definitions of these terms, see Types of VAR Models.

The vgxar function converts a VARMA model to a VAR model. Currently, it does not handle VARMAX models.

You have two choices in fitting parameters to a VARMA model or VARMAX model:

  • Set the vgxvarx 'IgnoreMA' parameter to 'yes' (the default is 'no'). In this case vgxvarx ignores VMA parameters, and fits the VARX parameters.

  • Convert a VARMA model to a VAR model using vgxar. Then fit the resulting VAR model using vgxvarx.

Each of these options is effective on some data. Try both if you have VMA terms in your model. See Fit a VARMA Model for an example showing both options.

Fit a VAR Model.  

This example uses two series: the consumer price index (CPI) and the unemployment rate (UNRATE) from the data set Data_USEconmodel.

Obtain the two time series, and convert them for stationarity:

load Data_USEconModel
CPI = Dataset.CPIAUCSL;
CPI = log(CPI);
CPID = diff(CPI);
DatesD = dates(2:end);
UNEM = Dataset.UNRATE;
Y = [CPID,UNEM(2:end)];

Create a VAR model:

Spec = vgxset('n',2,'nAR',4,'Constant',true)
Spec = 

      Model: 2-D VAR(4) with Additive Constant
          n: 2
        nAR: 4
        nMA: 0
         nX: 0
          a: []
         AR: {}
          Q: []

Fit the model to the data using vgxvarx:

[EstSpec,EstStdErrors,LLF,W] = vgxvarx(Spec,Y);
vgxdisp(EstSpec)
  Model  : 2-D VAR(4) with Additive Constant
           Conditional mean is AR-stable and is MA-invertible
  a Constant:
      0.00184568
        0.315567
  AR(1) Autoregression Matrix:
        0.308635     -0.0032011 
        -4.48152        1.34343 
  AR(2) Autoregression Matrix:
        0.224224     0.00124669 
         7.19015       -0.26822 
  AR(3) Autoregression Matrix:
        0.353274     0.00287036 
         1.48726      -0.227145 
  AR(4) Autoregression Matrix:
      -0.0473456   -0.000983119 
         8.63672      0.0768312 
  Q Innovations Covariance:
     3.51443e-05   -0.000186967 
    -0.000186967       0.116697 

Fit a VARMA Model.  

This example uses artificial data to generate a time series, then shows two ways of fitting a VARMA model to the series.

Specify the model:

AR1 = [.3 -.1 .05;.1 .2 .1;-.1 .2 .4];
AR2 = [.1 .05 .001;.001 .1 .01;-.01 -.01 .2];
Q = [.2 .05 .02;.05 .3 .1;.02 .1 .25];
MA1 = [.5 .2 .1;.1 .6 .2;0 .1 .4];
MA2 = [.2 .1 .1; .05 .1 .05;.02 .04 .2];
Spec = vgxset('AR',{AR1,AR2},'Q',Q,'MA',{MA1,MA2})
Spec = 

      Model: 3-D VARMA(2,2) with No Additive Constant
          n: 3
        nAR: 2
        nMA: 2
         nX: 0
         AR: {2x1 cell} stable autoregressive process
         MA: {2x1 cell} invertible moving average process
          Q: [3x3] covariance matrix

Generate a time series using vgxsim:

YF = [100 50 20;110 52 22;119 54 23]; % starting values
rng(1);                               % For reproducibility
Y = vgxsim(Spec,190,[],YF);

Fit the data to a VAR model by calling vgxvarx with the 'IgnoreMA' option:

estSpec = vgxvarx(Spec,Y(3:end,:),[],Y(1:2,:),'IgnoreMA','yes');

Compare the estimated model with the original:

vgxdisp(Spec,estSpec)
  Model 1: 3-D VARMA(2,2) with No Additive Constant
           Conditional mean is AR-stable and is MA-invertible
  Model 2: 3-D VAR(2) with No Additive Constant
           Conditional mean is AR-stable and is MA-invertible
       Parameter        Model 1        Model 2
  -------------- -------------- --------------
      AR(1)(1,1)            0.3       0.723964 
           (1,2)           -0.1       0.119695 
           (1,3)           0.05        0.10452 
           (2,1)            0.1      0.0828041 
           (2,2)            0.2       0.788177 
           (2,3)            0.1       0.299648 
           (3,1)           -0.1      -0.138715 
           (3,2)            0.2       0.397231 
           (3,3)            0.4       0.748157 
      AR(2)(1,1)            0.1      -0.126833 
           (1,2)           0.05     -0.0690256 
           (1,3)          0.001      -0.118524 
           (2,1)          0.001      0.0431623 
           (2,2)            0.1      -0.265387 
           (2,3)           0.01      -0.149646 
           (3,1)          -0.01       0.107702 
           (3,2)          -0.01      -0.304243 
           (3,3)            0.2      0.0165912 
      MA(1)(1,1)            0.5                
           (1,2)            0.2                
           (1,3)            0.1                
           (2,1)            0.1                
           (2,2)            0.6                
           (2,3)            0.2                
           (3,1)              0                
           (3,2)            0.1                
           (3,3)            0.4                
      MA(2)(1,1)            0.2                
           (1,2)            0.1                
           (1,3)            0.1                
           (2,1)           0.05                
           (2,2)            0.1                
           (2,3)           0.05                
           (3,1)           0.02                
           (3,2)           0.04                
           (3,3)            0.2                
          Q(1,1)            0.2       0.193553 
          Q(2,1)           0.05      0.0408221 
          Q(2,2)            0.3       0.252461 
          Q(3,1)           0.02     0.00690626 
          Q(3,2)            0.1      0.0922074 
          Q(3,3)           0.25       0.306271 

The estimated Q matrix is close to the original Q matrix. However, the estimated AR terms are not close to the original AR terms. Specifically, nearly all the AR(2) coefficients are the opposite sign, and most AR(1) coefficients are off by about a factor of 2.

Alternatively, before fitting the model, convert it to a pure AR model. To do this, specify the model and generate a time series as above. Then, convert the model to a pure AR model:

specAR = vgxar(Spec);

Fit the converted model to the data:

estSpecAR = vgxvarx(specAR,Y(3:end,:),[],Y(1:2,:));

Compare the fitted model to the original model:

vgxdisp(specAR,estSpecAR)
  Model 1: 3-D VAR(2) with No Additive Constant
           Conditional mean is AR-stable and is MA-invertible
  Model 2: 3-D VAR(2) with No Additive Constant
           Conditional mean is AR-stable and is MA-invertible
       Parameter        Model 1        Model 2
  -------------- -------------- --------------
      AR(1)(1,1)            0.8       0.723964 
           (1,2)            0.1       0.119695 
           (1,3)           0.15        0.10452 
           (2,1)            0.2      0.0828041 
           (2,2)            0.8       0.788177 
           (2,3)            0.3       0.299648 
           (3,1)           -0.1      -0.138715 
           (3,2)            0.3       0.397231 
           (3,3)            0.8       0.748157 
      AR(2)(1,1)          -0.13      -0.126833 
           (1,2)          -0.09     -0.0690256 
           (1,3)         -0.114      -0.118524 
           (2,1)         -0.129      0.0431623 
           (2,2)          -0.35      -0.265387 
           (2,3)         -0.295      -0.149646 
           (3,1)           0.03       0.107702 
           (3,2)          -0.17      -0.304243 
           (3,3)           0.05      0.0165912 
          Q(1,1)            0.2       0.193553 
          Q(2,1)           0.05      0.0408221 
          Q(2,2)            0.3       0.252461 
          Q(3,1)           0.02     0.00690626 
          Q(3,2)            0.1      0.0922074 
          Q(3,3)           0.25       0.306271 

The model coefficients between the pure AR models are closer than between the original VARMA model and the fitted AR model. Most model coefficients are within 20% or the original. Notice, too, that estSpec and estSpecAR are identical. This is because both are AR(2) models fitted to the same data series.

How vgxvarx Works.  vgxvarx finds maximum likelihood estimators of AR and Q matrices and the a and b vectors if present. For VAR models, vgxvarx uses a direct solution algorithm that requires no iterations. For VARX models, the algorithm iterates. The iterations usually converge quickly, unless two or more exogenous data streams are proportional to each other. In that case, there is no unique maximum likelihood estimator, and the iterations might not converge. You can set the maximum number of iterations with the MaxIter parameter, which has a default value of 1000.

vgxvarx calculates the loglikelihood of the data, giving it as an output of the fitted model. Use this output in testing the quality of the model. For example, see Determining an Appropriate Number of Lags and Examining the Stability of a Fitted Model.

Examining the Stability of a Fitted Model

When you enter the name of a fitted model at the command line, you obtain a summary. This summary contains a report on the stability of the VAR part of the model, and the invertibility of the VMA part. You can also find whether a model is stable or invertible by entering:

[isStable,isInvertible] = vgxqual(Spec)

This gives a Boolean value of 1 for isStable if the model is stable, and a Boolean value of 1 for isInvertible if the model is invertible. This stability or invertibility does not take into account exogenous terms, which can disrupt the stability of a model.

Stable, invertible models give reliable results, while unstable or noninvertible ones might not.

Stability and invertibility are equivalent to all eigenvalues of the associated lag operators having modulus less than 1. In fact vgxqual evaluates these quantities by calculating eigenvalues. For more information, see the vgxqual function reference page or Hamilton [48]

VAR Model Forecasting, Simulation, and Analysis

VAR Forecasting

When you have models with parameters (known or estimated), you can examine the predictions of the models. For information on specifying models, see Model Specification Structures. For information on calibrating models, see VAR Model Estimation.

The main methods of forecasting are:

  • Generating forecasts with error bounds using vgxpred

  • Generating simulations with vgxsim

  • Generating sample paths with vgxproc

These functions base their forecasts on a model specification and initial data. The functions differ in their innovations processes:

  • vgxpred assumes zero innovations. Therefore, vgxpred yields a deterministic forecast.

  • vgxsim assumes the innovations are jointly normal with covariance matrix Q. vgxsim yields pseudorandom (Monte Carlo) sample paths.

  • vgxproc uses a separate input for the innovations process. vgxproc yields a sample path that is deterministically based on the innovations process.

vgxpred is faster and takes less memory than generating many sample paths using vgxsim. However, vgxpred is not as flexible as vgxsim. For example, suppose you transform some time series before making a model, and want to undo the transformation when examining forecasts. The error bounds given by transforms of vgxpred error bounds are not valid bounds. In contrast, the error bounds given by the statistics of transformed simulations are valid.

Forecast a VAR Model.  

This example shows how to use vgxpred to forecast a VAR model.

vgxpred enables you to generate forecasts with error estimates. vgxpred requires:

  • A fully-specified model (e.g., impSpec in what follows)

  • The number of periods for the forecast (e.g., FT in what follows)

vgxpred optionally takes:

  • An exogenous data series

  • A presample time series (e.g., Y(end-3:end,:) in what follows)

  • Extra paths

Load the Data_USEconModel data set. This example uses two time series: the logarithm of real GDP, and the real 3-month T-bill rate, both differenced to be approximately stationary. Suppose that a VAR(4) model is appropriate to describe the time series.

load Data_USEconModel
DEF = log(Dataset.CPIAUCSL);
GDP = log(Dataset.GDP);
rGDP = diff(GDP - DEF); % Real GDP is GDP - deflation
TB3 = 0.01*Dataset.TB3MS;
dDEF = 4*diff(DEF); % Scaling
rTB3 = TB3(2:end) - dDEF; % Real interest is deflated
Y = [rGDP,rTB3];

Fit a VAR(4) model specification:

Spec = vgxset('n',2,'nAR',4,'Constant',true);
impSpec = vgxvarx(Spec,Y(5:end,:),[],Y(1:4,:));
impSpec = vgxset(impSpec,'Series',...
  {'Transformed real GDP','Transformed real 3-mo T-bill rate'});

Predict the evolution of the time series:

FDates = datenum({'30-Jun-2009'; '30-Sep-2009'; '31-Dec-2009';
'31-Mar-2010'; '30-Jun-2010'; '30-Sep-2010'; '31-Dec-2010';
'31-Mar-2011'; '30-Jun-2011'; '30-Sep-2011'; '31-Dec-2011';
'31-Mar-2012'; '30-Jun-2012'; '30-Sep-2012'; '31-Dec-2012';
'31-Mar-2013'; '30-Jun-2013'; '30-Sep-2013'; '31-Dec-2013';
'31-Mar-2014'; '30-Jun-2014' });
FT = numel(FDates);
[Forecast,ForecastCov] = vgxpred(impSpec,FT,[],...
    Y(end-3:end,:));

View the forecast using vgxplot:

vgxplot(impSpec,Y(end-10:end,:),Forecast,ForecastCov);

Forecast a VAR Model Using Monte Carlo Simulation.  

This example shows how to use Monte Carlo simulation via vgxsim to forecast a VAR model.

vgxsim enables you to generate simulations of time series based on your model. If you have a trustworthy model structure, you can use these simulations as sample forecasts.

vgxsim requires:

  • A model (impSpec in what follows)

  • The number of periods for the forecast (FT in what follows)

vgxsim optionally takes:

  • An exogenous data series

  • A presample time series (Y(end-3:end,:) in what follows)

  • Presample innovations

  • The number of realizations to simulate (2000 in what follows)

Load the Data_USEconModel data set. This example uses two time series: the logarithm of real GDP, and the real 3-month T-bill rate, both differenced to be approximately stationary. For illustration, a VAR(4) model describes the time series.

load Data_USEconModel
DEF = log(Dataset.CPIAUCSL);
GDP = log(Dataset.GDP);
rGDP = diff(GDP - DEF); % Real GDP is GDP - deflation
TB3 = 0.01*Dataset.TB3MS;
dDEF = 4*diff(DEF); % Scaling
rTB3 = TB3(2:end) - dDEF; % Real interest is deflated
Y = [rGDP,rTB3];

Fit a VAR(4) model specification:

Spec = vgxset('n',2,'nAR',4,'Constant',true);
impSpec = vgxvarx(Spec,Y(5:end,:),[],Y(1:4,:));
impSpec = vgxset(impSpec,'Series',...
  {'Transformed real GDP','Transformed real 3-mo T-bill rate'});

Define the forecast horizon.

FDates = datenum({'30-Jun-2009'; '30-Sep-2009'; '31-Dec-2009';
'31-Mar-2010'; '30-Jun-2010'; '30-Sep-2010'; '31-Dec-2010';
'31-Mar-2011'; '30-Jun-2011'; '30-Sep-2011'; '31-Dec-2011';
'31-Mar-2012'; '30-Jun-2012'; '30-Sep-2012'; '31-Dec-2012';
'31-Mar-2013'; '30-Jun-2013'; '30-Sep-2013'; '31-Dec-2013';
'31-Mar-2014'; '30-Jun-2014' });
FT = numel(FDates);

Simulate the model for 10 steps, replicated 2000 times:

rng(1); %For reproducibility
Ysim = vgxsim(impSpec,FT,[],Y(end-3:end,:),[],2000);

Calculate the mean and standard deviation of the simulated series:

Ymean = mean(Ysim,3); % Calculate means
Ystd = std(Ysim,0,3); % Calculate std deviations

Plot the means +/- 1 standard deviation for the simulated series:

subplot(2,1,1)
plot(dates(end-10:end),Y(end-10:end,1),'k')
hold('on')
plot([dates(end);FDates],[Y(end,1);Ymean(:,1)],'r')
plot([dates(end);FDates],[Y(end,1);Ymean(:,1)]+[0;Ystd(:,1)],'b')
plot([dates(end);FDates],[Y(end,1);Ymean(:,1)]-[0;Ystd(:,1)],'b')
datetick('x')
title('Transformed real GDP')
subplot(2,1,2)
plot(dates(end-10:end),Y(end-10:end,2),'k')
hold('on')
axis([dates(end-10),FDates(end),-.1,.1]);
plot([dates(end);FDates],[Y(end,2);Ymean(:,2)],'r')
plot([dates(end);FDates],[Y(end,2);Ymean(:,2)]+[0;Ystd(:,2)],'b')
plot([dates(end);FDates],[Y(end,2);Ymean(:,2)]-[0;Ystd(:,2)],'b')
datetick('x')
title('Transformed real 3-mo T-bill rate')

How vgxpred and vgxsim Work.  vgxpred generates two quantities:

  • A deterministic forecast time series based on 0 innovations

  • Time series of forecast covariances based on the Q matrix

The simulations for models with VMA terms uses presample innovation terms. Presample innovation terms are values of εt for times before the forecast period that affect the MA terms. For definitions of the terms MA, Q, and εt, see Types of VAR Models. If you do not provide all requisite presample innovation terms, vgxpred assumes the value 0 for missing terms.

vgxsim generates random time series based on the model using normal random innovations distributed with Q covariances. The simulations of models with MA terms uses presample innovation terms. If you do not provide all requisite presample innovation terms, vgxsim assumes the value 0 for missing terms.

Data Scaling

If you scaled any time series before fitting a model, you can unscale the resulting time series to understand its predictions more easily.

  • If you scaled a series with log, transform predictions of the corresponding model with exp.

  • If you scaled a series with diff(log), transform predictions of the corresponding model with cumsum(exp). cumsum is the inverse of diff; it calculates cumulative sums. As in integration, you must choose an appropriate additive constant for the cumulative sum. For example, take the log of the final entry in the corresponding data series, and use it as the first term in the series before applying cumsum.

Calculating Impulse Responses

You can examine the effect of impulse responses to models with the vgxproc function. An impulse response is the deterministic response of a time series model to an innovations process that has the value of one standard deviation in one component at the initial time, and zeros in all other components and times. vgxproc simulates the evolution of a time series model from a given innovations process. Therefore, vgxproc is appropriate for examining impulse responses.

The only difficulty in using vgxproc is determining exactly what is "the value of one standard deviation in one component at the initial time." This value can mean different things depending on your model.

  • For a structural model, B0 is usually a known diagonal matrix, and Q is an identity matrix. In this case, the impulse response to component i is the square root of B(i,i).

  • For a nonstructural model, there are several choices. The simplest choice, though not necessarily the most accurate, is to take component i as the square root of Q(i,i). Other possibilities include taking the Cholesky decomposition of Q, or diagonalizing Q and taking the square root of the diagonal matrix.

Generate Impulse Responses for a VAR model.  

This example shows how to generate impulse responses of an interest rate shock on real GDP using vgxproc.

Load the Data_USEconModel data set. This example uses two time series: the logarithm of real GDP, and the real 3-month T-bill rate, both differenced to be approximately stationary. Suppose that a VAR(4) model is appropriate to describe the time series.

load Data_USEconModel
DEF = log(Dataset.CPIAUCSL);
GDP = log(Dataset.GDP);
rGDP = diff(GDP - DEF); % Real GDP is GDP - deflation
TB3 = 0.01*Dataset.TB3MS;
dDEF = 4*diff(DEF); % Scaling
rTB3 = TB3(2:end) - dDEF; % Real interest is deflated
Y = [rGDP,rTB3];

Define the forecast horizon.

FDates = datenum({'30-Jun-2009'; '30-Sep-2009'; '31-Dec-2009';
'31-Mar-2010'; '30-Jun-2010'; '30-Sep-2010'; '31-Dec-2010';
'31-Mar-2011'; '30-Jun-2011'; '30-Sep-2011'; '31-Dec-2011';
'31-Mar-2012'; '30-Jun-2012'; '30-Sep-2012'; '31-Dec-2012';
'31-Mar-2013'; '30-Jun-2013'; '30-Sep-2013'; '31-Dec-2013';
'31-Mar-2014'; '30-Jun-2014' });
FT = numel(FDates);

Fit a VAR(4) model specification:

Spec = vgxset('n',2,'nAR',4,'Constant',true);
impSpec = vgxvarx(Spec,Y(5:end,:),[],Y(1:4,:));
impSpec = vgxset(impSpec,'Series',...
  {'Transformed real GDP','Transformed real 3-mo T-bill rate'});

Generate the innovations processes both with and without an impulse (shock):

W0 = zeros(FT, 2); % Innovations without a shock
W1 = W0;
W1(1,2) = sqrt(impSpec.Q(2,2)); % Innovations with a shock

Generate the processes with and without the shock:

Yimpulse = vgxproc(impSpec,W1,[],Y); % Process with shock
Ynoimp = vgxproc(impSpec,W0,[],Y); % Process with no shock

Undo the scaling for the GDP processes:

Yimp1 = exp(cumsum(Yimpulse(:,1))); % Undo scaling
Ynoimp1 = exp(cumsum(Ynoimp(:,1)));

Compute and plot the relative difference between the calculated GDPs:

RelDiff = (Yimp1 - Ynoimp1) ./ Yimp1;
plot(FDates,100*RelDiff);dateaxis('x',12)
title(...
'Impact of Interest Rate Shock on Real Gross Domestic Product')
ylabel('% Change')

The graph shows that an increased interest rate causes a dip in the real GDP for a short time. Afterwards the real GDP begins to climb again, reaching its former level in about 1 year.

VAR Model Case Study

Overview of the Case Study

This section contains an example of the workflow described in Building VAR Models. The example uses three time series: GDP, M1 money supply, and the 3-month T-bill rate. The example shows:

  1. Loading and transforming the data for stationarity

  2. Partitioning the transformed data into presample, estimation, and forecast intervals to support a backtesting experiment

  3. Making several models

  4. Fitting the models to the data

  5. Deciding which of the models is best

  6. Making forecasts based on the best model

Loading and Transforming Data

Loading Data.  The file Data_USEconModel ships with Econometrics Toolbox software. The file contains time series from the St. Louis Federal Reserve Economics Database. This example uses three of the time series: GDP, M1 money supply (M1SL), and 3-month T-bill rate (TB3MS). Load the data into a time series matrix Y as follows:

load Data_USEconModel
GDP = Dataset.GDP;
M1 = Dataset.M1SL;
TB3 = Dataset.TB3MS;
Y = [GDP,M1,TB3];

Transforming Data for Stationarity.  Plot the data to look for trends:

figure
subplot(3,1,1)
plot(dates,Y(:,1),'r');
title('GDP')
datetick('x')
grid on
subplot(3,1,2);
plot(dates,Y(:,2),'b');
title('M1')
datetick('x')
grid on
subplot(3,1,3);
plot(dates, Y(:,3), 'k')
title('3-mo T-bill')
datetick('x')
grid on
hold off

Not surprisingly, both the GDP and M1 data appear to grow, while the T-bill returns show no long-term growth. To counter the trends in GDP and M1, take a difference of the logarithms of the data. Taking a difference shortens the time series, as described in Transforming Data for Stationarity. Therefore, truncate the T-bill series and date series X so that the Y data matrix has the same number of rows for each column:

Y = [diff(log(Y(:,1:2))), Y(2:end,3)]; % Transformed data
X = dates(2:end);

figure
subplot(3,1,1)
plot(X,Y(:,1),'r');
title('GDP')
datetick('x')
grid on
subplot(3,1,2);
plot(X,Y(:,2),'b');
title('M1')
datetick('x')
grid on
subplot(3,1,3);
plot(X, Y(:,3),'k'),
title('3-mo T-bill')
datetick('x')
grid on

You see that the scale of the first two columns is about 100 times smaller than the third. Multiply the first two columns by 100 so that the time series are all roughly on the same scale. This scaling makes it easy to plot all the series on the same plot. More importantly, this type of scaling makes optimizations more numerically stable (for example, maximizing loglikelihoods).

Y(:,1:2) = 100*Y(:,1:2);
figure
plot(X,Y(:,1),'r');
hold on
plot(X,Y(:,2),'b');
datetick('x')
grid on
plot(X,Y(:,3),'k');
legend('GDP','M1','3-mo T-bill');
hold off

Selecting and Fitting Models

Selecting Models.  You can choose many different models for the data. This example rather arbitrarily chooses four models:

  • VAR(2) with diagonal autoregressive and covariance matrices

  • VAR(2) with full autoregressive and covariance matrices

  • VAR(4) with diagonal autoregressive and covariance matrices

  • VAR(4) with full autoregressive and covariance matrices

Make the series the same length, and transform them to be stationary and on a similar scale.

dGDP = 100*diff(log(GDP(49:end)));
dM1 = 100*diff(log(M1(49:end)));
dT3 = diff(TB3(49:end));
Y = [dGDP dM1 dT3];

Create the four models as follows:

dt = logical(eye(3));
VAR2diag = vgxset('ARsolve',repmat({dt},2,1),...
    'asolve',true(3,1),'Series',{'GDP','M1','3-mo T-bill'});
VAR2full = vgxset(VAR2diag,'ARsolve',[]);
VAR4diag = vgxset(VAR2diag,'nAR',4,'ARsolve',repmat({dt},4,1));
VAR4full = vgxset(VAR2full,'nAR',4);

The matrix dt is a diagonal logical matrix. dt specifies that both the autoregressive matrices for VAR2diag and VAR4diag are diagonal. In contrast, the specifications for VAR2full and VAR4full have empty matrices instead of dt. Therefore, vgxvarx fits the defaults, which are full matrices for autoregressive and correlation matrices.

Choosing Presample, Estimation, and Forecast Periods.  To assess the quality of the models, divide the response data Y into three periods: presample, estimation, and forecast. Fit the models to the estimation data, using the presample period to provide lagged data. Compare the predictions of the fitted models to the forecast data. The estimation period is in-sample, and the forecast period is out-of-sample (also known as backtesting).

For the two VAR(4) models, the presample period is the first four rows of Y. Use the same presample period for the VAR(2) models so that all the models are fit to the same data. This is necessary for model fit comparisons. For both models, the forecast period is the final 10% of the rows of Y. The estimation period for the models goes from row 5 to the 90% row. The code for defining these data periods follows:

Ypre = Y(1:4,:);
T = ceil(.9*size(Y,1));
Yest = Y(5:T,:);
YF = Y((T+1):end,:);
TF = size(YF,1);

Fitting with vgxvarx.  Now that the models and time series exist, you can easily fit the models to the data:

[EstSpec1,EstStdErrors1,LLF1,W1] = ...
    vgxvarx(VAR2diag,Yest,[],Ypre,'CovarType','Diagonal');
[EstSpec2,EstStdErrors2,LLF2,W2] = ...
    vgxvarx(VAR2full,Yest,[],Ypre);
[EstSpec3,EstStdErrors3,LLF3,W3] = ...
    vgxvarx(VAR4diag,Yest,[],Ypre,'CovarType','Diagonal');
[EstSpec4,EstStdErrors4,LLF4,W4] = ...
    vgxvarx(VAR4full,Yest,[],Ypre);
  • The EstSpec structures are the fitted models.

  • The EstStdErrors structures contain the standard errors of the fitted models.

  • The LLF are the loglikelihoods of the fitted models. Use these to help select the best model, as described in Checking Model Adequacy.

  • The W are the estimated innovations (residuals) processes, the same size as Yest.

  • The specification for EstSpec1 and EstSpec3 includes diagonal covariance matrices.

Checking Model Adequacy

Checking Stability.  You can check whether the estimated models are stable and invertible with the vgxqual function. (There are no MA terms in these models, so the models are necessarily invertible.) The test shows that all the estimated models are stable:

[isStable1,isInvertible1] = vgxqual(EstSpec1);
[isStable2,isInvertible2] = vgxqual(EstSpec2);
[isStable3,isInvertible3] = vgxqual(EstSpec3);
[isStable4,isInvertible4] = vgxqual(EstSpec4);
[isStable1,isStable2,isStable3,isStable4]
ans =

     1     1     1     1

You can also look at the estimated specification structures. Each contains a line stating whether the model is stable:

EstSpec4
EstSpec4 = 

      Model: 3-D VAR(4) with Additive Constant
     Series: {'GDP' 'M1' '3-mo T-bill'}
          n: 3
        nAR: 4
        nMA: 0
         nX: 0
          a: [0.524224 0.106746 -0.671714] additive constants
     asolve: [1 1 1 logical] additive constant indicators
         AR: {4x1 cell} stable autoregressive process
    ARsolve: {4x1 cell of logicals} autoregressive lag indicators
          Q: [3x3] covariance matrix
     Qsolve: [3x3 logical] covariance matrix indicators

Likelihood Ratio Tests.  You can compare the restricted (diagonal) AR models to their unrestricted (full) counterparts using the lratiotest function. The test rejects or fails to reject the hypothesis that the restricted models are adequate, with a default 5% tolerance. This is an in-sample test. To perform the test:

  1. Count the parameters in the models using the vgxcount function:

    [n1,n1p] = vgxcount(EstSpec1);
    [n2,n2p] = vgxcount(EstSpec2);
    [n3,n3p] = vgxcount(EstSpec3);
    [n4,n4p] = vgxcount(EstSpec4);
    
  2. Compute the likelihood ratio tests:

    reject1 = lratiotest(LLF2,LLF1,n2p - n1p)
    reject3 = lratiotest(LLF4,LLF3,n4p - n3p)
    reject4 = lratiotest(LLF4,LLF2,n4p - n2p)
    
    reject1 =
    
         1
    
    
    reject3 =
    
         1
    
    
    reject4 =
    
         0
    
    

The 1 results indicate that the likelihood ratio test rejected both the restricted models, AR(1) and AR(3), in favor of the corresponding unrestricted models. Therefore, based on this test, the unrestricted AR(2) and AR(4) models are preferable. However, the test does not reject the unrestricted AR(2) model in favor of the unrestricted AR(4) model. (This test regards the AR(2) model as an AR(4) model with restrictions that the autoregression matrices AR(3) and AR(4) are 0.) Therefore, it seems that the unrestricted AR(2) model is best among the models fit.

Akaike Information Criterion.  To find the best model among a set, minimize the Akaike information criterion. This is an in-sample calculation. Here is how to calculate the criterion for the four models:

AIC = aicbic([LLF1 LLF2 LLF3 LLF4],[n1p n2p n3p n4p])
AIC =

   1.0e+03 *

    1.5129    1.4462    1.5122    1.4628

The best model according to this criterion is the unrestricted AR(2) model. Notice, too, that the unrestricted AR(4) model has lower Akaike information than either of the restricted models. Based on this criterion, the unrestricted AR(2) model is best, with the unrestricted AR(4) model coming next in preference.

Comparing Forecasts with Forecast Period Data.  To compare the predictions of the four models against the forecast data YF, use the vgxpred function. This function returns both a prediction of the mean time series, and an error covariance matrix that gives confidence intervals about the means. This is an out-of-sample calculation.

[FY1,FYCov1] = vgxpred(EstSpec1,TF,[],Yest);
[FY2,FYCov2] = vgxpred(EstSpec2,TF,[],Yest);
[FY3,FYCov3] = vgxpred(EstSpec3,TF,[],Yest);
[FY4,FYCov4] = vgxpred(EstSpec4,TF,[],Yest);

A plot shows the predictions in the shaded region to the right:

figure
vgxplot(EstSpec2,Yest,FY2,FYCov2)

It is now straightforward to calculate the sum of squares error between the predictions and the data, YF:

error1 = YF - FY1;
error2 = YF - FY2;
error3 = YF - FY3;
error4 = YF - FY4;

SSerror1 = error1(:)' * error1(:);
SSerror2 = error2(:)' * error2(:);
SSerror3 = error3(:)' * error3(:);
SSerror4 = error4(:)' * error4(:);
figure
bar([SSerror1 SSerror2 SSerror3 SSerror4],.5)
ylabel('Sum of squared errors')
set(gca,'XTickLabel',...
    {'AR2 diag' 'AR2 full' 'AR4 diag' 'AR4 full'})
title('Sum of Squared Forecast Errors')

The predictive performance of the four models is similar.

The full AR(2) model seems to be the best and most parsimonious fit. Its model parameters are:

vgxdisp(EstSpec2)
  Model  : 3-D VAR(2) with Additive Constant
           Conditional mean is AR-stable and is MA-invertible
  Series : GDP
  Series : M1
  Series : 3-mo T-bill
  a Constant:
        0.687401
          0.3856
       -0.915879
  AR(1) Autoregression Matrix:
        0.272374     -0.0162214      0.0928186 
       0.0563884       0.240527      -0.389905 
        0.280759     -0.0712716       -0.32747 
  AR(2) Autoregression Matrix:
        0.242554       0.140464      -0.177957 
      0.00130726       0.380042     -0.0484981 
        0.260414       0.024308       -0.43541 
  Q Innovations Covariance:
        0.632182       0.105925       0.216806 
        0.105925       0.991607      -0.155881 
        0.216806      -0.155881        1.00082 

Forecasting

This example shows two ways to make predictions or forecasts based on the EstSpec4 fitted model:

  • Running vgxpred based on the last few rows of YF.

  • Simulating several time series with the vgxsim function.

In both cases, transform the forecasts so they are directly comparable to the original time series.

Forecasting with vgxpred.  This example shows predictions 10 steps into the future.

  1. Generate the prediction time series from the fitted model beginning at the latest times:

    [ypred,ycov] = vgxpred(EstSpec2,10,[],YF);
    
  2. Transform the predictions by undoing the scaling and differencing applied to the original data. Make sure to insert the last observation at the beginning of the time series before using cumsum to undo the differencing. And, since differencing occurred after taking logarithms, insert the logarithm before using cumsum:

    yfirst = [GDP,M1,TB3];
    yfirst = yfirst(49:end,:);           % Remove NaNs
    dates = dates(49:end);
    endpt = yfirst(end,:);
    endpt(1:2) = log(endpt(1:2));
    ypred(:,1:2) = ypred(:,1:2)/100;     % Rescale percentage
    ypred = [endpt; ypred];              % Prepare for cumsum
    ypred(:,1:3) = cumsum(ypred(:,1:3));
    ypred(:,1:2) = exp(ypred(:,1:2));
    lastime = dates(end);
    timess = lastime:91:lastime+910;     % Insert forecast horizon
    
    figure
    subplot(3,1,1)
    plot(timess,ypred(:,1),':r')
    hold on
    plot(dates,yfirst(:,1),'k')
    datetick('x')
    grid on
    title('GDP')
    subplot(3,1,2);
    plot(timess,ypred(:,2),':r')
    hold on
    plot(dates,yfirst(:,2),'k')
    datetick('x')
    grid on
    title('M1')
    subplot(3,1,3);
    plot(timess,ypred(:,3),':r')
    hold on
    plot(dates,yfirst(:,3),'k')
    datetick('x')
    grid on
    title('3-mo T-bill')
    hold off
    

    The plot shows the extrapolations in dotted red, and the original data series in solid black.

Look at the last few years in this plot to get a sense of how the predictions relate to the latest data points:

ylast = yfirst(170:end,:);
timeslast = dates(170:end);

figure
subplot(3,1,1)
plot(timess,ypred(:,1),'--r')
hold on
plot(timeslast,ylast(:,1),'k')
datetick('x')
grid on
title('GDP')
subplot(3,1,2);
plot(timess,ypred(:,2),'--r')
hold on
plot(timeslast,ylast(:,2),'k')
datetick('x')
grid on
title('M1')
subplot(3,1,3);
plot(timess,ypred(:,3),'--r')
hold on
plot(timeslast,ylast(:,3),'k')
datetick('x')
grid on
title('3-mo T-bill')
hold off

The forecast shows increasing GDP, little growth in M1, and a slight decline in the interest rate. However, the forecast has no error bars. For a forecast with errors, see Forecasting with vgxsim.

Forecasting with vgxsim.  This example shows forecasting 10 steps into the future, with a simulation replicated 2000 times, and generates the means and standard deviations.

  1. Simulate a time series from the fitted model beginning at the latest times:

    rng(1); % For reproducibility
    ysim = vgxsim(EstSpec2,10,[],YF,[],2000);
    
  2. Transform the predictions by undoing the scaling and differencing applied to the original data. Make sure to insert the last observation at the beginning of the time series before using cumsum to undo the differencing. And, since differencing occurred after taking logarithms, insert the logarithm before using cumsum:

    yfirst = [GDP,M1,TB3];
    endpt = yfirst(end,:);
    endpt(1:2) = log(endpt(1:2));
    ysim(:,1:2,:) = ysim(:,1:2,:)/100;
    ysim = [repmat(endpt,[1,1,2000]);ysim];
    ysim(:,1:3,:) = cumsum(ysim(:,1:3,:));
    ysim(:,1:2,:) = exp(ysim(:,1:2,:));
    
  3. Compute the mean and standard deviation of each series, and plot the results. The plot has the mean in black, with ±1 standard deviation in red.

    ymean = mean(ysim,3);
    ystd = std(ysim,0,3);
    
    figure
    subplot(3,1,1)
    plot(timess,ymean(:,1),'k')
    datetick('x')
    grid on
    hold on
    plot(timess,ymean(:,1)+ystd(:,1),'--r')
    plot(timess,ymean(:,1)-ystd(:,1),'--r')
    title('GDP')
    subplot(3,1,2);
    plot(timess,ymean(:,2),'k')
    hold on
    datetick('x')
    grid on
    plot(timess,ymean(:,2)+ystd(:,2),'--r')
    plot(timess,ymean(:,2)-ystd(:,2),'--r')
    title('M1')
    subplot(3,1,3);
    plot(timess,ymean(:,3),'k')
    hold on
    datetick('x')
    grid on
    plot(timess,ymean(:,3)+ystd(:,3),'--r')
    plot(timess,ymean(:,3)-ystd(:,3),'--r')
    title('3-mo T-bill')
    hold off
    

The series show increasing growth in GDP, moderate to little growth in M1, and uncertainty about the direction of T-bill rates.

Was this topic helpful?