G13 Class
関数リスト一覧   NagLibrary Namespaceへ  ライブラリイントロダクション  本ヘルプドキュメントのchm形式版

This chapter provides facilities for investigating and modelling the statistical structure of series of observations collected at equally spaced points in time. The models may then be used to forecast the series.
The chapter covers the following models and approaches.
  1. Univariate time series analysis, including autocorrelation functions and autoregressive moving average (ARMA) models.
  2. Univariate spectral analysis.
  3. Transfer function (multi-input) modelling, in which one time series is dependent on other time series.
  4. Bivariate spectral methods including coherency, gain and input response functions.
  5. Vector ARMA models for multivariate time series.
  6. Kalman filter models.
  7. GARCH models for volatility.

Syntax

C#
public static class G13
Visual Basic (Declaration)
Public NotInheritable Class G13
Visual C++
public ref class G13 abstract sealed
F#
[<AbstractClassAttribute>]
[<SealedAttribute>]
type G13 =  class end

Background to the Problems

Univariate Analysis

Let the given time series be x1,x2,,xn, where n is its length. The structure which is intended to be investigated, and which may be most evident to the eye in a graph of the series, can be broadly described as:
(a) trends, linear or possibly higher-order polynomial;
(b) seasonal patterns, associated with fixed integer seasonal periods. The presence of such seasonality and the period will normally be known a priori. The pattern may be fixed, or slowly varying from one season to another;
(c) cycles or waves of stable amplitude and period p (from peak to peak). The period is not necessarily integer, the corresponding absolute frequency (cycles/time unit) being f=1/p and angular frequency ω=2πf. The cycle may be of pure sinusoidal form like sinωt, or the presence of higher harmonic terms may be indicated, e.g., by asymmetry in the wave form;
(d) quasi-cycles, i.e., waves of fluctuating period and amplitude; and
(e) irregular statistical fluctuations and swings about the overall mean or trend.

Transformations

Differencing operations

Sample autocorrelations

Partial autocorrelations

Finite lag predictor coefficients and error variances

ARIMA models

Autoregressive parameters are appropriate when the autocorrelation function (ACF) pattern decays geometrically, or with a damped sinusoidal pattern which is associated with quasi-periodic behaviour in the series. If the sample partial autocorrelation function (PACF) ϕ^k,k is significant only up to some low lag p, then a pure autoregressive model ARp is appropriate, with q=0. Otherwise moving-average terms will need to be introduced, as well as autoregressive terms.
The seasonal ARIMA p,d,q,P,D,Q,s model allows for correlation at lags which are multiples of the seasonal period s. Taking wt=dsDxt, the series is represented in a two-stage manner via an intermediate series et: where Φi, Θi are the seasonal parameters and P and Q are the corresponding orders. Again, wt may be replaced by wt-c.

ARIMA model estimation

In theory, the parameters of an ARIMA model are determined by a sufficient number of autocorrelations ρ1,ρ2,. Using the sample values r1,r2, in their place it is usually (but not always) possible to solve for the corresponding ARIMA parameters.
These are rapidly computed but are not fully efficient estimates, particularly if moving-average parameters are present. They do provide useful preliminary values for an efficient but relatively slow iterative method of estimation. This is based on the least-squares principle by which parameters are chosen to minimize the sum of squares of the innovations at, which are regenerated from the data using (2), or the reverse of (3) and (4) in the case of seasonal models.
Lack of knowledge of terms on the right-hand side of (2), when t=1,2,,maxp,q, is overcome by introducing q unknown series values w0,w1,,wq-1 which are estimated as nuisance parameters, and using correction for transient errors due to the autoregressive terms. If the data w1,w2,,wN=w is viewed as a single sample from a multivariate Normal density whose covariance matrix V is a function of the ARIMA model parameters, then the exact likelihood of the parameters is
-12logV-12wTV-1w.
The least-squares criterion as outlined above is equivalent to using the quadratic form
QF=wTV-1w
as an objective function to be minimized. Neglecting the term -12logV yields estimates which differ very little from the exact likelihood except in small samples, or in seasonal models with a small number of whole seasons contained in the data. In these cases bias in moving-average parameters may cause them to stick at the boundary of their constraint region, resulting in failure of the estimation method.
Approximate standard errors of the parameter estimates and the correlations between them are available after estimation.
The model residuals, a^t, are the innovations resulting from the estimation and are usually examined for the presence of autocorrelation as a check on the adequacy of the model.

ARIMA model forecasting

Exponential Smoothing

Exponential smoothing is a relatively simple method of short term forecasting for a time series. A variety of different smoothing methods are possible, including; single exponential, Brown's double exponential, linear Holt (also called double exponential smoothing in some references), additive Holt–Winters and multiplicative Holt–Winters. The choice of smoothing method used depends on the characteristics of the time series. If the mean of the series is only slowly changing then single exponential smoothing may be suitable. If there is a trend in the time series, which itself may be slowly changing, then linear Holt smoothing may be suitable. If there is a seasonal component to the time series, e.g., daily or monthly data, then one of the two Holt–Winters methods may be suitable.
For a time series yt, for t=1,2,,n, the five smoothing functions are defined by the following:
  • Single Exponential Smoothing
    mt = α yt + 1-α mt-1 y^t+f = mt var y^t+f = varεt 1+ f-1 α2
  • Brown Double Exponential Smoothing
    mt = α yt + 1-α mt-1 rt = α mt - mt-1 + 1-α rt-1 y^ t+f = mt + f-1 + 1 / α rt var y^ t+f = varεt 1+ i=0 f-1 2α+ i-1 α2 2
  • Linear Holt Smoothing
    mt = α yt + 1-α mt-1 + ϕ rt-1 rt = γ mt - mt-1 + 1-γ ϕ rt-1 y^ t+f = mt + i=1 f ϕi rt var y^ t+f = var εt 1+ i=1 f-1 α + α γ ϕ ϕi-1 ϕ-1 2
  • Additive Holt–Winters Smoothing
    mt = α yt - s t-p + 1-α m t-1 +ϕ r t-1 rt = γ mt - m t-1 + 1-γ ϕ rt-1 st = β yt - mt + 1-β s t-p y^ t+f = mt + i=1 f ϕi rt + s t-p var y^ t+f = var εt 1+ i=1 f-1 ψi2 ψi = 0 if ​if α + αγϕ ϕ i -1 ϕ-1 if ​ i mod p0 α + α γ ϕ ϕi -1 ϕ-1 +β 1-α otherwise
  • Multiplicative Holt–Winters Smoothing
    mt = α yt / s t-p + 1-α m t-1 +ϕ r t-1 rt = γ mt - m t-1 + 1-γ ϕ r t-1 st = β yt / mt + 1-β s t-p y^ t+f = mt + i=1 f ϕi rt × s t-p var y^ t+f = var εt i=0 j=0 p-1 ψ j+ip s t+f s t+f-j 2
    and ψ is defined as in the additive Holt–Winters smoothing,
where mt is the mean, rt is the trend and st is the seasonal component at time t with p being the seasonal order. The f-step ahead forecasts are given by y^t+f and their variances by var y^ t+f . The term var εt  is estimated as the mean deviation.
The parameters, α, β and γ control the amount of smoothing. The nearer these parameters are to one, the greater the emphasis on the current data point. Generally these parameters take values in the range 0.1 to 0.3. The linear Holt and two Holt–Winters smoothers include an additional parameter, ϕ, which acts as a trend dampener. For 0.0<ϕ<1.0 the trend is dampened and for ϕ>1.0 the forecast function has an exponential trend, ϕ=0.0 removes the trend term from the forecast function and ϕ=1.0 does not dampen the trend.
For all methods, values for α, β, γ and ψ can be chosen by trying different values and then visually comparing the results by plotting the fitted values along side the original data. Alternatively, for single exponential smoothing a suitable value for α can be obtained by fitting an ARIMA0,1,1 model. For Brown's double exponential smoothing and linear Holt smoothing with no dampening, (i.e., ϕ=1.0), suitable values for α and, in the case of linear Holt smoothing, γ can be obtained by fitting an ARIMA0,2,2 model. Similarly, the linear Holt method, with ϕ1.0, can be expressed as an ARIMA1,2,2 model and the additive Holt–Winters, with no dampening, (ϕ=1.0), can be expressed as a seasonal ARIMA model with order p of the form ARIMA0,1,p+10,1,0. There is no similar procedure for obtaining parameter values for the multiplicative Holt–Winters method, or the additive Holt–Winters method with ϕ1.0. In these cases parameters could be selected by minimizing a measure of fit using nonlinear optimization.

Univariate Spectral Analysis

In describing a time series using spectral analysis the fundamental components are taken to be sinusoidal waves of the form Rcosωt+ϕ, which for a given angular frequency ω, 0ωπ, is specified by its amplitude R>0 and phase ϕ, 0ϕ<2π. Thus in a time series of n observations it is not possible to distinguish more than n/2 independent sinusoidal components. The frequency range 0ωπ is limited to the shortest wavelength of two sampling units because any wave of higher frequency is indistinguishable upon sampling (or is aliased with) a wave within this range. Spectral analysis follows the idea that for a series made up of a finite number of sine waves the amplitude of any component at frequency ω is given to order 1/n by
R2= 1n2 t=1 n xt eiωt 2 .

The sample spectrum

For a series x1,x2,,xn this is defined as
f* ω= 12nπ t=1 n xt eiωt 2 ,
the scaling factor now being chosen in order that
20πf*ωdω=σx2,
i.e., the spectrum indicates how the sample variance (σx2) of the series is distributed over components in the frequency range 0ωπ.
It may be demonstrated that f*ω is equivalently defined in terms of the sample ACF rk of the series as
f*ω=12π c0+2k=1 n-1ckcoskω ,
where ck=σx2rk are the sample autocovariance coefficients.
If the series xt does contain a deterministic sinusoidal component of amplitude R, this will be revealed in the sample spectrum as a sharp peak of approximate width π/n and height n/2πR2. This is called the discrete part of the spectrum, the variance R2 associated with this component being in effect concentrated at a single frequency.
If the series xt has no deterministic components, i.e., is purely stochastic being stationary with autocorrelation function (ACF) rk, then with increasing sample size the expected value of f*ω converges to the theoretical spectrum – the continuous part
fω=12π γ0+2k=1γk cosωk ,
where γk are the theoretical autocovariances.
The sample spectrum does not however converge to this value but at each frequency point fluctuates about the theoretical spectrum with an exponential distribution, being independent at frequencies separated by an interval of 2π/n or more. Various devices are therefore employed to smooth the sample spectrum and reduce its variability. Much of the strength of spectral analysis derives from the fact that the error limits are multiplicative so that features may still show up as significant in a part of the spectrum which has a generally low level, whereas they are completely masked by other components in the original series. The spectrum can help to distinguish deterministic cyclical components from the stochastic quasi-cycle components which produce a broader peak in the spectrum. (The deterministic components can be removed by regression and the remaining part represented by an ARIMA model.)
A large discrete component in a spectrum can distort the continuous part over a large frequency range surrounding the corresponding peak. This may be alleviated at the cost of slightly broadening the peak by tapering a portion of the data at each end of the series with weights which decay smoothly to zero. It is usual to correct for the mean of the series and for any linear trend by simple regression, since they would similarly distort the spectrum.

Spectral smoothing by lag window

Direct spectral smoothing

Linear Lagged Relationships Between Time Series

We now consider the context in which one time series, called the dependent or output series, y1,y2,,yn, is believed to depend on one or more explanatory or input series, e.g., x1,x2,,xn. This dependency may follow a simple linear regression, e.g.,
yt=vxt+nt
or more generally may involve lagged values of the input
yt=v0xt+v1xt- 1+v2xt- 2++nt.
The sequence v0,v1,v2, is called the impulse response function (IRF) of the relationship. The term nt represents that part of yt which cannot be explained by the input, and it is assumed to follow a univariate ARIMA model. We call nt the (output) noise component of yt, and it includes any constant term in the relationship. It is assumed that the input series, xt, and the noise component, nt, are independent.
The part of yt which is explained by the input is called the input component zt:
zt=v0xt+v1xt-1+v2xt-2+
so yt=zt+nt.
The eventual aim is to model both these components of yt on the basis of observations of y1,y2,,yn and x1,x2,,xn. In applications to forecasting or control both components are important. In general there may be more than one input series, e.g., x1,t and x2,t, which are assumed to be independent and corresponding components z1,t and z2,t, so
yt=z1,t+z2,t+nt.

Transfer function models

Cross-correlations

Prewhitening or filtering by an ARIMA model

Multi-input model estimation

The term multi-input model is used for the situation when one output series yt is related to one or more input series xj,t, as described in [Linear Lagged Relationships Between Time Series]. If for a given input the relationship is a simple linear regression, it is called a simple input; otherwise it is a transfer function input. The error or noise term follows an ARIMA model.
Given that the orders of all the transfer function models and the ARIMA model of a multi-input model have been specified, the various parameters in those models may be (simultaneously) estimated.
The procedure used is closely related to the least-squares principle applied to the innovations in the ARIMA noise model.
The innovations are derived for any proposed set of parameter values by calculating the response of each input to the transfer functions and then evaluating the noise nt as the difference between this response (combined for all the inputs) and the output. The innovations are derived from the noise using the ARIMA model in the same manner as for a univariate series, and as described in [ARIMA models].
In estimating the parameters, consideration has to be given to the lagged terms in the various model equations which are associated with times prior to the observation period, and are therefore unknown. The method descriptions provide the necessary detail as to how this problem is treated.
Also, as described in [ARIMA model estimation] the sum of squares criterion
S=at2
is related to the quadratic form in the exact log-likelihood of the parameters:
-12logV-12wTV-1w.
Here w is the vector of appropriately differenced noise terms, and
wTV-1w=S/σa2,
where σa2 is the innovation variance parameter.
The least-squares criterion is therefore identical to minimization of the quadratic form, but is not identical to exact likelihood. Because V may be expressed as Mσa2, where M is a function of the ARIMA model parameters, substitution of σa2 by its maximum likelihood (ML) estimator yields a concentrated (or profile) likelihood which is a function of
M1/NS.
N is the length of the differenced noise series w, and M=detM.
Use of the above quantity, called the deviance, D, as an objective function is preferable to the use of S alone, on the grounds that it is equivalent to exact likelihood, and yields estimates with better properties. However, there is an appreciable computational penalty in calculating D, and in large samples it differs very little from S, except in the important case of seasonal ARIMA models where the number of whole seasons within the data length must also be large.
You are given the option of taking the objective function to be either S or D, or a third possibility, the marginal likelihood. This is similar to exact likelihood but can counteract bias in the ARIMA model due to the fitting of a large number of simple inputs.
Approximate standard errors of the parameter estimates and the correlations between them are available after estimation.
The model residuals a^t are the innovations resulting from the estimation, and they are usually examined for the presence of either autocorrelation or cross-correlation with the inputs. Absence of such correlation provides some confirmation of the adequacy of the model.

Multi-input model forecasting

Transfer function model filtering

Multivariate Time Series

Multi-input modelling represents one output time series in terms of one or more input series. Although there are circumstances in which it may be more appropriate to analyse a set of time series by modelling each one in turn as the output series with the remainder as inputs, there is a more symmetric approach in such a context. These models are known as vector autoregressive moving-average (VARMA) models.

Differencing and transforming a multivariate time series

Model identification for a multivariate time series

Multivariate analogues of the autocorrelation and partial autocorrelation functions are available for analysing a set of k time series, xi,1,xi,2,,xi,n, for i=1,2,,k, thereby making it possible to obtain some understanding of a suitable VARMA model for the observed series.
It is assumed that the time series have been differenced if necessary, and that they are jointly stationary. The lagged correlations between all possible pairs of series, i.e.,
ρ ijl = corr x i , t , x j , t + l
are then taken to provide an adequate description of the statistical relationships between the series. These quantities are estimated by sample auto- and cross-correlations rijl. For each l these may be viewed as elements of a (lagged) autocorrelation matrix.
Thus consider the vector process xt (with elements xit) and lagged autocovariance matrices Γl with elements of σiσjρijl where σi2=varxi,t. Correspondingly, Γl is estimated by the matrix Cl with elements sisjrijl where si2 is the sample variance of xit.
For a series with short-term cross-correlation only, i.e., rijl is not significant beyond some low lag q, then the pure vector MAq model, with no autoregressive parameters, i.e., p=0, is appropriate.
In the univariate case the partial autocorrelation function (PACF) between xt and xt+l is the correlation coefficient between the two after removing the linear dependence on each of the intervening variables xt+1,xt+2,,xt+l-1. This partial autocorrelation may also be obtained as the last regression coefficient associated with xt when regressing xt+l on its l lagged variables xt+l-1,xt+l-2,,xt. Tiao and Box (1981) extended this method to the multivariate case to define the partial autoregression matrix. Heyse and Wei (1985) also extended the univariate definition of the PACF to derive the correlation matrix between the vectors xt and xt+l after removing the linear dependence on each of the intervening vectors xt+1,xt+2,,xt+l-1, the partial lag correlation matrix.
Note that the partial lag correlation matrix is a correlation coefficient matrix since each of its elements is a properly normalized correlation coefficient. This is not true of the partial autoregression matrices (except in the univariate case for which the two types of matrix are the same). The partial lag correlation matrix at lag 1 also reduces to the regular correlation matrix at lag 1; this is not true of the partial autoregression matrices (again except in the univariate case).
Both the above share the same cut-off property for autoregressive processes; that is for an autoregressive process of order p, the terms of the matrix at lags p+1 and greater are zero. Thus if the sample partial cross-correlations are significant only up to some low lag p then a pure vector ARp model is appropriate with q=0. Otherwise moving-average terms will need to be introduced as well as autoregressive terms.
Under the hypothesis that xt is an autoregressive process of order l-1, n times the sum of the squared elements of the partial lag correlation matrix at lag l is asymptotically distributed as a χ2 variable with k2 degrees of freedom where k is the dimension of the multivariate time series. This provides a diagnostic aid for determining the order of an autoregressive model.
The partial autoregression matrices may be found by solving a multivariate version of the Yule–Walker equations to find the autoregression matrices, using the final regression matrix coefficient as the partial autoregression matrix at that particular lag.
The basis of these calculations is a multivariate autoregressive model:
xt=ϕl,1xt-1++ϕl,lxt-l+el,t
where ϕl,1,ϕl,2,,ϕl,l are matrix coefficients, and el,t is the vector of errors in the prediction. These coefficients may be rapidly computed using a recursive technique which requires, and simultaneously furnishes, a backward prediction equation:
xt-l-1=ψl,1xt-l+ψl,2xt-l+1++ψl,lxt-1+fl,t
(in the univariate case ψl,i=ϕl,i).
The forward prediction equation coefficients, ϕl,i, are of direct interest, together with the covariance matrix Dl of the prediction errors el,t. The calculation of these quantities for a particular maximum equation lag l=L involves calculation of the same quantities for increasing values of l=1,2,,L.
The quantities vl=detDl/detΓ0 may be viewed as generalized variance ratios, and provide a measure of the efficiency of prediction (the smaller the better). The reduction from vl-1 to vl which occurs on extending the order of the predictor to l may be represented as
vl=vl-11-ρl2
where ρl2 is a multiple squared partial autocorrelation coefficient associated with k2 degrees of freedom.
Sample estimates of all the above quantities may be derived by using the series covariance matrices Cl, for l=1,2,,L, in place of Γl. The best lag for prediction purposes may be chosen as that which yields the minimum final prediction error (FPE) criterion:
FPEl=vl× 1+lk2/n 1-lk2/n .
An alternative method of estimating the sample partial autoregression matrices is by using multivariate least-squares to fit a series of multivariate autoregressive models of increasing order.

VARMA model estimation

The cross-correlation structure of a stationary multivariate time series may often be represented by a model with a small number of parameters belonging to the VARMA class. If the stationary series wt has been derived by transforming and/or differencing the original series xt, then wt is said to follow the VARMA model:
wt=ϕ1wt-1++ϕpwt-p+εt-θ1εt-1--θqεt-q,
where εt is a vector of uncorrelated residual series (white noise) with zero mean and constant covariance matrix Σ, ϕ1,ϕ2,,ϕp are the p autoregressive (AR) parameter matrices and θ1,θ2,,θq are the q moving-average (MA) parameter matrices. If wt has a nonzero mean μ, then this can be allowed for by replacing wt,wt-1, by wt-μ,wt-1-μ, in the model.
A series generated by this model will only be stationary provided restrictions are placed on ϕ1,ϕ2,,ϕp to avoid unstable growth of wt. These are stationarity constraints. The series εt may also be usefully interpreted as the linear innovations in wt, i.e., the error if wt were to be predicted using the information in all past values wt-1,wt-2,, provided also that θ1,θ2,,θq satisfy what are known as invertibility constraints. This allows the series εt to be generated by rewriting the model equation as
εt=wt-ϕ1wt-1--ϕpwt-p+θ1εt-1++θqεt-q.
The method of maximum likelihood (ML) may be used to estimate the parameters of a specified VARMA model from the observed multivariate time series together with their standard errors and correlations.
The residuals from the model may be examined for the presence of autocorrelations as a check on the adequacy of the fitted model.

VARMA model forecasting

Cross-spectral Analysis

The relationship between two time series may be investigated in terms of their sinusoidal components at different frequencies. At frequency ω a component of yt of the form
Ryωcosωt-ϕyω
has its amplitude Ryω and phase lag ϕyω estimated by
Ryωeiϕyω=1nt=1nyteiωt
and similarly for xt. In the univariate analysis only the amplitude was important – in the cross analysis the phase is important.

The sample cross-spectrum

The amplitude and phase spectrum

The coherency spectrum

The gain and noise spectrum

If yt is believed to be related to xt by a linear lagged relationship as in [Linear Lagged Relationships Between Time Series], i.e.,
yt=v0xt+v1xt-1+v2xt-2++nt,
then the theoretical cross-spectrum is
fxyω =Vω fxxω
where
Vω=Gωeiϕω=k=0vkeikω
is called the frequency response of the relationship.
Thus if xt were a sinusoidal wave at frequency ω (and nt were absent), yt would be similar but multiplied in amplitude by Gω and shifted in phase by ϕω. Furthermore, the theoretical univariate spectrum
fyyω=G ω 2fxxω+fnω
where nt, with spectrum fnω, is assumed independent of the input xt.
Cross-spectral analysis thus furnishes estimates of the gain
G^ ω= f^ xy ω / f^xx ω
and the phase
ϕ^ω=argf^xyω .
From these representations of the estimated frequency response V^ω, parametric transfer function (TF) models may be recognized and selected. The noise spectrum may also be estimated as
f^ yx ω= f^yy ω 1-W^ ω
a formula which reflects the fact that in essence a regression is being performed of the sinusoidal components of yt on those of xt over each frequency band.
Interpretation of the frequency response may be aided by extracting from V^ω estimates of the impulse response function (IRF) v^k. It is assumed that there is no anticipatory response between yt and xt, i.e., no coefficients vk with k=-1 or -2 are needed (their presence might indicate feedback between the series).

Cross-spectrum smoothing by lag window

Direct smoothing of the cross-spectrum

Kalman Filters

The estimate of Xt given observations Y1 to Yt-1 is denoted by X^ tt-1  with state covariance matrix E X^ tt-1 X^tt-1T = Ptt-1  while the estimate of Xt given observations Y1 to Yt is denoted by X^ tt  with covariance matrix E X^ tt X^ttT = Ptt.
The update of the estimate, X^ t+1t , from time t to time t+1, is computed in two stages.
First, the update equations are
X^ tt = X^ tt-1 + Kt rt,   Ptt= I- Kt Ct Ptt-1
where the residual rt= Yt- Ct Xtt-1  has an associated covariance matrix Ht = Ct Ptt-1 CtT + Rt , and Kt is the Kalman gain matrix with
Kt= Ptt-1 CtT Ht-1 .
The second stage is the one-step-ahead prediction equations given by
X^ t+1 t = At X^ t t ,   Pt+1 t = At Pt t AtT + Bt Qt BtT .
These two stages can be combined to give the one-step-ahead update-prediction equations
X^ t+1t = At X^ tt-1 + At Kt rt .
The above equations thus provide a method for recursively calculating the estimates of the state vectors X^ tt  and X^ t+1t  and their covariance matrices Ptt  and Pt+1t from their previous values. This recursive procedure can be viewed in a Bayesian framework as being the updating of the prior by the data Yt.
The initial values X^ 10 and P10 are required to start the recursion. For stationary systems, P10 can be computed from the following equation:
P10=A1P10 A1T +B1Q1 B1T ,
which can be solved by iterating on the equation. For X^ 10 the value EX can be used if it is available.

Computational methods

Model fitting and forecasting

Kalman filter and time series models

GARCH Models

ARCH models and their generalizations

Rather than modelling the mean (for example using regression models) or the autocorrelation (by using ARMA models) there are circumstances in which the variance of a time series needs to be modelled. This is common in financial data modelling where the variance (or standard deviation) is known as volatility. The ability to forecast volatility is a vital part in deciding the risk attached to financial decisions like portfolio selection. The basic model for relating the variance at time t to the variance at previous times is the autoregressive conditional heteroskedastic (ARCH) model. The standard ARCH model is defined as
ytψt-1N0,ht, ht=α0+i=1qαiεt-12,
where ψt is the information up to time t and ht is the conditional variance.
This can be combined with a regression model:
yt=b0+i= 1k bi xit+εt,
where εtψt-1N0,ht and where xit, for i=1,2,,k, are the exogenous variables.
The above models assume that the change in variance, ht, is symmetric with respect to the shocks, that is, that a large negative value of εt-1 has the same effect as a large positive value of εt-1. A frequently observed effect is that a large negative value εt-1 often leads to a greater variance than a large positive value. The following three asymmetric models represent this effect in different ways using the parameter γ as a measure of the asymmetry.
Type I AGARCH(p,q)
ht = α0 + i=1 q αi εt-i +γ 2 + i=1 p βi ht-i .
Type II AGARCH(p,q)
ht = α0 + i=1 q αi εt-i + γ εt-i 2+ i=1 p βi ht-i .
GJR-GARCH(p,q), or Glosten, Jagannathan and Runkle GARCH (see Glosten et al. (1993))
ht = α0 + i=1 q αi + γ It-1 ε t-1 2 + i=1 p βi ht-i ,
where It=1 if εt<0 and It=0 if εt0.
The first assumes that the effects of the shocks are symmetric about γ rather than zero, so that for γ<0 the effect of negative shocks is increased and the effect of positive shocks is decreased. Both the Type II AGARCH and the GJR GARCH (see Glosten et al. (1993)) models introduce asymmetry by increasing the value of the coefficient of εt-12 for negative values of εt-1. In the case of the Type II AGARCH the effect is multiplicative while for the GJR GARCH the effect is additive.
Coefficient εt-1<0 εt-1>0
Type II AGARCH αi 1-γ 2 αi 1+γ 2
GJR GARCH αi+γ αi
(Note that in the case of GJR GARCH, γ needs to be positive to inflate variance after negative shocks while for Type I and Type II AGARCH, γ needs to be negative.)
A third type of GARCH model is the exponential GARCH (EGARCH). In this model the variance relationship is on the log scale and hence asymmetric.
lnht = α0 + i=1 q αi zt-i + i=1 q ϕi zt-i - E zt-i + i=1 p βi ln ht-i ,
where zt= εtht  and Ezt-i denotes the expected value of zt-i.
Note that the ϕi terms represent a symmetric contribution to the variance while the αi terms give an asymmetric contribution.
Another common characteristic of financial data is that it is heavier in the tails (leptokurtic) than the Normal distribution. To model this the Normal distribution is replaced by a scaled Student's t-distribution (that is a Student's t-distribution standardized to have variance ht). The Student's t-distribution is such that the smaller the degrees of freedom the higher the kurtosis for degrees of freedom >4.

Fitting GARCH models

References

Inheritance Hierarchy

See Also