Introduction (TS 3.1, 3.3, 3.6, S6) #

Overview of main models #

We are now ready to introduce the main models with full generality, and to discuss an overall estimation, fitting, and forecasting approach. The main models we will review are:

Autoregressive models of order $p$ , denoted $A R (p)$ : $x_{t} = α + ϕ_{1} x_{t - 1} + ϕ_{2} x_{t - 2} + \dots + ϕ_{p} x_{t - p} + w_{t},$ where $α = μ (1 - ϕ_{1} - ϕ_{2} - \dots - ϕ_{p})$ .
Moving averages of order $q$ , denoted $M A (q)$ : $x_{t} = w_{t} + θ_{1} w_{t - 1} + θ_{2} w_{t - 2} + \dots + θ_{q} w_{t - q} .$
$A R M A (p, q)$ models, which are a combination of the above: $x_{t} = ϕ_{1} x_{t - 1} + \dots + ϕ_{p} x_{t - p} + w_{t} + θ_{1} w_{t - 1} + \dots + θ_{q} w_{t - q}$

Variations #

$A R I M A (p, d, q)$ models, which reduce to $A R M A (p, q)$ when differenced $d$ times.
Multiplicative seasonal ARIMA models:
- Pure seasonal autoregressive moving average $A R M A (P, Q)_{s}$ models
- Multiplicative seasonal autoregressive moving average models
- Seasonal autoregressive integrated moving average models
Multivariate time series—Vector Auto-Regressive $V A R (p)$ models
Extra models mentioned by the CS2 syllabus:
- bilinear model
- threshold autoregressive model
- $A R C H (p)$ models

Partial Autocorrelation Function (PACF) #

Motivation #

There is one more property we need to introduce, the Partial Autocorrelation Function (PACF).
The ACF can help identify $q$ for a $M A (q)$ model, because it should be 0 for lags $> q$
However, the order of an $A R (p)$ or $A R M A (p, q)$ model cannot be recognised from the ACF.
What we need for autoregressive models is a measure of dependence at two different points in time,
with the effect of the other points in time removed.
We will remove the linear effect of all variables in-between, and so by increasing the lag iteratively, this partial correlation should be 0 after some lag ( $p$ for $A R (p)$ )

Reminder: partial correlation #

Let $X$ , $Y$ , and $Z$ be random variables.
The partial correlation between $X$ and $Y$ , given $Z$ is obtained by
- Regressing $X$ on $Z$ to obtain $\hat{X}$ ;
- Regressing $Y$ on $Z$ to obtain $\hat{Y}$ ,
Then the partial correlation is $ρ_{X Y | Z} = corr {X - \hat{X}, Y - \hat{Y}} .$
In the special case where the random variables are multivariate normal (and hence dependence is linear), $ρ_{X Y | Z} = corr {X, Y | Z} .$ (Otherwise, $\hat{X}$ and $\hat{Y}$ are only linear projections, not the full story.)

The partial autocorrelation function (PACF) #

The partial autocorrelation function (PACF) of a stationary process, $x_{t}$ , denoted $ϕ_{h h}$ , for $h = 1, 2, \dots$ , is $ϕ_{11} = corr (x_{t + 1}, x_{t}) = ρ (1),$ and $ϕ_{h h} = corr (x_{t + h} - {\hat{x}}_{t + h}, x_{t} - {\hat{x}}_{t}), h \geq 2.$ Note:

The PACF, $ϕ_{h h}$ , is the correlation between $x_{t + h}$ and $x_{t}$ with the linear dependence of ${x_{t + 1}, \dots, x_{t + h - 1}}$ on each, removed.
The PACF of an $A R (p)$ model will be 0 for $h > p$ .
The R function pacf will display a sample $ϕ_{h h}$ for $h > 0$ .

Examples of AR PACFs #

set.seed(2)
AR1 <- arima.sim(list(order = c(1, 0, 0), ar = 0.8), n = 10000)
AR2 <- arima.sim(list(order = c(2, 0, 0), ar = c(0.8, -0.1)),
  n = 10000)
AR5 <- arima.sim(list(order = c(5, 0, 0), ar = c(0.8, -0.8, 0.5,
  -0.5, 0.3)), n = 10000)
par(mfrow = c(2, 3))
acf(AR1)
acf(AR2)
acf(AR5)
pacf(AR1)
pacf(AR2)
pacf(AR5)

Examples of MA PACFs #

set.seed(2)
MA1 <- arima.sim(list(order = c(0, 0, 1), ma = 0.8), n = 10000)
MA2 <- arima.sim(list(order = c(0, 0, 2), ma = c(0.8, -0.1)),
  n = 10000)
MA5 <- arima.sim(list(order = c(0, 0, 5), ma = c(0.8, -0.8, 0.5,
  -0.5, 0.3)), n = 10000)
par(mfrow = c(2, 3))
acf(MA1)
acf(MA2)
acf(MA5)
pacf(MA1)
pacf(MA2)
pacf(MA5)

$A R (p)$ models (TS 3.1, 3.3, 3.6, S6) #

Definition #

$A R (p)$ models #

An autoregressive model of order $p$ $A R (p)$ is of the form $x_{t} = α + ϕ_{1} x_{t - 1} + ϕ_{2} x_{t - 2} + \dots + ϕ_{p} x_{t - p} + w_{t},$ where $x_{t}$ is stationary, $w_{t} \sim wn (0, σ_{w}^{2})$ , $ϕ_{1}, ϕ_{2}, \dots, ϕ_{p}$ are constants $(ϕ_{p} \neq 0)$ , and where $α = μ (1 - ϕ_{1} - ϕ_{2} - \dots - ϕ_{p})$ .

Assume $α = 0$ and rewrite $(1 - ϕ_{1} B - ϕ_{2} B^{2} - \dots - ϕ_{p} B^{p}) x_{t} = w_{t} or ϕ (B) x_{t} = w_{t} .$ Note:

Unless stated otherwise we assume $α = 0$ . The mean of $x_{t}$ is then zero.
Recall that here the regressors $x_{t - 1}, \dots, x_{t - p}$ are random components (as opposed to the ‘usual’ regression).
The autoregressive operator is defined as $ϕ (B) = (1 - ϕ_{1} B - ϕ_{2} B^{2} - \dots - ϕ_{p} B^{p}),$ the roots of which will be crucial for the analysis.

Example: AR(1) model #

An $A R (1)$ is $\begin{array}{rcl} x_{t} & = & ϕ x_{t - 1} + w_{t} \\ = & ϕ (ϕ x_{t - 2} + w_{t - 1}) + w_{t} \\ ⋮ \\ = & ϕ^{k} x_{t - k} + \sum_{j = 0}^{k - 1} ϕ^{j} w_{t - j} \end{array}$ If one continues to iterate backwards, one gets (provided $| ϕ | < 1$ and $sup_{t} V a r (x_{t}) < \infty$ ) $x_{t} = \sum_{j = 0}^{\infty} ϕ^{j} w_{t - j},$ a linear process! This is called the stationary solution of the model.

The stationary solution of the model has mean $E [x_{t}] = \sum_{j = 0}^{\infty} ϕ^{j} E [w_{t - j}] = 0,$ and autocovariance function (see (3.7) in TS) $γ (h) = \frac{σ_{w}^{2} ϕ^{h}}{1 - ϕ^{2}}, h \geq 0.$ (remember $γ (h) = γ (- h)$ ) The ACF of an $A R (1)$ is $ρ (h) = \frac{γ (h)}{γ (0)} = ϕ^{h}, h \geq 0,$ and $ρ (h)$ satisfies the recursion $ρ (h) = ϕ ρ (h - 1), h = 1, 2, \dots .$

Sample paths of $A R (1)$ processes #

par(mfrow = c(2, 1))
plot(arima.sim(list(order = c(1, 0, 0), ar = 0.9), n = 100),
  ylab = "x", main = (expression(AR(1) ~ ~~phi == +0.9 ~ ~~sigma^2 ==
    1 ~ ~~rho == 0.9^h)))
plot(arima.sim(list(order = c(1, 0, 0), ar = -0.9), n = 100),
  ylab = "x", main = (expression(AR(1) ~ ~~phi == -0.9 ~ ~~sigma^2 ==
    1 ~ ~~rho == (-0.9)^h)))

Stationarity (and Causality) #

Stationarity of $A R (p)$ processes #

The time series process $x_{t}$ is stationary and causal if, and only if, the roots of the characteristic polynomial $1 - ϕ_{1} z - ϕ_{2} z^{2} - \dots - ϕ_{p} z^{p} = ϕ (z) = 0$ (which can be complex numbers) are all greater than 1 in absolute value (outside the unit circle).

We will get a sense of why as we make our way through this module.

ACF and PACF #

Autocovariances #

Remember that for stationary $x_{t} = \sum_{j = 1}^{p} ϕ_{j} x_{t - j} + w_{t}$ we have $γ (k) = C o v (\sum_{j = 1}^{p} ϕ_{j} x_{t - j} + w_{t}, x_{t - k}) = \sum_{j = 1}^{p} ϕ_{j} γ (k - j) for k \geq 0.$ This is a $p$ -th order difference equation with constant coefficients with solution (see TS 3.2) $γ (k) = \sum_{j = 1}^{p} A_{j} z_{j}^{- k} for all k \geq 0,$ where $z_{1}, \dots, z_{p}$ are the $p$ roots of the characteristic polynomial $ϕ (z) = 0$ , and $A_{1}, \dots, A_{p}$ are constants depending on the initial values.

R can calculate this for you (code is provided later).

Note:

we expect $γ (k) \to 0$ as $k \to \infty$ for stationary $x_{t}$
this is equivalent to $| z_{j} | > 1$ for all $j$
this is equivalent to our iff condition for $x_{t}$ to be stationary and causal

Yule-Walker equations #

We have a solution form for $γ (k)$ , but no explicit solution yet. These need to be solved for.
Sometimes, it is quicker to use the so-called “Yule-Walker equations”. Consider the equation for an $A R (p)$ model with $μ = 0$ , multiply by $x_{t - k}$ , take expectations to get $\begin{array}{rcl} C o v (x_{t}, x_{t - k}) & = & ϕ_{1} C o v (x_{t - 1}, x_{t - k}) + ϕ_{2} C o v (x_{t - 2}, x_{t - k}) + \dots \\ + ϕ_{p} C o v (x_{t - p}, x_{t - k}) + C o v (w_{t}, x_{t - k}), 0 \leq k \leq p, \end{array}$ that is, $γ (k) = ϕ_{1} γ (k - 1) + ϕ_{2} γ (k - 2) + \dots + ϕ_{p} γ (k - p) + σ_{w}^{2} 1_{{k = 0}}, 0 \leq k \leq p .$ This is solvable thanks to the fact that $γ (k) = γ (- k)$ .
A matrix representation will be introduced later

Example #

For an $A R (3)$ model we have $\begin{array}{rcl} γ (3) & = & ϕ_{1} γ (2) + ϕ_{2} γ (1) + ϕ_{3} γ (0), \\ γ (2) & = & ϕ_{1} γ (1) + ϕ_{2} γ (0) + ϕ_{3} γ (1), \\ γ (1) & = & ϕ_{1} γ (0) + ϕ_{2} γ (1) + ϕ_{3} γ (2), \\ γ (0) & = & ϕ_{1} γ (1) + ϕ_{2} γ (2) + ϕ_{3} γ (3) + σ_{w}^{2} . \end{array}$
The second and third equations can be solved linearly to obtain expressions for $γ (1)$ and $γ (2)$ as a constant times $γ (0)$ , which yields explicitly $ρ (1)$ and $ρ (2)$ . ( Remember $ρ (k) = γ (k) / γ (0)$ , $ρ (0) = 1$ ) $\begin{array}{rcl} ρ (2) & = & ϕ_{1} ρ (1) + ϕ_{2} + ϕ_{3} ρ (1), \\ ρ (1) & = & ϕ_{1} + ϕ_{2} ρ (1) + ϕ_{3} ρ (2), + σ_{w}^{2} . \end{array}$
For numerical values of $γ (1)$ and $γ (2)$ , and indeed the others, too, the system needs to be solved for.
The matrix representation will help.

ACF #

As illustrated above $ρ (h) = \sum_{j = 1}^{p} ϕ_{j} ρ (k - j),$ because $ρ (h) = γ (h) / γ (0) .$

Note:

For stationary and causal AR, $| z_{i} | > 1$ , $i = 1, \dots, r$ ( $r \leq p$ distinct roots)
If all the roots are real, then $ρ (h)$ dampens exponentially fast to zero as $h \to \infty$ .
If some of the roots are complex, then they will be in conjugate pairs and $ρ (h)$ will dampen, in a sinusoidal fashion, exponentially fast to zero as $h \to \infty$ . In this case, the time series may appear to be cyclic in nature.
This property flows on to ARMA models.

PACF #

When $h > p$ the regression of $x_{t + h}$ on ${x_{t + 1}, \dots, x_{t + h - 1}}$ is ${\hat{x}}_{t + h} = \sum_{j = 1}^{p} ϕ_{j} x_{t + h - j}, h > p .$ This is not a typo, it is a really nice result! (see TS for a proof).

[why? because this means that ${\hat{x}}_{t + h} = x_{t + h} - w_{t + h}$ ]

This means that $x_{t} - {\hat{x}}_{t}$ , which will depend only on ${w_{t + h - 1}, w_{t + h - 2}, \dots}$ , has no overlap with $x_{t + h} - {\hat{x}}_{t + h} = w_{t + h}$

Hence the PACF $ϕ_{h h} = corr (x_{t + h} - {\hat{x}}_{t + h}, x_{t} - {\hat{x}}_{t}) = corr (w_{t + h}, x_{t} - {\hat{x}}_{t}) = 0, h > p .$

In summary, we know that $ϕ_{h h} = 0, for all h > p$ “by design” (we wanted a measure that would die down beyond $p$ for diagnostic reasons).

“In-between”, we have that $ϕ_{11}, \dots, ϕ_{p p}$ are not necessarily 0.

Furthermore, it can be shown that $ϕ_{p p} = ϕ_{p},$ a really nice feature.

Explosive AR models and causality #

Remember the random walk (an $A R (1)$ with $ϕ = 1$ ) $x_{t} = x_{t - 1} + w_{t}$ is not stationary. This was because $x_{t - 1}$ includes all past $w_{t}$ ’s, leading to an exploding variance.

Examination of the $A R (1)$ representation $x_{t} = \sum_{j = 0}^{\infty} ϕ^{j} w_{t - j}$ suggests that the key to contain explosion is with $ϕ$ . If $| ϕ^{j} |$ increases without bound as $j \to \infty$ the expected value of this quantity won’t converge. Explosive processes quickly become large in magnitude, leading to unstationarity.

Unfortunately not all AR models are stationary. For the random walk, $ϕ = 1$ and it is not stationary. What are the values of $ϕ$ so that this does not happen?

Assume $| ϕ | > 1$ . Write (by iterating forward $k$ steps) $\begin{array}{rcl} x_{t} & = & ϕ^{- 1} x_{t + 1} - ϕ^{- 1} w_{t + 1} \\ = & ϕ^{- 1} (ϕ^{- 1} x_{t + 2} - ϕ^{- 1} w_{t + 2}) - ϕ^{- 1} w_{t + 1} \\ ⋮ \\ = & ϕ^{- k} x_{t + k} - \sum_{j = 1}^{k - 1} ϕ^{- j} w_{t + j} \end{array}$

Because $| ϕ |^{- 1} < 1$ , this result suggests the future dependent $A R (1)$ model $x_{t} = - \sum_{j = 1}^{\infty} ϕ^{- j} w_{t + j} .$ This is of the $A R (1)$ form $x_{t} = ϕ x_{t - 1} + w_{t}$ , but it is useless because it requires us to know the future to be able to predict the future!

When a process does not depend on the future— such as $A R (1)$ when $| ϕ | < 1$ — we will say the process is causal.

The model above with $| ϕ | > 1$ is stationary, but it is also future dependent, and hence is not causal.

Here is the lesson for $p > 1$ :

stationary and causal are not equivalent conditions
depending on the parameters of your $A R (p)$ model, you may have a future dependent model without knowing it: when $p > 1$ it is not obvious by just looking at the parameters $ϕ_{j}$
that’s why the condition above stated “stationary and causal”
this is further discussed in the ARMA section, where the argument above is further generalised to $p > 1$

$M A (q)$ models (TS 3.1, 3.3, 3.6, S6) #

Definition #

$M A (q)$ models #

The moving average model of order $q$ $M A (q)$ is of the form $x_{t} = w_{t} + θ_{1} w_{t - 1} + θ_{2} w_{t - 2} + \dots + θ_{q} w_{t - q} = θ (B) w_{t},$ where $w_{t} \sim wn (0, σ_{w}^{2})$ and $θ_{1}, θ_{2}, \dots, θ_{q}$ $(θ_{q} \neq 0)$ are parameters.

Note:

The AR combines linearly the $x_{t}$ ’s, whereas the MA combines linearly the $w_{t}$ ’s.
The moving average operator is defined as $θ (B) = 1 + θ_{1} B + θ_{2} B^{2} + \dots + θ_{q} B^{q} .$
Unlike the AR, the MA is stationary for any value of the parameters $θ$ ’s

Example: $M A (1)$ model #

An $M A (1)$ is $x_{t} = w_{t} + θ w_{t - 1} .$ Hence, $E [x_{t}] = 0,$ and $γ (h) = {\begin{array}{lc} (1 + θ^{2}) σ_{w}^{2} & h = 0, \\ θ σ_{w}^{2} & h = 1, \\ 0 & h > 1, \end{array}$ and the ACF is $ρ (h) = {\begin{array}{lc} (1 & h = 0) \\ \frac{θ}{(1 + θ^{2})} & h = 1, \\ 0 & h > 1. \end{array}$ Furthermore $| ρ (1) | \leq 1 / 2$ for all values of $θ$ .

Sample paths of $M A (1)$ processes #

par(mfrow = c(2, 1))
plot(arima.sim(list(order = c(0, 0, 1), ma = 0.5), n = 100),
  ylab = "x", main = (expression(MA(1) ~ ~~theta == +0.5)))
plot(arima.sim(list(order = c(0, 0, 1), ma = -0.5), n = 100),
  ylab = "x", main = (expression(MA(1) ~ ~~theta == -0.5)))

Non-uniqueness and invertibility #

Note that (multiply the RHS by $θ^{2}$ to get the original formula): $ρ (h) = \frac{θ}{(1 + θ^{2})} = \frac{1 / θ}{(1 + 1 / θ^{2})} .$
Furthermore, the pair $(θ = 5, σ_{w}^{2} = 1)$ yield the same autocovariance function as the pair $(θ = 1 / 5, σ_{w}^{2} = 25)$ , namely, $γ (h) = {\begin{array}{lc} 26 & h = 0, \\ 5 & h = 1, \\ 0 & h > 1. \end{array}$

Thus, the $M A (1)$ processes $x_{t} = w_{t} + \frac{1}{5} w_{t - 1}, w_{t} \sim N (0, 25)$ and $y_{t} = v_{t} + 5 v_{t - 1}, v_{t} \sim N (0, 1)$ are the same because of normality (same first two moments, and the $γ$ fully determine their dependence structure).
We can only observe $x_{t}$ or $y_{t}$ , but not $w_{t}$ or $v_{t}$ so we cannot distinguish between them.
We encountered this phenomenon already in Tutorial Exercise TS5.

For convenience, we will systematically choose the version that is invertible, which means a process that has an infinite AR representation, as explained below.

Example:

Consider the following inversion of the roles of $x_{t}$ and $w_{t}$ in the specific case of an MA(1) model: $\begin{array}{rcl} w_{t} & = & - θ w_{t - 1} + x_{t} \\ = & \sum_{j = 0}^{\infty} (- θ)^{j} x_{t - j} if | θ | < 1, \end{array}$ which is an infinite AR representation of the model.
Since we need $| θ | < 1$ for this to work, we will choose the version with $(θ = 1 / 5, σ_{2}^{2} = 25)$ .

How can we generalise this to $q > 1$ ?

As in the AR case, the polynomial $θ (z)$ is key. The inversion in general is $x_{t} = θ (B) w_{t} ⟺ π (B) x_{t} = w_{t} where π (B) = θ^{- 1} (B)$
Just as we required $| θ | < 1$ for , we will

Example: in the $M A (1)$ case, $θ (z) = 1 + θ z ⟺ π (z) = θ^{- 1} (z) = \frac{1}{1 + θ z} = \sum_{j = 0}^{\infty} (- θ)^{j} z^{j} if | θ z | < 1 .$ Consequently, $π (B) = \sum_{j = 0}^{\infty} (- θ)^{j} B^{j} .$

ACF and PACF #

Autocovariance #

We have $\begin{array}{rcl} γ (h) & = & C o v (x_{t + h}, x_{t}) \\ = & C o v (\sum_{j = 0}^{q} θ_{j} w_{t + h - j}, \sum_{k = 0}^{q} θ_{k} w_{t - k}) \\ = & {\begin{cases} σ_{w}^{2} \sum_{j = 0}^{q - h} θ_{j} θ_{j + h}, & 0 \leq h \leq q, \\ 0 & h > q . \end{cases} \\ = & γ (- h) \end{array}$ Note

$γ (q) \neq 0$ because $θ_{q} \neq 0$
the cutting off of $γ (h)$ after $q$ lags is the signature of the $M A (q)$ model.

ACF #

The ACF is then $ρ (h) = {\begin{cases} \frac{\sum_{j = 0}^{q - h} θ_{j} θ_{j + h}}{1 + θ_{1}^{2} + \dots + θ_{q}^{2}}, & 1 \leq h \leq q, \\ 0 & h > q . \end{cases}$

PACF #

An invertible $M A (q)$ can be written as $x_{t} = - \sum_{j = 1}^{\infty} π_{j} x_{t - j} + w_{t} .$ No finite representation exists, and hence the PACF will never cut off (as opposed to the case of $A R (p)$ ).

Example: For an $M A (1)$ , $\begin{array}{rcl} ϕ_{11} & = & ρ (1), \\ ϕ_{22} & = & - \frac{θ^{2}}{1 + θ^{2} + θ^{4}}, \\ ϕ_{h h} & = & - \frac{(- θ)^{h} (1 - θ^{2})}{1 - θ^{2 (h + 1)}}, h \geq 1. \end{array}$ This is derived in the textbook.

$A R M A (p, q)$ models (TS 3.1, 3.3, 3.6, S6) #

Definition #

A time series ${x_{t}; t = 0, \pm 1, \pm 2, \dots}$ is $A R M A (p, q)$ if it is stationary and $x_{t} = ϕ_{1} x_{t - 1} + \dots + ϕ_{p} x_{t - p} + w_{t} + θ_{1} w_{t - 1} + \dots + θ_{q} w_{t - q},$ with $ϕ_{p} \neq 0$ , $θ_{q} \neq 0$ , and $σ_{w}^{2} > 0$ . The parameters $p$ and $q$ are called the autoregressive and the moving average orders, respectively. If $x_{t}$ has a nonzero $μ$ , we set $α = μ (1 - ϕ_{1} - \dots - ϕ_{p})$ and write the model as $x_{t} = α + ϕ_{1} x_{t - 1} + \dots + ϕ_{p} x_{t - p} + w_{t} + θ_{1} w_{t - 1} + \dots + θ_{q} w_{t - q},$ where $w_{t} \sim wn (0, σ_{w}^{2})$ .

This can be rewritten very concisely as $ϕ (B) x_{t} = θ (B) w_{t} .$

Parameter redundancy #

Multiplying both sides by a third operator $η (B)$ leads to $η (B) ϕ (B) x_{t} = η (B) θ (B) w_{t},$ which doesn’t change the dynamics. One could then have redundant parameters. For example:

Consider the white noise process $x_{t} = w_{t}$ , which is “$ARMA(0,0)$”
If we multiply both sides by $η (B) = 1 - 0.5 B$ then the model becomes $x_{t} = 0.5 x_{t - 1} - 0.5 w_{t - 1} + w_{t},$ which looks like $A R M A (1, 1)$ , even though it is still white noise.
We have hidden that fact through over-parametrisation, leading to parameter redundancy.
Fitting results are likely to suggest that all parameters (including
the unnecessary ones) are significant.

The AR and MA polynomials #

The AR and MA polynomials are defined as $ϕ (z) = 1 - ϕ_{1} z - \dots - ϕ_{p} z^{p}, ϕ_{p} \neq 0,$ and $θ (z) = 1 + θ_{1} z + \dots + θ_{q} z^{q}, θ_{q} \neq 0,$ respectively, where $z$ may be a complex number.

These will be used extensively to study the properties of ARMA processes.

Properties #

Problems to keep in mind #

In the previous sections, we identified the following potential “problems”, or potential “issues to keep in mind”:

parameter redundant models,
stationary AR models that depend on the future, and
MA models that are not unique.

We now discuss how to avoid those issues.

Parameter redundant models #

We will require:

$ϕ (z)$ and $θ (z)$ cannot have common factors

This will ensure that when referring to an $A R M A (p, q)$ model, there can’t be a reduced form of it.

Future dependence and causality #

[What does causal mean?] An $A R M A (p, q)$ model is said to be causal, if the time series ${x_{t}; t = 0, \pm 1, \pm 2, \dots}$ can be written as a one-sided linear process: $x_{t} = \sum_{j = 0}^{\infty} ψ_{j} w_{t - j} = ψ (B) w_{t},$ where $ψ (B) = \sum_{j = 0}^{\infty} ψ_{j} B^{j}$ , and $\sum_{j = 0}^{\infty} | ψ_{j} | < \infty$ ; we set $ψ_{0} = 1$ .

Note:

key here is that the sum starts at $j = 0$ (so that it depends only on the past);
the $ψ$ parameters are new ones, which are defined in the equation above (“such that” $x_{t}$ can be written as such a sum).

[When is it causal?] An $A R M A (p, q)$ model is causal if and only if $ϕ (z) \neq 0$ for $| z | \leq 1$ .

To see this, note that the coefficients of the linear process given above can be determined by solving $ψ (z) = \sum_{j = 0}^{\infty} ψ_{j} z^{j} = \frac{θ (z)}{ϕ (z)}, | z | \leq 1.$

Equivalently, an ARMA process is causal only when the roots of $ϕ (z)$
lie outside the unit circle, that is, $ϕ (z) = 0$ only when $| z | > 1$ .

Invertibility #

[What does invertible mean?] An $A R M A (p, q)$ model is said to be invertible, if the time series ${x_{t}; t = 0, \pm 1, \pm 2, \dots}$ can be written as $π (B) x_{t} = \sum_{j = 0}^{\infty} π_{j} x_{t - j} = w_{t},$ where $π (B) = \sum_{j = 0}^{\infty} π_{j} B^{j}$ , and $\sum_{j = 0}^{\infty} | π_{j} | < \infty$ ; we set $π_{0} = 1$ .

Note:

key here is that we express the model as “$w_t$ is a function of $x_{t}$ ’s” even though our focus (what we model and observe) is $x_{t}$ . That’s the “invertible” idea;
the $π$ parameters are new ones, which are defined in the equation above (“such that” $w_{t}$ can be written as such a sum of $x_{t}$ ’s).

[When is it invertible?] An $A R M A (p, q)$ model is invertible if and only if $θ (z) \neq 0$ for $| z | \leq 1$ . The coefficients $π_{j}$ of $π (B)$ given above can be determined by solving $π (z) = \sum_{j = 0}^{\infty} π_{j} z^{j} = \frac{ϕ (z)}{θ (z)}, | z | \leq 1.$ Equivalently, an ARMA process is invertible only when the roots of
$θ (z)$ lie outside the unit circle; that is, $θ (z) = 0$ only when $| z | > 1$ .

Example: parameter redundancy, causality, invertability #

Consider the process: $x_{t} = 0.4 x_{t - 1} + 0.45 x_{t - 2} + w_{t} + w_{t - 1} + 0.25 w_{t - 2} .$ The first step here is to express that with the help of the backshift operator: $(1 - 0.4 B - 0.45 B^{2}) x_{t} = (1 + B + 0.25 B^{2}) w_{t}$ This looks like an $A R M A (2, 2)$ process, but there’s a trick! Write the AR and MA polynomials, and try to factorise them: $\begin{array}{rcl} ϕ (B) & = & 1 - 0.4 B - 0.45 B^{2} = (1 + 0.5 B) (1 - 0.9 B) \\ θ (B) & = & 1 + B + 0.25 B^{2} = (1 + 0.5 B)^{2} \end{array}$ There is a common factor, leading to parameter redundancy!

Factorise the common factor $(1 + 0.5 B)$ out, and get $\begin{array}{rcl} ϕ (B) & = & 1 - 0.9 B \\ θ (B) & = & 1 + 0.5 B \end{array}$ so that our model is in fact $x_{t} = 0.9 x_{t - 1} + 0.5 w_{t - 1} + w_{t},$ an $A R M A (1, 1)$ model.
[Parameter redundancy: checked!]

We next check whether the process is causal. We need the root of $ϕ (z) = 1 - 0.9 z = 0$ to be outside the unit circle, which it is as the solution is $z = 10 / 9 > 1$ . [Causality: checked!]

We next check wether the model is invertible. We need the root of $θ (z) = 1 + 0.5 z = 0$ to be outside the unit circle, which it is as the solution is $z = - 2$ . [Invertibility: checked!]

Now, we would like to find the linear representation of the process, that is, get the $ψ$ weights. Because $\begin{array}{rcl} ϕ (z) ψ (z) & = & θ (z) \\ (1 - 0.9 z) (1 + ψ_{1} z + ψ_{2} z^{2} + \dots + ψ_{j} z^{j} + \dots) & = & 1 + 0.5 z \\ (regrouping coefficients of powers of z) \\ 1 + (ψ_{1} - 0.9) z + (ψ_{2} - 0.9 ψ_{1}) z^{2} + \\ \dots + (ψ_{j} - 0.9 ψ_{j - 1}) z^{j} + \dots & = & 1 + 0.5 z \end{array}$ We compare coefficients of the powers of $z$ , and note that all coefficients of $z^{j}$ are 0 for $j > 1$ on the RHS.

We obtain then $\begin{array}{rcl} ψ_{1} - 0.9 = 0.5 & ⟹ & ψ_{1} = 1.4, \\ ψ_{j} - 0.9 ψ_{j - 1} = 0 & ⟹ & \frac{ψ_{j}}{ψ_{j - 1}} = 0.9, j > 1. \end{array}$ and thus $ψ_{j} = 1.4 (0.9)^{j - 1} for j \geq 1,$ and hence $x_{t} = w_{t} + 1.4 \sum_{j = 1}^{\infty} {0.9}^{j - 1} w_{t - j} .$ In R, this is much quicker! just use ( $x_{t} = 0.9 x_{t - 1} + 0.5 w_{t - 1} + w_{t}$ )

format(ARMAtoMA(ar = 0.9, ma = 0.5, 10), digits = 2)  # first 10 psi-weights
##  [1] "1.40" "1.26" "1.13" "1.02" "0.92" "0.83" "0.74" "0.67" "0.60"
## [10] "0.54"

[Linear representation: checked!]

(no, it’s not over yet!)

Next, we want to determine the invertible representation of the model. Because $\begin{array}{rcl} θ (z) π (z) & = & ϕ (z) \\ (1 + 0.5 z) (1 + π_{1} z + π_{2} z^{2} + π_{3} z^{3} + \dots) & = & 1 - 0.9 z \\ (regrouping coefficients of powers of z) \\ 1 + (π_{1} + 0.5) z + (π_{2} + 0.5 π_{1}) z^{2} + \\ \dots + (π_{j} + 0.5 π_{j - 1}) z^{j} + \dots & = & 1 - 0.9 z \end{array}$ We compare coefficients of the powers of $z$ .

We get $π_{j} = (- 1)^{j} 1.4 (0.5)^{j - 1} for j \geq 1$ and then $w_{t} = \sum_{j = 0}^{\infty} π_{j} x_{t - j} = x_{t} + \sum_{j = 1}^{\infty} π_{j} x_{t - j}$ so that $x_{t} = - \sum_{j = 1}^{\infty} π_{j} x_{t - j} + w_{t} .$

Again, this is much quicker in R: ( $w_{t} = - 0.5 w_{t - 1} - 0.9 x_{t - 1} + x_{t}$ )

format(ARMAtoMA(ar = -0.5, ma = -0.9, 10), digits = 1)  # first 10 pi-weights
##  [1] "-1.400" " 0.700" "-0.350" " 0.175" "-0.087" " 0.044" "-0.022"
##  [8] " 0.011" "-0.005" " 0.003"

[Invertible representation: checked!]

Stationarity, Causality and Invertibility #

Wrapping it up #

First, it is helpful to rewrite the ARMA representation as $x_{t} = \frac{θ (B)}{ϕ (B)} w_{t} = ψ (B) w_{t}$ To summarise:

first, we require $θ (B)$ and $ϕ (B)$ to not have common factors. If they do, these will obviously cancel out in the ratio $θ (B) / ϕ (B)$ and we, really, are only dealing with the “reduced” model (without the redundant parameters).
pure $A R (p)$ models will be stationary as long as $ψ (B)$ is well behaved (say, finite), which will happen as long as the roots of $ϕ (B)$ are all greater than one in modulus $(| z_{j} | > 1)$ ; this is because $ϕ (B)$ appears in the denominator. Since this is not impacted by $θ (B)$ , which is in the numerator, this result also holds in the case of $A R M A (p, q)$
models.

pure $M A (q)$ models—where $ϕ (B) = 1$ —are always stationary (under some mild conditions on the coefficients which we ignore here), because by definition they include a finite number of $w$ ’s (all covariances are finite). They are also causal by definition.
now, even though $M A (q)$ are always causal, establishing causality for $A R (p)$ and $A R M A (p, q)$ models—where $p > 0$ —is a little trickier. We not only need to to be well behaved, but we also need it to depend on past values only (an example of stationary but non-causal model is provided on slide 26 of Module 9 above; see also next subsection). It turns out that you only need the roots to not be on the unit circle for the process to be stationary. For it to be causal, the additional requirement is that it needs to be outside the unit circle. In other words, while you could have roots inside the unit circle and still achieve stationarity, the process would not be causal. This means that causality will imply stationarity, but not the other way around.

Again, stationarity does not require the roots of $θ (B)$ to be greater than one in modulus. This is required for invertibility, which aims to flip things around (“invert” the process): $w_{t} = \frac{ϕ (B)}{θ (B)} x_{t} .$ It becomes clear why, now, it is the roots of $θ (B)$ that need to be well behaved (as it is now $θ (B)$ that is in the denominator).

Revisiting the future-dependent example #

Let us revisit that example: We have $x_{t} = ϕ x_{t - 1} + w_{t} .$ In this case the characteristic equation is $ϕ (z) = 1 - ϕ z .$ This has root $z_{0} = 1 / ϕ .$

We distinguish three cases:

$ϕ < 1$ (say, $ϕ = 0.5$ so that $z_{0} = 1 / ϕ = 2$ ): the root is not on the unit circle, and also outside the unit circle, so that the process—which is $A R (1)$ —is stationary and causal.
$ϕ = 1$ $(z_{0} = 1)$ : the root is on the unit circle, and hence the process is not stationary. This is the random walk case. Note it is not causal either, because a causal process is a linear process that depends on the past only, and while the random walk depends on the present and past only, it does not satisfy the requirement (for it to be a linear process) that the sum of the absolute value of weights is finite (the sum of an infinite number of 1’s is infinite) - see Module 7.
$ϕ > 1$ (say, $ϕ = 3$ so that $z_{0} = 1 / 3 < 1$ ): the root is not on the unit circle, and hence the process is stationary. However, the root is inside the unit circle, which implies it is not causal. The process will depend on the future as demonstrated earlier.

ACF and PACF #

Autocovariance and ACF #

For a causal $A R M A (p, q)$ model $ϕ (B) x_{t} = θ (B) w_{t}$ we use the linear representation $x_{t} = \sum_{j = 0}^{\infty} ψ_{j} w_{t - j},$ from which it follows immediately that $E [x_{t}] = 0$ , and the autocovariance function of $x_{t}$ is $γ (h) = C o v (x_{t + h}, x_{t}) = σ_{w}^{2} \sum_{j = 0}^{\infty} ψ_{j} ψ_{j + h}, h \geq 0.$ This approach requires solving for the $ψ$ ’s.

Alternatively, it is possible to write a general homogeneous equation for the ACF of a causal ARMA process to solve for the $γ$ ’s directly
(The proof is outside scope but available in TS): $γ (h) - ϕ_{1} γ (h - 1) - \dots - ϕ_{p} γ (h - p) = 0, h \geq max (p, q + 1),$ with initial conditions $γ (h) - \sum_{j = 1}^{p} ϕ_{j} γ (h - j) = σ_{w}^{2} \sum_{j = h}^{q} θ_{j} ψ_{j - h}, 0 \leq h < max (p, q + 1) .$ Finally, the ACF is $ρ (h) = \frac{γ (h)}{γ (0)} .$ In general, the ACF cannot distinguish between AR and ARMA, which is why PACF is useful (in presence of pure AR, it will cut off).

Example: ACF of $A R M A (1, 1)$ #

Consider the $A R M A (1, 1)$ process $x_{t} = ϕ x_{t - 1} + θ w_{t - 1} + w_{t}, where | ϕ | < 1.$ The autocovariance then satisfies $γ (h) - ϕ γ (h - 1) = 0, h = 2, 3, \dots,$ which has general solution $γ (h) = c ϕ^{h}, h = 1, 2, \dots .$ Initial conditions are $\begin{array}{rcl} γ (0) & = & ϕ γ (1) + σ_{w}^{2} (θ_{0} ψ_{0} + θ_{1} ψ_{1}) = ϕ γ (1) + σ_{w}^{2} [1 + θ ϕ + θ^{2}] \\ γ (1) & = & ϕ γ (0) + σ_{w}^{2} θ . \end{array}$ Note that $ψ_{1} = θ + ϕ$ for an $A R M A (1, 1)$ model
(see Example 3.12 in the textbook).

Solving for $γ (0)$ and $γ (1)$ yields $\begin{array}{rcl} γ (0) & = & σ_{w}^{2} \frac{1 + 2 θ ϕ + θ^{2}}{1 - ϕ^{2}} \\ γ (1) & = & σ_{w}^{2} \frac{(1 + θ ϕ) (ϕ + θ)}{1 - ϕ^{2}} . \end{array}$ Since $γ (1) = c ϕ$ , we get $c = γ (1) / ϕ$ and $γ (h) = \frac{γ (1)}{ϕ} ϕ^{h} = σ_{w}^{2} \frac{(1 + θ ϕ) (ϕ + θ)}{1 - ϕ^{2}} ϕ^{h - 1}, h \geq 1,$ from which we obtain the ACF $ρ (h) = \frac{γ (h)}{γ (0)} = \frac{γ (1)}{γ (0)} ϕ^{h - 1} = \frac{(1 + θ ϕ) (ϕ + θ)}{1 + 2 θ ϕ + θ^{2}} ϕ^{h - 1}, h \geq 1.$ This has the same pattern as an $A R (1)$ .

$A R I M A (p, d, q)$ models (TS 3.1, 3.3, 3.6, S6) #

Definition #

A process $x_{t}$ is said to be $A R I M A (p, d, q)$ if $\nabla^{d} x_{t} = (1 - B)^{d} x_{t}$ is $A R M A (p, q)$ . In general, we will write the model as $ϕ (B) (1 - B)^{d} x_{t} = θ (B) w_{t} .$ If $E [\nabla^{d} x_{t}] = μ$ , we write the model as $ϕ (B) (1 - B)^{d} x_{t} = δ + θ (B) w_{t},$ where $δ = μ (1 - ϕ_{1} - \dots - ϕ_{p}) .$ The integrated ARMA, or ARIMA, is a broadening of the class of
ARMA models to include differencing.

Remarks #

Because of nonstationarity, care must be taken when deriving forecasts. It is best to use so-called state-space models for handling nonstationary models (but these are outside the scope of this course).
Since $y_{t} = \nabla^{d} x_{t}$ is ARMA, it suffices to discuss how to fit and forecast ARMA models. For instance, if $d = 1$ , given forecast $y_{n + m}^{n}$ for $m = 1, 2, \dots$ , we have $y_{n + m}^{n} = x_{n + m}^{n} - x_{n + m - 1}^{n} so that x_{n + m}^{n} = y_{n + m}^{n} + x_{n + m - 1}^{n}$ with initial condition $x_{n + 1}^{n} = y_{n + 1}^{n} + x_{n}$ (noting $x_{n}^{n} = x_{n}$ ).
Derivation of prediction errors is slightly more involved (but also outside the scope of this course).

The IMA(1,1) model and exponential smoothing #

IMA(1,1) and EWMA #

The $A R I M A (0, 1, 1)$ , or $I M A (1, 1)$ model is of interest because many economic time series can be successfully modelled this way.
The model leads to a frequently used— and abused! —forecasting method called exponentially weighted moving averages (EWMA) (“exponential smoothing” in S6).
One should not use this method unless there is strong statistical evidence that this is the right model to use!
(as per the following sections)

EWMA example #

set.seed(666)
x <- arima.sim(list(order = c(0, 1, 1), ma = -0.8), n = 100)  #simulate IMA(1,1)
x.ima <- HoltWinters(x, beta = FALSE, gamma = FALSE)  # fit EWMA
plot(x.ima)

Multiplicative seasonal ARIMA models (TS 3.9) #

Introduction #

We observed seasonality in many time series so far. This is manifested when the dependence on the past tends to occur most strongly at multiples of some underlying seasonal lag $s$ . For example:
- monthly economic data: strong yearly component occurring at lags that are multiples of $s = 12$ ;
- data taken quarterly: will exhibit the yearly repetitive period at $s = 4$ quarters.
In this section, we introduce several modifications made to the ARIMA model to account for seasonal and nonstationary behavior.
The reference for this section is TS 3.9.
Note that seasonality can be due to:
- Deterministic reasons: include in trend
- Stochastic dependence at seasonal lags $s$ : use SARIMA models

Pure seasonal $A R M A (P, Q)_{s}$ models #

The pure seasonal autoregressive moving average model $A R M A (P, Q)_{s}$ takes the form $Φ_{P} (B^{s}) x_{t} = Θ_{Q} (B^{s}) w_{t},$ where $Φ_{P} (B^{s}) = 1 - Φ_{1} B^{s} - Φ_{2} B^{2 s} - \dots - Φ_{P} B^{P s}$ is the seasonal autoregressive operator of order $P$ with seasonal period $s$ , and where $Θ_{Q} (B^{s}) = 1 + Θ_{1} B^{s} + Θ_{2} B^{2 s} + \dots + Θ_{Q} B^{P s}$ is the seasonal moving average operator of order $Q$ with seasonal period $s$ .

Here, inter-temporal correlations are exclusively seasonal (there are no non-seasonal correlations).

Note: (analogous to nonseasonal ARMA)

It will be causal only when the roots of $Φ_{P} (z^{s})$ lie outside the unit circle
It will be invertible only when the roots of $Θ_{Q} (z^{s})$ lie outside the unit circle

Example: first order seasonal $(s = 12)$ AR model #

A first-order seasonal autoregressive series that might run over months could be written as $(1 - Φ B^{12}) x_{t} = w_{t} or x_{t} = Φ x_{t - 12} + w_{t} .$ ` Note:

$x_{t}$ is expressed in terms of past lags at multiple of the (yearly) seasonal period $s = 12$ months
very similar to the unit lag model $A R (1)$ that we know
causal if $| Φ | < 1$
Simulated example (with $Φ = 0.9$ ):

set.seed(666)
phi <- c(rep(0, 11), 0.9)
sAR <- arima.sim(list(order = c(12, 0, 0), ar = phi), n = 37)
sAR <- ts(sAR, freq = 12)

par(mar = c(3, 3, 2, 1), mgp = c(1.6, 0.6, 0))
plot(sAR, axes = FALSE, main = "Seasonal AR(1)", xlab = "year",
  type = "c")
Months <- c("J", "F", "M", "A", "M", "J", "J", "A", "S", "O",
  "N", "D")
points(sAR, pch = Months, cex = 1.25, font = 4, col = 1:4)
axis(1, 1:4)
abline(v = 1:4, lty = 2, col = gray(0.7))
axis(2)
box()

Theoretical ACF and PACF of the model:

ACF <- ARMAacf(ar = phi, ma = 0, 100)
PACF <- ARMAacf(ar = phi, ma = 0, 100, pacf = TRUE)
par(mfrow = c(2, 1), mar = c(3, 3, 2, 1), mgp = c(1.6, 0.6, 0))
plot(ACF, type = "h", xlab = "LAG", ylim = c(-0.1, 1))
abline(h = 0)
plot(PACF, type = "h", xlab = "LAG", ylim = c(-0.1, 1))
abline(h = 0)

Autocovariance and ACF of first-order seasonal models #

For the first-order seasonal $(s = 12)$ MA model $x_{t} = w_{t} + Θ w_{t - 12}$ we have $\begin{array}{rcl} γ (0) & = & (1 + Θ^{2}) σ^{2} \\ γ (\pm 12) & = & Θ σ^{2} \\ γ (h) & = & 0, otherwise. \end{array}$ Thus, the only nonzero correlation (aside from lag zero) is $ρ (\pm 12) = \frac{Θ}{1 + Θ^{2}} .$ The PACF will tail off at multiples of $s = 12$ .

For the first-order seasonal $(s = 12)$ AR model $x_{t} = Φ x_{t - 12} + w_{t}$ we have $\begin{array}{rcl} γ (0) & = & \frac{σ^{2}}{1 - Φ^{2}} \\ γ (\pm 12 k) & = & \frac{σ^{2} Φ^{k}}{1 - Φ^{2}}, k = 1, 2, \dots \\ γ (h) & = & 0, otherwise. \end{array}$ Thus, the only nonzero correlations are $ρ (\pm 12 k) = Φ^{k}, k = 1, 2, \dots .$ The PACF will have one nonzero correlation at $s = 12$ and then cut off.

Multiplicative seasonal $A R M A (p, q) \times (P, Q)_{s}$ models #

In general, we can combine the seasonal and nonseasonal operators into a multiplicative seasonal autoregressive moving average model $A R M A (p, q) \times (P, Q)_{s}$ , and write $Φ_{P} (B^{s}) ϕ (B) x_{t} = Θ_{Q} (B^{s}) θ (B) w_{t}$ as the overall model.

Note:

When selecting a model, we will need to carefully examine the ACF and PACF of the data.
Choosing the seasonal autoregressive and moving average components first will generally lead to better results.
This will be discussed more in Module 10.

Example: A mixed seasonal model $A R M A (0, 1) \times (1, 0)_{12}$ #

Consider an $A R M A (0, 1) \times (1, 0)_{12}$ model $x_{t} = Φ x_{t - 12} + w_{t} + θ w_{t - 1},$ where $| Φ | < 1$ and $| θ | < 1$ .

Because

$x_{t - 12}$ , $w_{t}$ and $w_{t - 1}$ are uncorrelated; and
$x_{t}$ is stationary

then $γ (0) = Φ^{2} γ (0) + σ_{w}^{2} + θ^{2} σ_{w}^{2} ⟺ γ (0) = \frac{1 + θ^{2}}{1 - Φ^{2}} σ_{w}^{2} .$

Furthermore, multiplying the model by $x_{t - h}$ , $h > 0$ $x_{t} x_{t - h} = Φ x_{t - 12} x_{t - h} + w_{t} x_{t - h} + θ w_{t - 1} x_{t - h}$ and taking expectations leads to $\begin{array}{rcl} γ (1) & = & Φ γ (11) + θ σ_{w}^{2} \\ γ (h) & = & Φ γ (h - 12) for h \geq 2. \end{array}$ The first result stems from $\begin{array}{rcl} γ (1) = E [x_{t} x_{t - 1}] & = & E [Φ x_{t - 12} x_{t - 1}] + E [w_{t} x_{t - 1}] + E [θ w_{t - 1} x_{t - 1}] \\ = & Φ γ (11) + 0 + θ σ_{w}^{2} \end{array}$ because $x_{t - 1} = Φ x_{t - 13} + w_{t - 1} + θ w_{t - 2} .$ Proof of the second result is similar.

Thus, the ACF for this model (requires some algebra) is

$\begin{array}{rcl} ρ (12 h) & = & Φ^{h}, h = 1, 2, \dots \\ ρ (12 h - 1) & = & ρ (12 h + 1) = \frac{θ}{1 + θ^{2}} Φ^{h}, h = 0, 1, 2, \dots \\ ρ (h) & = & 0 otherwise . \end{array}$

Example: if $Φ = 0.8$ and $θ = - .5$ , then theoretical ACF and PACF become

phi <- c(rep(0, 11), 0.8)
ACF <- ARMAacf(ar = phi, ma = -0.5, 50)[-1]  # [-1] removes 0 lag
PACF <- ARMAacf(ar = phi, ma = -0.5, 50, pacf = TRUE)
par(mfrow = c(1, 2))
plot(ACF, type = "h", xlab = "LAG", ylim = c(-0.4, 0.8))
abline(h = 0)
plot(PACF, type = "h", xlab = "LAG", ylim = c(-0.4, 0.8))
abline(h = 0)

Seasonal differencing #

Motivating example #

Consider average temperatures over the years.
Each January would be approximately the same (as would February, etc…).
In this case we might think of the average monthly temperature as $x_{t} = S_{t} + w_{t},$ where $S_{t}$ is a seasonal component that varies a little from one year to the next, say (random walk) $S_{t} = S_{t - 12} + v_{t} .$
Here, $v_{t}$ and $w_{t}$ are uncorrelated white noise processes.

Note $x_{t} = S_{t - 12} + v_{t} + w_{t} and x_{t - 12} = S_{t - 12} + w_{t - 12} .$
If we subtract the effect of successive years from each other (“seasonal differencing”), we find that $(1 - B^{12}) x_{t} = x_{t} - x_{t - 12} = v_{t} + w_{t} - w_{t - 12} .$
This model is a stationary purely seasonal $M A (1)_{12}$ , and its ACF will have a peak only at lag 12.
Such a model would exhibit an ACF that is large and decays very slowly at lags $h = 12 k$ , for $k = 1, 2, \dots$ .

Seasonal differencing #

In general, seasonal differencing can be indicated when the ACF decays slowly at multiples of some season $s$ , but is negligible between the periods.
The seasonal difference of order $D$ is defined as $\nabla_{s}^{D} x_{t} = (1 - B^{s})^{D} x_{t},$ where $D = 1, 2, \dots$ takes positive integer values
Typically, $D = 1$ is sufficient to obtain seasonal stationarity

How do we combine this idea with an ARIMA model?

SARIMA model $A R I M A (p, d, q) \times (P, D, Q)_{s}$ #

The multiplicative seasonal autoregressive integrated moving average model, or SARIMA model is given by $Φ_{P} (B^{s}) ϕ (B) \nabla_{s}^{D} \nabla^{d} x_{t} = δ + Θ_{Q} (B^{s}) θ (B) w_{t},$ where $w_{t}$ is the (usual) Gaussian white noise process.

Note:

This is denoted as $A R I M A (p, d, q) \times (P, D, Q)_{s}$ .
The ordinary autoregressive and moving average components are represented by polynomials $ϕ (B)$ and $θ (B)$ of orders $p$ and $q$ , respectively.
The seasonal autoregressive and moving average components by $Φ_{P} (B^{s})$ and $Θ_{Q} (B^{s})$ of orders $P$ and $Q$ , respectively.
The ordinary and seasonal difference components by
$\nabla^{d} = (1 - B)^{d}$ and $\nabla_{s}^{D} = (1 - B^{s})^{D}$ .

A typical SARIMA model #

Consider the following $A R I M A (0, 1, 1) \times (0, 1, 1)_{12}$ model with seasonal fluctuations that occur every 12 months, $\nabla_{12} \nabla x_{t} = Θ (B^{12}) θ (B) w_{t},$ where we have set $δ = 0$ .

Note:

This model often provides a reasonable representation for seasonal, nonstationary, economic time series.
This model can be represented equivalently as $\begin{array}{rcl} (1 - B^{12}) (1 - B) x_{t} & = & (1 + Θ B^{12}) (1 + θ B) w_{t} \\ (1 - B - B^{12} + B^{13}) x_{t} & = & (1 + θ B + Θ B^{12} + Θ θ B^{13}) w_{t} . \end{array}$

Yet another representation is $x_{t} = x_{t - 1} + x_{t - 12} - x_{t - 13} + w_{t} + θ w_{t - 1} + Θ w_{t - 12} + Θ θ w_{t - 13} .$

The multiplicative nature implies that the coefficient of $w_{t - 13}$ is $Θ θ$ rather than yet another free parameter.
However, this often works well, and reduces the number of parameters needed.

Example: Air Passengers #

Consider the R data set AirPassengers, which are the monthly totals of international airline passengers, 1949 to 1960, taken from Box & Jenkins (1970):

x <- AirPassengers

We’ll explore this dataset and see we can “difference out” some seasonality.

Let’s have a look at the series:

plot.ts(x, main = "Air Passengers series, unmodified")

This shows trend plus increasing variance $\to$ try a log transformation.

lx <- log(x)
plot.ts(lx, main = "Log of Air Passengers series")

The log transformation has stabilised the variance.

We now need to remove the trend, and try differencing:

dlx <- diff(lx)
plot.ts(dlx, main = "Differenced Log of Air Passengers series")

It is clear that there is still persistence in the seasons, that is,
dlx$t\approx;$ dlx${t-12}$.

We apply a twelvth-order difference:

ddlx <- diff(dlx, 12)
plot.ts(ddlx, main = "[s=12]-differenced Differenced Log of Air Passengers series")

This seems to have removed the seasonality:

par(mfrow = c(2, 1), mar = c(3, 3, 2, 1), mgp = c(1.6, 0.6, 0))
monthplot(dlx)
monthplot(ddlx)

Note the monthplot function.

To summarise:

plot.ts(cbind(x, lx, dlx, ddlx), main = "")

The transformed data appears to be stationary and
we are now ready to fit a model (see Module 10).

Multivariate time series (TS 5.6) #

Introduction #

Many data sets involve more than one time series, and we are often interested in the possible dynamics relating all series.
We are thus interested in modelling and forecasting $k \times 1$ vector-valued time series $x_{t} = (x_{t 1}, \dots, x_{t k})^{'}, t = 0, \pm 1, \pm 2, \dots .$
Unfortunately, extending univariate ARMA models to the multivariate case is not so simple.
The multivariate autoregressive model, however, is a straight-forward extension of the univariate AR model.
The resulting models are called Vector Auto-Regressive (VAR) models.
The reference for this section is TS 5.6.

VAR(1) model #

For the first-order vector autoregressive model, VAR(1), we take $x_{t} = α + Φ x_{t - 1} + w_{t},$ where $Φ$ is a $k \times k$ transition matrix that expresses the dependence of $x_{t}$ on $x_{t - 1}$ (note these are vectors). The vector white noise process $w_{t}$ is assumed to be multivariate normal with mean-zero and covariance matrix $E [w_{t} w_{t}^{'}] = Σ_{w} .$ The vector $α = (α_{1}, α_{2}, . . ., α_{k})^{'}$ is similar to a constant in a regression setting. If $E [x_{t}] = μ$ , then $α = (I - Φ) μ$ as before.

Example: Mortality, Temperature, Pollution #

Define $x_{t} = (x_{t 1}, x_{t 2}, x_{t 3})^{'}$ as a vector of dimension $k = 3$ for cardiovascular mortality $x_{t 1}$ , temperature $x_{t 2}$ , and particulate levels $x_{t 3}$ .
We might envision dynamic relations with first order relation $\begin{array}{rcl} x_{t 1} & = & α_{1} + β_{1} t + ϕ_{11} x_{t - 1, 1} + ϕ_{12} x_{t - 1, 2} + ϕ_{13} x_{t - 1, 3} + w_{t 1} \\ x_{t 2} & = & α_{2} + β_{2} t + ϕ_{21} x_{t - 1, 1} + ϕ_{22} x_{t - 1, 2} + ϕ_{23} x_{t - 1, 3} + w_{t 2} \\ x_{t 3} & = & α_{3} + β_{3} t + ϕ_{31} x_{t - 1, 1} + ϕ_{32} x_{t - 1, 2} + ϕ_{33} x_{t - 1, 3} + w_{t 3} \end{array}$
This can be rewritten in matrix form as $x_{t} = Γ u_{t} + Φ x_{t - 1} + w_{t},$ where $Γ = [α | β]$ is $3 \times 2$ and $u_{t} = (1, t)^{'}$ is $2 x 1$ .

We use the R package vars to fit VAR models via least squares.

library(vars)
x <- cbind(cmort, tempr, part)
summary(VAR(x, p = 1, type = "both"))  # 'both' fits constant + trend
...
## VAR Estimation Results:
## ========================= 
## Endogenous variables: cmort, tempr, part 
## Deterministic variables: both 
## Sample size: 507 
## Log Likelihood: -5116.02 
## Roots of the characteristic polynomial:
## 0.8931 0.4953 0.1444
## Call:
## VAR(y = x, p = 1, type = "both")
...

Note that roots less than one here ensure stability; see this

library(vars)
x <- cbind(cmort, tempr, part)
summary(VAR(x, p = 1, type = "both"))  # 'both' fits constant + trend
...
## Estimation results for equation cmort: 
## ====================================== 
## cmort = cmort.l1 + tempr.l1 + part.l1 + const + trend 
## 
##           Estimate Std. Error t value Pr(>|t|)    
## cmort.l1  0.464824   0.036729  12.656  < 2e-16 ***
## tempr.l1 -0.360888   0.032188 -11.212  < 2e-16 ***
## part.l1   0.099415   0.019178   5.184 3.16e-07 ***
## const    73.227292   4.834004  15.148  < 2e-16 ***
## trend    -0.014459   0.001978  -7.308 1.07e-12 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## 
## Residual standard error: 5.583 on 502 degrees of freedom
## Multiple R-Squared: 0.6908,	Adjusted R-squared: 0.6883 
## F-statistic: 280.3 on 4 and 502 DF,  p-value: < 2.2e-16 
...

library(vars)
x <- cbind(cmort, tempr, part)
summary(VAR(x, p = 1, type = "both"))  # 'both' fits constant + trend
...
## Estimation results for equation tempr: 
## ====================================== 
## tempr = cmort.l1 + tempr.l1 + part.l1 + const + trend 
## 
##           Estimate Std. Error t value Pr(>|t|)    
## cmort.l1 -0.244046   0.042105  -5.796 1.20e-08 ***
## tempr.l1  0.486596   0.036899  13.187  < 2e-16 ***
## part.l1  -0.127661   0.021985  -5.807 1.13e-08 ***
## const    67.585598   5.541550  12.196  < 2e-16 ***
## trend    -0.006912   0.002268  -3.048  0.00243 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## 
## Residual standard error: 6.4 on 502 degrees of freedom
## Multiple R-Squared: 0.5007,	Adjusted R-squared: 0.4967 
## F-statistic: 125.9 on 4 and 502 DF,  p-value: < 2.2e-16 
...

library(vars)
x <- cbind(cmort, tempr, part)
summary(VAR(x, p = 1, type = "both"))  # 'both' fits constant + trend
...
## Estimation results for equation part: 
## ===================================== 
## part = cmort.l1 + tempr.l1 + part.l1 + const + trend 
## 
##           Estimate Std. Error t value Pr(>|t|)    
## cmort.l1 -0.124775   0.079013  -1.579    0.115    
## tempr.l1 -0.476526   0.069245  -6.882 1.77e-11 ***
## part.l1   0.581308   0.041257  14.090  < 2e-16 ***
## const    67.463501  10.399163   6.487 2.10e-10 ***
## trend    -0.004650   0.004256  -1.093    0.275    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## 
## Residual standard error: 12.01 on 502 degrees of freedom
## Multiple R-Squared: 0.3732,	Adjusted R-squared: 0.3683 
## F-statistic: 74.74 on 4 and 502 DF,  p-value: < 2.2e-16 
...

library(vars)
x <- cbind(cmort, tempr, part)
summary(VAR(x, p = 1, type = "both"))  # 'both' fits constant + trend
...
## Covariance matrix of residuals:
##        cmort  tempr   part
## cmort 31.172  5.975  16.65
## tempr  5.975 40.965  42.32
## part  16.654 42.323 144.26
## 
## Correlation matrix of residuals:
##        cmort  tempr   part
## cmort 1.0000 0.1672 0.2484
## tempr 0.1672 1.0000 0.5506
## part  0.2484 0.5506 1.0000
...

$V A R (p)$ models #

It is easy to extend the VAR(1) process to higher orders (that means with correlations going farther than one step into the past), leading to $V A R (p)$ .
The regressors are $(1, x_{t - 1}^{'}, x_{t - 2}^{'}, \dots, x_{t - p}^{'})^{'} .$
The regression model is then $x_{t} = α + \sum_{j = 1}^{p} Φ_{j} x_{t - j} + w_{t} .$
The function VARselect suggests the optimal order $p$ according to different criteria: AIC, Hannan-Quinn (similar to BIC), BIC, and Final Prediction Error (which minimises the approximate mean squared one-step-ahead prediction error).

Example: VAR(1) on Mortality, Temperature, Pollution #

VARselect(x, lag.max = 10, type = "both")
## $selection
## AIC(n)  HQ(n)  SC(n) FPE(n) 
##      9      5      2      9 
## 
## $criteria
##                   1           2           3           4           5
## AIC(n)     11.73780    11.30185    11.26788    11.23030    11.17634
## HQ(n)      11.78758    11.38149    11.37738    11.36967    11.34557
## SC(n)      11.86463    11.50477    11.54689    11.58541    11.60755
## FPE(n) 125216.91717 80972.28678 78268.19568 75383.73647 71426.10041
##                  6           7           8           9          10
## AIC(n)    11.15266    11.15247    11.12878    11.11915    11.12019
## HQ(n)     11.35176    11.38144    11.38760    11.40784    11.43874
## SC(n)     11.65996    11.73587    11.78827    11.85473    11.93187
## FPE(n) 69758.25113 69749.89175 68122.40518 67476.96374 67556.45243

We will proceed with order $p = 2$ according to BIC.

summary(fit <- VAR(x, p = 2, type = "both"))
...
## VAR Estimation Results:
## ========================= 
## Endogenous variables: cmort, tempr, part 
## Deterministic variables: both 
## Sample size: 506 
## Log Likelihood: -4987.186 
## Roots of the characteristic polynomial:
## 0.8807 0.8807 0.5466 0.4746 0.4746 0.4498
## Call:
## VAR(y = x, p = 2, type = "both")
...

summary(fit <- VAR(x, p = 2, type = "both"))
...
## Estimation results for equation cmort: 
## ====================================== 
## cmort = cmort.l1 + tempr.l1 + part.l1 + cmort.l2 + tempr.l2 + part.l2 + const + trend 
## 
##           Estimate Std. Error t value Pr(>|t|)    
## cmort.l1  0.297059   0.043734   6.792 3.15e-11 ***
## tempr.l1 -0.199510   0.044274  -4.506 8.23e-06 ***
## part.l1   0.042523   0.024034   1.769  0.07745 .  
## cmort.l2  0.276194   0.041938   6.586 1.15e-10 ***
## tempr.l2 -0.079337   0.044679  -1.776  0.07639 .  
## part.l2   0.068082   0.025286   2.692  0.00733 ** 
## const    56.098652   5.916618   9.482  < 2e-16 ***
## trend    -0.011042   0.001992  -5.543 4.84e-08 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## 
## Residual standard error: 5.295 on 498 degrees of freedom
## Multiple R-Squared: 0.7227,	Adjusted R-squared: 0.7188 
## F-statistic: 185.4 on 7 and 498 DF,  p-value: < 2.2e-16 
...

summary(fit <- VAR(x, p = 2, type = "both"))
...
## Estimation results for equation tempr: 
## ====================================== 
## tempr = cmort.l1 + tempr.l1 + part.l1 + cmort.l2 + tempr.l2 + part.l2 + const + trend 
## 
##           Estimate Std. Error t value Pr(>|t|)    
## cmort.l1 -0.108889   0.050667  -2.149  0.03211 *  
## tempr.l1  0.260963   0.051292   5.088 5.14e-07 ***
## part.l1  -0.050542   0.027844  -1.815  0.07010 .  
## cmort.l2 -0.040870   0.048587  -0.841  0.40065    
## tempr.l2  0.355592   0.051762   6.870 1.93e-11 ***
## part.l2  -0.095114   0.029295  -3.247  0.00125 ** 
## const    49.880485   6.854540   7.277 1.34e-12 ***
## trend    -0.004754   0.002308  -2.060  0.03993 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## 
## Residual standard error: 6.134 on 498 degrees of freedom
## Multiple R-Squared: 0.5445,	Adjusted R-squared: 0.5381 
## F-statistic: 85.04 on 7 and 498 DF,  p-value: < 2.2e-16 
...

summary(fit <- VAR(x, p = 2, type = "both"))
...
## Estimation results for equation part: 
## ===================================== 
## part = cmort.l1 + tempr.l1 + part.l1 + cmort.l2 + tempr.l2 + part.l2 + const + trend 
## 
##           Estimate Std. Error t value Pr(>|t|)    
## cmort.l1  0.078934   0.091773   0.860 0.390153    
## tempr.l1 -0.388808   0.092906  -4.185 3.37e-05 ***
## part.l1   0.388814   0.050433   7.709 6.92e-14 ***
## cmort.l2 -0.325112   0.088005  -3.694 0.000245 ***
## tempr.l2  0.052780   0.093756   0.563 0.573724    
## part.l2   0.382193   0.053062   7.203 2.19e-12 ***
## const    59.586169  12.415669   4.799 2.11e-06 ***
## trend    -0.007582   0.004180  -1.814 0.070328 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## 
## Residual standard error: 11.11 on 498 degrees of freedom
## Multiple R-Squared: 0.4679,	Adjusted R-squared: 0.4604 
## F-statistic: 62.57 on 7 and 498 DF,  p-value: < 2.2e-16 
...

summary(fit <- VAR(x, p = 2, type = "both"))
...
## Covariance matrix of residuals:
##        cmort  tempr   part
## cmort 28.034  7.076  16.33
## tempr  7.076 37.627  40.88
## part  16.325 40.880 123.45
## 
## Correlation matrix of residuals:
##        cmort  tempr   part
## cmort 1.0000 0.2179 0.2775
## tempr 0.2179 1.0000 0.5998
## part  0.2775 0.5998 1.0000
...

Furthermore, the Portmanteau test rejects the null hypothesis of white noise, suggesting poor fit:

serial.test(fit, lags.pt = 12, type = "PT.adjusted")
## 
## 	Portmanteau Test (adjusted)
## 
## data:  Residuals of VAR object fit
## Chi-squared = 162.35, df = 90, p-value = 4.602e-06

This is confirmed by visual inspection:

acf(resid(fit), 52)

acf(resid(fit), 52)

Predictions are produced with

fit.pr <- predict(fit, n.ahead = 24, ci = 0.95)  # 24 weeks ahead
fanchart(fit.pr)  # plot prediction + error

Special cases briefly mentioned in CS2 syllabus (S6) #

Bilinear models #

Consider the bilinear model $x_{t} - α (x_{t - 1} - μ) = μ + w_{t} + β w_{t - 1} + b (x_{t} - μ) w_{t - 1} .$
This equation is linear in $x_{t}$ , and also linear in $w_{t}$ - hence the name of .
This can be rewritten as $\begin{array}{rcl} x_{t} & = & μ + w_{t} \\ + (β + b (x_{t} - μ)) w_{t - 1} \\ + α (x_{t - 1} - μ) . \end{array}$
As opposed to ARMA models, the bilinear models exhibit behaviour: when the process is far from its mean it tends to exhibit larger fluctuations.
When the process is “at” its mean, dynamics are similar
to that of an $M A (1)$ process.

Threshold autoregressive models #

Consider the threshold autogressive model $x_{t} = μ + {\begin{array}{lc} α_{1} (x_{t - 1} - μ) + w_{t} & x_{t - 1} \leq d, \\ α_{2} (x_{t - 1} - μ) + w_{t} & x_{t - 1} > d . \end{array}$
A distinctive feature of some models from the threshold autoregressive class is limit cycle behaviour. This makes the threshold autoregressive models potential candidates for modelling ‘cyclic’ phenomena.
The idea is to have parameters to be allowed to change in a “regime-switching” fashion, sometimes depending on the past values of $x_{t}$ .
Some additional details can be found, for instance, on Wikipedia here (not assessable).

Random coefficient autoregressive models #

Here $x_{t} = μ + α_{t} (x_{t - 1} - μ) + w_{t},$ where ${α_{1}, α_{2}, \dots, α_{n}}$ is a sequence of independent random variables.
Such a model could be used to represent the behaviour of an investment fund, with $μ = 0$ and $α_{t} = 1 + i_{t}$ where $i_{t}$ is a random rate of return.
The extra randomness make such processes more irregular than the corresponding $A R (1)$ .

Autoregressive models with conditional heteroscedasticity #

A feature that is frequently observed in asset price data is that a significant change in the price of an asset is often followed by a period of high volatility.
The class of autoregressive models with conditional heteroscedasticity of order $p$ —the $A R C H (p)$ models —is defined by the relation $x_{t} = μ + w_{t} \sqrt{α_{0} + \sum_{k = 1}^{p} α_{k} (x_{t - k} - μ)^{2}},$ where $w_{t} \sim iid N (0, σ_{w}^{2})$ .
Because of the scaling of $w_{t}$ with a term that is bigger as previous values of $x_{t}$ are farther from their mean $μ$ , ARCH models go towards capturing this feature.

If $p = 1$ , the $A R C H (1)$ model is $x_{t} = μ + w_{t} \sqrt{α_{0} + α (x_{t - 1} - μ)^{2}} .$
ARCH models have been used for modelling financial time series. If $z_{t}$ is the price of an asset at the end of the $t$ -th trading day, set $x_{t} = \log (z_{t} / z_{t - 1})$ (the daily return on day $t$ ).
Note:
- Setting all $α_{k} = 0$ , $k = 1, \dots, p$ will “switch off” the “ARCH” behaviour, and we will be back with a white noise process of variance $α_{0}$ (plus constant mean $μ$ ).
- So what we are modelling here is purely the variance of the process $x_{t}$ , given its past ($p$) values.
- If we extend the idea of conditional heteroskedasticity to an ARMA structure (rather than just AR), we get the so-called “GARCH” models.

References #

Shumway, Robert H., and David S. Stoffer. 2017. Time Series Analysis and Its Applications: With r Examples. Springer.

Introduction (TS 3.1, 3.3, 3.6, S6) #

Overview of main models #

Variations #

Partial Autocorrelation Function (PACF) #

Motivation #

Reminder: partial correlation #

The partial autocorrelation function (PACF) #

Examples of AR PACFs #

Examples of MA PACFs #

AR(p) models (TS 3.1, 3.3, 3.6, S6) #

Definition #

AR(p) models #

Example: AR(1) model #

Sample paths of AR(1) processes #

Stationarity (and Causality) #

Stationarity of AR(p) processes #

ACF and PACF #

Autocovariances #

Yule-Walker equations #

Example #

ACF #

PACF #

Explosive AR models and causality #

MA(q) models (TS 3.1, 3.3, 3.6, S6) #

Definition #

MA(q) models #

Example: MA(1) model #

Sample paths of MA(1) processes #

Non-uniqueness and invertibility #

ACF and PACF #

Autocovariance #

ACF #

PACF #

ARMA(p,q) models (TS 3.1, 3.3, 3.6, S6) #

Definition #

Parameter redundancy #

The AR and MA polynomials #

Properties #

Problems to keep in mind #

Parameter redundant models #

Future dependence and causality #

Invertibility #

Example: parameter redundancy, causality, invertability #

Stationarity, Causality and Invertibility #

Wrapping it up #

Revisiting the future-dependent example #

ACF and PACF #

Autocovariance and ACF #

Example: ACF of ARMA(1,1) #

ARIMA(p,d,q) models (TS 3.1, 3.3, 3.6, S6) #

Definition #

Remarks #

The IMA(1,1) model and exponential smoothing #

IMA(1,1) and EWMA #

EWMA example #

Multiplicative seasonal ARIMA models (TS 3.9) #

Introduction #

Pure seasonal ARMA(P,Q)s models #

Example: first order seasonal (s=12) AR model #

Autocovariance and ACF of first-order seasonal models #

Multiplicative seasonal ARMA(p,q)×(P,Q)s models #

Example: A mixed seasonal model ARMA(0,1)×(1,0)12 #

Seasonal differencing #

Motivating example #

Seasonal differencing #

SARIMA model ARIMA(p,d,q)×(P,D,Q)s #

A typical SARIMA model #

Example: Air Passengers #

Multivariate time series (TS 5.6) #

Introduction #

VAR(1) model #

Example: Mortality, Temperature, Pollution #

VAR(p) models #

Example: VAR(1) on Mortality, Temperature, Pollution #

Special cases briefly mentioned in CS2 syllabus (S6) #

Bilinear models #

Threshold autoregressive models #

Random coefficient autoregressive models #

Autoregressive models with conditional heteroscedasticity #

References #

$A R (p)$ models (TS 3.1, 3.3, 3.6, S6) #

$A R (p)$ models #

Sample paths of $A R (1)$ processes #

Stationarity of $A R (p)$ processes #

$M A (q)$ models (TS 3.1, 3.3, 3.6, S6) #

$M A (q)$ models #

Example: $M A (1)$ model #

Sample paths of $M A (1)$ processes #

$A R M A (p, q)$ models (TS 3.1, 3.3, 3.6, S6) #

Example: ACF of $A R M A (1, 1)$ #

$A R I M A (p, d, q)$ models (TS 3.1, 3.3, 3.6, S6) #

Pure seasonal $A R M A (P, Q)_{s}$ models #

Example: first order seasonal $(s = 12)$ AR model #

Multiplicative seasonal $A R M A (p, q) \times (P, Q)_{s}$ models #

Example: A mixed seasonal model $A R M A (0, 1) \times (1, 0)_{12}$ #

SARIMA model $A R I M A (p, d, q) \times (P, D, Q)_{s}$ #

$V A R (p)$ models #