Forecasting MA(1) process

Suppose \(x_{t} = w_{t} + \theta w_{t-1}\) where \(w_t\) is white noise with variance \(\sigma_{w}^2\)

Derive the minimum mean square error one-step forecast based on the infinite past and determine the mean square error of this forecast.

Let \(\tilde{x}_n^{n+1}\) be the truncated one step ahead forecast based on the n previous observations. Show that \(E[(x_{n+1}-\tilde{x}_n^{n+1})^2=\sigma_w^2(1+\theta^{(2+2n)})\)
The solution for the first part states that:
\(x_{n+1} = \sum_{j=1}^{\infty}-\theta^{j}x_{n+1-j} + w_{n+1}\) and \(\tilde{x}_{n+1}=\sum_{j=1}^{\infty}(-(\theta)^{j}x_{n+1-j})\) so the \(MSE = E[x_{n+1}-\tilde{x}_{n+1}] = \sigma_{w}^2\). I am not sure how the expression for \(x_{n+1}\) or \(\tilde{x}_{n}^{n+1}\) was arrived at. Can someone explain how this was derived?

For the second part, the solution states \(\tilde{x}_{n}^{n+1}= \sum_{j=1}^{n}-\theta^{j}x_{n+1-j}\) and \(MSE = E(x_{n+1} - \tilde{x}_{n}^{n+1})^2 =E[\sum_{j=n+1}^{\infty}-\theta^{j}x_{n+1-j} + w_{n+1}]^2\). I am not sure how this was arrived at.


TS Contributor
Introduce the back-shift operator \( B \) such that \( Bw_t = w_{t-1} \)

Now the MA(1) model can be rewritten as \( x_t = (1 + \theta B)w_t \)

For the time being lets believe that we may "invert" the operator like this

\( w_t = (1 + \theta B)^{-1} x_t \)

And also believe that we can apply the geometric series result:

\( w_t = \sum_{j=0}^{+\infty} (-\theta)^j B^j x_t
= \sum_{j=0}^{+\infty} (-\theta)^j x_{t-j} \)

Now combining these facts on \( x_{n+1} \):

\( x_{n+1} = (1 + \theta B)w_{n+1}
= w_{n+1} + \theta B\sum_{j=0}^{+\infty} (-\theta)^j x_{n+1-j} \)

\( = \sum_{j=0}^{+\infty} -(-\theta)^{j+1} x_{n-j} + w_{n+1} \)

\( = \sum_{j=1}^{+\infty} -(-\theta)^j x_{n+1-j} + w_{n+1} \)

The one-step ahead forecast should be the conditional expectation:

\( \tilde{x}_{n+1} = E[x_{n+1}|\mathcal{F}_n]
= \sum_{j=1}^{+\infty} -(-\theta)^j x_{n+1-j} + E[w_{n+1}|\mathcal{F}_n] \)

as the former sum is \( \mathcal{F}_n \)-measurable. Note that the latter expectation vanish as it is white noise and independent of the filtration.

For the truncated forecast it is really by definition - truncate the series up to the most recent \( n \) terms only instead of the infinite past.

For the last part you may try to write \( x_t \) in terms of those white noises \( w_t \) again and calculate the MSE. By the zero expectation and independence of white noise, the variance/MSE is not hard to calculate.
I am still not sure how you got \( w_{t} = \sum_{j=0}^{\infty}(-\theta^{j})B^{j}x_{t}\). What do you mean when you say "believe" we can apply the geometric series result. I am lost at this step.


Ambassador to the humans
One way to arrive at that result is to observe that it seems like that quantity might act as some sort of inverse and then verify that it does (this is how we did it in my time series course and while it works it isn't too satisfying). Another way is to find a pattern to arrive at the first part.

\(x_t = w_t + \theta w_{t-1}\) which implies \(w_t = x_t - \theta w_{t-1}\). (1)

But we also know \(x_{t-1} = w_{t-1} + \theta w_{t-2}\) so we can rearrange this to solve for \(w_{t-1}\) and plug that back into (1). If you keep doing these substitutions you might notice a pattern.