Correlation between two non stationary time series

#1
Hey everyone,

I'm trying to figure out if there is significant correlation between two non stationary time series (e.g. temperature on a a given day and icecream sales). Could someone please propose a method. There are so many different methods i read about online im not sure which to go with. Could someone explain if monte carlo method would work and how.

Also I would like to do regression analysis on the time series. i.e. figure out how much one variable will change in response to changing the other variable.

Thanks so much in advance!
 

noetsi

Fortran must die
#2
I can propose many, none are easy. In fact this is to me one of the most difficult questions you can ask and I spent much of the last two years looking for a good approach. Multivariate ARIMA, some call this ARIMAX is one approach. This is an extremely painful process because of the need to prewhiten input series. I believe you can achieve the same thing with transfer function models, but I know only the vaguest amount of that approach. Autoregressive Distributed Lag models can aslo address this, if the series are non-stationary you have to difference each I believe. At least in some accounts if the series have different orders of integration and are not cointegrated this will not work.

There are many others including dynamic regression, VAR models and error correction models but I have limited experience with those. I suggest looking at either Multivariate ARIMA or autoregressive distributed lag and see what you think. Good luck.
 
#3
Thanks so much for quick reply!!

Do you think I could break up the two series into different seasons ( to get multiple stationary time series) and then find the correlation between them ? ( e.g. look at ice cream sales vs. temp only in summer ) Would this decrease my chances of getting accurate results? I'm assuming find correlation between stationary time series is easier?

Also could you please recommend good texts that discuss analyzing two non stationary time series?
 

staassis

Active Member
#4
Whatever model you use to capture the relationship, the bottom line is the following: to be on the safe side, transform the two series into their stationary counterparts (by differencing, log-differencing, removing deterministic trends / seasonalities, etc). Then model the relationship between the two stationary counterparts and map it back into the original time series.

If you estimate correlation between two non-stationary series, the estimate will be consistent only if the two time series are cointegrated.
 
#5
Is there anyway to test the correlation without modelling the two time series. For example after detrending and confirming that they are co integrated, is there a test analogous to cor.test that i could use to test the correlation between the two variables?

Also can I take the first difference of a time series describing daily weather and then use adf.test on the differenced time series to test for unit root hypothesis? Is this the correct way to test stationarity?
 

noetsi

Fortran must die
#6
You can run cross correlations in ARIMA, this is essentially the correlation between two time series, but you have to do ARIMA analysis on the series first (such as differencing) so I am not sure if this is what you mean by not modeling the time series.

There are many ways to test stationarity. Augmented Dickey Fuller is one, I assume you mean this by ADF, but is limited as are all unit root test by lower power especially if near unit roots occur. One recomendation that makes sense to me is run ADF which has a null of non-stationarity I think and another of the unit root test that has the exact opposite null. Then if both suggest non-stationarity (or stationarity) you are more confident of the results.

Note that ADF is testing for stochastic non-stationarity. A seperate type is deterministic non-stationarity which is commonly handled through regression not ARIMA (that is you do not address this type of non-stationarity through differencing). If you assume one form of non-stationarity and it is the other you will often get the wrong results. There is a process that utilizes ADF to determine if the process is determanistic or non-determanistic non-stationarity, but it is not simple. I suggest looking up determanistic non-stationarity to explore this issue.
 

JesperHP

TS Contributor
#7
If you are going for the effect on a day by day basis I would first difference temp. and sales ... do ADF on both (before and after).... and if series are stationary after first diff. then try a model where first diff. sales are regressed on first diff. temp. + lag of first diff. temp. .... you can always critize the model but this one is very simple and worth a shot.
 

noetsi

Fortran must die
#8
The problem with that approach JesperHP is that even when stationary the two variables are time series. So if autoregressive patterns exist the standard errors won't be correct. At the least you should conduct a test like Durbin's T for AR error (not Durbin-Watson which only deals with first order AR). If you find this problem you have to try something like ARDL or ARIMA with transfer functions.
 

JesperHP

TS Contributor
#9
So if autoregressive patterns exist the standard errors won't be correct.
Must admit my timeseries analysis is a bit rusty nevertheless I would argue this is not necessarily correct. The model I suggested was :

\(y_t := \Delta sales = sales_t - sales_{t-1}\)
\(z_t := \Delta temp = temp_t - temp_{t-1}\)

\( y_t =\beta_0 + \beta_1 y_{t-1} + \beta_2 z_t + \beta_3 z_{t-1} + u_t\)

Which allows for autoregressive patterns....

I was implicitly (and admitted probably a little too implicitly) suggesting the use of assumptions TS1'-TS5' in chapter 11 page 386-388 particularly theorem 11.2
Woolridge
which justifies using usual OLS standard errors, t-test, F-test and LM-tests.

Using the assumptions offcourse means you have to test for homoscedasticity and lack of serial correlation ... and more generally the assumption is
an assumption of dynamic completeness which is definitely NOT true of the model. Since the sales of icecream will depend on many other factors sector specific and of macroeconomic nature but that is an entirely different story.

Anyway chapter 11 and 12 of the linked book should be of interest to you s_chrodinger.
 
Last edited:

JesperHP

TS Contributor
#10
Ok rereading my post #7 I see that the model in #9 is not what I suggested ...making Noetsi's point correct ... nevertheless the point remains that the model could easily be corrected as in #9 to allow for lags of dependent variable.... and that it probably remains invalid for entirely different reasons.