Comparison of time series

#1
Hi there,

firstly I would like to say that I'm newbie here, so hi everyone, and please be patient if I do something wrong. And sorry for my level of used english, hope everything will be clear :)

Now, I have a problem with "comparison" of two time series (TS): measured observations and generated TS. I would like to decide whether the second TS is statistically "relevant, similar" to the first one so that the first TS can be replaced by the second TS. Can anyone help me?

I can say now that most of the descriptive characteristics are similar (means, standard deviations, higher moments like skewness, kurtosis; quantiles etc.), but I think that is not enough, say, because of some difficulties may occur since I have to deal with bimodal distributions, but maybe not..

So what kind of analysis should be performed? My friend advised me to look at autocorrelation functions, cross-correlation, coherence, psd and similar techniques, but I am not familiar with them.

I'm using Matlab, and if any figure is needed, I immediately post it.

Any help is appreciated.

Thanks, Jan.
 
S

schwartzaw

Guest
#3
You may find some simple measures of accuracy are a good place to start. Measures like mean and standard deviation won't be very helpful with time series data where a value is typically predicated on prior values (eg, today's stock market price is influenced by yesterday's). I could easily construct two time series with exact opposite trends that had the same mean and standard deviation.

From minitab 15's help:

Measures of accuracy (time series analysis)
Use these statistics to compare the fits of different forecasting and smoothing methods. Minitab computes three measures of accuracy of the fitted model: MAPE, MAD, and MSD. The three measures are not very informative by themselves, but you can use them to compare the fits obtained by using different methods. For all three measures, smaller values generally indicate a better fitting model.

Mean absolute percentage error (MAPE) – Expresses accuracy as a percentage of the error. Because this number is a percentage, it may be easier to understand than the other statistics. For example, if the MAPE is 5, on average the forecast is off by 5%.

Mean absolute deviation (MAD) – Expresses accuracy in the same units as the data, which helps conceptualize the amount of error. Outliers have less of an affect on MAD than on MSD.

Mean squared deviation (MSD) – A commonly-used measure of accuracy of fitted time series values. Outliers have more influence on MSD than MAD.

For example, you have sales data for 36 months and you would like to find a prediction model. You try two models: single exponential smoothing (SES) and linear trend, and get the following results:

SES Linear Trend
MAPE 8.1976 MAPE 6.9551
MAD 3.6215 MAD 2.7506
MSD 22.3936 MSD 11.2702

All three numbers are lower for the linear trend model compared to the single exponential smoothing method; therefore, the linear trend model seems to provide the better fit.
You might also be able to leverage a measure like r-squared - you can use linear regression on time series data - to understand how good the fit is.