# When is regression cross sectional and when is it time series?

#### noetsi

##### Fortran must die
This is not an issue of autocorrelation but the accuracy of the results that is bias. I am concerned exclusively with my results not being biased since I work with essentially entire populations. Simply put I would always rather do cross sectional data than any time series I know for multivariate approaches. This is because every time series method I know is invariably far more difficult to do or interpret than cross sectional approaches and violations of assumptions are commonly far harder to detect as well. Moreover nonstationarity is nearly certain in our dependent over time (although no one knows if this is true of the relationship of X and Y).

How does one know when it is appropriate to use a time series method than a cross sectional one? I used to think that this was simple, if X occurs before Y than you do time series. But of course X always occurs before Y if it influences Y.

Analysis of time series usually deals with lags. So X influences Y at one lag for example. With our data spending influences (we assume) results so types and quantity of spending is the predictor and say income gain is the result. But spending can take place at many, many points in time, it is rare for it to occur at just one point. And the result takes places years after the spending with a customer getting many types of spending over time. Further, one case can have the spending at say a year before the result is measured another six months and so on. It is not even clear when the lags are happening here another reason I would prefer cross sectional approaches.

So when is it valid to use cross sectional approaches with data over time (the organization we report to uses OLS not a time series approach although I am doubtful they have considered this issue).

#### noetsi

##### Fortran must die
While I am asking this, after countless hours studying such, when you are testing for Stationarity with say ADF or Phillip Pheron, do you test the individual variables for Stationarity or do you test the residual assumptions?

#### hlsmith

##### Less is more. Stay pure. Stay poor.
I believe the issue with OLS is the dependence in residuals and not addressing a related measure of error. So people get away with using OLS, but need to realize the error measures will be opportunistic and at least robust errors are needed. But you know this.

I haven't gotten into formal tests of stationarity yet, I am imagining they look at the autocorrelation, etc. What I have heard, one of the best tests for stationarity is just getting a large series and visualizing it. If you have a large series, you should have greater confidence.

#### noetsi

##### Fortran must die
You can use Newey West to deal with autocorrelation. The much more concerning issue to me is bias which occurs when for example you leave out a lag of Y or X or alternately when you have non-Stationarity in your variables and they are not integrated of the same order. Cointegration can work , but apparently if some predictors are integrated as I[2] and some as I[1] then nothing including cointegration works. Note that even with cointegration you have to use error correction models [which I will have to learn] rather than OLS. And of course you have to know the various cointegration tests

The best approach for non-Stationarity is to run a test like ADF and KPS [which have opposite nulls for Stationarity and see if they agree]. You hope they do....this is because low power is a major issue with Stationarity tests particularly if they have a process which almost has a unit run, but does not (like .9994).

Another author I found made this comment:

I think a wise way to approach making use of such tests is to combine inference from an appropriate version of a test for stationarity, coupled with an appropriate version of a test for unit root:

• Reject test for stationarity, and fail to reject test for unit root: conclude data are stationary.
• Fail to reject test for stationarity, and reject test for unit root: conclude data have unit root (are non-stationary).
• Fail to reject test for stationarity, and fail to reject test for unit root: conclude data are under-powered to make any inference one way or the other.
• Reject test for stationarity, and reject test for unit root: think hard about your data! For example with tests like those I mentioned above it may be the case that some time series have unit root, and some time series are stationary. It may also be the case that your data are autoregressive (values have memory of their prior states), but are not fully unit root (i.e. the memory decays eventually, instead of carrying forward infinitely).
For a single time series, the augmented Dickey-Fuller test for stationarity, and the complementary Kwiatkowski–Phillips–Schmidt–Shin test for unit root may be appropriate tools, and these tests are commonly implemented in statistical software (e.g., R, Stata, etc.).

#### hlsmith

##### Less is more. Stay pure. Stay poor.
Yeah, I have my time series exam tomorrow, and we are stuck in the weeds and don't look at the big picture. But perhaps things will eventually come into focus.

I have been doing looks at unit root examinations in the complex space, but not really using any real examples, just getting spoon feed some defining polynomials and asked if the AR model is stationary. Can you tell me more about what the roots bigger than unit means or represent? If phi's are <1 and >-1, the roots seem to be bigger than unit, and if phi's are beyond those bounds, the model becomes explosive, but is there anything else I need to know?

#### noetsi

##### Fortran must die
I am not sure it is possible to have a value greater than 1 (a unit root) although as always I know little about the actual math. What a unit root means is that the process has a trend in it. Which is the same as non-stationary. This has a range of bad impacts, for example its impossible to run time series regression (except of the special case of cointegration) and ARIMA MA and AR will not work. You can not do ARMA at all with non-stationary (ARIMA reflects differencing to deal with Stationarity so you can do ARMA).