# What is this time series coding doing

#### noetsi

##### Fortran must die
(fit<-auto.arima(uschange[,"Consumption],xreg=uschange[,"Income]))

It is what is in the brackets that I am asking about I don't know what it is doing. It is figure 9.1 in the link below.
https://otexts.com/fpp2/regarima.html

#### Dason

I find it odd that you typed that out instead of copy/pasting. You missed the closing parenthesis on each of the quotes.

But to answer the question by wrapping everything in parenthesis it will automatically print the results. Typically if you store a result into a variable you don't automatically print it.

Code:
> a <- 3
> # it didn't print the value of a
> print(a)
[1] 3
> #or just type the variable
> a
[1] 3
> # but we can also get it to print right away by wrapping in ()
> (b <- 102)
[1] 102

#### noetsi

##### Fortran must die
My computer will not let me copy and paste formulas from that link or code. I am not sure why this occurs.

Thanks for explaining what the [] does. One problem with OTEST is the authors assume you know a fair amount of R already.

#### Dason

Let's be clear. I explained what the () surrounding the entire code chunk was doing. I didn't mention the square brackets at all. Which were you asking about

#### hlsmith

##### Less is more. Stay pure. Stay poor.
I love it. You can't make this stuff up.

Yeah, I thought this was what you were after. [] can be use to grab rows, columns, and datum. In this situation it tells the function which variables to use from the dataset. Columns are listed second when using [], so that is why it is [ , then target variable]. Like in matrix algebra I think [row, columns]

So [1,] grabs first row;
So [,1] grabs first column;
So [1,1] grabs datum in first row and first column.
Your columns have names, so instead of listing the column location you list it's name.
So dataset is uschange.
Columns are Income and Consumption.
You are saying in dataset uschange, use column Income.

#### noetsi

##### Fortran must die
I have a question about auto.arima with regression which I am hoping someone knows.

For regression with arima errors everything is integrated with the same order [ignoring seasonality]. In multivariate arima since you prewhiten each predictor they could in theory be integrated of a different order. I am not sure what auto.arima does when you specify a regressor although I suspect the former.

Does anyone know? And if there is one order of integration, what exactly is being integrated (determines the order of differencing). The regression residuals?

A related question. When you are predicting future y, that is when X impacts Y say a year from now, how does this work with auto.arima (I spent a lot of time today and have not found this). Do you have to predict the future X first and provide this to auto.arima or does auto.arima predict the future X itself and then use those to predict future Y. If you believe that X-> influences Y+1 would you just provide the X one period before?

I think use a training set of X in the auto.arima function to determine the order. Then use projected X to forecast future Y.

I am not sure ARIMA handles lagged regressors in honesty.

Last edited:

#### noetsi

##### Fortran must die
I think what you do is use the X and Y training set to build the model. Then use the whole data set to pick future results. For future predictions you have first predict all X in the future (assuming you don't chose to test lags of X influencing Y).

#### noetsi

##### Fortran must die
One author suggested this.

"Note that you should set the parameters stepwise and approximation parameters to FALSE if you are dealing with a single data set to ensure that you obtain a well-fitting model."

How would you do this. ,stepwise=FALSE, approximation=FALSE)
do those have to be done in a specific order - auto.arima has many many parameters.

#### noetsi

##### Fortran must die
A totally different question.

If it helps here is the documentation
auto.arima

I have a data set called mydata. It has three columns in it. Month is a date the other two are numbers.

when I do str(mydata) it says there are 84 observations. There actually are 85 including the variable label row. Does it just ignore the label row?

I assumed so since I wanted all data except the last 12 points to be a training data set recommended in auto.arima I did
train=mydata[1:71,] #training data set
mydata$month=NULL #not sure why you do this valid=mydata[1:83,] #all data points to be used later fit=auto.arima(train [,"rehab.rate"],approximation = FALSE, stepwise = FALSE,xreg=train$unemployment.rate)

to run the regression with ARIMA error. I expected based on other examples to see an arima model and coefficients generated. I get no error, but no results either. Does anyone know why? I looked at the auto-arima documentation but I did not see anything.

I am expecting to see something like this
ARIMA(1,0,1)(1,1,0)[52] . I might be missing a function I have to run, but I have not found any.

Coefficients:
ar1 ma1 sar1
0.7520 -0.1921 -0.5759
s.e. 0.0696 0.0958 0.0603