Yeah, I thought this was what you were after.  can be use to grab rows, columns, and datum. In this situation it tells the function which variables to use from the dataset. Columns are listed second when using , so that is why it is [ , then target variable]. Like in matrix algebra I think [row, columns]
So [1,] grabs first row;
So [,1] grabs first column;
So [1,1] grabs datum in first row and first column.
Your columns have names, so instead of listing the column location you list it's name.
So dataset is uschange.
Columns are Income and Consumption.
You are saying in dataset uschange, use column Income.
I have a question about auto.arima with regression which I am hoping someone knows.
For regression with arima errors everything is integrated with the same order [ignoring seasonality]. In multivariate arima since you prewhiten each predictor they could in theory be integrated of a different order. I am not sure what auto.arima does when you specify a regressor although I suspect the former.
Does anyone know? And if there is one order of integration, what exactly is being integrated (determines the order of differencing). The regression residuals?
A related question. When you are predicting future y, that is when X impacts Y say a year from now, how does this work with auto.arima (I spent a lot of time today and have not found this). Do you have to predict the future X first and provide this to auto.arima or does auto.arima predict the future X itself and then use those to predict future Y. If you believe that X-> influences Y+1 would you just provide the X one period before?
I think use a training set of X in the auto.arima function to determine the order. Then use projected X to forecast future Y.
I am not sure ARIMA handles lagged regressors in honesty.
I think what you do is use the X and Y training set to build the model. Then use the whole data set to pick future results. For future predictions you have first predict all X in the future (assuming you don't chose to test lags of X influencing Y).
I have a data set called mydata. It has three columns in it. Month is a date the other two are numbers.
when I do str(mydata) it says there are 84 observations. There actually are 85 including the variable label row. Does it just ignore the label row?
I assumed so since I wanted all data except the last 12 points to be a training data set recommended in auto.arima I did
train=mydata[1:71,] #training data set
mydata$month=NULL #not sure why you do this
valid=mydata[1:83,] #all data points to be used later
fit=auto.arima(train [,"rehab.rate"],approximation = FALSE, stepwise = FALSE,xreg=train$unemployment.rate)
to run the regression with ARIMA error. I expected based on other examples to see an arima model and coefficients generated. I get no error, but no results either. Does anyone know why? I looked at the auto-arima documentation but I did not see anything.
I am expecting to see something like this
ARIMA(1,0,1)(1,1,0) . I might be missing a function I have to run, but I have not found any.
I don't think any here do time series, but for those interested in auto.arima this might be of interest.
The first one is answered by a quick experiment, auto.arima(rnorm(100)) gives you an answer, because auto.arima() will convert your data into a ts() object. It's still good practice to convert manually to set seasonality and start dates and so plots come out right. [/qipte]