tests

noetsi

Fortran must die
#1
I am relearning all the test I know in SAS. :p

This is the Breusch-Godfrey Test for serial correlation

Code:
data(mtcars)
model <- lm(mpg~disp+hp, data=mtcars)
bgtest(mpg~disp+hp, order = 1,data = mtcars ) #Breusch-Godfrey Test for serial correlation

Breusch-Godfrey test for serial correlation of order up to 1 # results

data:  mpg ~ disp + hp
LM test = 3.6211, df = 1, p-value = 0.05705
This is the R documentation. I think I got it right but I am not sure because I do not ask for the residuals above (this comes from page 2).

bgtest(formula, order = 1, order.by = NULL, type = c("Chisq", "F"),
data = list(), fill = 0)

Arguments
formula a symbolic description for the model to be tested (or a fitted "lm" object).
order integer. maximal order of serial correlation to be tested.
order.by Either a vector z or a formula with a single explanatory variable like ~ z. The
observations in the model are ordered by the size of z. If set to NULL (the default)
the observations are assumed to be ordered (e.g., a time series).
type the type of test statistic to be returned. Either "Chisq" for the Chi-squared test
statistic or "F" for the F test statistic.
data an optional data frame containing the variables in the model. By default the
variables are taken from the environment which bgtest is called from.

https://cran.r-project.org/web/packages/lmtest/lmtest.pdf

the Ramsey RESET test
Code:
It starts in page 36 https://cran.r-project.org/web/packages/lmtest/lmtest.pdf
resettest(formula, power = 2:3, type = c("fitted", "regressor", "princomp"), data = list(), vcov = NULL, ...)
 data(mtcars)
   model <- lm(mpg~disp+hp, data=mtcars)
   resettest(mpg~disp+hp, power = 2:3,  data = mtcars) # pulls in cubics and quadratics

RESET test # results

data:  mpg ~ disp + hp
RESET = 15.45, df1 = 2, df2 = 27, p-value = 3.367e-05
It concerns me that the p value is so low although this is a made up result. It makes me wonder if I ran the test right.
 
Last edited:

noetsi

Fortran must die
#2
I don't understand the Jacque Berra test (but this question is also in part an R question). This test of normality is supposed to be on the residuals of a time series including ARIMA. I don't understand what residuals even mean in a univariate distribution. Is this test actually only used when you are doing time series regression?

The link I found for this suggested a test of jarque.bera.test(dataset) but their data set did not look like residuals - It looked like a series of univariate distributions.
https://www.statology.org/how-to-conduct-a-jarque-bera-test-in-r/

The more general question I have which comes from above, is can tests in R that rely on residuals work when all you have is the regression. For example the Breusch-Godfrey test for serial correlation is being run
bgtest(mpg~disp+hp, order = 1,data = mtcars )

will R run the regression extract the residuals and then run the test? I am getting answers, but I have not found this addressed.
 

noetsi

Fortran must die
#3
I found out that these test extract the residuals and then test them but the Jarque Berra test (Jarque.beta.test) appears to work in R on a univariate time series that does not have residuals - which I don't understand in honesty reading how this test is supposed to work.

I don't understand the underlying code R is written in and that is useful to know what it is doing. Does anyone know where I can learn the actual programming code R is built on (I am guessing it is S or S+ but I don't know).