Coefficient of determination and statistical significance

#1
Hi,
I tried two different models for a regression (both with one predictor).

The first one was a linear regression model of the type Y= a + bX and the coefficients seem to be statistically significant (at the 95% level) and the coefficient of determination is relatively high : R^2 = 0.85

The second model I used was a model of polynomial regression of the second degree (quadratic regression). I found that the coefficients were NOT statistically significant. However, i found R^2 = 0.89

How should I interpret this?

Btw: I did use the same data set for in both regressions, obviously.

Thank you :)
 

Dason

Ambassador to the humans
#3
Well it should be noted that no matter what if you increase the number of predictors your R^2 will increase. Even if the predictor is completely unrelated. You could add a predictor that is the temperatures in Korea during 1947 and it would increase your R^2 value.

Sample size is a concern.

Have you tried graphing the data? Have you graphed the data? What do the residuals look like for the simple linear regression?
 
#4
Well, the sample size was quite small. I have 11 observations. I used one predictor in both models:

Basically

1) Y = a+bX
2) Y = a + bX + cX^2

The residual plot of the linear regression looks relatively random in the sense that it doesn't suggest that the underlying regression function should be quadratic.

I just don't understand how in the second model, all coefficients are statistically insignificant, but at the same time I get a higher coefficient of determination...
 

Dason

Ambassador to the humans
#5
I just don't understand how in the second model, all coefficients are statistically insignificant, but at the same time I get a higher coefficient of determination...
Well you get a higher R^2 because of the reason I stated. If you add any predictor then you're going to increase R^2. If it didn't look like a quadratic would help and you only have a sample size of 11 then yes it makes perfect sense that the other terms wouldn't be significant anymore.
 

Mean Joe

TS Contributor
#6
The coefficient of determination measures how much of the variance is explained by the model. So adding more variables on top of a model will raise the R^2.

The non-significant result for coefficients means that they are not statistically different from 0. Maybe the coefficient from your data is 0.30. That does not mean it is 0, especially with such a small sample. I suppose with your data, the linear model had a larger coefficient (hence lower p-value), then when you added the quadratic term then the coefficient for the linear term becomes smaller (some of its effect is being taken up by the quadratic term), hence it's p-value increases.

Basically the p-value depends on the value of coefficient/standard error for the X term. I'm wondering, since the R^2 does not change much (0.85 to 0.89), if the standard error for the X term also does not change much?
 
#7
Oh ok, thank you:)

I do understand that Rsquared will increase even if the variable added is the squared value of the first variable.

What i dont understand is why when i perform hypothesis tests on the coefficients (in the quadratic model) i find that they are not significant at the 95% confidence level...

I mean, purely from looking at the data points and the drawn regression lines, the quadratic model seems to fit a bit better than the linear one. For the linear one i get very clear statistical significance (t value of about 10 and -15)

Can you just explain this further? I just started out with statistics, as you have already noticed! :)

Thanks

EDIT: I hadnt seen the previous post
 
Last edited: