Simple linear regression - Do I include the constant in the equation?

#1
Hello,

I am hoping someone can help me with a few regression questions. I am running a number of simple linear regressions....Do I want to (should I) include a constant in the equation?

Also, when I do run the regressions with the constant my r2 values are all very low (0.00006 - 0.05)...I realize that only really tells me about what amount of variability can be predicted by the independent variable, but when they are that low are is the ability to predict important at all (even when they are significant at p<0.01)???

Need some help...Thanks
 
Last edited:
#2
Hi grensot,

You usually need to include the contant in the regression equation. R^2 value of ~0.05 indicates that the equation is not useful for making predictions. I wouldn't use the equation even if the independent variables are significant.
 

JohnM

TS Contributor
#3
grensot said:
Hello,

I am hoping someone can help me with a few regression questions. I am running a number of simple linear regressions....Do I want to (should I) include a constant in the equation?

Also, when I do run the regressions with the constant my r2 values are all very low (0.00006 - 0.05)...I realize that only really tells me about what amount of variability can be predicted by the independent variable, but when they are that low are is the ability to predict important at all (even when they are significant at p<0.01)???

Need some help...Thanks
This can happen when you have a large sample size - a relatively low r2 value (indicating a "weak" association) can be statistically significant.

Remember, statistical significance says nothing about usefulness, and only tells you if the statistic is larger than what would be typically found by chance.
 
#4
After reviewing my output some more I am convinced that there is no real relationship here, although I would like to know why larger sample sizes (mine is n=295) can result in a significance of p<0.05 when the r2 is really low and the slope is practically 0 (0.0024)

any thoughts?
 

JohnM

TS Contributor
#5
In general, the larger the sample size, the more precise and sensitive a statistic is to differences - the larger the sample, the greater the chance of detecting a very small difference.

Here's a link that shows how to compute a t-statistic to see if "r" is significantly different from 0.

t = r * sqrt [ (n-2) / (1-r^2) ]

The sample size "n" is in the numerator, so in general, the larger the sample size, the more likely it is that r will be statistically significant.

http://davidmlane.com/hyperstat/B134689.html