Sample variance s^2?

#1
Can anyone give me a definitive description of s^2, sigma^2 and sxx for regression?
I know that we can obtain an estimate of the true population variance from the sample variance from sigmahat^2 = (n/n-1)s^2
However, two different sources are showing me to calculate s^2 differently. One says to use the 1/n version and the other says the 1/(n-1) version!
I thought that s^2 was the same as sxx for regression , ie the 1/n version.
In what situation would you use the two different versions of s^2?
I'm very confused!
 

BioStatMatt

TS Contributor
#2
This is a great question. However, the answer is somewhat involved. You have described three different estimators for the variance of a random variable. The "1/n" version corresponds to the "maximum likelihood" estimator for variance. However, this estimator is biases. That is, it introduces a systematic error. The 1/(n-1) version is the unbiased version. That is why we typically use this as the estimator of variance. In the case of simple linear regression, we use the 1/(n-2) version bescause it is an unbiased estimator of the variance in this case. The 1/(n-2) version is what is used to calculate MSE (meas square error) in simple linear regression. It is interresting that each of these estimators are consistent. That is, as the sample size increases, each of these estimators will converge to the same value.

My advice to you: we almost never use the 1/n version, and alsmost always use the 1/(n-1) version outside the context of regression. Look carefully to see what is used when you encounter new types of statistics.

~Matt

~Matt