I have simple conceptual question:

In the simple linear regression problem, where the true relationship is,

\( y = ax + b + e \)

the error terms, \( e \), are assumed to be normally distributed \( N(0,\sigma^2) \).

However, linear regression only yields estimates \( \alpha \approx a \) and \( \beta \approx b \). The resulting equation is,

\( y = \alpha x + \beta + \epsilon \)

where \( \epsilon \) is the observable residual instead of the unobservable error \( e \).

1. How can one test the normality assumption of the error terms if the true errors are unobservable?

2. The residual is an estimator of the error. What kind of estimator is it? Unbiased? Maximum likelihood?

3. The true heart of my question is this: if the reasoning is to test the normality of the errors by testing the normality of the estimator of the errors, the residuals, how does one justify a test for normality since all of the residuals are correlated? From my understanding the residuals are all correlated by their dependence on the line of best fit. While, the tests of normality (goodness-of-fit) assume independent, identically distributed (iid) random variables.

Thanks,

mity