Normality assumption for Regression

#1
I have an outcome variable that is composed of one item on a 3 point scale to assess participation (none-1, member of 1 program-2, member of both programs-3). n=766

The test for normality is significant for the residuals. Is this because the range is too small to be converted to any type of distribution? The QQ plots look crazy as do the scatterplots.

Can I still proceed with MLR or should I collapse this item into a dichotomous (Y/N) and perform logistic instead? Thank you!
 

hlsmith

Less is more. Stay pure. Stay poor.
#2
You can also use multinomial, since you have 3 groups that seem more categorical than 3-point scale.
 

noetsi

Fortran must die
#4
I have an outcome variable that is composed of one item on a 3 point scale to assess participation (none-1, member of 1 program-2, member of both programs-3). n=766

The test for normality is significant for the residuals. Is this because the range is too small to be converted to any type of distribution? The QQ plots look crazy as do the scatterplots.

Can I still proceed with MLR or should I collapse this item into a dichotomous (Y/N) and perform logistic instead? Thank you!
Normality is not needed for much of the linear analysis (its primary role I believe is in confidence intervals). You could transform the data to make it normal but if you have a three point DV its better to run ordinal or multinominal regression (or possibly convert it to two levels and do bivariate logistic regression).

When you have less than five levels you are normally going to run into nonnormal data.
 
#5
Normality is not needed for much of the linear analysis (its primary role I believe is in confidence intervals). You could transform the data to make it normal but if you have a three point DV its better to run ordinal or multinominal regression (or possibly convert it to two levels and do bivariate logistic regression).

When you have less than five levels you are normally going to run into nonnormal data.
Thank you very much!
 

noetsi

Fortran must die
#6
Also remember that what is really at issue is the normality of the residual terms (well some disagree, but we won't go there). :p

Certain estimators for regression such as maximum liklihood do require normality I believe - at least in the context of specific methods. That is a seperate issue. For example when you do SEM which commonly uses maximum liklihood, normality is a major issue. When nonnormality occurs you have to utilize a different estimator like robust weighted least squares.

Point being pay attention not just to the method like regression, but what estimator is being used when you consider the importance of normality.