Distribution of DV in multiple OLS regression

#1
Hello,

I am running a multiple linear OLS regression on a DV with the following frequency distribution. The DV is the mean of 5 Likert-scale items (thus, pseudo-metric as common in social sciences).

I know that the DV does not have to be normally distributed in multiple linear OLS regression but that the residuals should be normally distributed. I checked for the normality of residuals and everything seems fine from this perspective (PP and QQ plot looks fine, Shapiro-Wilk Test for the standardized residual is not significant).

I would like to run a multiple linear OLS regression with this DV and would like to know whether you think this is justifiable? I do not want to run an ordinal regression for various reasons.

 
Last edited:

Karabiner

TS Contributor
#4
This looks decidedly non-normal to (there are 3 peaks). You could have a look at a Q-Q plot in order to get
an additional impression.

But your sample size is large enough, so that non-normality won't matter anyway.

With kind regards

Karabiner
 
#5
So you can't use ordinal regression since you check the averages which are not discrete?

As I know the normality assumption is only for the residuals.

The DV doesn't look like normal distribution but quite symmetrical, so even if it was residual's distribution I would probably won't deny regression.
just for the fun did you try normality test? what p-value do you get in the following Shapiro Wilk test? http://www.statskingdom.com/320ShapiroWilk.html
 
#6
This looks decidedly non-normal to (there are 3 peaks). You could have a look at a Q-Q plot in order to get
an additional impression.
Hi Karabiner, thanks for your answer. I thought the normality assumption applies only to the residuals?

But your sample size is large enough, so that non-normality won't matter anyway.
Sample size is about 130.

So you can't use ordinal regression since you check the averages which are not discrete?
Well, I guess thats open for debate. Researchers in my field interpret the mean of multiple Likert-Scale items as continuous.

The DV doesn't look like normal distribution but quite symmetrical, so even if it was residual's distribution I would probably won't deny regression.
just for the fun did you try normality test? what p-value do you get in the following Shapiro Wilk test? http://www.statskingdom.com/320ShapiroWilk.html
The graph shows the plain and simple frequency distribution of my dependent variable. Thanks for the link, I'll try the test later.
 
#7
[QUOTE="Well, I guess thats open for debate. Researchers in my field interpret the mean of multiple Likert-Scale items as continuous.[/QUOTE]

I assume the debate is for the Likert-Scale but you use an average of the Likert-Scale
 
#10
Thanks for all your answers.

I also checked the scatter plot showing the standardized residuals vs standardized expected values. You can identify three diagonal lines in the plot. I guess thats due to the three peaks in the frequency distribution of my DV. Is this problematic?

I read the following paper and blog post in this regard and the authors do not seem to mention any problems in this respect:

 
Last edited: