Dependent variable as a percentage-which model to use


New Member

I am dealing with a dependent variable which is a percentage, so which regression model would be better in this case?
The histogram and kernel density of this dependent variable show that the distribution have many values concentrated around 0 and 1 and others in between. It doesn't have a nice distribution around 0.2-0.8 or 0.3-0.7 so that I can use the usual linear regression model in this case.
I found that some texts and online guides suggest to use a GLM model with logit link and binomial family.
Is it right? or shall I use a GLM model with logit link and leave the default family, which is the gaussian?

Besides, do I need to do something on the coefficients after running the GLM model? Can I interpret them as in the case of a linear model or do I have to do something like taking the log or the exp?

I have attached a word file with the graphs of the Kernel density and the histogram and then the coefficients as they presented to me after running the GLM model with logit link and binomial family.

I would be grateful if anyone can give some insights on whether I followed the right approach for the model and on the interpretation of the coefficients.

Thank you very much


New Member
A normal distribution is not a requirement for regression. If you're variables are censored at 0 and/or 1 by your measurement technique, than you have another sort of problem (I believe you will want to use a probit regression). However, an odd distribution on its own is not a problem for regression unless this odd distribution reflects other underlying violations of the assumptions of the regression model you are using.