Non normally distributed predictors in regression modelling

#1
I am wanting to use linear regression modelling to look the association of symptom severity on various outcomes. The problem is it is significantly left skewed, and the normal transformations I have used in the past (cubic, log etc) do generate normal or near normal distribution. A college suggested rank-order inverse transformation which also does not help.

What other options do I have in terms of using this predictor in a regression model?

any advice would be gratefully received :)
 

Karabiner

TS Contributor
#2
Who told you that a predictor (or the dependent variable, by the way) has to
be normally distributed in a regression? This is absolutely not true.

With kind regards

Karabiner
 

hlsmith

Less is more. Stay pure. Stay poor.
#3
Some fields are historically transformation heavy. Transformation decisions are usually based on model residuals (error terms), as needed.
 
#4
thanks for the replies

I thought one of the assumptions for linear regression modelling was that continuous variables needed a gaussian distribution otherwise this violates the assumption? I'm not a statistician (probably v obvious!) so of course I may be wrong!
 

Karabiner

TS Contributor
#5
Well, nobody here is a statistician either. I'd be curious which source misled you to the assumption
you stated.

There is an assumption that the prediction errors (residuals) from the model should be normally
distributed (in the population from which the sample was drawn).
But if the sample size is large enough (about n > 30 or so), even that assumption can be
violated without negative consequences.

With kind regards

Karabiner
 
Last edited: