Thoughts on Multicollinearity

#1
Greetings everyone,

This might be one response long or several. I would like to have an exhaustive list of issues surrounding multicollinearity. In my studies, it isn't always clear to me when multicollinearity is a problem.

I know how to spot the issue (large se, high correlation between predictors, large VIFs, etc.) This suggests that the parameter estimate is not stable from one sample to the next. There are a number of remedial measures we can take (standardizing the variable, centering the variable). I read that centering the variable doesn't change the interpretation of the coefficient. I suppose it would change the interpretation of the intercept. But, if interpretation of the intercept doesn't make sense to begin with then we don't have to worry about it.

To my knowledge, multicollinearity is not a problem when we are predicting a response (predictive modeling). If I took a model with high multicollinearity and predicted a response, I would get the same result as a model with centered variables. I have great trust in the people who have taught me, but this last point eludes me a bit. If the parameter estimates aren't stable from one sample to the next, why should I have reason to believe the predicted response wouldn't change drastically? (I know that multicollinearity can cause predictors to have opposite signs, is this a hint of an explanation?)
 

ondansetron

TS Contributor
#2
Greetings everyone,

This might be one response long or several. I would like to have an exhaustive list of issues surrounding multicollinearity. In my studies, it isn't always clear to me when multicollinearity is a problem.
I think, in general, the issue of MC is pretty interesting, but is also quite confusing for people outside of stats who don't have the insight you're getting in your MS program.

I know how to spot the issue (large se, high correlation between predictors, large VIFs, etc.) This suggests that the parameter estimate is not stable from one sample to the next. There are a number of remedial measures we can take (standardizing the variable, centering the variable). I read that centering the variable doesn't change the interpretation of the coefficient. I suppose it would change the interpretation of the intercept. But, if interpretation of the intercept doesn't make sense to begin with then we don't have to worry about it.
I think I've been taught that even "relatively" lower VIFs or things like that can still occur with MC because it depends on the whole picture of "things not adding up" and being unclear about the partial effects/relationships in a model. I would say, that if you center the variables, then, assuming all have been centered or it makes sense for all to be set simultaneously to zero (which is easiest if all are centered), then the intercept would have a logical interpretation when it previously didn't.

To my knowledge, multicollinearity is not a problem when we are predicting a response (predictive modeling). If I took a model with high multicollinearity and predicted a response, I would get the same result as a model with centered variables. I have great trust in the people who have taught me, but this last point eludes me a bit. If the parameter estimates aren't stable from one sample to the next, why should I have reason to believe the predicted response wouldn't change drastically? (I know that multicollinearity can cause predictors to have opposite signs, is this a hint of an explanation?)
The key is that the individual estimates may be unstable, but the weighted sum of this group is stable (as far as I can tell), which then means any of the collinear variables can separately have high instability in the estimated betas, but the group has more stability. I haven't searched for a more formal understanding, but this is my best guess as to why predictions are generally unaffected but inferences on the betas might be because the predicted Y-values are really just a weighted sum (which comes from a collinear group and possibly a portion from a noncollinear group). (I also think that your question about the opposite signs could help explain this, but also, just the unstable magnitudes, so a combo of these things.)
 

Dason

Ambassador to the humans
#3
I think it helps to take it to the extreme case and think about that. Imagine a situation where

X=Y=Z

If we want to predict Z using X and Y as predictors we could have the prediction equation

Zhat = aX + bY

And as long as a+b=1 we will always have a perfect prediction. But if we want to estimate the "true" values for a and b what should we choose? Using a=1 and b=0 works. Using a=0.5 and b=0.5 works. Using a=1000 and b=-999 works.

We can get perfect predictions even though we have little to no certainty in what the individual values of a and b should be. If you add even a tiny bit of noise to the system you'll get identifiable estimates but we can only get unique estimates because of random noise so it makes sense that there will be a lot of variability in the estimates.
 

hlsmith

Not a robit
#4
I would note that MC may not impact predictions much, but as eluded to SEs will be inflated, but not really inflated because they are what they are, and confidence wont be precise.
 

noetsi

Fortran must die
#6
My thoughts is that it rarely matters and the solutions may be worse than the problem. If the variables are related to each other in their impact on the DV then they are. Trying to artificially separate out their effects may well distort reality.