Dealing with multicollinearity

I have a following model:
y=a+b1x1+b2x2+b3x1x2+b4x3+other terms $

Let's say, I am only interested in coefficient b3. The magnitude of coefficient b3 is 0.54 and its t value is 4.50 and the magnitude of the coefficient b2 is 2.00 and it's t value is 3.00. Now after including the interaction term x2x3

y=a+b1x1+b2x2+b3x1x2+b4x3+b5x2x3+other terms $

, the magnitude of the coefficient b3 is 0.53 and t value is 4.30 and the magnitude of the coefficient b2 is 300 and t value is 0.001. Note that the coefficient b1 doesn't change much.

I would like to know whether I should consider adding the interaction term x2x3 . I want to reiterate that I am only interested in coefficient b3.


Less is more. Stay pure. Stay poor.
If the interaction term is significant you should included it, since there would appear to be a conditional relationship between X2 and X3. You also keep the main terms used in the interaction in the model when you have a significant interaction regardless of there significance.

Not sure where the question about multicollinearity is here?
Thank you. I suspected of multicollinearity because standard error and magnitude of b2 increased by a large amount when I added an interaction term x2x3 in the model.


New Member
Multicollinearity happens when 2 or more independent variables are highly correlated. You can check it using measures like VIF (Variance Inflation Factor) or Tolerance (which is 1/VIF). Or you can inspect correlation matrix (which you should always do before analysis).


Less is more. Stay pure. Stay poor.
Yes, there is obvious collinearity between X2 and X2*X3 term. This makes absolute sense. So then you get larger SE values, lowering statistical power given the null hypothesis is false.

A way to tackle this is to center your variables if they were continuous. Please try this out if applicable and report back your SE values for X2 before and after. The following thread shows how this happens. Side note, it now becomes inappropriate to attempt to interpret the X2 main term, because the variable is established as conditional on X3.