Include or Exclude a Variable in Regression ?

#1
Dear all,

Dr Timothy Z Keith advised that a new variable should not be included unless it is
a common cause (associated with both dependent variable and factor).

On the other hand, Dr Samprit Chatterjee suggested that we should ask 2 questions
before such decision. Those 2 questions are :

1) is the coefficient of the new variable significant ?
2) Did the new variable change the coefficients of other variables substantially ?


Finally Dr Wayne Winston suggested that our decision may be based on p values
and t values.

Their advice are so different, which one should I follow ?

Thank you very much in advance.

Sincerely
Marty
 

hlsmith

Omega Contributor
#2
Depends on your purpose, if you are trying to examine a single variable, say an exposure or intervention, then Keith is right about only needing to control for confounder in that model. Controlling for other independent variables won't help you better under stand the relationship between the exposure and outcome.


Side note, solely focusing on significance levels puts blinders on you, in a bad way. What if the pvalue is 0.055, do you ignore the variable, etc.!
 
#3
Depends on your purpose, if you are trying to examine a single variable, say an exposure or intervention, then Keith is right about only needing to control for confounder in that model. Controlling for other independent variables won't help you better under stand the relationship between the exposure and outcome.
Thank you very much, hlsmith. You cleared my clouds of confusion.