Stepwise Regression Limitations Explanation

#1
I've recently been working on building a model and have come across a number of different approaches. I'm particularly interested in the limitations of using Stepwise regression as it has a huge amount of criticism online, however I can't find much material detailing why it's a poor method to use. Specifically, I've seen the following claims:

- R-squared values are biased too high.
- p-values are too low due to multiple comparisons.
- Parameter estimates are biased high.

Could someone please explain briefly how Stepwise regression causes the above claims?
 

hlsmith

Less is more. Stay pure. Stay poor.
#2
"p-values are too low due to multiple comparisons" well you run more than one model, so you may be inclined to correct for false discovery. If I throw all variables into a model and then do that over and over, I run the risk of finding spurious correlations.

In general, if you are not using knowledge of the context to guide your rationale, this could lead to the above.





The issue behind your first and last critiques seem similar.
 
#3
The basic problem with stepwise is it relies heavily on chance. What you find in one survey might generate totally different results for stepwise in another survey. There is also a serious problem with misspecification if variables that are correlated with each other and the DV exist. If stepwise excludes one and includes the other misspecification will occur.

The last variable included/first excluded can be particularly wrong. Stepwise gets blasted by statisticians - I once read a chapter entitled "Death to Stepwise: Think for Yourself" :p
 

hlsmith

Less is more. Stay pure. Stay poor.
#4
noetsi has an interesting point with the collinearity of estimators. Perhaps two variables have an overlap in predicting the outcome. The model may grab the better of the two, but the second never gets a fair shake to be included, since the analyst assumes it is not significant. Though the second variable may be a better predictor to use with the sample (cheaper/easier to collect, etc., even though the firsts trumps it.
 

rogojel

TS Contributor
#5
But one should check the VIF and if large take the appropriate steps, right? This is imo not a good argument against stepwise.

regards
 

hlsmith

Less is more. Stay pure. Stay poor.
#6
Partially agree, since you can set the criteria used in stepwise regression. So feasibly you may be able to control for this during the process. But I would guess mamy folks that use it miss this step. Especially since this step typically occurs from running a non-stepwise approach.


So conduct regression to get VIF, then run stepwise afterwards and drop or address potential collinearity concerns. Seems roundabout. Plus what about other assumptions for model fit or appropriateness. Stepwise currently is not telling you, you have leverage, etc., it is automated with basic criteria to fulfill.
 

CB

Super Moderator
#7
To me, bias is the biggest problem here. The bias occurs because you intentionally throw out variables that are non-significant.

Imagine the following scenario. You have a variable, X1, that (in the population), actually has a moderate effect size. You also have some other variables, X2-X5, which you would consider as part of your model. (But we'll focus on X1).

Now imagine we conducted repeated studies, each time randomly drawing a sample from the population and estimating a regression model. The (estimated) sample coefficient for X1 would vary: Sometimes it will be smaller than the true parameter value, and sometimes larger. Make sense?

Importantly, the cases when the sample coefficient for X1 is smaller will also tend to be the cases when the coefficient is not statistically significant.

If, each time we collected a sample, we used stepwise regression to exclude non-significant predictors, we would tend to systematically exclude all the instances in which the effect of X1 is relatively small.

Across stepwise models estimated on repeated samples, the average estimate of the effect of X1 will therefore be larger than the true parameter value of X1. In other words, the sample coefficient based on a stepwise regression is biased.

(NB: If using SPSS, this problem occurs regardless of whether you actually click "stepwise" "forward" or "backward" selection in SPSS - all are broadly stepwise methods. Further, it also applies if you manually exclude predictors based on their p values).
 

hlsmith

Less is more. Stay pure. Stay poor.
#8
CB I like your post but i am unsure if a person wouldn't exclude that variable even if they used a nonautomated approach. I agree say that in the arena of publication bias or general publishing we are likely to those samples with greater effects - the standard error then comes into play somewhat to remind us of the distribution of effects.

Side note, stepwise typically has inclusion exclusion criteria to help catch those small effect variables.
 

CB

Super Moderator
#9
CB I like your post but i am unsure if a person wouldn't exclude that variable even if they used a nonautomated approach. I agree say that in the arena of publication bias or general publishing we are likely to those samples with greater effects
Yes. Bias happens whether the selection is automated, whether predictors are binned manually based on p values, or whether papers aren't published because of "insignificant" results. Preregister + predetermined analyses + publish whatever you study (in some format) is one better option.
 

Englund

TS Contributor
#11
I must say that stepwise regression has its place in an analyst's toolbox. Not so much in an academic's toolbox though, who performs experimental trials for example. But in, for example, ecommerce companies with lots and lots of data on customer purchase and online behaviour I'd say that it is greatly advantageous to use stepwise regression.

I once built a predictive model for a large Swedish company who wanted to predict the probability that a customer would not place an order within one year. I had a couple of hundreds of variables to work with and I could identify a dozen which would surely have an impact on the DV.

Within many fields, you don't care if you've included the 'correct' IVs; the only thing you care about is whether the model gives accurate predictions. And if you can assure that your model can do that - why not stepwise?

So, what I did was that I used stepwise regression and found the 'best' model (or in other words: a good model) based on the least out of sample validation error. I did this for three different time periods and then I averaged the predictions for these three models. By building models based on three different time periods - the variables which does not affect the DV is expected to be cancelled out by averaging the predicted probabilities.

TL;DR - Stepwise regression can be useful if you don't care if you accidentally include variables which does not affect the DV. Or in other words: if you only care about the model's predictive capability.
 

hlsmith

Less is more. Stay pure. Stay poor.
#12
Agreed. I was going to mention this as well. On occassion when I have many variables I will run a stepwise to get a feeling for the covariates I may need to control from.
 

CB

Super Moderator
#13
Within many fields, you don't care if you've included the 'correct' IVs; the only thing you care about is whether the model gives accurate predictions. And if you can assure that your model can do that - why not stepwise?
I guess my main responses to this would be:
  1. Stepwise regression is not set up to maximise prediction accuracy - it's based purely on the significance of predictors
  2. Stepwise regression will give you an overly optimistic estimate of prediction accuracy

Using cross-validation is great, and that helps to deal with point 2. But it doesn't really deal with point 1. If you want to maximise out-of-sample prediction accuracy, stepwise regression isn't really the best tool - that's not what stepwise regression attempts to achieve. Some other options off the top of my head would be AIC, BIC, cross-validation (for selection, not just validation), or lasso.