Regression - What to do with insignificant variables?

#1
Please pardon me if you find this question very silly but this doubt has been troubling me for some time now whenever I want to run a regression.

I am working on SAS. I have a dataset which has 24,000 observations, and there are about 50 independent variables. There are no missing values and/or outliers. Dummy coding for categorical variables is also done. So, data preparation is complete. Now, when I run a regression model on this dataset, there are a few variables (8 variables) for which p-value is > 0.05 i.e. these variables are insignificant.

My question is what next? Do we remove these variables from the final regression equation? So, instead of having 50 independent variables, we'll have 42 independent variables. Or do we need to remove one of these insignificant variables and re-run the regression model to see if there's any previously insignificant variable becomes significant now?
 

Miner

TS Contributor
#2
It is not a silly question.


I remove variables one at a time and reevaluate the model at each iteration. However, I am in a field (industrial statistics) where it is relatively easy to validate the final model. In fields where this is not feasible, the practice might be different.
 

Karabiner

TS Contributor
#3
Now, when I run a regression model on this dataset
For what purpose? What do you want to achive with your analysis?
A model which can be generalized to other datasets? Maximum R²?
Theoretical insight about the relationships between your dependent
and independent variables? Or something else?

You could maybe also tell us what the study is about and where these data
come from and what the variables actually represent.

With kind regards

K.