- Thread starter noetsi
- Start date

If I get time this weekend i will see about setting up a n example using work.heart set.

For time series I think there is agreement that multivariate approaches have little value to predict relative to univariate models [or at best add little] so I will not be using multivariate approaches to predict, only to show relationships between variables.

of note, you can just jam all variables into model and it will identify the true/best model. It just optimizes given presented terms. So for example, if you put effects of both covariates and the outcome, the model doesnt know this and will treat them as a covariate.

Lasso imposes a constraint on the beta coefficients. This is a modification of the typical least squares optimization problem.

See this wiki link under "Geometric Interpretation". https://en.wikipedia.org/wiki/Lasso_(statistics)

In this modified optimization there is a tuning parameter, lambda. When lamba = 0 we get the ols beta coefficients. But sometimes lambda is tuned such that the betas are exactly zero and this is due to the constraint region. Basically, lasso is a trade-off between bias and variance. We trade more bias for less variance in our predictions. I think ridge regression can be used to mitigate multicollinearity.

See this wiki link under "Geometric Interpretation". https://en.wikipedia.org/wiki/Lasso_(statistics)

In this modified optimization there is a tuning parameter, lambda. When lamba = 0 we get the ols beta coefficients. But sometimes lambda is tuned such that the betas are exactly zero and this is due to the constraint region. Basically, lasso is a trade-off between bias and variance. We trade more bias for less variance in our predictions. I think ridge regression can be used to mitigate multicollinearity.

Last edited:

If you get rid of the intercept how do you deal with reference levels of the dummy variables?

Why would one want to use lasso to chose variables when you know it biases the results. Biasing the results of the parameters would seem to be the cardinal sin of regression. Nothing is worse.

Because it can improve your predictive ability. https://en.wikipedia.org/wiki/Bias–variance_tradeoff

This is on page 5 of this article about LASSO. It made me wonder....

"The following are OBSOLETE!• When testing interactions, main effects should also be included"

http://www.misug.org/uploads/8/1/9/1/8191072/bgillespie_variable_selection_using_lasso.pdf