# Lasso regression

#### noetsi

##### Loves R
hey hlsmith did I do this correctly. All the variables including the dummy variable have two levels.

First I randomly split my data into two pieces. And ran this code

Code:
ODS graphics on;
proc glmselect data=randomdata plots=all;
partition fraction(validate=.3);
class pd1 pd2 pd3 pd4 pd5 pd6 pd7 pd8 pd9 pd10 pd11 pd1 pd12 pd13 pd14 pd15 pd16 pd17
pd18 pd19 pd20 pd21 pd22 pd23 pd24 pd25 pd26 pd27 pd28 pd29 pd30 pd31 ;
model dvd = pd1 pd2 pd3 pd4 pd5 pd6 pd7 pd8 pd9 pd10 pd11 pd12 pd13 pd14 pd15 pd16 pd17
pd18 pd19 pd20 pd21 pd22 pd23 pd24 pd25 pd26 pd27 pd28 pd29 pd30 pd31
/ selection=lasso(stop=none choose=validate);
run;
ods graphics off;

But this showed up so I don't know if there is an error or not.

Selection stopped because all candidate effects for entry are linearly dependent on effects in the model.

This is what I ended up with. So this is suggesting these are the most important variables right?

Parameter Estimates Parameter DF Estimate
Intercept1 1.034753
pd3_01 -0.114219
pd6_01 -0.011136
pd7_01 -0.131273
pd10_01 0.054962
pd11_01 0.019537
pd14_01 -0.085000
pd17_01 -0.137026
pd19_01 -0.020757
pd20_01 -0.238826
pd21_01 -0.010070
pd22_01 -0.085069
pd23_01 -0.169836
pd26_01 -0.016727
pd28_01 -0.024876
pd29_01 -0.135407
pd30_01 0.047060

Last edited by a moderator:

#### noetsi

##### Loves R
HLSMITH do you know what this means for LASSO and if its important?

I think that always happens when you chose cross validation for LASSO, but I am not sure.

Also if this warning matters for adaptive lasso/lasso.

WARNING: The adaptive weights for the LASSO method are not uniquely determined because the full least squares model is singular.

It makes model selections despite these errors and I use these variables in logistic regression without generating any warning.

#### noetsi

##### Loves R
It might be tied to this issue when I run logistic regression on the reduced data set used to choose the LASSO.

This does not occur when I have the entire data set. But I split it into two pieces first to choose lasso variables. Then to run the logistic regression. I thought the whole point of lasso was to chose variables when you have too little data relative to your variables. And my data set is 224 cases after being split which is not that small. Although there are only 24 cases on one level of the DV.

I guess this means I can't use lasso on this data set. Although it estimates lasso variables. But I am not sure if this is valid with this error.

#### hlsmith

##### Less is more. Stay pure. Stay poor.
How are you splitting the data? Perhaps split it putting a little more data into the training set and use a difference seed for the splitting. There is at least one subgroup based on predictors that is getting perfectly predicted. Which is likely a '0' outcome.

Are you then fitting the model on the holdout set?

#### noetsi

##### Loves R
This is what I was going to do. But splitting the data in two this way leaves only about 224 non-missing cases of which only 24 are at one level of the DV. I don't think that is enough to actually analyze with the logistic regression. I am talking about the hold out data set used for that regression.

Given the warnings I don't think the lasso data set is actually generating the correct variable list. Or at least I am not sure it is. I have been unable to find documentation if given these warnings the list of variables generated is accurate.

Given the coding how do I change the amount that goes into training for LASSO. I don't know what you mean by a difference seeding.

Do you know if given the warnings the lasso generates reasonable results?

#### hlsmith

##### Less is more. Stay pure. Stay poor.
So "data=randomdata" is 100% of your data?

And if not, run the model with 100% of data and see if you get the error. Then follow-up.

#### noetsi

##### Loves R
randomdata is the half of the overall data I used to do lasso.

Even when using the entire data set to do LASSO I get the following warning.
WARNING: The adaptive weights for the LASSO method are not uniquely determined because the full least squares model is singular.

and in the results I get the same warning as before. What is strange is that when I ran logistic regression with 31 predictors on the same data set I got no warning at all. So it works fine.

I thought lasso was intended to work with small data sets (sparse data). But it does not seem to work that way.

#### hlsmith

##### Less is more. Stay pure. Stay poor.
It still has the same underpinnings as regressions, but incorporates penalties to regularize coefficients toward zero. I'll look at the code tomorrow, perhaps SAS isn't using logistic as its base. I remember originally they didnt offer it with base SAS.

#### noetsi

##### Loves R
proc glmselect only runs linear models although for selection purposes my understanding is that does not matter.

I decided the issue was the use of K fold validation which requires larger data sets than I have. So I tried using SBC as the criteria instead of validation. I got no warning this time when I ran the code, but this still appears in the results (I have not been able to figure out if this is an error or something that always occurs when you run lasso).

I am also not sure I ran the code correctly. For whatever reason I always get overwhelmed by the SAS documentation.
Code:
ODS graphics on;
proc glmselect data=dvddu plots=all;
/*partition fraction(validate=.3);*/
class pd1 pd2 pd3 pd4 pd5 pd6 pd7 pd8 pd9 pd10 pd11  pd12 pd13 pd14 pd15 pd16 pd17
pd18 pd19 pd20 pd21 pd22 pd23 pd24 pd25 pd26 pd27 pd28 pd29 pd30 pd31 ;
model dvd = pd1 pd2 pd3 pd4 pd5 pd6 pd7 pd8 pd9 pd10 pd11 pd12 pd13 pd14 pd15 pd16 pd17
pd18 pd19 pd20 pd21 pd22 pd23 pd24 pd25 pd26 pd27 pd28 pd29 pd30 pd31
/ selection=lasso(stop=none choose=sbc);
run;
ods graphics off;

#### hlsmith

##### Less is more. Stay pure. Stay poor.
Yeah that is a good book. Next you have to stumble onto their book statistical inference in the computer age so?) and read the chapter that says LASSO is the same as Bayesian models in some cases.

#### noetsi

##### Loves R
SBC is BIC which I think stands for Bayesian Information Criterion

Most of the book was beyond me, but I figured it would be useful to others. I did find some of the early stuff useful.

I am not sure what you mean by this
Next you have to stumble onto their book statistical inference in the computer age so?)

I don't even try to learn Bayesian methods. I would be delighted learning the basics of regression...

#### hlsmith

##### Less is more. Stay pure. Stay poor.
The below book. Well if you are going to use a procedure it is good to know what it is doing - regularizing, much like Bayesian priors do. Superficially learning about many, if not all concepts helps them eventually click in the future. Quit wearing blinders and accept the idea that all procedures are approachable and related.

https://web.stanford.edu/~hastie/CASI_files/PDF/casi.pdf