# Interpreting dummy variables.

#### noetsi

##### Fortran must die
After all these years reading regression this should be simple to do...

Impact are the regression slopes for dummy variables.

I should say that the excluded reference group here is not a good idea to me, they are less than 16 of which we have extremely few and most likely they earn very little. I can not change it, it was decided by the federal government.

That said I don't see how every dummy variable can be positive. Some have to earn less than others. Is there a way to say, I have not seen this addressed, that relative to another level one level did better? Formally you are comparing customers in the category to those not. But my audience will want us to discuss how one level did relative to the other.

What I did say [not certain this is true formally for regression]

#### hlsmith

##### Less is more. Stay pure. Stay poor.
Yeah, what you have seems fine. And yeah you are correct that when grabbing estimates from the multiple linear reg they would be for the base case for the other references, so your presentation above seems fine.

#### noetsi

##### Fortran must die
I am running proc genmod, essentially OLS (the distribution is Normal and the link function is Identity). My dependent variable has two levels (0 and 1). As I understand it the slope is thus increased chance of being in one of these levels (I believe the increased chance of level 1, but I am uncertain of this in the documentation). Or decreased chance of course if the slope is negative.

With dummy variables it is the mean difference as always, but it still reflects the increased (or decreased) chance of being at one of the levels (again I assume this is level 1).

I ask this because in Proc Logistics unlike normal software SAS maximized the chance of being at level 0 not 1.

#### hlsmith

##### Less is more. Stay pure. Stay poor.
The output or log likely tells you which is the DV and IV reference groups.

Also, as mentioned before - this model would be kicking out the probability values and using the MLE.

#### noetsi

##### Fortran must die
I don't understand what you mean by "this model would be kicking out the probability values."

That this shows the increased probability of being at a certain level of the DV?

#### hlsmith

##### Less is more. Stay pure. Stay poor.
You are putting a binary DV into a linear model (dist=normal). You aren't gonna get log odds out of it correct?