Dummy Variables and Multiple Regression

#1
Hi, if someone could make this clear for me, would be much appreciated..

Is it possible to create a dummy variable for more than one outcome.

For example, If I had a column of numbers (say the number of windows in a house) and I wanted to create a dummy variable so that every house with "1,2,3,4" windows would produce a value of 1 and every other house would produce a value of 0 - is this possible and if so how do I do it on Excel?

I don't want to have to create a new column for each dummy variable so that I have one for 1 ... 2 ... 3 etc... especially as a lot of the data only has like 1 "1" value.

Hope that makes sense :S. Thanks in advance.

:)
 

Dr.D

New Member
#2
Hi,

Dummy variables are created for variables with more than one outcome.

Each outcome will be a new variable and you would need to have a separate column (hence separate variable) for each 1, 2,3 4. Under each column, you will 1's and 0's as responses, depending on the column. For houses with 1 window, under that column '1', you put 1. For others, zeros.


Not sure how to automatically do it Excell but you can put in manually. Or do it in SPSS.
 
#3
Okay, thanks. Think I've got my head round it...

Just another quick question. If you were doing a regression analysis and developing a model to say erm I don't know...assess what factors caused people to buy cars or something ... what would be an acceptable number of independent variables to have at the end, once you've worked through and decided which ones to reject through looking at the individual coefficients.

I mean, a lot of the models I've looked at in textbooks etc... only seem to have about 3 independents but at the moment in my model it is tending towards 6! Would this be a problem.

Also final thing. When you've done a correlation matrix what values would suggest you should reject independents at this stage and how much judgement can you use to say...yes the correlation is low but this might actually be significant for this study.

Thank you soo much

:)
 

Dr.D

New Member
#4
Okay, thanks. Think I've got my head round it...

Just another quick question. If you were doing a regression analysis and developing a model to say erm I don't know...assess what factors caused people to buy cars or something ... what would be an acceptable number of independent variables to have at the end, once you've worked through and decided which ones to reject through looking at the individual coefficients.

I mean, a lot of the models I've looked at in textbooks etc... only seem to have about 3 independents but at the moment in my model it is tending towards 6! Would this be a problem.

Also final thing. When you've done a correlation matrix what values would suggest you should reject independents at this stage and how much judgement can you use to say...yes the correlation is low but this might actually be significant for this study.

Thank you soo much

:)

Well there is no recommended number of independent variables for models in regression.

However, researchers will tell you concise models are best. However, six are ok. But remember, the sample size requirements: N > 50 + 8p (where p is number of x-variables/predictors). This means that if you have 6 x-variables:

(8 x 6) = 48. 50 + 48 = 98 cases/observations are your minimum sample size for regression analysis. So bear this in mind.


When you look at the correlations, you should be able to see which ones are significant, albeit small, based on p-values. The convention for sizes of correlations:

.1 = small
.3 = moderate
.5 = strong

To see whether variables are significant, look to see if p-values are smaller than .05.