dummy coding help for cat variables


New Member
I have a questionnaire with 28 items - 20 items are Likert attitude questions. In fact there are 5 main variables (constructs?) being measured by 4 items apiece that will be averaged. These are the main dependent variables.(Ex. one variable is student autonomy level)

The independent variables will be 6 items which include demographics like, age (continuous), gender (dicotomous-m/f), major (dicotomous), years of study as a distance student (1,2,3,4), previous experience as a distance student (dicotomous - yes/no), status as a student (three categories)

I want to do Anovas and Chi-square to compare the strength of the the 5 dependent variables.

I also want to do a multiple regression to see how well the independent variables predict the dependent ones - for example, as age increases does student autonomy level increase? I will look at interactions with the other variables such as gender - for example, are changes in autonomy due to age different for male and females.

My questions are:

Question 1) I want to start by putting raw questionnaire data into an excel document. Would I code categorical data like this?


Faculty (32) Home Ed =1, Hum Dev=2
Gender (33) Male=1, Female=2
Age (34) Continuous
Ent Term (35) Fall 2006=1, Spring 2006=2, 2005=3, 2004=4
Status (36) Full-time=1, Part-time=2, Other=3
Prev Exp (37) Yes=1. No=2

Question 2) Before I run my statistical analysis in SPSS do I need to change the above coding so that a 0's are used?

Faculty (32) Home Ed =0, Hum Dev=1
Gender (33) Male=0, Female=1
Age (34) Continuous
Ent Term (35) Fall 2006=0, Spring 2006=1, 2005=2, 2004=3
Status (36) Full-time=0, Part-time=1, Other=2
Prev Exp (37) Yes=0 No=1

Question 3 - Why? I don't really inderstand dummy coding? Is it just to make statistical analysis work out right - do I even need to understand the "why" of dummy coding? Wouldn't it be easier to code the same in EXcel and SPSS?

Many thanks...
Last edited: