Multiple response variable missing

#1
If I have a multiple response variable and none of the categories is checked, the subject should be coded as missing right?

EX
Race
RWhite
RBlack
RChinese
ROther

Participants can choose more than one option. 1=check 0=not check. If a participants did not check any of the categories, I should code all the race categories as missing (.) and NOT as 0s correct? I believe that if he/she did not answer this question at all and I coded all the categories as 0, I will under-represent the 1s since there will be more 0 when in reality should be excluded do to missing data.

Can any one clarify this!
Thanks!
Marvin
 

hlsmith

Less is more. Stay pure. Stay poor.
#2
If I follow you, yes it should be classified as missing and should not be incoporated in the percentages.

However, depending on your pupose and how many missings you have, and what other statistics you may run - some people classify all missing as its own race group and use it in analyses to make sure it is not impacting statistics/hypotheses. That would be that it is not missing at random.
 

noetsi

Fortran must die
#3
There are methods to address data that is missing for example multiple imputations. This assumes you don't believe it is missing at random.
 

hlsmith

Less is more. Stay pure. Stay poor.
#4
noetsi, I am familiar with imputation but not with the assumption that it is not missing at random. I can see pros and cons for this.
 
#5
So is is valid enough, to coded as missing all the race variable if a participant did not answer any of them. But, If a participation check at least one race the other should be coded as 0. This is what I have been taught.

Id rwhite rblack rchinnese rother
1 0 0 0 0
2 1 0 0 0


It should be:

Id rwhite rblack rchinnese rother
1 . . . .
2 1 0 0 0
 
#6
So is is valid enough, to coded as missing all the race variable if a participant did not answer any of them. But, If a participation check at least one race the other should be coded as 0. This is what I have been taught.

Id rwhite rblack rchinnese rother
1 0 0 0 0
2 1 0 0 0


It should be:

Id rwhite rblack rchinnese rother
1 . . . .
2 1 0 0 0
 

noetsi

Fortran must die
#7
noetsi, I am familiar with imputation but not with the assumption that it is not missing at random. I can see pros and cons for this.
If the data is missing completely at random I don't think it is seen as a major problem. Not something you would spend signficant effort addressing. Commonly you do this when the data missing is not assumed to be MCAR
 

hlsmith

Less is more. Stay pure. Stay poor.
#8
How are these data collected. Was each race category a question of its own, and people could potentially not answer OR answer no to each one???