Data cleaning for categorical variable

#1
Hi,

I am trying to run a chi-square for 200 observations from my dataset on Internet Use. Only one of the observations for the gender was reported as Other. It's giving me an extreme value when I include it the chi-square table. I was wondering if correct to code it as missing variable or delete the entire observation is a better way to deal with it?

Thanks.

Sandeep
 

hlsmith

Not a robit
#2
Dropping it seems like the simplest action, given it is a single observation. Which would mean that you post hoc modified your exclusion criteria So you cannot generalize results to that gender group - which you probably weren't planning on doing anyway.
 
#3
@hlsmith Thanks. This makes prefect sense. Just for curiosity, what if I code "Other" in gender as a missing variable. Do you think that will change any of my stat or interpretation?
 

hlsmith

Not a robit
#4
Well coding them as missing will do two possible things. 1.) every time you look at counts with percentages it will add another group called missing with a values of 1; or 2.) ignore it; depends on your software and options used.

When running some procedures, say like regression, many programs will perform what is call listwise deletion. That means the total row will be excluded from the model if the missing variable is attempted to be used in it.