handling missing data

#1
1. Do you think it makes sense to compute missing demographic information (e.g. gender, race, age) using EM algorithm?

2. I am using AMOS to test a latent mediation model.

I know that I can use Amos's FIML method with containing all missing values, but I feel like I have to delete the data that has no responses for all 7 items of a dependent variable.

I wonder if I can use FIML method even if respondents provided no responses for a dependent variable (But they responded to the items of predictors and mediators).

3. Because I cannot perform bootstrapping with missing data using AMOS, I think I will do sobel-test instead of bootstrapping.
I wonder if it's better to compute missing values using EM algorithm, and then do bootstrapping. In this case, there are many missing values for demographics and I don't feel good about computing demographic variables. ( I will do two-group analysis using gender and race, and I am not sure if I can do two-group analysis using computed gender values).

Your opinions would be a big help for my research. I couldn't find info about these questions.

Thank you in advance.
 
#3
General question, what type of missingness do you have. Not all missingness can be successfully imputed.
Results of Little's MCAR test were significant at .001, showing that it's not MCAR.
Not sure how I can know if missingness is MAR or not.
I got significantly different paths between girls and boys, and I think these results are very meaningful.
However, given 5% of missing values on gender, I wonder if I can use these results.
- Sample sizes are different. The whole group analysis used about 880 samples whereas two group analyses used about 840 sample due to missing values on gender. Would it be okay?
 

hlsmith

Omega Contributor
#4
Do you have a theory for the missingness? Can the missingness be explained by other variables in dataset or extraneous variables not in the dataset?
 
#5
Do you have a theory for the missingness? Can the missingness be explained by other variables in dataset or extraneous variables not in the dataset?
My guess is NMAR. I decided to try EM algorithm to do bootstrapping using AMOS though.
I also decided to exclude two participants who did not respond to any of 7 items about a dependent variable.
I am not sure if it's better to keep them and impute their missing values even if they did not provide any responses for a dependent variable
(they answered other questions about predictors).