Proc MI fully conditional specification

#1
Hi everybody,

I'm using multiple imputation with the fully conditional specification (fcs) method in SAS (proc mi, fcs statement) to replace missing values in 26 variables. My choice for fcs was motivated by the fact that the missing value pattern was arbitrary. All variables (except one) have missing values, with missing value percentages for the variables in the range 0.2%-36.4% (median 14.1%). This is the first time I'm conducting multiple imputation, and I assume those are a lot of values to impute. All variables are binary, so I used the logistic statement in fcs. The syntax goes as follows.

proc mi data = datain out = dataout nimpute = 5 seed = 123456;
class var1 var2 ...var25 var 26;
fcs nbiter=100
logistic(var1/details)
logistic(var2/details)
...
logistic(var25/details)
logistic(var26/details);
var var1 var2 ... var25 var26;
run;

Variables in the var statement were sorted by % missing values (descending).

Although the procedure did run completely, I got the warning message: "The maximum likelihood estimates for the FCS method logistic model for variable var1 in an iteration process may not exist. The resulting posterior predictive distribution of the parameters used in the imputation process is based on the maximum likelihood estimates in the last maximum likelihood iteration.

This message was shown for about 9 out of the 26 variables. For those variables (which have low sample proportion estimates in the non-imputed set), the sample proportion estimates were considerably higher then in the non-imputed set. Although I have no understanding of what could have gone wrong in the statistics behind the fcs method, I presumed that those variables that have low sample proportion values were maybe not appropriate to use for the imputation. For now, I constructed specific models in the fcs statements for each variable for which I received a warning, each time leaving out all the other variables for which I received warnings.

Suppose the warning was shown for var 1-8:

proc mi data = data out = data nimpute = 5 seed = 123456;
class var1 var2 ...var25 var26;
fcs nbiter=100
logistic(var1 = var9 var10 var11 var12 ... var25 var 26/details)
logistic(var2= var9 var10 var11 var12 ... var25 var 26/details)
...
logistic(var25/details)
logistic(var26/details);
var var1 var2 ...var 25 var 26;
run;

I did not receive anymore warning messages. Sample proportion estimates calculated from the imputed dataset were acceptable (although still somewhat inflated for low sample proportions).

My questions:
- is this good practice; am I trying to impute too many missing values?
- what is the exact problem causing the warning messages?
- did I adress the problem in an appropriate way or should I do something else?

thanks in advance for any reply or comment,

Kind regards,

Philippe
 

noetsi

No cake for spunky
#2
One thing that may be an issue is that sometimes the FCS algorithm may not converge. This has to be checked for every variable separately[in theory most don't do that in practice I suspect with a lot of variables]. It may be that the algorithm is converging for some variables not others. Trace plots are recommended to check for this. But a major hang up with this is that apparently trace plots only work with interval variables and I have found no way to check non interval variables for convergence failure to date. I think the solution for convergence, or one, is changing the number of burn ins.

Code for trace plots in case they ever help is

ods graphics on;
proc mi data=Fish3 seed=1305417 out=outex8;
class Species;
fcs plots=trace
logistic(Species= Height Width Height*Width /details);
var Species Height Width;
run;
ods graphics off;

If you think this is the issue you might want to look up FCS convergence.

We have a massive thread on MI in the context of SAS in the Applied Statistics. Mainly me begging for information... :p My PROC MI document is now at 42 pages of links and code and getting larger. Its not a simple process.