How to handle missing data?


I just finished an online study in which participants filled out several questionnaires at baseline and after 6 weeks (posttest). There were two groups a control group and an intervention group. The intervention group received online treatment and control group was a waiting list condition. Since it was an online study, we had a high dropout. The dropout was much higher in the intervention group (71% dropout). In the control group the dropout was 49%.How should I account for the missing data? Should I use multiple imputations.

Hope you can help me!




Less is more. Stay pure. Stay poor.
Was intervention randomized? Are the subjects with all data comparable between the two treatment groups?
yes, it was randomized. There are no significant differences between the two groups at baseline.

Could I do an analysis on the completed cases only?
Last edited:


No cake for spunky
There really are two issues here. Missing data can be handled by multiple imputations, and many other approaches depending on data type. But as I understand it when the drop out rate is different for intervention and control groups this creates additional issues for the analysis. I have not seen multiple imputations used for that although I am hardly an expert. To use multiple imputations you have to assume the missing data is MAR rather than MNAR and I wonder if that is a reasonable assumption if the drop out rate was significantly different.


Less is more. Stay pure. Stay poor.
Yes, re-compare baseline characteristics. Also, don't just focus on significant differences, look at the actual values, since this time around statistics will have lower power and a greater chance for a type II error.

So it is a good thing that the baseline covariates were balanced prior to the intervention. Now if there is a different, those differences can help explain and possibly impute these missing data. Noetsi, perhaps they can split the sample into two, intervention and non-intervention group. Now they can impute for each group then merge them back together.

One side note, you are missing ALOT of data, so imputation may still be a limited approach.


TS Contributor
In a clinical trial, one has to define beforehand how to deal with missing data.
Now we have the problem to find a sincere, unbiased approach after the missing
data pattern is known.

What could have defined beforehand? One approach could have been ITT analysis
with Last Observation Carried Forward substitution. In addition, a per-protocol
analysis with study completers could be performed (expressedly as a secondary

By the way, you should always report sample size if you describe a problem with
data analsysis.

With kind regards