Composite Scores - Approaches to Missing Data

#1
I have some data from an assessment form that has 25 items, all on a 5-point scale. What I'm looking at is the sum of these items.

For a few items, there are missing values, which obviously affects the composite score that I'm looking at. I'm just wondering what approach is best to deal with this? My sample size is small (25), so I don't want to discard any participant data unless I have to.
 

jepo

New Member
#2
I'm bit unsure what your data actually is like. Do you have only one assessment form?

In case you have many assessments I suppose you could try to imput the missing data. First you should diagnose whether the data is missing completely randomly (MCAR) or less randomly (i.e. missing at random (MAR)).

The diagnosis can be done at least by:
1. forming two groups - one having the assesments with missing cases and the other the assessments with complete cases - and then comparing these groups for each variable (e.g. testing the similarity of the means between two groups)
2. comparing the missing data pattern to one expected by truly random missing data process (have not tried)

After you have classified your data as MAR or MCAR you can choose appropriate imputation methods. Do you think this gives enough "ground" to start with?

Some programs provide tools for this purpose (I've tried only SPSS Missing Values packet)
 
#3
I'm bit unsure what your data actually is like. Do you have only one assessment form?

In case you have many assessments I suppose you could try to imput the missing data. First you should diagnose whether the data is missing completely randomly (MCAR) or less randomly (i.e. missing at random (MAR)).

The diagnosis can be done at least by:
1. forming two groups - one having the assesments with missing cases and the other the assessments with complete cases - and then comparing these groups for each variable (e.g. testing the similarity of the means between two groups)
2. comparing the missing data pattern to one expected by truly random missing data process (have not tried)

After you have classified your data as MAR or MCAR you can choose appropriate imputation methods. Do you think this gives enough "ground" to start with?

Some programs provide tools for this purpose (I've tried only SPSS Missing Values packet)
Thank you! A very helpful post.

I consulted with the author of the assessment form and he said much of the same. So...I determined that the missing values, which were quite minimal, were MCAR. The author suggested in this case that I use a value of 3 - essentially a neutral scoring on the 5-point scale being used.

Thanks again!

PS: Sorry for the lack of clarity in my original post. I had multiple forms.
 
Last edited:
#4
If all you care about is the sum and not which questions got which scores, the mean response on each questionnaire gives you just as much information. So perhaps you can compute the mean of all questions answered for each form--some will be sum/25, others sum/24, etc.

I'm a little concerned that if you plug in 3 for missing values, you'll bias the mean towards 3 (or the sum towards 75).
 
#5
multiple imputation

I agree, don't just plop in a mean score. Many have shown that complete case (or deleting the variables that have the missing cases) is even better than this.

Although a bit advanced, you can do multiple imputation as well. SAS has PROC MI for this, see http://support.sas.com/rnd/app/papers/miv802.pdf

You'll blow your advisor's mind if you can pull this off - trust me.
 
#6
I have some data from an assessment form that has 25 items, all on a 5-point scale. What I'm looking at is the sum of these items.

For a few items, there are missing values, which obviously affects the composite score that I'm looking at. I'm just wondering what approach is best to deal with this? My sample size is small (25), so I don't want to discard any participant data unless I have to.
How much of your data are missing? If the percentage is really small (like under 5%), then any approach is likely to give you the same outcome, even mean imputation (which has been shown to be the worst solution).

The greater the percentage of missing data, the more important it is to use a good technique.

Karen
 
#7
How much of your data are missing? If the percentage is really small (like under 5%), then any approach is likely to give you the same outcome, even mean imputation (which has been shown to be the worst solution).

The greater the percentage of missing data, the more important it is to use a good technique.

Karen
Karen, it's definitely less than 5%.

Thanks so much for the response - and also to AtlasFrysmith and SmilingSara for your input.

I agree with Karen that since I'm missing very little data, any approach will likely grant the same outcome.