Likert item analysis: ANOVA and Kruskal-Wallis assumptions violated

#1
Hi there,

I am trying to analyse whether three groups (samples sizes 27, 26 and 23) differ in their responses to 20 questionnaire items, each measured using a 5-point likert scale. The items are all investigating different things, so it is not appropriate/of interest to condense them into likert scales for broader constructs.

I am well aware of the controversy surrounding the use of ANOVA for likert data. However, I have been advised that this could be justified, and I know of papers that suggest ANOVA is relatively robust in most situations. However, my data is also non-normal and contains multiple extreme outliers. As these outliers represent real likert responses, I believe that they have value and should not be excluded. Overall, I feel like there are a lot of arguments against using one-way ANOVAs to analyse the data at this point.

I have therefore also endeavoured to analyse the data using the Kruskal-Wallis test, which at first seemed to be more appropriate. However, the data violated the assumption that distributions across groups should be similar and consequently analysis can only tell me about the distributions of likert responses, rather than median responses which would make more interpretable sense (I think?). Following a guide to conducting and reporting the analysis, I arrived at findings like this:
'Likert ratings increased from Group 1 (mean rank = 33.78), to Group 2 (mean rank = 38.41), to Group 3 (mean rank = 42.77), but the differences were not statistically significant, χ2(2) = 2.218, p = .330.'
Talking about 'mean rank' seems to be irrelevant to my discussion of whether, and in what ways, groups differed in their likert responses.

My supervisor is suggesting that I stick to one-way ANOVA, but I do not see how the results of this can be justified for this data. At the same time, Kruskel-Wallis does not seem suitable for my questions of interest.

I would greatly appreciate any advise on how I can best analyse my data at this point!
 

ondansetron

TS Contributor
#2
Hi there,

I am trying to analyse whether three groups (samples sizes 27, 26 and 23) differ in their responses to 20 questionnaire items, each measured using a 5-point likert scale. The items are all investigating different things, so it is not appropriate/of interest to condense them into likert scales for broader constructs.

I am well aware of the controversy surrounding the use of ANOVA for likert data. However, I have been advised that this could be justified, and I know of papers that suggest ANOVA is relatively robust in most situations. However, my data is also non-normal and contains multiple extreme outliers. As these outliers represent real likert responses, I believe that they have value and should not be excluded. Overall, I feel like there are a lot of arguments against using one-way ANOVAs to analyse the data at this point.

I have therefore also endeavoured to analyse the data using the Kruskal-Wallis test, which at first seemed to be more appropriate. However, the data violated the assumption that distributions across groups should be similar and consequently analysis can only tell me about the distributions of likert responses, rather than median responses which would make more interpretable sense (I think?). Following a guide to conducting and reporting the analysis, I arrived at findings like this:
'Likert ratings increased from Group 1 (mean rank = 33.78), to Group 2 (mean rank = 38.41), to Group 3 (mean rank = 42.77), but the differences were not statistically significant, χ2(2) = 2.218, p = .330.'
Talking about 'mean rank' seems to be irrelevant to my discussion of whether, and in what ways, groups differed in their likert responses.

My supervisor is suggesting that I stick to one-way ANOVA, but I do not see how the results of this can be justified for this data. At the same time, Kruskel-Wallis does not seem suitable for my questions of interest.

I would greatly appreciate any advise on how I can best analyse my data at this point!
A couple points of clarification before shooting off possibly ill-informed advice.
1) What are you specifically trying to answer/know?
2) have you collected other variables? If so, how are they measured and what are they?
3) Why does your advisor insist on ANOVA?
 
#3
Hi there,

Thank you for your reply!
1. The questionnaires are just being used to understand whether different groups vary in their experiences/perceptions of the main task. For example, whether they all found it equally as difficult. It is exploratory analysis so I dont have to include everything in my write-up, but analysing the questionnaires might provide important insight into the main results.
2.Yes, the main study used coded interview transcripts that were analysed using one-way ANOVA and met the assumptions for this.
3. I think mainly because of past advise she has had that ANOVA is robust and also because the alternative (Kruskal-Wallis) makes the results difficult to interpret.

Hope that helps.
 

ondansetron

TS Contributor
#4
Hi there,

Thank you for your reply!
1. The questionnaires are just being used to understand whether different groups vary in their experiences/perceptions of the main task. For example, whether they all found it equally as difficult. It is exploratory analysis so I dont have to include everything in my write-up, but analysing the questionnaires might provide important insight into the main results.
2.Yes, the main study used coded interview transcripts that were analysed using one-way ANOVA and met the assumptions for this.
3. I think mainly because of past advise she has had that ANOVA is robust and also because the alternative (Kruskal-Wallis) makes the results difficult to interpret.

Hope that helps.
I think if you could add more specific details to each this would be helpful:
1) Are you trying to show equivalence (which a standard hypothesis test cannot do) or are you trying to show evidence of a difference? What is the outcome of interest (a score or multiple scores on a survey)? How would you frame this in the context of specific variables from your study? What is the variable with the three groups? What are the other important variables? Pretend you're talking specifics with your advisor or a peer so feedback can be specific.
2) What was analyzed with ANOVA? The outcome of interest or participant characteristics? If the latter, the ANOVA is misapplied. What did these variables look like that were analyzed with ANOVA? If they're 5-point Likert scales, ANOVA isn't appropriate regardless of what the tests of assumptions know because theoretically we know the Y variable in each group is not normally distributed (bounded 1-5 and probably asymmetric, not continuous for sure).
3) What is your advisor's formal background? We have some very good members of the forum with formal statistics background, psychology, epidemiology, and so on (there's more), but not everyone is well trained in statistics and misunderstand/uses terms like "robust" and without further reasoning on your advisor's part, I can't feel comfortable saying the advice is well supported. Kruskal-Wallis isn't really more difficult to interpret. In the most general application, it tests a null for equality of the distributions (shape and location). This is a more general question but isn't necessarily less valuable than saying something about the mean or median. This is why I'm trying to hammer down more specifics of your question (i.e. do any of the 3 treatments impact the distribution of perceived task difficulty as measured by a likert item?, for example). If we assume distributions have the same shape and differ only in location, then we can use KW to test the null of equality of medians (but only in this case).