Help with Kruskal Wallis Test and Post-hoc Dunn's Test

#1
Hi everyone,

I am currently helping out with a clinical research project which involves comparing heart rates between different groups. There are 4 groups which I have separated according to the type of treatment the patients received. The heart rate data is non-parametric according to the Shapiro-Wilk test.

I first carried out a Kruskal-Wallis test which showed a statistically significant difference in heart rate between the 4 groups. However, I would like to know whether there is a statistically significant difference between individual groups (e.g. group 1 vs group 3/ group 2 vs group 3). I checked around online and some people recommended a Dunn's test. Is this the right approach? If so, should I use an unadjusted or adjusted Dunn's test? Are there any alternatives with more advantages?

Not sure if these will help but here are the median heart rates for each group:
Group 1: 65 (n=126)
Group 2: 70 (n=148)
Group 3: 76 (n=58)
Group 4: 73 (n=86)

The main aspect I would like to find out is whether group 1 vs group 3 is significant and whether group 2 vs group 3 is significant.

I am using STATA software and these are the results (attached) when I use an unadjusted Dunn's test.


Kruskal Wallis.PNG Dunn's test.PNG

Any help is greatly appreciated :)
 

Karabiner

TS Contributor
#2
Note that there is no such thing as nonparametric data. You performed a normality test on your dependent variable. After this gave a significant result (which is no surprise, since even small deviations from normality lead to a rejection of the null hypothesis, in case of large sample sizes such as yours), you decided to use a non-parametric test.

But this is based on a wrong assumption. An ANOVA does not assume a normally distributed dependent variable. It assumes that the variable is normally distributed in each group, or more simple, it assumes that the prediction errors from the model (the residuals) are normally distributed.

But this assumption is only needed in small samples (n < 30 or so). With your 400+ subjects, you can use oneway ANOVA with post-hoc pairwise comparison tests.

But you can of course use K-W & Dunn‘s test if you do not need to compare means (note that the K-W does not compare medians). I do not know why you have 4 groups if you were only interested in 3 of them, or why you performed 6 comparisons if you were interested only in 2 of them, but 1 vs. 3 and 2 vs. 3 certainly are statistically significant here, even after a possible Bonferroni adjustment.

With kind regards

Karabiner
 
Last edited:
#3
Dear Karabiner,

Thank you very much for your comments and advice regarding terminology. I just had a couple of follow-up questions about the K-wallis vs. ANOVA that you mentioned.

I decided to plot histograms just to see the distribution of heart rate values per patient subgroup (attached). In some cases, the data does look normally distributed but the Shapiro-Wilk test does not see it as normal distribution (p-values below). Is there any reason why this is the case? I realise that you said "this assumption is only needed in a small sample", but is it still ok to proceed with one-way ANOVA? And would it matter which post-hoc test I used?

1621695375109.png


One of my figures in this research paper was going to be 4 boxplots on the same graph, comparing the heart rates in all subgroups. However, I was going to use my results from the K-Wallis (or maybe ANOVA now) to give the levels of significance. But if my tests are comparing means rather than medians, is it still ok to represent this as a boxplot (which shows medians, IQR etc.)? Sorry if this is a silly question.

And just out of mathematical interest, is there any test which compares medians instead of means? When would these be recommended instead?

Many thanks for the help, really appreciate it,

Ali
 

Attachments

Karabiner

TS Contributor
#4
I decided to plot histograms just to see the distribution of heart rate values per patient subgroup (attached). In some cases, the data does look normally distributed but the Shapiro-Wilk test does not see it as normal distribution (p-values below).
They do not look normal, in my view. And if sample size is large enough, the test will certainly reject the null hypothesis that the sample data are from a population where the variable is exactly (!) normally distributed.
And just out of mathematical interest, is there any test which compares medians instead of means? When would these be recommended instead?
It is actually called median test. It can be used if someone wants to compare medians.

With kind regards

Karabiner