- Thread starter luckycat
- Start date
- Tags anova kruskal wallis levene's test non-parametric anova shapiro-wilk test

1) ANOVA (and other regression models) do not assume that the marginal ("overall") distribution of the dependent variable is normal (link). ANOVA does assume that the distribution of the DV is normal

2) A non-significant Levene's test statistic indicates

3) A Kruskal-Wallis test is a non-parametric alternative to ANOVA, but it tests a completely different null hypothesis (that the mean

Hope that helps!

- How do I know if Levene's test brings a statistically significant result because of low power, rather than because the variances are not equal?

- Several sources outline that the general assumptions of ANOVA assume both normal distribution of the DVs (that is what I have been testing with Shapiro-Wilks) and homogeneity of variances (testing with Levene's test). See e.g. here: https://statistics.laerd.com/statistical-guides/one-way-anova-statistical-guide-2.php. That is why I'm concerned about distributions, although ANOVA is pretty resistant to non-normality

- Non-parametric alternatives are recommended for small-sample (30 participants per condition) Likert-scale data, see e.g. https://dl.acm.org/citation.cfm?id=1753686&dl=ACM&coll=DL&CFID=997388384&CFTOKEN=76933434

- Kruskal-Wallis doesn't use means, it uses medians. See e.g. here: http://www.statisticshowto.com/kruskal-wallis/, https://statistics.laerd.com/spss-tutorials/kruskal-wallis-h-test-using-spss-statistics.php

- I use Dunn post-hoc test for pairwise comparisons.

Thanks a lot

just to build on what CB said, you should run the ANOVA test and then do the diagnostics: check whether the residuals are normal or not, whether you have unhomogenous variances (like the horn shape in the residual graph) etc. This is generally much easier and more sensible to do then the checks before the test. If your residuals show patterns and/or non-normality you might want to either use more advanced techniques (like a data transformationtransformation) or move over to a non-parametric test. They have generally a lower power, so you might want to consider gathering more data in this case.

regards

- How do I know if Levene's test brings a statistically significant result because of low power, rather than because the variances are not equal?

https://ncss-wpengine.netdna-ssl.co.../PASS/Levene_Test_of_Variances-Simulation.pdf page 553-11

- Several sources outline that the general assumptions of ANOVA assume both normal distribution of the DVs (...). See e.g. here: https://statistics.laerd.com/statistical-guides/one-way-anova-statistical-guide-2.php

Careful reading is highly recommended.

With kind regards

Karabiner

In my knowledge CBear is correct in that the Wilcoxon-Mann-Whitney-Kruskal-Wallace is based on the null hypothesis that the MEAN of the ranks is the same. I am sorry but I am to lazy to search for sources for that.

I am sorry but I don't trust the sources that luckycat gives in post 3. We all know that some sources on the internt are not reliable.

But Fagerland Sandvik (2009) "Performance of five two-sample location tests for skewed distributions with unequal variances" says that it is not generally true that the Wilcoxon-Mann-Whitney is a test of medians.

Have a look at Fagerland, Sandvik and Mowinckel (2015) where the abstract says:

Results

The Welch U test (the T test with adjustment for unequal variances) and its associated confidence interval performed well for almost all situations considered. The Brunner-Munzel test also performed well, except for small sample sizes (10 in each group). The ordinary T test, the Wilcoxon-Mann-Whitney test, the percentile bootstrap interval, and the bootstrap-t interval did not perform satisfactorily.

Conclusions

The difference between the means is an appropriate effect measure for comparing two independent discrete numerical variables that has both lower and upper bounds. To analyze this problem, we encourage more frequent use of parametric hypothesis tests and confidence intervals.

The Welch U test (the T test with adjustment for unequal variances) and its associated confidence interval performed well for almost all situations considered. The Brunner-Munzel test also performed well, except for small sample sizes (10 in each group). The ordinary T test, the Wilcoxon-Mann-Whitney test, the percentile bootstrap interval, and the bootstrap-t interval did not perform satisfactorily.

Conclusions

The difference between the means is an appropriate effect measure for comparing two independent discrete numerical variables that has both lower and upper bounds. To analyze this problem, we encourage more frequent use of parametric hypothesis tests and confidence intervals.