Matt Whiteny U test Vs T test

#1
Hello fellow statisticians,

Problem definition: I need to test whether the difference between the "mean" utilization metrics of two machines is statistically significant.

Given Data: Utilisation data is available for two machines for more than 100 days. Therefore, the mean, standard deviation, and standard error can be calculated over the 100 days.

My approach: Calculate the mean, standard error and then use t-test with unequal variances to test the difference in means. Used the approach described here "https://www.cscu.cornell.edu/news/statnews/stnews73.pdf"

Clarifications: What is the best test here to test the statistical significance of the difference in "means"? Is it t-test or Mann Whitney u test? I read many academic papers where t-test is widely used than Mann Whitney U test (without testing for the normal distribution). Is it because they take the central limit theorem for granted and they use the t-test? Also, some claims that Mann Whitney U test is really not used to test the statistical significance of the means rather it is used for medians.

It's quite hard for me to understand the logic behind choosing these tests.
 

Karabiner

TS Contributor
#2
>Clarifications: What is the best test here
>to test the statistical significance of the
>difference in "means"? Is it t-test or Mann
>Whitney u test?

The U-test is a test for rank data/ordinal data. It is therefore not able to compare means.

>Also, some claims that Mann rank data/Whitney
>U test is really not used to test the statistical
>significance of the means rather it is used for medians.

The Mann-Whitney is not a test for the comparison of medians. Instead, the Median Test is a test for medians.

With kind regards

Karabiner
 
#3
>Clarifications: What is the best test here
>to test the statistical significance of the
>difference in "means"? Is it t-test or Mann
>Whitney u test?

The U-test is a test for rank data/ordinal data. It is therefore not able to compare means.

>Also, some claims that Mann rank data/Whitney
>U test is really not used to test the statistical
>significance of the means rather it is used for medians.

The Mann-Whitney is not a test for the comparison of medians. Instead, the Median Test is a test for medians.

With kind regards

Karabiner
Thank you for your reply. So can we use a t-test to compare the means always irrespective of the normality rule?
 
#4
So can we use a t-test to compare the means always irrespective of the normality rule?
NO! Of course not. The t-test is still based on the normal distribution. The distribution must still be somewhat similar to the normal distribution. But the t-test is relatively robust to moderate deviations from normality. With the Welch test it is robust to non constant variance. But the Mann Whitney test is sensitive to non constant variances.

Sometimes in verbal conversations it is said that the t-test is "more non parametric" than the Mann Whitney test., But of course you will not see such a fluffy statement in a published paper. (Search for Fagerland Sandvik for published papers.)

But the t-test is not robust to outliers. Outliers will have dramatic influence on both the mean and the t-test. Then Mann Whitney will be more robust. But before that one can try a transformation like taking the log or square root.

But why isn't permutation tests used more?
 

gianmarco

TS Contributor
#5
"The permutation test is useful even if we plan to use the two-sample t test. Rather than relying on Normal quantile plots of the two samples and the central limit theorem, we can directly check the Normality of the sampling distribution by looking at the permutation distribution. Permutation tests provide a “gold standard” for assessing two-sample t tests. If the two P-values differ considerably, it usually indicates that the conditions for the two-sample t don’t hold for these data. Because permutation tests give accurate P-values even when the sampling distribution is skewed, they are often used when accuracy is very important." (Moore, McCabe, Craig, Introduction to the Practice of Statistics, New York: W. H. Freeman and Company, 2009).