t-test vs. Wilcoxon-test

#1
Hi there,

I am new in this community. I hope you can help me out.
For my master thesis, I have a dataset of 6043 observations. This observations contain the spread differential (delta) of matched securities (I compare one corporate bond with a Green corporate bond of the same issuer)
What I want to investigate is, whether there is a significance in delta >= 0 (H_0) vs. delta < 0 (H_1)

Therefore I ran two tests on this datas.
1. one sample one tailed t-test
2. Wilcoxon test

The results (run by R) show that 1.) H_1 is true and 2.) H_0 is true.

Now, I am kind of frustrated, since i have two different outcomes of my hypothesis.
What are your opinions?

Best Regards,
Emre
Bildschirmfoto 2020-09-09 um 19.14.21.jpg
 

hlsmith

Less is more. Stay pure. Stay poor.
#2
Can you provide a histogram of your data, so we can understand what you are working with. Also, see the below quick simulation I did where both tests generate comparable frequentist conclusions. Also, why do you think the ttest would not be a good fit?

Code:
X <- rnorm(6043, -1, 1)
X
hist(X)
t.test(X, mu=0, alternative = 'less')
wilcox.test(X, mu=0, alternative="less")
 

Dason

Ambassador to the humans
#3
A histogram would be nice to see. If we are looking at symmetric distributions then the t-test and wilcox test should give comparable results. But if there is extreme skew they might disagree. The t-test is a test of the mean. The wilcox test is a test of the median. In the skewed case these can be very different. Here is an example:

Code:
> y <- rexp(10000)
> mean(y)
[1] 1.003803
> median(y)
[1] 0.6841946
> t.test(y, mu = .9, alternative = "greater")

        One Sample t-test

data:  y
t = 10.177, df = 9999, p-value < 2.2e-16
alternative hypothesis: true mean is greater than 0.9
95 percent confidence interval:
 0.9870244       Inf
sample estimates:
mean of x 
 1.003803 

> wilcox.test(y, mu = .9, alternative = "greater")

        Wilcoxon signed rank test with continuity correction

data:  y
V = 22960851, p-value = 1
alternative hypothesis: true location is greater than 0.9
 
#4
A histogram would be nice to see. If we are looking at symmetric distributions then the t-test and wilcox test should give comparable results. But if there is extreme skew they might disagree. The t-test is a test of the mean. The wilcox test is a test of the median. In the skewed case these can be very different. Here is an example:

Code:
> y <- rexp(10000)
> mean(y)
[1] 1.003803
> median(y)
[1] 0.6841946
> t.test(y, mu = .9, alternative = "greater")

        One Sample t-test

data:  y
t = 10.177, df = 9999, p-value < 2.2e-16
alternative hypothesis: true mean is greater than 0.9
95 percent confidence interval:
0.9870244       Inf
sample estimates:
mean of x
1.003803

> wilcox.test(y, mu = .9, alternative = "greater")

        Wilcoxon signed rank test with continuity correction

data:  y
V = 22960851, p-value = 1
alternative hypothesis: true location is greater than 0.9

Hi, thanks for the replies @hlsmith and @Dason,
The histogram shows, that my data is skewed. That's why the test accepts the alternative hypothesis (less) and the Wilcox test rejects the alternative.
What would you conclude based on my data?
 

Attachments

obh

Active Member
#5
Hi Emre,

The t-test and the Wilcoxon test doesn't check the same thing, so you don't expect to get the same result...

1. If you want to check if the probability to get a random value that is smaller than zero equal to the probability to get a random value that is larger than zero then choose the Wilcoxon test.

2. If you want to check if the average equals zero then use the t-test.

Despite the above, the Wilcoxon is still a substitute for the t-test, as it has fewer assumptions, and you may say that it checks a similar issue.

With symmetrical data, you may get a similar result.

Asymmetrical example
[-5,-4,-3,-2,14]
The average is 0.
The estimate of the probability to get a randomly selected number smaller the zero is 0.8

Symmetrical example
[-3,-2,-1,1,2,3]
The average is 0.
The estimate of the probability to get a randomly selected number smaller the zero is 0.5
 
Last edited:
#7
Hi,
To use a t.test, you must check the normality of the distribution of your data with a shapiro.test.
If yes, you use the t.test, if not, you use wilcoxon.test
F.
Well, not really. First f all, distribution of the sample data is not so interesting.
What might be interesting for the one-sample t test is whether the data are
from a normally distributed population (in case of a 2-sample test: whether
each sample is drawn from a normally distributed population). But with
n > 6000 it is really, really not relevant for the validity of a test
(for the random sampling distribution of the means) whether the population
is normally distributed.

With kind regards

Karabiner
 

Dason

Ambassador to the humans
#8
Hi,
To use a t.test, you must check the normality of the distribution of your data with a shapiro.test.
If yes, you use the t.test, if not, you use wilcoxon.test
F.
I disagree for various reasons. One of the big ones being that it depends on what you truly are interested in. The two tests are addressing different questions. In some cases those questions are essentially the same - in some cases they aren't. But it isn't as simple as you make it seem.