Dear Community,
We are trying to perform a Wilcoxon Signed Rank Test in R but suspect it is not working because our dataset is too large. We have around 10,000 paired samples: looking at number of hospital appointments before and after an intervention. About half of the population have a difference of zero e.g. same amount of appointments in the time period (one year) before and after intervention.
The data is in a straightforward format e.g. :
Patient PRE POST
1 3 9
2 7 6
3 2 1
etc…
We have run the following code in R:
test5 <- wilcox.test(mydata$PRE_followup, mydata$POST_followup, mu=0, alt="two.sided",
paired = TRUE, conf.int=TRUE, conf.level=0.95, exact=FALSE, correct=FALSE)
Which gives us:
Wilcoxon signed rank test
data: mydata$PRE_followup and mydata$POST_followup
V = 34697374, p-value < 2.2e-16
alternative hypothesis: true location shift is not equal to 0
95 percent confidence interval:
0.5000322 0.9999830
sample estimates:
(pseudo)median
0.5000005
Because of the size of our sample we’re getting way too significant p-values. Is there a correction that can be done? Or an alternative test that our incessant googling has missed?
Thank you very much!
We are trying to perform a Wilcoxon Signed Rank Test in R but suspect it is not working because our dataset is too large. We have around 10,000 paired samples: looking at number of hospital appointments before and after an intervention. About half of the population have a difference of zero e.g. same amount of appointments in the time period (one year) before and after intervention.
The data is in a straightforward format e.g. :
Patient PRE POST
1 3 9
2 7 6
3 2 1
etc…
We have run the following code in R:
test5 <- wilcox.test(mydata$PRE_followup, mydata$POST_followup, mu=0, alt="two.sided",
paired = TRUE, conf.int=TRUE, conf.level=0.95, exact=FALSE, correct=FALSE)
Which gives us:
Wilcoxon signed rank test
data: mydata$PRE_followup and mydata$POST_followup
V = 34697374, p-value < 2.2e-16
alternative hypothesis: true location shift is not equal to 0
95 percent confidence interval:
0.5000322 0.9999830
sample estimates:
(pseudo)median
0.5000005
Because of the size of our sample we’re getting way too significant p-values. Is there a correction that can be done? Or an alternative test that our incessant googling has missed?
Thank you very much!