rank sums, exact distribution

fed2

Active Member
#1
Is there an easy way to compute the exact distribution of rank sums with say, n, n1 = n2 = n/2 in each group?

for example with n=4: ranks 1, 2, 3, 4 you could get group1 has 1 + 2 = 3, or 2 + 3 = 5...

Obviously there will be (n chose n1) distinct ranks, but some of the sums will be redundant. For example 1 + 4 = 2 + 3. How to determine the number of combinations contributing to each of the possible sums?
 

katxt

Active Member
#2
In the original Mann-Whitney paper, I think they calculated this sort of thing with recurrence relations, building up the distribution using values already found earlier in the table. This makes me suspect that there is no closed form giving the frequency as a function of n and ranksum. However, It is probably possible to use this approach to get exact answers for particular values of n and ranksum.
 

katxt

Active Member
#4
Or, if you are wanting the table for a specific smallish n, you can just write a simple program with nested loops in a couple of minutes.
 

fed2

Active Member
#8
hi guys thanks for your responses and guidances: I think both will be instructive for me.

Actually my interest in the subject is related to sample sizes that are pretty small, like definitely less than 12 per group or so and in actuality I am sort of more interested in the exact stratified wilcoxon (van elteren) type tests. The reason being that it does not seem that SAS has an exact version of that test implemented, although I suspect R does, but I have to find the package.

I sort was just looking for a way to easily QC the p-values I was getting out of my computer, I thought maybe there was an easy way to figure this quickly in excel, at least for reasonably small n. Maybe it will occur to me when I look at these references.

Thanks!
 

obh

Well-Known Member
#9
hi guys thanks for your responses and guidances: I think both will be instructive for me.

Actually my interest in the subject is related to sample sizes that are pretty small, like definitely less than 12 per group or so and in actuality I am sort of more interested in the exact stratified wilcoxon (van elteren) type tests. The reason being that it does not seem that SAS has an exact version of that test implemented, although I suspect R does, but I have to find the package.

I sort was just looking for a way to easily QC the p-values I was getting out of my computer, I thought maybe there was an easy way to figure this quickly in excel, at least for reasonably small n. Maybe it will occur to me when I look at these references.

Thanks!
Hi Fed,

If I remember correctly the calculation should be fast until ~17, so 12 should be fast enough for any computer.
But why should you calculate yourself when you can use R?
The R cotains both the exact method and the approximation, but in case of ties it uses only the approximation!
wilcox.test(x1, x2, alternative = "two.sided", paired = FALSE, exact = TRUE)
If you have ties you may consider to calculate per the article I send before. (or use simulation)

Another option is to use the following online calculator, it gets the same results as R, and also lets you know the R alternative code.
The exact method is only until 20 for each group, so 12 should be okay.
https://www.statskingdom.com/170median_mann_whitney.html
 

fed2

Active Member
#10
i just wanted to follow-up on this: I think the R package for a broad range exact non-parametrics is 'COIN' in R.

Also it seems that someone did solve the recurrence relation @katxt mentioned above! Although they used mathematica. I never realized that recurrence, thanks again!
 

Attachments