# rank sums, exact distribution

#### fed2

##### Active Member
Is there an easy way to compute the exact distribution of rank sums with say, n, n1 = n2 = n/2 in each group?

for example with n=4: ranks 1, 2, 3, 4 you could get group1 has 1 + 2 = 3, or 2 + 3 = 5...

Obviously there will be (n chose n1) distinct ranks, but some of the sums will be redundant. For example 1 + 4 = 2 + 3. How to determine the number of combinations contributing to each of the possible sums?

#### katxt

##### Well-Known Member
In the original Mann-Whitney paper, I think they calculated this sort of thing with recurrence relations, building up the distribution using values already found earlier in the table. This makes me suspect that there is no closed form giving the frequency as a function of n and ranksum. However, It is probably possible to use this approach to get exact answers for particular values of n and ranksum.

#### fed2

##### Active Member
thanks! ill have to look into it.

#### katxt

##### Well-Known Member
Or, if you are wanting the table for a specific smallish n, you can just write a simple program with nested loops in a couple of minutes.

#### katxt

##### Well-Known Member
How big do want n to be allowed get?

#### fed2

##### Active Member
hi guys thanks for your responses and guidances: I think both will be instructive for me.

Actually my interest in the subject is related to sample sizes that are pretty small, like definitely less than 12 per group or so and in actuality I am sort of more interested in the exact stratified wilcoxon (van elteren) type tests. The reason being that it does not seem that SAS has an exact version of that test implemented, although I suspect R does, but I have to find the package.

I sort was just looking for a way to easily QC the p-values I was getting out of my computer, I thought maybe there was an easy way to figure this quickly in excel, at least for reasonably small n. Maybe it will occur to me when I look at these references.

Thanks!

#### obh

##### Well-Known Member
hi guys thanks for your responses and guidances: I think both will be instructive for me.

Actually my interest in the subject is related to sample sizes that are pretty small, like definitely less than 12 per group or so and in actuality I am sort of more interested in the exact stratified wilcoxon (van elteren) type tests. The reason being that it does not seem that SAS has an exact version of that test implemented, although I suspect R does, but I have to find the package.

I sort was just looking for a way to easily QC the p-values I was getting out of my computer, I thought maybe there was an easy way to figure this quickly in excel, at least for reasonably small n. Maybe it will occur to me when I look at these references.

Thanks!
Hi Fed,

If I remember correctly the calculation should be fast until ~17, so 12 should be fast enough for any computer.
But why should you calculate yourself when you can use R?
The R cotains both the exact method and the approximation, but in case of ties it uses only the approximation!
wilcox.test(x1, x2, alternative = "two.sided", paired = FALSE, exact = TRUE)
If you have ties you may consider to calculate per the article I send before. (or use simulation)

Another option is to use the following online calculator, it gets the same results as R, and also lets you know the R alternative code.
The exact method is only until 20 for each group, so 12 should be okay.
https://www.statskingdom.com/170median_mann_whitney.html

#### fed2

##### Active Member
i just wanted to follow-up on this: I think the R package for a broad range exact non-parametrics is 'COIN' in R.

Also it seems that someone did solve the recurrence relation @katxt mentioned above! Although they used mathematica. I never realized that recurrence, thanks again!

#### Attachments

• 364.7 KB Views: 1

#### katxt

##### Well-Known Member
And more than likely the dynamic programming article obh mentioned is effectively a recurrence relation as well.