test for paired proportions


I'm not sure which test is the right for my research, would love to hear your thoughts -

I have a small group (<30) of people that were asked to say as many words as possible within 1 minute, for each one of them I calculated the proportion of verbs - all words were in the same language L1 . After that the participants were asked to do the same thing again, this time in another language - L2. Now I have 2 sets of data, the proportions of L1 and L2. The samples are paired, yet the total number of words in each language is different. I know that I can't use wilcoxon test for paired smaples, but I can't think what would be the right thing to do here.
My H1 is that the proportion in L2 is greater than in L1


Active Member
Hi Michael,

Let's assume different question, school examination and you compare the average of the marks between girls and boys. what test will you use?
(examination mark may also be a proportion of questions ...)

Unlike the 2 samples proportion test, you don't compare 2 proportions but you compare the "marks" of the verb proportion per each person.
So the sample size is the number of people not the number of words?

You may decide to compare the overall proportion (subject1 verbs + ..+subjuct n verbs)/ ( subject1 total words + .. + subject1 total words)
But in this case, you lose the "pairs" and you also give different weight for every person.

I would try the paired t-test and if it doesn't meet the assumption I would go to the Wilcoxon sign rank.
The response variable is the number of words. Lets guess that it is Poisson distributed (with an expected value of mu).

There is a fixed effect language (L1 and L2). Lets call the factor L.
There is a random effect, the person. Lets call that factor P.

So I suggest the Poisson regression model with the factors L and P:

log(mu) = a + L + P_i

The random effect P_i takes different values for each individual i, (and it is assumed to be normally distributed with zero mean and variance sigma^2).
That means that some individuals can say more words and some says fewer.


Active Member
Hi Greta,

If I understand correctly, the question was to compare the proportion of verb from the total words between 2 languages.
While the L is the number of all word per time unit?

Independently what do you think about the other two options:
1. Two sample proportion test, in this case, the weight of each subject depends on his personal (number of word per minute)
2. two samples t-test or if doesn't distribute close to normal (as may be expected) the Wilcoxon sign rank.
If I understand correctly, the question was to compare the proportion of verb from the total words between 2 languages.
Ehh, well, yes! The question was about comparing the proportions.

But I believe that the conclusion would be very similar (or even exactly the same) if it was based on the Poisson count.

Let us sum the word count from both languages for each individual and call that n_i. Then lets condition on n_i and use a binomial model so that the proportion of L1 language is p_i for individual i. (And that the proportion for L2 is (1-p_i).)

Then use the logit-model to estimate the effect of langage an the random effect of person i (P_i):

log(p/(1-p)) = a + b*L + P_i

Then max likelihood estimates can be used to estimate the model and a likelihood ratio test to test the effect or language.

I believe that this would be very close (but maybe better) to pairwise t-test or Wilcoxon signed-ranked test.

If the number of words are not to small the observed proportion will be approx normal and the count of words (in the Poisson model) also approx normal. But maybe the ML estimates will be slightly better than the mean of proportions and the likelihood ratio test better that the Wald test in pairwise t-test.

But it is similar to the case where a paiwise t-test is the same as a two way anova with with treatment as one factor and person as the other factor.