Comparing proportions between two partially overlapping samples coming from one population

johnstatistic

New Member
Hello there,
I have a population of colorectal polyps, divided in 3 groups by size: 1-3 mm, 1-5 mm, 1-10 mm. So, each group contains the subjects from the smaller group.
No, I want to compare the proportion of X characteristic (binary) among these groups.
It seems that chi test is not enough as the subjects are overlapping among these groups.
Which test do you think is appropriate for comparing proportions between these groups?
Thanks.

hlsmith

Less is more. Stay pure. Stay poor.
Most of the times you are breaking the independence assumption between obs, so are out of luck. However, if you had a ton of data you may be able to figure something out.

What is you sample size of unique patients and how much overlap are we talking about?

johnstatistic

New Member
The overlap between groups is of a high percentage. However, the whole sample size is large, about 3200 polyps.
DO you think a Phi correlation coefficient can work?
thanks again.

johnstatistic

New Member
Most of the times you are breaking the independence assumption between obs, so are out of luck. However, if you had a ton of data you may be able to figure something out.

What is you sample size of unique patients and how much overlap are we talking about?
The overlap between groups is of a high percentage. However, the whole sample size is large, about 3200 polyps.
DO you think a Phi correlation coefficient can work?
thanks again.

katxt

Active Member
This sounds like it might be a straightforward job for a resampling test. If you resample the polyps then the independence problem goes away.

hlsmith

Less is more. Stay pure. Stay poor.
So polyps were clustered in people? If so, do you know which person its was in, so you can control for the within variability. Also, are there any fixed effects at the person level you would need to control for, say obesity, gene, etc.?

fed2

Active Member
seems like you need to define the populations to be mutually exclusive. if you don't, you may be able to conjure a statsitical test, but it will be equivalent to the test which comapres mutually exclusive populations. how can a population be different from a subset of itself unless the corresponding partitiion is significant, in some sense? you may use a t-test.

katxt

Active Member
you may be able to conjure a statsitical test, but it will be equivalent to the test which comapres mutually exclusive populations.
It sounds sort of right but I can't quite see why it has to be true.
I still maintain that a resampling test would take care of all those difficulties. And is not conjuring.

fed2

Active Member
I still maintain that a resampling test would take care of all those difficulties.
you might be a statistician if: you care more about debating whether assumptions for a statistical test are met ...

say you got 3 dudes dude 1 to 3: consider 1/2 ( Y 1 + Y 2) - 1/3 (Y1 + Y2 + Y3) == 0, I think it basically goes from there that you must have equality of the partition means, etc.

katxt

Active Member
OK. Summary so far, as I understand it. There are 3 groups G1, G2, G3 and 3 sizes, small, medium and large S, M, L.
G1 = S, G2 = S+M and G3 = S+M+L.
We want to test for differences between the groups.
For G1 vs G2 test S vs M. For G2 vs G3 test L vs S+M. For G1 vs G3 test M+L vs S.
t tests are OK because of the very large sample size or some other test for proportions.

hlsmith

Less is more. Stay pure. Stay poor.
I would still like to hear the OP say that a single person is not contributing more than one polyp to the dataset!