2 groups with ratio(s); How to test between-/within-group differences?



Hi statistics community

I am wondering if anyone could please give me advice on my rather simple (I assume) statistical problem.

I am working with animal breeding data. I have two groups of animals, group 1 with both partners from the same region, and group 2 with both partners from different regions. For both groups I have breeding success as a ratio of survived young per eggs laid and hatching success as hatched young per eggs laid.
I thought it is good to work with ratios as the first group has laid 200 eggs in total, while the second group only 50 eggs.

1) I see differences in breeding success between both groups. Group 1 has about 25% and group 2 about 15% breeding success. How would I be able to test whether these differences are significant?

2) I see that in both groups hatching success is larger than breeding success, in group 1 40% and in group 2 20%. How would I be able to test whether the loss (difference between hatching success and breeding success) would be significant within each group and between groups?

I assume because I work with ratios correcting for the differences in egg-number, I don't need to worry about these differences in the statistical tests, correct?

If anyone could give me advice on which may be the most appropriate statistical procedures to follow, I would greatly appreciate. Any suggestions involving Excel, R or SPSS would be great... with simple Excel being my naive preference.

This is my first post here, so I am looking forward to any replies,
Thanks in advance for your thoughts,
Hi and welcome

the first question seems like a 2x2 table.

---------breeding succes-------------breeding failure
group 1-----50---------------------------150-------
group 2-----8-----------------------------42-------

Now use a Fisher's exact test to find if there was any association between breeding types and breeding success. Note that you should put the real numbers in this table, not percentages.
For the second question, first of all you can make a similar analysis to see whether there were any association between the groups 1 and 2 annd their hatching proportions.

For checking if the loss rate was different between the two groups, the same test can help:

---------Hatched and bred------------------Hatched but not bred

Now put the real numbers in cells N1 to N4 and run a Fisher's test.

For assessing the significance of loss rate within each group (again Fisher's test):

In each group:------Hatching success--------------hatching failure-----
Breeding Success:--------N1------------------------N2 (= 0)------------
Breeding Failure :---------N3------------------------N4-----------------


Less is more. Stay pure. Stay poor.

The Fisher's exact test would be appropriate for the first question.

However, I am unable to wrap my mind around the second question, which should probably have three variables (hatch success, breed success, paternity). Not sure if you can compare hatching success with breeding success, without paternity, if your first Fisher's test comes up significant. If so, perhaps an option may be related to using the Cochran Mantel Haenzel test or some type of multivariate procedure. But it still seems wrong, do you look at a bird having breed y/n (with multiple counts per bird or just once), then what if the bird had failure to breed then no egg, you would be missing that information, do you not control for each bird if there are counts. Still very confused, which is probably conveyed in the lack of organization in this post, still cannot figure this out.

Perhaps you need to provide more details about what data you have. Multiple data for each bird pair?


Less is more. Stay pure. Stay poor.
You may be able to use the Fisher's Test, but results would probably need to presented with the prior information clearly layed out.

If were were racing two types of cars, and one always broke down, the cars eligible to win the race would be disproportional. Say two cars from the first type versuses 50 cars from the second type (I realized we would probably be examing proportions so this is a little bit off). Just seems like the first test may influence the second, especially if type A cars still in the race were not completely broken down, but running at a lower efficiency level, clunking along. You could compare how many broke down and how many won, but the type of care could still be affecting the outcome. Sorry that I cannot provide a better example, but it just seems like you would be omitting potentially useful information, similar to a hierachicial model (probably another poor example).

Also, for the first question, reporting Relative Risks, could also help provide tangiable numbers to the p-values, which may show differences.


Thank you all for your replies and thoughts so far!

My raw data are eggs. If individuals breed repeatedly over the study period (several decades), they are counted repeatedly. They are also counted repeatedly if they produce an egg with a different partner. So every egg is counted.
Do you suggest in order to answer whether or not parent origin may have an influence on breeding/hatching success to only count each pair once? I thought this may skew the picture because the decision of who is paired is made by humans, so naturally I would have less group-2-breeding pairs (people prefer keeping group 1 pairs).
I intended to avoid that by using ratios and expressing breeding/hatching success as fraction of laid eggs.

hlsmith, I don't understand why the second test, but not the first one, would have three variables. I thought of the origin as a nominal data (values would be "same", "different"), the other two variables then would be breeding and hatching success. Am I wrong?

I will post the data more structured again:

Group 1
- parents from same region, i.e. both from region A or both from region B
breeding success 25% (= 50 of 200 eggs yield a surviving young)
hatching success 40% (= 80 of 200 eggs hatch)

Group 2
- parents from different regions, i.e. one from region A and one from region B
breeding success 14% (= 7 of 50 eggs yield a surviving young)
hatching success 20% (= 10 of 50 eggs hatch)

Thinking about your question, hlsmith, I realize maybe should not lump all eggs per group together. Perhaps I should: count the number of pairs in each group, compute breeding + hatching success for each pair in each group, compute means per group, look if means of breeding success are statistically different between groups using Fisher's exact test (the same for hatching success), like you suggest. That would answers my first question.
Then I could compute the difference between hatching and breeding success within each group and again use Fisher to test if the difference in group 1 is significantly different from the difference in group 2. Yes?

I always write "success" above, thinking of a ratio, but you suggest that I should use the actual number of hatched and survived young, correct? Would the different number of eggs in both groups, which is a result of human preference of group 1 pair constellations, not bias the outcome then?

...and this morning I thought this was a simple question.
Thanks a lot for your help and input,
kind regards, N


Less is more. Stay pure. Stay poor.
This might still be a little more complex than it seems, depends on how much of the variability we want to try and account for at a time. But, yes use the actual numbers in calculations, not the percentages.


Thanks again for all your comments the other day. I eventually did it with the Fisher's exact test like you suggested... and my results seem biologically meaningful, so all is good. :)



Less is more. Stay pure. Stay poor.
Not to fuddle things up again, but if you followed birds for decades, could you also control for maternal age?


Hehe.. thanks for the comment, hlsmith.
Actually, I am just about to compose a thread with a question concerning this... give me a minute :)