In the first experiment, I conducted 99 independent trials in a 2x2 factorial setting, recording successes or failures. The data is as follows:

FactorA FactorB No_successes No_failures

0 0 25 0

0 1 25 0

1 0 16 8

1 1 24 1

When looking at the data, it seems clear that the major drop in success rate occurs only when FactorA=1 and FactorB=0. However, according to GLM with binomial errors, p for interaction is 1, but for both main effects p<0.01. It doesn't seem to make much sense!

In the second, larger data set (in the same 2x2 factorial design) the general picture is similar, but the differences in proportions between cells are even larger (for A=1 and B=0 the success rate is 0.37, in three other cases 0.79-0.95). When I analyze the data assuming binomial errors (as I should), p for interaction is 0.45. When I analyze the same data assuming its normal distribution (which is obviously wrong, but...), the significance of the main effects doesn't change dramatically, but p for interaction becomes less than 0.001!

It seems quite apparent that GLM assuming binomial error function seriously underestimates the effects of interactions between factors. Is there any procedure improving chances of detecting interactions in such data?