Chi squared and phi coefficient


New to statistics and I'm sure that will be obvious!

I'm taking over from someone at work they conduct some analysis on something :

A test group are receiving an email.

A control group are not.

The two categories are:

Does sign up to online account management

Does not sign up to online account management

Here are the observed results :

Test: 12870 signed up, 49184 did not.
Control : 1535 signed up, 9993 did not.

If i do a chi squared test I get the following results

Chi squared stat: 340.38
P value < 0.00001

This shows the result to be highly significant, aka we reject the null hypothesis that there is no association between group and the categories.

However, my colleague has then chosen to calculate the phi coefficient which comes out at 0.068145

The above was calculated using R. I assume you don't need to multiply is by 100 or anything?

From Google I can see that a figure of 0.068 means that there is 'no or negligible relationship'.

My question is:

These two results completely contradict each other. Chi squared is incredibly significant and phi says there's no association. Is there an association here or not?! Does receiving the email make you more likely to sign up?

Are there some rules as to when to use the phi coefficient or just stick to chi squared or vise versa?

Could there be rules regarding sample size? I thought my numbers above are fairly large? We're not talking single digit.

I'm aware there is other methods of working out correlation so any advice or links to particular articles that would help would be appreciated.

Thank you

Dr Strangelove


Active Member
The chi square p value is so small simply because your samples are so large. The small p value doesn't tell you how big the effect is. However, it does assure you that the effect is real.
The phi = 0.068 says the correlation is small. But even quite small effects can have significant p values if the sample size is large enough.
Thank you for replying. From what you are saying I shall ensure i use phi in addition to chi squared tests in future. Otherwise I could be fooled into thinking the groups are related to the categories (if I get a significant p value). So as phi is so small, I should report that the correlation is too small to deem that receiving the email has a significant effect on signing up to online account management.

Does that all make sense or am I missing anything do you think?

Thank you


TS Contributor
If you use "significant" in two different ways, then this causes confusion.
The test result means that you have a statistically significant association (i.e. in the population,
the association is not exactely = 0.00000000000), regardeless of its strength. The phi coefficient
indicates that in the present sample the association might be of little practical "significance"

With kind regards



Active Member
So as phi is so small, I should report that the correlation is too small to deem that receiving the email has a significant effect on signing up to online account management.
I wouldn't use phi to make a judgement. Look at the actual percentages who sign and don't and decide for yourself if the effect is important to you.
Can you or anyone else expand on this please? I do find this stuff hard to learn. How do I make a judgement then? Chi squared says one thing, phi says something else and now I'm bring told to just do a quick division and decide whether I think that's important or not?

Thanks for your replies I appreciate it I just really want to understand this.
I'm still not clear on why to even use phi. There's dozens of examples of chi squared on YouTube where they conclude that x does have an effect on y becuse of the tiny p value. Why don't they mention taking any more steps?


Active Member
Very often the effect size is used at the planning stage where you can work out how big your sample needs to be for a given effect size. Working in reverse, your small effect size after the experiment is saying that your experiment is "overpowered". That is, it is capable of detecting very small differences, not that the difference is not real or important. You could have had the same result with a smaller experiment.
Effect size is a way of describing the size of an effect against the background variability. Although it looks quantitative, it is essentially descriptive. It is especially favoured by the social scientists. Other areas tend to say there is a difference and its this big and it's up to you to decide whether it's important or not. I personally don't quote effect sizes.
There are certainly many situations where the effect size is not a good indicator of whether a result is useful and yours may be one of them. You know that the the email makes a difference. You decide whether the difference is worth the trouble of the email.
Last edited: