Why use Chi-square?

#1
Hi,

I am putting together a statistics course about inference techniques to use when
the predictor variable(s) are categorical (by which I mean, cannot be put into any
meaningful order). I am including both Fisher's exact test for nxm tables (we are
using R, which does this) and the Chi square test. My understanding is that the
two differences between these techniques are:
1) Fisher's exact test is for sampling without replacement, and chi-square is for
sampling with replacement.
2) The Chi square test makes use of the normal approximation.

My question is: given new calculational tools, and the fact that R, for instance, can
implement Fisher's exact test for mxn tables, what is the argument for ever using
the chi-square test for mxn tables?

I suppose I could hypothesize a situation in which the sample size is large compared
to the population, and the sampling is done with replacement, and the sample size is
large enough that the normal approximation is good, but I suspect such situations occur very rarely in real life.

On the other hand, I am familiar also with the work about exact versus approximate
confidence intervals for proportions, and the fact that unless the sample size is small and the proportion is very close to 0 or 1, approximate confidence intervals are
preferable. So it seems plausible that chi-square may in practice give better results.

I am grateful for any help with this!

Eugenie
 

Mean Joe

TS Contributor
#2
what is the argument for ever using the chi-square test for mxn tables?
It's faster, and often not different from the exact result (on the order of .02 or less difference in p-values).
There are times when the results do differ quite a lot. When an expected cell frequency is <5, I'd consider an exact test.

I usually go with the chi-square at first, to see what kind of results to expect, then consider using an exact test.
Also be aware that with the increase in computing power that made the exact test possible/appealing, has come more data and more variables, which has led to more tables that need to be tested, so the faster and less exact test wasn't exactly killed by technology.
 
#3
Thanks, that is very helpful. Could you explain about using Chi-squared then using exact afterwards? I had the impression from
the piecemeal reading I have done that it isn't good practice to use two different significance tests, as it increases the risk of
Type I error. My understanding is not at all subtle about this, though, so a good example like this would be very helpful.
 

Mean Joe

TS Contributor
#6
Could you explain about using Chi-squared then using exact afterwards? I had the impression from
the piecemeal reading I have done that it isn't good practice to use two different significance tests.
I didn't mean to use chi-square to get some results and then Fisher to get other results.

I get the chance to analyze data, which can have hundreds of variables. To understand the data, I'll look at means, freqs, etc.
It often happens that there's ~10 variables that I want to take another look at, in tandem. With software nowadays, it is just as fast to do a chi-square test while looking at the crosstab frequency, than just doing a crosstab by itself.
Then I'll talk with the researcher, and (s)he will narrow down to just a couple of tables to test. At that point, having seen the raw data and knowing the category frequencies (and also knowing the chi-square test results, which come almost for free), I may want to do an exact test.
I would not want to do exact test until I know I want/need it. At that point, I would only use the exact test; anything from the chi-square would be ignored.

Really, if you see the raw data then you have an idea of what the test result should be. In one scenario, you can see that there is not much difference between the categories, and you wouldn't even feel like wasting the time with an exact test. Sometimes a statistical test comes back with significant results, but looking at the raw data (the crosstab frequency) you see there might be a weakness in the test.

Are you going to teach both chi-square and Fisher? Or just the best one?
 
#7
I am planning to teach both, since certainly they will see chi-squared referred to, and I want them to know what it does and how it works and what its limitations are. In addition, I have to confess, I haven't yet learned what the best exact version of a goodness-of-fit test is for proportions, and I want to talk about that. But I also want to give students a sense of how modern computing has changed best practice. It seemed from what I have been reading that for the sort of situations you meet in most textbooks, you might as well do an exact test rather than chi-square, given its easy implementation in R. Does this get bogged down when you have about 10 variables? I can imagine it might, but I
haven't tried it.

It is very interesting to hear about what the data you deal with in practice is like. Do you generally work with a given researcher
through the design, pilot, and full analysis phases, or do they tend to come to you with data? what areas does the research tend
to be in? I am wondering where these hundreds of variables are coming from, and how you can hope to get anything significant
with so much potential for co-incidental interactions among them. I don't know anything about this sort of thing, and I am curious how it works.
 
#8
In every statistics class I have ever taken chi square is brought up. Even if there are better methods, its wise to at least discuss it. It might be helpful to distinquish between a chi squared test of independence (a specific test) and a chi square distribution used in a wide range of statistics from structural equation models to logistic regression. If you are new, the differences can confuse.
 

Mean Joe

TS Contributor
#9
It is very interesting to hear about what the data you deal with in practice is like. Do you generally work with a given researcher
through the design, pilot, and full analysis phases, or do they tend to come to you with data? what areas does the research tend
to be in?
For instance, I work with doctors who treat patients (patients with cancer, or patients with diabetes, or others). I am a "secondary statistician"; the researchers do work with a "primary statistician" through the design, pilot, etc, but when it comes time for analysis then I finally come in myself. So it does happen that multiple statisticians are involved, and they come in at different times during the research.

All the variables come from: treatment (there can be several treatments such as radiation, chemo, several medications, etc) which can lead to several binary variables, death, cause of death, survival time, etc and of course the basics age, sex, ethnic group, socioeconomic status (can be measured by several variables). Survival time itself can be split into binary variables (survival_lessthan_1year, survival_greaterthan_10years, etc). And you can run many tests of binary variables for treatment by survival, until you get a chance to talk with the researcher and know what (s)he wants. And sometimes you find something in your explorations that are news to the researcher, which (s)he may or may not be interested in when you meet.

how you can hope to get anything significant
with so much potential for co-incidental interactions among them
It takes some experience. You need more than statistical tests to do statistics; you also need to know the area of research. Check your significant statistics with accepted theory. If you find something that can't be supported by accepted theory, then be cautious in reporting it. Talk with others. I guess as a statistician you never work alone.
Also, you can run several tests to see if significance holds in all of them; and if they don't, can you understand why? eg back to survival times, you probably start with the whole study population and the continuous variable, then do more tests to see if some subjects with very long survival times (survival times are calculated from diagnosis to last contact...subjects in lower SES may only visit the doctor once in 5 years, so last contact date may be 3 years ago so survival time is censored...) are biasing your results. It takes some experience.