# Goodness-of-fit test

#### jackson3000

##### New Member
I have a question regarding goodness-of-fit tests. A general rule of thumb with the chi-square test of goodness of fit is that (1) no more than 20% of your expected frequencies can be less than five, and (2) none of your expected frequencies can be less than one. However, I am dealing with a data set with some small expected frequencies, and I am wondering if anyone can suggest an alternate text.

For example, in one case, I have expected frequencies of 8.2 for group 1 and 0.8 for group 2, and my observed frequencies are 5 for group 1 and 4 for group 2. Running a chi square test would violate assumptions 1 and 2 above. Again, I would appreciate any guidance about a possible alternative goodness-of-fit test to use in this instance.

#### mmercker

##### Member
Hi,

in such a situation it is recommended considering Fisher's exact test instead of a Chi-square test (e.g., Field et al, 2012: "Discovering statistics using R"). If you want to read more out more about why violating your assumptions in the context of the chi-squared-test creates problems, you find also some infomation in Howell, D.C. (2006): "Statistical Methods for psychology".

#### jackson3000

##### New Member
Hello, mmercker. I appreciate your response. However, I have a subsequent question. I am aware that there is a difference between a chi-square goodness-of-fit test and a chi-square test of independence. As I understand it, the Fisher's exact text is to be used similarly to the chi-square test of independence. However, I am needing a goodness-of-fit test that examines the differences between expected frequencies and observed frequencies. Can Fisher's exact test be used to assess goodness-of-fit?

#### hlsmith

##### Not a robit
What is your underlying purpose of using a test? If you repeat comparing observed to expected, please be specific why you need this information or what you plan to answer based on this information.

#### jackson3000

##### New Member
What is your underlying purpose of using a test? If you repeat comparing observed to expected, please be specific why you need this information or what you plan to answer based on this information.

However, in some instances, I have a small sample size. For instance, there are nine comments in the data set in which players are described as "intelligent." Therefore, my expected frequencies are 6.3 for group 1 (black players) and 2.7 for group 2 (white players). My observed frequencies are 2 for group one and 7 for group 2. If I were to run the chi-square test of goodness of fit with this data, I would be violating the rule that no more than 20% of expected frequencies can be less than five. Is there an alternative to the chi-square test of goodness of fit to use in this case?

#### hlsmith

##### Not a robit
Just curious what this is for?

You have many limitations here. Players are not the exact same with the only difference being race. So some are truly going to be more athletic or ugly.

In addition, you do not know the posters race (I would imagine), which could influence the comments.

Also, what if a poster comments on multiple players? How would you handle those comments?

Also, are some posts a part of the same thread? If so, prior posts may influence that posters responses, so it is not independent.

I understand what you are trying to do, but if you plan to do anything with these data or want to interpret them, you absolutely cannot given a goodness of fit test.

#### jackson3000

##### New Member
Just curious what this is for?

You have many limitations here. Players are not the exact same with the only difference being race. So some are truly going to be more athletic or ugly.

In addition, you do not know the posters race (I would imagine), which could influence the comments.

Also, what if a poster comments on multiple players? How would you handle those comments?

Also, are some posts a part of the same thread? If so, prior posts may influence that posters responses, so it is not independent.

I understand what you are trying to do, but if you plan to do anything with these data or want to interpret them, you absolutely cannot given a goodness of fit test.
This is for a research project. If you can provide your email address, I would be happy to send you with the full 10,000 research report if you would like. I always welcome feedback on my work. Also, thank you for pointing out these limitations. Now, notwithstanding the limitations mentioned, I would like to go back to my basic question. I have expected frequencies and observed frequencies for two groups. My expected frequencies are 6.3 for group 1 and 2.7 for group 2. My observed frequencies are 2 for group one and 7 for group 2. If I were to run the chi-square test of goodness-of-fit with this data, I would be violating the rule that no more than 20% of expected frequencies can be less than five. Is there an alternative to the chi-square goodness-of-fit test to use in this case?