Chi-Squared Tests


TS Contributor
For both examples here, we will need to compute a chi-square statistic:

X^2 = summation of [ (O - E)^2 / E ]
O = number of oberved frequencies
E = number of expected frequencies

Typically the chi-square test is used in situations where the data are arranged in a frequency table. The value of the computed chi-square statistic is compared to the critical chi-square statistic, whose value depends on the significance level (alpha) and the number of degrees of freedom, computed by:

(r-1) * (c-1)
r = number of rows in the table
c = number of columns in the table

1. Chi-Square Test of Independence

This is used to test for independence between variables, and is best demonstrated using a contingency table.

(use the attached image at the bottom-left for this example)

A study was conducted in a supermarket to see if attaching a "Do Not Litter" message to the daily specials flyer would have any effect on the rate of littering.

As customers entered the store, they were either handed a flyer without the message (Control) or with the message.

At the end of the day, the store was searched for flyers - their location, either in a trash can, left lying out (litter), or not located (removed from the store) along with the presence/absence of the message, was noted.

Use the formula: (row total * column total)/grand total to compute expected frequencies for each cell in the 2 * 3 table

Ho: Instructions and Location are independent
Ha: Instructions and Location are not independent
Use alpha = .05 for the level of significance

The chi-square statistic is computed as:
X^2 = summation of [ (O - E)^2 / E ]

= (41-61.66)^2/61.66 + ... + (499-478.64)^2/478.64
= 25.79

Since this is a 2x3 table, the degrees of freedom (df) is computed as (2-1)*(3-1) = 1*2 = 2

The critical value (table) of the chi-square statistic at alpha=.05 with 2 df is 5.99

Therefore we reject Ho and conclude that the location in which the flyers were left depended on the instructions.

2. Chi-Square Test of Goodness-of-Fit

This is used to test whether a set of observed frequencies "fits" a predicted or hypothesized distribution.

A local advertising firm wanted to know if any one of the local stations
has a significantly larger proportion of the television viewers of the
evening news broadcasts. The firm randomly selected 1,000 people and asked
them to specify which stations news program they prefer. The responses were
as follows:

TV Station A B C D
Number of Viewers 241 288 263 208

Using a 1% level of significance, test the null hypothesis that these 4 TV
stations have equal shares of the evening news audience.

Test the following hypothesis:
H0: the 4 TV stations have equal shares of the evening news audience.
Ha: ...unequal shares...

If they have equal shares,

the expected Number of Viewers is 1000/4=250

By the chi squared goodness-of-fit test,

Let Oi denote the ith observed number, E denote the expected number.

The test statistic:

Chisq0 = sum[(Oi-Ei)^2/Ei]


At alpha=0.01, the critical value is:
The rejection region is Chisq0 > 11.34

Chisq0 > 11.34

We reject H0 at 0.01 level. The 4 TV stations do not have equal shares of the evening news audience.