# Chi-Squared Tests

#### JohnM

##### TS Contributor
For both examples here, we will need to compute a chi-square statistic:

X^2 = summation of [ (O - E)^2 / E ]
where:
O = number of oberved frequencies
E = number of expected frequencies

Typically the chi-square test is used in situations where the data are arranged in a frequency table. The value of the computed chi-square statistic is compared to the critical chi-square statistic, whose value depends on the significance level (alpha) and the number of degrees of freedom, computed by:

(r-1) * (c-1)
where:
r = number of rows in the table
c = number of columns in the table

------------------------------------------------
1. Chi-Square Test of Independence

This is used to test for independence between variables, and is best demonstrated using a contingency table.

(use the attached image at the bottom-left for this example)

A study was conducted in a supermarket to see if attaching a "Do Not Litter" message to the daily specials flyer would have any effect on the rate of littering.

As customers entered the store, they were either handed a flyer without the message (Control) or with the message.

At the end of the day, the store was searched for flyers - their location, either in a trash can, left lying out (litter), or not located (removed from the store) along with the presence/absence of the message, was noted.

Use the formula: (row total * column total)/grand total to compute expected frequencies for each cell in the 2 * 3 table

Ho: Instructions and Location are independent
Ha: Instructions and Location are not independent
Use alpha = .05 for the level of significance

The chi-square statistic is computed as:
X^2 = summation of [ (O - E)^2 / E ]

= (41-61.66)^2/61.66 + ... + (499-478.64)^2/478.64
= 25.79

Since this is a 2x3 table, the degrees of freedom (df) is computed as (2-1)*(3-1) = 1*2 = 2

The critical value (table) of the chi-square statistic at alpha=.05 with 2 df is 5.99

Therefore we reject Ho and conclude that the location in which the flyers were left depended on the instructions.

2. Chi-Square Test of Goodness-of-Fit

This is used to test whether a set of observed frequencies "fits" a predicted or hypothesized distribution.

A local advertising firm wanted to know if any one of the local stations
has a significantly larger proportion of the television viewers of the
them to specify which stations news program they prefer. The responses were
as follows:

TV Station A B C D
Number of Viewers 241 288 263 208

Using a 1% level of significance, test the null hypothesis that these 4 TV
stations have equal shares of the evening news audience.

Test the following hypothesis:
H0: the 4 TV stations have equal shares of the evening news audience.
Ha: ...unequal shares...

If they have equal shares,

the expected Number of Viewers is 1000/4=250

By the chi squared goodness-of-fit test,

Let Oi denote the ith observed number, E denote the expected number.

The test statistic:

Chisq0 = sum[(Oi-Ei)^2/Ei]
=(241-250)^2/250+(288-250)^2/250+(263-250)^2/250+(208-250)^2/250
=0.32+5.78+0.68+7.06
=13.84

df=4-1=3

At alpha=0.01, the critical value is:
Chisq(3,0.99)=11.34
The rejection region is Chisq0 > 11.34

Chisq0 > 11.34

We reject H0 at 0.01 level. The 4 TV stations do not have equal shares of the evening news audience.