# Question about chi-square test and single columns

#### newb95

##### New Member
Hi guys and apologies if this is already answered somewhere, I couldnt find it.
As far as I understand, Chi-Square test is a way to go here.

I have a table with 3 groups and 4 categories for each group (values presented are not true values, just for an example):
G1 G2 G3 TOTAL
C1 10 30 20 60
C2 30 40 50 120
C3 10 5 60 75
C4 40 40 30 110

If I would like to test separately withing each group that is there a significant difference in distribution among categories (C1-C4).
What is bothering me is that is it valid to conduct one-way test separately for each column and omit the information about totals in each category? Or what would be the most valid way to proceed?

#### gianmarco

##### TS Contributor
I do not understand why you want to do that? Do you know in which context chi-sq test is used and which question it actually aims to address?

#### hlsmith

##### Less is more. Stay pure. Stay poor.
Please provide more information, given your description I am led to think you are trying to conduct single sample tests of groups against constant.

#### newb95

##### New Member
Thank you for the input so far. For each group separately, I would like to know is there a significant difference in values among categories.

#### obh

##### Well-Known Member
Seems that you put in one table 3 independent columns. If you want to test separately each group, How about running One Way ANOVA for each group?

#### katxt

##### Well-Known Member
I assume that the data is suitable for a chi square test (count data, independent etc). One problem is that of multiple p values when doing several tests which increases the chance of a false positive. You can do a chi square on the 4x3 group as a whole and then, only if that is significant, do each column separately.

#### obh

##### Well-Known Member
Is the data in the table count? I assumed it is averaged

#### gianmarco

##### TS Contributor
If I would like to test separately withing each group that is there a significant difference in distribution among categories (C1-C4).
Apologies if I go back to the very core of your main question, but I do not fully understand the reason why you may want to "modify" the chi-sq test when it actually proves fit to your situation.

What I mean is that chi-sq test is meant to test if a significant association exists between the levels of two categorical variables, i.e. between your table's row and column categories. So, the chi-sq test (of independence) will address this question, provided that this very question is the one you wish to investigate.

Also, should the "regular" chi-sq test prove significant, a usual follow-up test would entail calculating the standardized residuals for each cell. They will help you locating which cells is significantly contributing to the rejection of the Null Hypothesis of independence, i.e. where there is a significant "connection" (either positive or negative) between the levels of the two categ. variables.

See for example the following small cross-tabulation (taken from literature) of Univ. faculty vs funding categories:

Code:
              A  B  C  D  E
Geology       3 19 39 14 10
Biochemistry  1  2 13  1 12
Chemistry     6 25 49 21 29
Zoology       3 15 41 35 26
Physics      10 22 47  9 26
Engineering   3 11 25 15 34
Microbiology  1  6 14  5 11
Botany        0 12 34 17 23
Statistics    2  5 11  4  7
Mathematics   2 11 37  8 20
The chi.sq test is significant (p: <0.01). The table of standardized residuals (see attached .jpg, and focus on absolute values larger than 1.96) indicates that there is a negative association between Geology and funding catefory E (i.e., the observed frequency for Geology in category E is less than expected under the hypothesis of independence). ON the other hand, Zoology has a "positive" connection with funding category D, Physics with A, Engineering with E.

Hope this helps
Gm

Last edited: