Correlation with nominal variables

Newbe

New Member
Dear all,

I have two data sets and I would like to measure correlation between variables.

In the first data set I have three classes: c1, c2, c3. My classes are representing the difficulty level of an exam and I converted them into numbers such as 1 (easiest), 2 midlle, and 3 (hardest).

The second data set, I have nominal classes are types of exams that are not necessarily easier or harder (I can assume that but I am not sure about it). I used a similar encoding as above but I am not sure if I am biasing the data this way.

My question is what type of correlation can I use in both cases with the exam results of students? I've seen some answers on the web but I was confused. I tried Pearson correlation in both cases and I got some results that confirmed my initial hypothesis but I am not sure if this is just an illusion.

Thanks very much for your help as I have limited experience with stats.

staassis

Member
You can test for potential association between Class Type and Exam Type using chi-square test for independence. If there is association, you can estimate its strength using Cramer's V and Goodman & Kruskal's lambda. This is one of the most statistically efficient approaches, even though it involves ignoring the ordinal nature of Class Type.

A more statistically efficient approach would involve programming a randomization test specific to your problem.

Newbe

New Member
You can test for potential association between Class Type and Exam Type using chi-square test for independence. If there is association, you can estimate its strength using Cramer's V and Goodman & Kruskal's lambda. This is one of the most statistically efficient approaches, even though it involves ignoring the ordinal nature of Class Type.

A more statistically efficient approach would involve programming a randomization test specific to your problem.
Thank you for your reply. If I understand well, I should use chi square in both data set not with the second one only. Thanks again.

staassis

Member
Yes, because you seem to want to explore the relationship between the variables in the two data sets.