# Correlation between dichotomous and ordinal variables

#### ldwg

##### New Member
Hi everyone,

I am very thankful for your feedback on my eventually simple question: I want to calculate the correlation between a dichotomous independent variable and a ordinal dependent variable. What coefficient fits my needs?

Here is an example of my items:

Code:
1) Do you work in a team
[ ] yes     [ ] no

2) How satisfied are you with your job?
very unsatisfied  [ ]  [ ]  [ ]  [ ]  [ ]  [ ]  [ ]  very satisfied

I assume the first variable is dichotomous/nominal and the second one is ordinal, right? Tied ranks will occur, right? Is Somas' d a good option? Preferably, I'll work with SPSS.

ldwg

Last edited:

#### ldwg

##### New Member
Let me rephrase my question: What coefficient is appropriate to measure an association between an dichotomous and an ordinal variable?
Anyone? Thanks very much. Ldwg

#### ldwg

##### New Member
Hi all,

I still would be thankful for any advice.
How about point-biserial correlation. Would that be an option? Or is Kedalls-Tau-b an apropriate option, even though one item is dichotomous?

Ldwg

#### ldwg

##### New Member
How about the Mann–Whitney U test. Might that test be applicable?
Still thankful for feedback
Ldwg

Reference: Howell, D. C. (2006). Statistical Methods for Psychology (7th ed.). Wadsworth Publishing.

#### CB

##### Super Moderator
I assume the first variable is dichotomous/nominal and the second one is ordinal, right?
This would be the conventional way to describe these variables, yes.

Is Somas' d a good option?
Do you mean Somer's d? I don't know much about this coefficient, but I believe it's used when both variables are ordinal.

How about point-biserial correlation. Would that be an option? Or is Kedalls-Tau-b an apropriate option, even though one item is dichotomous?
A point-biserial correlation is used when one variable is continuous and the other is dichotomous; Kendall's tau when both are ordinal. Neither is particularly well-suited to the problem.

How about the Mann–Whitney U test. Might that test be applicable?
This would be the most conventional way to go, I think. Note that you'd be rephrasing your question as being about differences between the groups formed by the nominal variable as opposed to being about a correlation.

#### spunky

##### Doesn't actually exist
i'd do a polychoric correlation here... i know, i know the definition says when both are polytumous ordinal variables but the polychoric correlation works for any n X m matrices, including the special case of 2 x 2 matrices (the polychoric infinite series looks the same as the tetrachoric series in that case)... in yours you have a... what? 2 (yes/no) X 7(very unsatisfied...satisfied) matrix so... i dunno, my 2 cents would be to use a polychoric correlation.

#### ldwg

##### New Member
Thanks for all the feedback. My ordinal data is far from normally distributed. As far as I have read, somers' d seems quite a good choice. What do you say?

#### spunky

##### Doesn't actually exist
As far as I have read, somers' d seems quite a good choice.
Uhm... I dont think so. please, allow me to quote Somers himself (the guy who invtented it) in the paper where he developed this statistic (citation woud be: Somers, R. H. (1962). A new asymmetric measure of association for ordinal variables. American Sociological Review, 27, 799-811.

with regards to his Somers D, he said: "It should be emphasized that the remarks contained in this paper relate specifically to ordered contingency tables, meaning that both variables have a natural ordering (i.e.,
are ordinal variables) like "status," "education," "degree of agreement," etc. They do not apply in situations where one or both variables are nominal, i.e., without a natural ordering."

sadly you fall on the first situation he mentiosn since your question about teams cannot be meaningfully ordered...

#### CB

##### Super Moderator
My ordinal data is far from normally distributed.
It's difficult to imagine a situation in which ordinal data would be normally distributed. Luckily, normal distributed data isn't an assumption of any of the tests mentioned in this thread.