How to compare the ranking of six items to a correct ranking for each participant.

#1
Hi, I'm new here and have googled quite a bit before asking this.

So I have participants who had to give a rank to six items (Situations).

For example
- -Situation1 Situation2 Situation3 Situation4 Situation5 Situation6
Participant 1: 6 4 5 3 1 2
Participant 2: 5 6 3 4 2 1
Participant 3: 1 6 4 5 3 2

There is one CORRECT ranking, a perfect solution. Which is:
6 4 5 3 2 1

Now I want to express in a single number how well a participant has ranked compared to the perfect solution.

Participant 1 should have a high value, because he/she is only off once.
Participant 2 should have a lower value, because he/she is off multiple times.
Participant 3 should have a high value, because the order is correct, but only shifted to the right one digit.

Why? I then want to correlate this number to another thing I measured of each participant. To assess if some other traits of a participant make them good in ordering those situations.

I do know that Kendalls Tau would be the thing to use for such a problem. But that compares collumns, I have the data in rows. And over 400 participants...

Thank you very much for your help. I use SPSS and Jamovi for stats.
 
Last edited:

gianmarco

TS Contributor
#2
Just a quick suggestion, maybe some one can provide a sounder help.

To come up with a value that may express how close is each participant to the "correct" ranking, you might calculate the absolute difference between the rank assigned by each participant and the corresponding "correct" rank. When you add up all those differences across each participant, you will get a "score"; the smallest the score, the closer to the "correct" ranking the ranks assigned by each participant will be.

In your example, P1 would have a score of 2; P2 would have 5; P3 would have 12.

best
Gm
 

Karabiner

TS Contributor
#3
Spearman correlation coefficients for these three rows with the reference row are 0.943, 0.714 and -0.029, respectively.
I do know that Kendalls Tau would be the thing to use for such a problem. But that compares collumns, I have the data in rows. And over 400 participants...
Is this really a problem? You rearrange the data from rows into columns (e.g. in SPSS by using CASETOVARS),
and then you perform the correlations (in SPSS CORRELATIONS reference_var WITH subject_1 subject_2 [..] subject_400).

With kind regards

Karabiner
 
#4
Thank you very much for your answers! I highly appreciate it!


As for the first suggestion, I wouldn't know how I could justify this scientifically. Using a tested method would be prefered.

As for the CASETOVARS suggestion.
Hmm... I'm not sure yet if that will send me down a rabbit hole...

My Data is very complicated. In fact I have around 150 items which then are computed into lots of other variables and I end up with almost 400 variables and 430 valid cases.

Also there is a twist, I didn't. mention before, because I didn't want to make it too complicated... At the moment I don not have a ranking of this six items, but participants assigned a value from 0 to 100 to each and I need to turn this into ranks. That would be easier after rearrangement too. I think.

But to be honest, I never rearranged data like this and I believe I would have to delete all variable not used for the ranking in order to make SPSS do what I want. That again would break my data-flow.. I could't go back to what I had, except by some akward fusion of the data before and what I do after... seems all very cumbersome to me, or am I missing something?
 

Karabiner

TS Contributor
#5
You have a dataset with many variables. You will possibly end up with two distinct sets,
and you could then add the correlation coefficients from set 2 to the individual
data from set 1. Maybe I am missing something, but I cannot see why this should be an
uncommonly big problem.

With kind regards

Karabiner
 
Last edited:

gianmarco

TS Contributor
#6
As for the first suggestion, I wouldn't know how I could justify this scientifically. Using a tested method would be preferred.
Not to be defensive, but calculating absolute differences is not that unknown in data analysis. The mean (and median) absolute deviation is a robust "substitute" for the standard deviation. Also, Manhattan Distance rests on pair-wise absolute differences.

So, if your goal is coming up with a unique descriptive value that can "measure" how much each participant differs from the "ideal" ranking, I would not discard what I suggested. Among other things, that "measure" has an intuitive meaning (at least for me).

Actually, what I suggested is the Manhatta distance between each participant's ranking and the "correct" ranking.

Over to you
Best
 
Last edited: