Likert analysis. Chi-square vs t test vs Mann-Whitney-Wilcoxon

#1
Hello. Hopefully this doesn’t classify as a repeated post as I’m asking a more specific question.
I have a questionnaire with five Likert questions within it (I thought I had seven, but on reflection, the options for two of these questions are not Likert scale).

Of these questions, there are 3 categories;
x2 questions look at the reach of the product (1 = heard of it, 2 = seen it use)
x1 question asking about the confidence of using the product
x2 questions about COVID (1 = impact on your work, 2= impact on your development)

when I look at similar research papers they have used Chi-squared to analyse their Likert data., However, Dr Google always points me in the direction of statistical research papers (that are well above my head!) that say, maybe use Mann-Whitney-Wilcoxon or t test (or both).

My dependant variables will be demographic information, i.e. age vs confidence using product, years’ experience vs reach of product etc.

My questions are (I'm aware of the debate of whether Likert is ordinal or interval, and thus whether you should use parametric and non-parametric…. but I am also aware that there is no definitive answer for this debate)
1 - do I analyse the questions individually, or group them as “one” if they have a pair? I believe Likert should technically ask "the same" type of question a few times in different ways
2 - based on question one, what test to use for Likert analysis
 

Karabiner

TS Contributor
#2
when I look at similar research papers they have used Chi-squared to analyse their Likert data.
Chi² is useful for categorical variables. If your variables are Likert-type, then they are
on an ordinal scale. Chi² would not make full use of the information from these items.
My questions are (I'm aware of the debate of whether Likert is ordinal or interval,
That debate is around whether a Likert scale (which is a sum scale of 2+ Likert items) is ordinal or interval.
Not so much whether a single Likert item is ordinal or interval.

If you only have bivariate associations to analyse, then non-parametric tests would be quite suitable.

If you want to correlate a Likert item with an interval scaled variable (like age), or with an ordinal scaled variable (like level of education), then Spearman's rank correlation. Likert versus binary variable (like gender): U test. Likert versus categorical variable: Krukal-Wallis H test.

HTH

Karabiner
 
#3
Hi

Where I’m getting confused is ready a journal and them saying “we used chi-squared”, so I was looking for a one-stop shop stats test.

What you are saying if the test I use is dependant on my…dependants.

My dependants are

  • Sex – m/f
  • Age (internal)
  • Current role (categorical)
  • Years in role (categorical)
  • Level of qualification (categorical)

So, if I want to find out of any answers to my Likert depend on those, I have to run a sperate test on each of the above, per Likert question?

Ie to find if the age of the participant effects how confident they are using the product, ie use a Spearman's rank correlation?

to find if the age of the participant effects how knowledgeable the participant is I also use a Spearman's rank correlation?

If I want to find is there’s a link between the level of qualification that someone has an how knowledgeable they are, I use Krukal-Wallis H test.?
 

Karabiner

TS Contributor
#4
Years in role (categorical)
Level of qualification (categorical)
The first is almost certain ordinal, the second possibly (you say "level" of qualification, which sounds like
responses can be ordered).
So, if I want to find out of any answers to my Likert depend on those, I have to run a sperate test on each of the above, per Likert question?
It depends on what you want to find out. If these one-to-one relationships are what you are
interested in, then you can do it as described.

With kind regards

Karabiner
 
#5
The first is almost certain ordinal, the second possibly (you say "level" of qualification, which sounds like
responses can be ordered).
The answers to this question I have put in groups
0-1 years
2-5 years
5-10 years etc
That Is why I suggested it is categorical

It depends on what you want to find out. If these one-to-one relationships are what you are
interested in, then you can do it as described.
Pretty much yes, I want to see, say whether age, or number of years’ experience might affect their confidence of use, or their knowledge. So i could report the mode for the Likert response and then drill down further. i.e. most respondents agree that they are Confident enough to use a product. However, there is a statistical significant difference when confidence is compared with years’ experience, showing that those who have X (i.e. 2-5 years’ experience) are Statistically less confident than all other groups.

Am I on the right lines?

I bet you guys get a little bit frustrated of how students who are even completing high level degrees are required to use statistical analysis when they only have the very basic training on such (ie t test, ANOVA etc). I feel that I'm trying to undertake a 3 point turn or reverse parallel park when I've only just learnt how to change gear
 

Karabiner

TS Contributor
#6
I bet you guys get a little bit frustrated of how students who are even completing high level degrees are required to use statistical analysis when they only have the very basic training on such (ie t test, ANOVA etc).
Why should we be frustrated? That is not our problem. The questions stay the same.
 

katxt

Well-Known Member
#7
students who are even completing high level degrees are required to use statistical analysis when they only have the very basic training on such (ie t test, ANOVA etc).
This is what supervisors are for - to train you up and show you how to do things. The frustrating thing for a student is to have a supervisor who won't or can't give good advice.
 

katxt

Well-Known Member
#10
Years in role (categorical) and Level of qualification (categorical) both look ordinal to me or can be made so. I like Karabiner's advice but you will need to be careful with the number of p values you generate and make some allowance in your critical p value to avoid false positives.
 
#11
Years in role (categorical) and Level of qualification (categorical) both look ordinal to me or can be made so. I like Karabiner's advice but you will need to be careful with the number of p values you generate and make some allowance in your critical p value to avoid false positives.
thanks for the reply Katxt.

Of course, you are correct, they ARE both ordinal even thou I created the categories. ie there is an is an order to how many years’ experience, even if I created they categories

I am confused by your comment that I need to be careful with the number of p values you generate and make some allowance in your critical p value to avoid false positives.

Are you talking, for comparison, ANOA over T-test, i.e. you do an ANOVA as doing lots of t-tests inflates Type I error chance?

I have x5 demographics (dependants) and x5 Likert questions therefore, if need to run 25 tests (some with a different stats test). Is this what you are alluding to?
 

katxt

Well-Known Member
#12
I have x5 demographics (dependants) and x5 Likert questions therefore, if need to run 25 tests (some with a different stats test). Is this what you are alluding to?
Yes. Exactly that. If you use p<0.05 for significance then even if there is no connection anywhere between anything, in 25 tests you will more than likely get at least one p value less than 0.05 just by chance - a false positive. This has always been a thorny problem for researchers. A common solution is to reduce the cutoff p value but just how much to do that has never really been agreed on. A conservative method is Bonferroni which says us 0.05/number of tests. Other methods have been suggested, but they don't really improve all that much on Bonferroni. Perhaps you could use p<0.005 but you should use something small.
 
#13
Yes. Exactly that. If you use p<0.05 for significance then even if there is no connection anywhere between anything, in 25 tests you will more than likely get at least one p value less than 0.05 just by chance - a false positive. This has always been a thorny problem for researchers. A common solution is to reduce the cutoff p value but just how much to do that has never really been agreed on. A conservative method is Bonferroni which says us 0.05/number of tests. Other methods have been suggested, but they don't really improve all that much on Bonferroni. Perhaps you could use p<0.005 but you should use something small.
I know this is all about stats n math n stuff.... but reducing to 0.005 means that there’s a 99.5% chance any sig did not occur by chance, yes?

Seems a small margin!

When you say, Bonferroni, do you mean Bonferroni correction? Do you run this as a post hoc?
 
#14
Further, if I’m using age, for example, I’m looking at whether age influences one of the my 5 Likert questions

then I’m looking at whether level of experience influences my 5 Likert questions
etc etc for my x5 demographics / dependent variables

Because of this, both age and level of experience will both be compared to one of the Likert questions..... therefore, the amount of chance that one of the my x5 demographics would throw a false positive (type 1) for one of the Likert questions would increase?

hence the correction needed?

now, if I am comparing x5 demographics with my x5 Likert, is it divided by 25.... or is it just divided by the number of questions that each demographic is compared against (ie 5)?
Ie
age compared to likert question 1
age compared to likert question 2
age compared to likert question 3
age compared to likert question 4
age compared to likert question 5

or, keep going, so
experience compared to likert question 1
experience compared to likert question 2
experience compared to likert question 3
experience compared to likert question 4
experience compared to likert question 5
etc, until you have the 25 options

Also, sorry for basic question
Dependant would be age. This is the thing that changes the independent. Ie age changes the responses to the Likert
Independents would be Likert Reponses.
 

katxt

Well-Known Member
#15
now, if I am comparing x5 demographics with my x5 Likert, is it divided by 25.... or is it just divided by the number of questions that each demographic is compared against (ie 5)?
It's the total number of p values that count.
Seems a small margin!
I agree. It is a small margin. That is just a consequence of having so many combinations and p values.
In any research you run the risk of making a false claim of a true connection or difference when one does not exist, just because the data happened, by an unfortunate chance, to produce a low p value. Generally, p = 0.05 gives you a 95% protection against such a false claim for any particular test if in fact there is no difference. It's like Russian roulette with 1 bullet in 20 chambers. You have 1 chance in 20 of shooting yourself in the foot, which is usually considered an acceptable risk. Unfortunately, you are taking 25 shots and are very likely to lose your foot.
A possible way out is to explain the multiple p problem, say that those p<0.005, say are true, for p<0.01 say probable but not yet established, and p<0.05 interesting with no convincing evidence but worthy of future research.
 
#16
Phew, I might have got it... Ive mapped them out like this;

Screenshot 2022-06-30 101112.jpg

Qual level, years’ experience and current age group are all categories that I created, but based on suggestions above, they are still ordinal data as there is a logic and order to the options.

Due to the number of tests I will run (25 overall), it inflate the chance of type I error, so I need to complete a Bonferroni correction. As suggested above, I use the 0.005 as cut off for sig, but make comments on non-sig but near .005
 
#17
furthermore.... i have more questions that I could like to compare, x5 are ranking questions, where I’d like to see if different ages, for example, ranked options in a different order. Dr google tells me Mann Whitney U could be used (ordinal, non-para data).

Now, these “new” ranking questions, when added to the questions above, means that for “age” as a dependant, I want to see if age influences the response of 13 different questions

Age is one of 5 dependants topics.

So in total, I’m running 5*13 tests.

The type I error chance must be off the chart! But, how can you correct for that without your corrected sig being .0000???????????
 

katxt

Well-Known Member
#18
I'm afraid that you have problems. Basically you are trying to discover too much all at once.
If you google multiple p corrections you will find quite a few ways folk have devised to improve on Bonferroni, but realistically they won't help much in your situation.
If you are still in the planning stage, and haven't collected your data yet, you might think about redirecting your efforts into just a few key questions.
 

Karabiner

TS Contributor
#19
So if sample size is large enough, then sex, age, current role, years in role, level of qualification could jointly
be used to predict the other variables, using multiple linear regression, or logistic regression (for yes/no
variables). That would reduce the number of analyses.

With kind regards

Karabiner
 

katxt

Well-Known Member
#20
That would reduce the number of analyses.
There are strange problems with multiple p's that I have never really got my head around. Say we do a 2 way anova with interactions. This produces three p values. Should we do an adjustment or not? We never seem to, but perhaps we should. This isn't quite like multiple regression, but you get the same number of slope p values whether you do the regressions singly or do them all at once (more or less) so should you adjust your critical p with multiple regression? I just don't know.
I guess in the end it comes down to how much risk of a false positive are you prepared to take. And that will depend on the consequences to you or others if you claim something to be true that isn't.