# Which statistical test do I use?

#### AniBioUSA

##### New Member
I have a survey project and I'm not the best at stats.
I'm comparing the means of answers given based on the continent the responder is from.
So I have North America (7.20), South America (5.40), Europe (5.84), Africa (6.95), Asia (5.91) and Australia (5.90). The mean answers they gave for a question are in the brackets. I also have the total mean of all the responders as (6.20). I want to compare this data in Rstudio. Do I want to run a one-way ANOVA or Kruskal Wallis test? There are a total of 8 questions but they are all independent of each other so I'll have to do the same thing 8 times correct?

#### obh

##### Active Member
Hi Ani,

I'm not sure what do you try to do

One way ANOVA will only tell you if based on the samples, you can reject the H0 that Mean north america = Mean South America = ...= Mean Australia.

The Kruskal Wallis is the non-parametric test of the One way ANOVA, doing the same with a different assumption.

Do all the 8 questions measure the same? or different aspects?

#### Dason

Just to confirm - you have access to the raw data and not just the means right?

#### AniBioUSA

##### New Member
Hello,
I do have the raw data.
What I want to do is compare the means of all the continents for each question they were asked in the survey. I'm just unsure of how I do this.

Thank you.

#### obh

##### Active Member
I have a survey project and I'm not the best at stats.
There are a total of 8 questions but they are all independent of each other so I'll have to do the same thing 8 times correct?
Correct if each question describes different measurement, otherwise you may calculate the average of the questions that describe the same measurement.

So you can do the "One way ANOVA" if meet the assumptions or otherwise Kruskal Wallis.
Do you use the Likert scale, how many possible answers?

#### AniBioUSA

##### New Member
We have a sample size of 500.
We used a 1-10 scale so 10 possible answers.

How do I check for normality so I know which test to use?

#### obh

##### Active Member
Hi Ani,

The data doesn't distribute normally since it is a discrete range of limited range value. (can't be more than 10 or less than 0)

But since the number of option is 10
And the sample size is more than 30.
I assume it should be okay Central_limit_theorem)

You don't need to test the data for normality as the average should distribute toward the normal.

#### AniBioUSA

##### New Member
So in that case I can just run an ANOVA for each picture right?

#### obh

##### Active Member
I assume it should be okay for the normality assumption, as the sample size is very big (500).
Another Anova assumption you need to know is for equal variances between the groups. (more important if the groups' sizes aren't similar)

#### AniBioUSA

##### New Member
Oh right. So ya the group sizes aren't similar at all. What does that change?

#### AniBioUSA

##### New Member
Sorry for the brief absence. You have been so helpful Obh.
I decided to run the ANOVA for each individual question. I've been going over the P-values and they just don't seem right but this project is more complicated than I am used to in terms of number of questions and variables for each question.

Question 1: P=0
Question 2: P=0
Question 3: P= 0.0007
Question 4: P= 0.0823
Question 5: P=0
Question 6: P= 0.8138
Question 7: P= 0.0003
Question 8: P= 0.0029

#### obh

##### Active Member
Hi Ani,

Glad to try to help, it takes time to learn Is there a connection between the questions? say some question measuring the same thing?
What doesn't seem okay in the results?

#### AniBioUSA

##### New Member
After reviewing the them more closely I think they do make sense.
There are no connetions between the questions.

So my null hypothesis would be that there is no statistical sig between what continent you are from and how you answer each question.
So questions 4 and 6 were the only ones with p-values >0.05 they are the only ones were I failed to reject the null hypothesis correct?
That would seem to be true as those are the only two where the means didn't vary greatly. They were are pretty well grouped around the average mean.
The other questions have large variation in the means from each country compared to the total average mean.

I wish I could show you my data but I can't seem to upload it. I hope my description helps. Again, I can't thank you enough got your incredibly valuable help!

#### obh

##### Active Member
Hi Ani,

Even question 4 is potentially significance.
0.05 is commonly used, but actually the is no big difference between 0.05 and 0.08, it is only a rule of thumb.
It only says that if you decide the difference is significance, the possibility of error is relatively big : 0.08

But since your sample size is very big 500, and we need to have a cut of value I would say not significance. (even if significance the effect size is probably very small)

"So questions 4 and 6 were the only ones with p-values >0.05 they are the only ones were I failed to reject the null hypothesis correct?"
Yes

You can always attached a zip file

#### AniBioUSA

##### New Member
I hope that worked. I have never converted to a zip file. I'm doing this whole thing on my brothers tablet because I don't have a computer.
So a friend of mine who is taking the same science class got this from the teacher. It tells us how we need to write this for the project.

Statistical conventions
• * Means and standard errors/standard deviations (and medians and interquartile ranges/confidence limits), with their associated sample sizes, are given in the format X±SE = 10.20±1.01 g, N = 15.
• * For statistical tests, give the name of the test followed by a colon, the test statistic and its value, the degrees of freedom or sample size (whichever is the convention for the test) and the P value (note that F values have two degrees of freedom). The different parts of the statistical quotation are separated by a comma.
• * If the test statistic is quoted with degrees of freedom, these are reported as a subscript to the test statistic. For example:
ANOVA: F1,11 = 7.89, P = 0.017
Kruskal-Wallis test: H11 = 287.8, P = 0.001
Chi-square test: X2 = 0.19, P = 0.91
Paired t test: t12 = 1.99, P = 0.07
• * If the test is quoted with the sample size, this should follow the test statistic value. For example:
Spearman rank correlation: rS = 0.80, N = 11, P < 0.01
Wilcoxon signed-ranks test: T = 6, N = 14, P < 0.01
Mann-Whitney U test: U = 74, N1 = N2 = 17, P < 0.02
• * P values for significant outcomes can be quoted as below a threshold significance value (e.g. P < 0.05, 0.01, 0.001), but wherever possible should be quoted as an exact value. Marginally non-significant outcomes can be indicated as exact probability values or as P < 0.1. Non-significant outcomes should be indicated with an exact probability value whenever possible, or as NS or P > 0.05, as appropriate for the test. State whether a test is one tailed or two tailed.

I'm thinking that means I need to write is as:
Picture 1: F1, 5=7.34, p=0<0.05
Picture 2: F1, 5=8.37, p=0<0.05
Picture 3: F1, 5=4.31, p=0.0007<0.05
Picture 4: F1, 5=1.96, p=0.0823>0.05
Picture 5: F1, 5=6.46, p=0<0.05
Picture 6: F1, 5=2.09, p=0.8138>0.05
Picture 7: F1, 5=4.81, p=0.0003<0.05
Picture 8: F1, 5=3.65, p=0.0029<0.05

5 is the degrees of freedom and the F statistical value?

I need to use the Fisher test to find the exact probability correct?

How do I know if it is one tailed or two tailed?
Can I do this in excel? I really didn't get the hang of Rstudio.

#### Attachments

• 70.3 KB Views: 0

#### obh

##### Active Member
Hi Ani,

I hope that worked. I have never converted to a zip file. I'm doing this whole thing on my brothers tablet because I don't have a computer.
.
You probably should buy a laptop/computer, as it is much easier to work than a tablet ... I hope that worked. I have never converted to a zip file. I'm doing this whole thing on my brothers tablet because I don't have a computer.
.
I hope that worked. I have never converted to a zip file. I'm doing this whole thing on my brothers tablet because I don't have a computer.
.
You probably should buy a laptop/computer, as it is much easier to work I hope that worked. I have never converted to a zip file. I'm doing this whole thing on my Picture 1: F1, 5=7.34, p=0<0.05.......................
.
Usually F(1, 5)=7.34 ...
1 and 5 are the degrees of freedom

I hope that worked. I have never converted to a zip file. I'm doing this whole thing on my I need to use the Fisher test to find the exact probability correct?
.
Why?

I hope that worked. I have never converted to a zip file. I'm doing this whole thing on my I How do I know if it is one tailed or two tailed?
.
I suggest you will look at http://www.statskingdom.com/180Anova1way.html , and let me know your conclusion.
just run a simple example and see the F chart and the tails information "I"

I hope that worked. I have never converted to a zip file. I'm doing this whole thing on my I need to use the Fisher test to find the exact probability correct?
.
I hope that worked. I have never converted to a zip file. I'm doing this whole thing on my I Can I do this in excel? I really didn't get the hang of Rstudio.
.
If your teacher asked for RStudio, then I assume you should use it.
It is very easy to install, but you need a computer ...

#### AniBioUSA

##### New Member
I hope to get one when I graduate.
The teacher didn't specify that Rstudio should be used but we learned a little on how to use it in our computers class Junior year.

I used this calculator and it all came up the same as when I ran Rstudio before. Thank you for the tip this site gives more information to help me know what I'm looking at.

They all came out right tailed.
4 and 6 came out as not statistically significant and the others were.

#### obh

##### Active Member
ANOVA uses only the right-tailed with the following reason:

"When performing ANOVA test, we try to determine if the difference between the averages reflects a real difference between the groups
or is due to the random noise inside each group.
The F statistic represents the ratio of the variance between the groups and the variance inside the groups"

In a regular test, the statistic may be to big (right tail) or too small (left tail)
In the ANOVA test when the statistic F is "too small", it just says that variance between the group is very small related to variance inside the groups.
So H0 is correct - the averages are equal, so actually, there is no such thing as "too small"/ left tail.
(A similar idea in chi test for goodness of fit).

Interesting to know (but not surprisingly) that when having only two groups the 2 sample t-test (2-tailed,) and the ANOVA test (right-tailed) has the same result.

#### AniBioUSA

##### New Member
oh I see. So does that mean that ANOVA tests are one-tailed?

Thank you again so much for all of your help. I really appreciate it. I want to go into my freshmen year with a good project under my belt. Thank you so much.