# 3 groups - What statistical analysis to use (Newbie)

#### Ahyanvirg

##### New Member
I have 3 groups: Pass = 533, Clinical = 58, Double Invalid = 28.

Total 619 participants were given 4 types of test and then later on, based on their scores, were categorised as pass, clinical or double invalid. Can these data be used for statistical analysis? If so, can you suggest which test to use?

We want to know if scores from 4 different types of test can predict the outcome (pass, clinical or double invalid).

Regards,
Marianne

#### Karabiner

##### TS Contributor
If you use the scores of the 4 tests to categorize participants into 3 groups,
then what sense does it make to ask whether the scores of the 4 tests are
associated with the categorization?

With kind regards

Karabiner

#### Ahyanvirg

##### New Member
If you use the scores of the 4 tests to categorize participants into 3 groups,
then what sense does it make to ask whether the scores of the 4 tests are
associated with the categorization?

With kind regards

Karabiner

What can you suggest a better way to do it?

#### Karabiner

##### TS Contributor
It is not a question of a better way, but a question of what sense your goal makes.
If you use the 4 scores for categorization, then necessarily they are associated with
the categorization. The way you use them for categorization (e.g. relative weighting)
determines how close these associations are. But maybe I am missing something
and you have a theoretical and/or practical and/or more specific question regarding
that matter.

With kind regards

Karabiner

#### Ahyanvirg

##### New Member
to give you a better overview of my data...these 619 participants underwent the same 4 testing procedures and were all successful. Since they passed these 4 tests, they where then given a final assessment wherein some passed, some fail, and others had a double invalid profile. From these scenario, we want to know if the scores from these four tests can predict/ or is there any relation to the final assessment outcome? We are trying to make a matrix wherein from the result of these four tests, we can assume the result of the final assessment even without asking them all to do it. I hope it makes sense

#### Karabiner

##### TS Contributor
Were there subjects which underwent the 4 tests but failed?
On which scale level are the test results measured (interval, ordinal,
or categorical)?

With kind regards

Karabiner

#### Ahyanvirg

##### New Member
Data are measured using rating scale.

Test 1 = scale of 1-5 (with 2 factors being measured; minimum passing sum score of 4)
Test 2 = scale of 1-7 (with 6 factors being measured; minimum passing sum score of 25)
Test 3 = scale of 1-5 (with 7 factors being measured; minimum passing sum score of 21)
Test 4 = scale of 1-5 (with 5 factors being measured; minimum passing sum score of 15)

Tests are given in chronological order. if you fail on one test, you will not proceed with the other tests. We are interested on the profile of those who passed all 4 tests and are subjected to the final assessment. So to answer you question, there where no subjects which underwent the 4 tests but failed...

#### Karabiner

##### TS Contributor
If you were able to dichotomize your dependent variable (e.g. "passed" vs. "other"), then you
could try binary logistic regression in order to obtain an equation for prediction of group
membership by the 4 scores.

Before that, you could inspect your data graphically, e.g. compare the test variables between
groups using box-and-whisker plots.

With kind regards

Karabiner

#### abdiasis.jama2

##### New Member
recode your ordinal vairables and run non parametric Kruskal Wallis H test to compare if there is significance difference among the groups

your sample size is quite large. after numeric data recoding, you could run ANOVA F test provided homogeneity of variances is repected using Levene test and boxplots visualization

this parametric option is explored in my recent book
Statistics guide for students and researchers with SPSS illustration on amazon

#### abdiasis.jama2

##### New Member
if you want to predict the dependent variable.
you could run multiple linear regression after scatterplot visualization
a high ANOVA F value will reject null hypothesis and make the relationship not significant
t values will indicate significance of each independent variable