Hi everyone
I'm looking for some advice on finding the most appropriate way to analyse my data.
I would like to test the difference in the number of mistakes students make on two tests. Let's say that the students take biology and geography tests, and we count the number of mistakes made on each test. I would like to know, which test is associated with more mistakes (geography or biology test).
Intuitively I would just test the difference with t-test, however, this is a count data which is not appropriate for such analysis. My tutor has advised me to use binomial logistic regression, but I don't think it's appropriate either as the outcome variable is not just 'yes' or 'no' dichotomous variable, but each participant scores in the two categories of the outcome variable (mistakes on biology AND geography tests). Is it appropriate to record this data in a long format and analyse with logistic regression? The count variable would be then transformed into a proportion.
EDIT
Would it be possible to test this with linear mixed models after the count scores have been transformed into proportions? For example:
proportions of test mistakes ~ test (geography and biology) +(1| id)
Example data
ID Test Proportion
1 Geo 17
1 Bio 23
2 Geo 5
2 Bio 17
3 Geo 20
3 Bio 60
Thanks so much for your help.
I'm looking for some advice on finding the most appropriate way to analyse my data.
I would like to test the difference in the number of mistakes students make on two tests. Let's say that the students take biology and geography tests, and we count the number of mistakes made on each test. I would like to know, which test is associated with more mistakes (geography or biology test).
Intuitively I would just test the difference with t-test, however, this is a count data which is not appropriate for such analysis. My tutor has advised me to use binomial logistic regression, but I don't think it's appropriate either as the outcome variable is not just 'yes' or 'no' dichotomous variable, but each participant scores in the two categories of the outcome variable (mistakes on biology AND geography tests). Is it appropriate to record this data in a long format and analyse with logistic regression? The count variable would be then transformed into a proportion.
EDIT
Would it be possible to test this with linear mixed models after the count scores have been transformed into proportions? For example:
proportions of test mistakes ~ test (geography and biology) +(1| id)
Example data
ID Test Proportion
1 Geo 17
1 Bio 23
2 Geo 5
2 Bio 17
3 Geo 20
3 Bio 60
Thanks so much for your help.
Last edited: