what kind of regression tu use for competition results

#1
hi, we have botanical data and need to analyze this sort of scenario, in terms of students for the simplicity of explenation:
we have a series of competitions or trials, each one involves 2 individuals at a time, and each trial has 2 new individuals. We test for a binary score - win or loose in the test, and want to see if parameters of the individuals affect this result. The parameters are a mix of parametric such as age etc., and categorial ones, such as the school the each individual comes from. So the trials are independent among them, but naturally the result of each trial depends not only on the winner's paremters but also on his oponents. We could think of any simplr mixed efect and/or nested logistic model. Any suggestion? is it possible to to a regression analysis on this? or any other statistics? someone suggested Bayesian inference due to the categorial parameters we have some prior data on, but we prefer not to use Bayesian methods. Thanks for the help.
 
#3
maybe you can use logistic regression if it wanted to see the effect, it was one of them because I could not see the data
thanks. The problem in using logistic regression, even mixed effects logistic regression, is that we either loose some of the data if we include only the winner, or multiply the data lines falsly - since each 2 individuals in a trial are not independent in the result outcome , actually exactly the inverse. So perhaps we are wrong in the use of data.
 

noetsi

No cake for spunky
#4
Why do you lose data in logistic regression - that is why do you only have to include the winners? Include the winners coded as 1 and the losers as 0.

I don't understand why the individuals in the trial are not independent? What makes them that way? Obviously they vary on some dimension but that does not mean they are not independent of each other statistically.
 
#5
Why do you lose data in logistic regression - that is why do you only have to include the winners? Include the winners coded as 1 and the losers as 0.

I don't understand why the individuals in the trial are not independent? What makes them that way? Obviously they vary on some dimension but that does not mean they are not independent of each other statistically.

I attach some example of the data, maybe it will make our problem clearer or help find where are we wrong:

the following data (also appears in the attachement) comes from this experiment: we take subjects (=subject id) from two schools (=group) and make a spelling contest between them
and the outcome is sin or loose, i.e. 1 or 0 (=result), and we also record each subject previous english grade (=grade), and we might add some more columns in the future
of grades in other related subjects (e.g., literature, 2nd language grades etc.).
thanks.

line no. trial No. Subject id group grade result (win=1; loose=0)
1 1 15 1 2 0
2 1 21 2 2.15 1
3 2 8 1 2.15 0
4 2 27 2 1.9 1
1 3 18 1 2 0
2 3 9 2 2.25 1
3 4 3 1 2.05 1
4 4 5 2 2.1 0


In the example we have 4 trials, and 2 data lines for each
trial: one for each of the 2 subjects. Note that the resuls are complementary for each trial, result=1 in one of the
subjects mean that the other has result=o. So this is one problem of redundency of part of the data - regarding the effect of group on the result - sisnce one line determines the other (note
that with grades parameter this is not so, since it is a parametric variable, not a dichotomous one). So even if we choose nested models, still in requires independence of the data points in each lvel - which we don;t have here in the level of each trial , or can the points be dependent?
Of course the win or loose is a result of the individual parametrs of the winner, but also
of the loose, i mean of who was the competitor (although we do not intent to refress also on the Subject id, naturally as it is meanningless) and what were his grades. So how can we find the effects
on the result of the group (should be random effect, since it has many diverse effects, probably also on the grade parameter, at least to some extent) as well as of the grades ?
When we though of mixed and nested effects logistic regression model, we found the possible problem of redindant data in the goup->result effect.
So any suggestions? is there a suitable regression model? should we use other statistical approach?

thanks!
 
Last edited:
#6
I think the data can still be analyzed with logistic regression. In logistic regression we compare between categories with the reference category being that of the coefficients or category you wish to compare (Try to learn more about the interpretation of the logistic regression model). Thus, you can still compare the desired variables with the desired category. However, one thing that may be considered if there is interaction between the variables in the model.
for reference see the book Razia Azen and Chindy M. Walker on categorical data anailisis