# Binomial regression with x as the dependent variable and n as an independent variable

#### emily.stats

##### Member
Hi all

I have a sample size of 100. Out of 100 a certain number are successful at stage one. Of those successful at stage one some are successful at stage two. I want to regress the number successful at stage one and a number of other factors against the number successful at stage two. The challenge is that the number at stage two cannot be greater than the number at stage one.

Thanks

#### SiBorg

##### New Member
Re: Binomial regression with x as the dependent variable and n as an independent vari

I'm no expert but it sounds like you want to perform logistic regression with 'successful at stage 2' as your dependent variable.

You then have predictor variables, one of which is the binary variable 'successful at stage 1', plus your other predictor variables.

You'll then see if any of the predictor variables (including successful at stage 1) are significant predictors of successful at stage 2.

Given your small sample size, however, you can't really have too many predictors in the model. So a backwards or forwards regression strategy where the software throws out predictors or adds them in depending on their significance would probably work best.

#### bryangoodrich

##### Probably A Mammal
Re: Binomial regression with x as the dependent variable and n as an independent vari

I don't quite understand this constraint you want to impose. Are your data going to be like (1 for success; 0 not):

Code:
1st 2nd
0   0
1   0
1   1
0   0
1   0
1   1
Or might you have

Code:
1st 2nd
0   1
1   0
1   1
0   0
1   0
1   1
The second has successes in the 2nd stage while not successes in the 1st. The first example has 2nd stage 0's where there are 1st stage 0's. If it is like the first stage, then the constraint you want is implicit in the data, and no constraint on the model is necessary. If your data is like the second, I have to wonder why you would want to constrain your model. That would impose a constraint that doesn't reflect the data (phenomena) you observe. The easy solution would be to simply remove those observations where you have a (0, 1) pair between your 1st and 2nd stages. Then your data is prepared for the model you want to use without having to impose the constraint on the model. However, it has to be stated clearly that you've imposed that constraint on your data.

Edit: The above comment follows my thoughts on how to model this. Just do a logistic regression using the data as-is, checking if one is a successful predictor of the other. Sometimes sometimes you'll have success/failure pairs of (0, 0), (0, 1), (1, 0), or (1, 1), but the number of 1's in either the first or second stage shouldn't matter. If it is something unattainable from the process that generated it, then it's even more queer that you would have impossible data. Hence, why I wonder why you need this constraint at all.

#### SiBorg

##### New Member
Re: Binomial regression with x as the dependent variable and n as an independent vari

Note: If being successful at stage 1 is a prerequisite for getting a go at stage 2, then it can't be a predictor of success at stage 2....

#### emily.stats

##### Member
Re: Binomial regression with x as the dependent variable and n as an independent vari

The data will look like
n s1 s2
100 80 79
100 70 60
100 50 25
etc

#### SiBorg

##### New Member
Re: Binomial regression with x as the dependent variable and n as an independent vari

If you are interested in what determines success at stage 2, AND if success at stage 1 is required to go on to stage 2, then just do a logistic regression with your dependent variable as success at stage 2 (and ignore success at stage 1). The reason is that success at stage 2 implies success at stage 1 so stage 1 can be ignored.

Or maybe not..... I'm guessing 'binomial regression' is different to logistic regression in that you're not considering individuals but the whole sample....

Last edited:

#### bryangoodrich

##### Probably A Mammal
Re: Binomial regression with x as the dependent variable and n as an independent vari

The data will look like
n s1 s2
100 80 79
100 70 60
100 50 25
etc
So as we've inquired, s2 is necessarily less than s1? To be in stage 2 requires you to be successful at stage 1? Then by that design, the data itself will meet your constraint, correct?

#### emily.stats

##### Member
Re: Binomial regression with x as the dependent variable and n as an independent vari

Thanks for the help so far. Just to clarify I want to be able to predict s2 based on s1 and other variables

#### emily.stats

##### Member
Re: Binomial regression with x as the dependent variable and n as an independent vari

I'm still interested in the answer to this question. Any more ideas?
Thanks

#### SiBorg

##### New Member
Re: Binomial regression with x as the dependent variable and n as an independent vari

I still don't understand the problem.

Two questions. (1) Is success at stage 1 essential to obtain success at stage 2?

And (2) What, exactly, are you trying to predict with your predictor variables? Chance of success at stage 2? For one person with a certain combination of predictor variables?

#### emily.stats

##### Member
Re: Binomial regression with x as the dependent variable and n as an independent vari

yes success at stage 1 is essential for success at stage 2.
we are trying to predict the chance of sucess at stage 2 using results from stage 1 and other predictor variables.
it is easy to measure success at stage 1 but not at stage 2 so for future studies we want to just measure stage 1 and use it as a basis for estimating stage 2 success.
Thanks

#### SiBorg

##### New Member
Re: Binomial regression with x as the dependent variable and n as an independent vari

Ok, that is helpful. And do you want a probability of passing stage 2 for an individual taking these exams or the percentage of the whole group that will pass?

#### emily.stats

##### Member
Re: Binomial regression with x as the dependent variable and n as an independent vari

The data will look like
n s1 s2
100 80 79
100 70 60
100 50 25
etc
I want to predict s2 from s1

#### SiBorg

##### New Member
Re: Binomial regression with x as the dependent variable and n as an independent vari

Ok so you want to predict number successful at stage 2 from number successful at stage 1 plus other predictor covariates.

I don't think it's logistic regression you need then. Maybe some form of linear regression would be appropriate, but I'm really not sure.

Anyone?