Normal logistic regression vs a mixed model; which to use? benefits gained?

I have some evidence to suggest that one vaccine was far superior over another and resulted in the local elimination of a disease. Because it affects pharmaceutical companies the information is sensitive so I will share as much as I can.

I analysed the following data set using logistic regression:
Month; Type of vaccine used; Nr of vaccine administered; Absence of disease (Yes/No)

The Presence of disease is my dichotomous response variable. We chose the variable this way because our disease is quite rare and our surveillance for it isn't the best. The variable is taken as true if there is 6 month continuous period without any cases.

I have this data spanning 12 years for 5 different districts. All of the time one particular vaccine was used except for 28 months roughly around year 10 and 11. Towards the end of this period the disease was simultaneously eliminated from all 5 districts.

At first it seems obvious that the temporary vaccine change made all the difference BUT during the time of using the different vaccine, there was also a slight, concurrent increase in the number of vaccine used (for some district the highest over the whole study period but not for others.)

My results (by way of Odds ratios from my intercepts, supported by p-values) suggest that indeed the vaccine change was the key factor.

Unfortunately for me someone in our organisation (a senior and respected person) has said that I have to redo the analysis using a mixed model. I've subsequently researched mixed models a lot (before I knew nothing about them) but I still can't see why it's necessary to apply it here. Now before I dare contest the view of this person I will need to be surer than what I am now.

I can see that one can incorporate the district as a variable and because the district has repeated measures one can add a random effect to it in a mixed model. Conversely my method involved 5 models (one for each district) plus 1 more (for the entire region as a whole). I don't see why the mixed model would be a better option than my method. Is there something important I'm not seeing?

Many many thanks for reading this far!


Less is more. Stay pure. Stay poor.
Did this person say which variable should be entered as a random effect? If you control for random effects of district it could account for differences in the subsamples (districts). Were the characteristic of people in districts comparable as well as results? Were vaccine use same between districts?

I believe you can run a mixed model and see how much variance controlling for district as a random effect explains, and if it is a trivial amount than you may be able to use your original approach if it is appropriate.
No the person did not say which variable, but now after understanding moxed models it's the only one that would make sense.

There is indeed a big difference between the districts.

From my results 3 are inconclusive and 2 show very strongly that the vaccine was the major factor. Additionally, the combined one for the whole area also supported the different vaccine. None indicated that the number of vaccinations was the strongest effect.


Less is more. Stay pure. Stay poor.
Is there a theat to the Stable Unit Treatment Value Assumption (SUTVA)? [FONT=Arial,Arial] [FONT=Arial,Arial]

•SUTVA has two components:

[/FONT][FONT=Century Gothic,Century Gothic][FONT=Century Gothic,Century Gothic]1.No interference [/FONT][/FONT](units do not interfere with each other): treatment applied to one unit does not effect the outcome for another unit

2.There is only a single version of each treatment level (potential outcomes must be well defined)

Typically examples, I have an experiment that makes people better trained to get jobs, through the non-experiment group gets hosed because now there are fewer jobs because treatment group gets them all. Thus the experiment group affects the outcome of the non-experimental group. For part 2, there are not multiple doses of intervention (vaccine)

Think of the mixed model as it is used in meta-analyses. You pool different districts together. You control for random effects because beyond random sampling variation, the districts will actually have different effects and you want to be able to explain that unique additional variation that exist beyond chance, otherwise if you don't address it. your confidence interval will be too narrow around the ORs.

Last edited:
Yes there is a very definite threat to the SUTVA assumption!

Elimination of the disease in one district reduces the infection pressure on each of it's neighbours.

Aaargh! So I how do I deal with that?


Less is more. Stay pure. Stay poor.
I saw this text in the Imbin and Rubins (2015) book on Causality:

Another classic example of interference between units arises in settings with immunizations against
infectious diseases. The causal effect of your immunization versus no immunization will
surely depend on the immunization of others: if everybody else is already immunized
with a perfect vaccine, and others can therefore neither get the disease nor transmit it,
your immunization is superfluous. However, if no one else is immunized, your treatment
(immunization with a perfect vaccine) would be effective relative to no immunization. In
such cases, sometimes a more restrictive form of SUTVA can be considered by defining
the unit to be the community within which individuals interact, for example, schools in
educational settings, or specifically limiting the number of units assigned to a particular

This may be generalized to looking at a vaccine and neighboring non-vaccine area versus perhaps another non-vaccine area not near a vaccine area. Or other combinations.