I have collected mortality data and several covariates (data on treatment and vital parameters) from patients in two different hospitals. My goal is to analyse effects of treatments on mortality while adjusting for covariates, and using logistic regression would be my first choice to accomplish this goal.

Since patients within each hospital are not independent from each other and since there may be a systematic difference between the treatment in the two hospitals I would like to adjust my analysis for the fact that observations are clustered in hospitals.

I have specified different models:

1. Logistic regression including hospital as fixed effect covariate (there are only two hospitals and I was also interested in the effect of hospital on mortality)

2. Random intercept logistic model using hospital as random effect and the other covariates as fixed.

3. Generalized Estimating Equation with logistic link function, assuming an exchangeable correlation structure and using robust standard errors to adjust for clustered observations.

Although I would say that all models should adequately address the issue of clustered observations (and in fact should provide similar results), the results are very different. The mixed model and the GEE (models 2 and 3) produce similar results, but the first model is very different. In the one case, a certain treatment does influence mortality, whereas it does not in the other. The same is true for the significance of covariates.

What model would you consider most adequate to adjust for the fact that patients were treated in two different hospitals? Does anyone have an idea why treating hospital as fixed effect makes such a big difference versus the random effects approach? Shouldn't it be very similar when there are only two hospitals??

Cheers,

Patrick