So I was still interested in this problem, but for the wrong reasons. I believe this is a GLMM which can be fit by lmer in the lme4 package, that made it interesting to me. But the wrong reasons are I have no experience with GLMMs, lmer or lme4. So that's not helpful.

But I thought about what I would do if someone held a gun to my head.

My model is fixed effects for subgroup and newcriteria levels. And random effects for intercept nested in cycle .

That leads to the model in R of:

promoted ~ subgroup+newcriteria + (1|cycle)

My code was this :

Code:

```
library(lme4)
setwd('C:/Users/Jeremiah/Documents/Fall08/Research/lme/')
data = read.table("data.txt")
colnames(data) = c("cycle", "promoted", "notpromoted", "subgroup", "newcriteria")
data$cycle = factor(data$cycle)
data$subgroup = factor(data$subgroup)
data$newcriteria = factor(data$newcriteria)
summary(data)
response = cbind(data$promoted,data$notpromoted, ) #first one is success
fit1 = lmer(response ~ subgroup + newcriteria + (1|cycle), data=data, family=binomial)
```

And that leads to summary

Code:

```
Generalized linear mixed model fit by the Laplace approximation
Formula: response ~ subgroup + newcriteria + (1 | cycle)
Data: data
AIC BIC logLik deviance
44.08 47.64 -18.04 36.08
Random effects:
Groups Name Variance Std.Dev.
cycle (Intercept) 0.056011 0.23667
Number of obs: 18, groups: cycle, 9
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 2.46866 0.08453 29.206 <2e-16 ***
subgroup1 -0.21176 0.09702 -2.183 0.0291 *
newcriteria1 0.46537 0.25136 1.851 0.0641 .
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Correlation of Fixed Effects:
(Intr) sbgrp1
subgroup1 -0.109
newcriteri1 -0.002 -0.381
```

Based on that you have statistically significant evidence that subgroup 1 is less likely to be promoted, but you are marginal on evidence that the new criteria is a non-zero fixed effect. This is kind of a more data situation another cycle might just tilt it one way or another.

I am not really sure about diagnostics. Like I said new turf for me. Seek out an expert on GLMMs.

BTW that number of observations line is kind of misleading in this context. Consider:

Code:

```
#reshape to help lmer out
#one observation per a row
newdata = matrix(0, sum(data$promoted) + sum(data$notpromoted), ncol(data)-1)
index = 1
for(i in 1:nrow(data)) {
promoted = data$promoted[i]
notpromoted = data$notpromoted[i]
newrow = as.numeric(as.matrix(data[i,]))
newrow = newrow[-3]
newrow[2] = 1 #level for promoted
for(j in 1:promoted){
newdata[index,] = newrow
index = index + 1
}
newrow[2] = 0 #level for notpromoted
for(j in 1:notpromoted){
newdata[index,] = newrow
index= index + 1
}
}
newdata = as.data.frame(newdata)
colnames(newdata) = colnames(data)[-3]
newdata$cycle = factor(newdata$cycle)
newdata$promoted = factor(newdata$promoted)
newdata$subgroup = factor(newdata$subgroup)
newdata$newcriteria = factor(newdata$newcriteria)
summary(newdata)
fit = lmer(promoted ~ subgroup+newcriteria +(1|cycle), data=newdata, family=binomial)
summary(fit)
```

So now there is 1 observation per a line with summary:

Code:

```
Generalized linear mixed model fit by the Laplace approximation
Formula: promoted ~ subgroup + newcriteria + (1 | cycle)
Data: newdata
AIC BIC logLik deviance
9460 9491 -4726 9452
Random effects:
Groups Name Variance Std.Dev.
cycle (Intercept) 0.056011 0.23667
Number of obs: 17081, groups: cycle, 9
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 2.46866 0.08453 29.206 <2e-16 ***
subgroup1 -0.21176 0.09702 -2.183 0.0291 *
newcriteria1 0.46508 0.25133 1.850 0.0642 .
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Correlation of Fixed Effects:
(Intr) sbgrp1
subgroup1 -0.109
newcriteri1 -0.002 -0.381
```

You'll note that even the variance and Std. Dev estimates for the random effect are exactly the same, yet the number of observations has jumped to 17081. It is interesting the likelihood has changed, which suggest internally something different is being done even though the estimates are exactly the same.

edit and ps: According to MASS the first column in the response matrix is considered "success". Which is different than what I said earlier =) And its also confusing because if you ever use the factor version of the response ( column of 1s and 0s) the first factor level is considered failure and everything else is considered success. That's how I messed it up in my head.