Pretest-posttest design at different institutions

I have data that consists of numerical scores ranging from 0-100 on a test, matched pre/post by student. (So for each student we have a pretest score and a posttest score.) All students were enrolled in a course that used a specific curriculum, but we have data from multiple schools with differing student populations. There is no control group, but we are comparing to published shift scores at other institutions.

We would like to determine which of the shifts from pre- to post- are significant. (In other words, at which institutions is the post-test data significantly different from the pre-test data.) I had originally done this for each institution separately using a paired-samples t-test.

One of our reviewers pointed out that we might be increasing the risks of incurring a Type I error this way, and suggested that we should do our analysis using ANOVA. I can see how to use ANOVA to determine whether the pretest data or the post-test data from the different schools are different from one another, but I cannot see how to use ANOVA to determine whether the pretest and post-test scores at a single institution are different from one another (this seems like going back to a t-test).

I think there may be some way to do this with post-hoc tests, but am not sure how to interpret the outputs of these tests.

This is all complicated by the fact that we have very different numbers of students at the different schools, and so many repeated-measures ANOVA designs do not seem appropriate.


Cookie Scientist
The "ANOVA" method of asking this statistical question essentially just amounts to pooling the data from all of your schools together and then doing one big paired t-test for everybody. It also affords you a means to test for school differences as well as improvement*school interactions, but since you aren't really interested in those questions (at least I didn't get the sense from your post that you were), the effect that you're going to end up talking about is basically just a paired t-test on the pooled data. This doesn't strike me as an obviously better way of looking at the data, but I guess you're kind of at the mercy of the reviewers at this point.

How many different schools do you have? If you have a substantial number of schools, then really the more correct way to do this analysis is a multilevel model. But if you have only a small number of schools (less than 5, say), the ANOVA model is probably fine. I'm told that multilevel models sometimes have a hard time fitting models with really small numbers of random effects (i.e., schools), although I've never tried personally. But anyway, again, this may not be the stage in your paper's life to change the basic statistical method of your analysis.