# Mixed Model ANOVA Error terms

#### DrJBN

##### New Member
Can someone show me, or point me to a resource, that would show how to compute the resulting Error sums of squares that would be produced in a repeated-measures ANOVA *without* using the typical (SSTotal - (SS Subjects + SS Effect)) subtraction process?

I realize the term represents the interaction of the treatment with subjects, i.e., the treatment effects each participant differently, I am just trying to see its calculation more directly.

#### Jake

Here is an annotated R session that hopefully sheds some light on what these different SS and MS values really represent. The example I use is a simple 2-level repeated-measures ANOVA, in other words, just a "paired t-test," but the principles apply just the same to more complicated datasets such as the between-within case.
Code:
> # set random number seed
> set.seed(12345)
>
> # create and examine wide-format dataset
> wide <- data.frame(pre=rnorm(10))
> wide$post <- wide$pre + 1 + rnorm(10)
> wide
pre         post
1   0.5855288  1.469281011
2   0.7094660  3.526778061
3  -0.1093033  1.261324550
4  -0.4534972  1.066719284
5   0.6058875  0.855355461
6  -1.8179560 -0.001056128
7   0.6300986  0.743741030
8  -0.2761841  0.392238305
9  -0.2841597  1.836552908
10 -0.9193220  0.379401697
>
> # create and examine long-format dataset
> long <- data.frame(stack(wide), subject=factor(rep(1:10, 2)))
> names(long)[1:2] <- c("score", "time")
> long
score time subject
1   0.585528818  pre       1
2   0.709466018  pre       2
3  -0.109303315  pre       3
4  -0.453497173  pre       4
5   0.605887456  pre       5
6  -1.817955968  pre       6
7   0.630098551  pre       7
8  -0.276184105  pre       8
9  -0.284159744  pre       9
10 -0.919322002  pre      10
11  1.469281011 post       1
12  3.526778061 post       2
13  1.261324550 post       3
14  1.066719284 post       4
15  0.855355461 post       5
16 -0.001056128 post       6
17  0.743741030 post       7
18  0.392238305 post       8
19  1.836552908 post       9
20  0.379401697 post      10
>
> # do the ANOVA the traditional way
> aovmod <- aov(score ~ subject + time + subject:time, data=long)
> summary(aovmod)
Df Sum Sq Mean Sq
subject       9 11.756   1.306
time          1  8.269   8.269
subject:time  9  3.189   0.354
> # like you said, SS_subject*time = SS_total - SS_subject - SS_time
>
> # and the F for the time effect is MS_time / MS_subject*time
> c(F_time = 8.269 / 0.354)
F_time
23.35876
>
> # an equivalent way to do this ANOVA is as a one-sample t-test on the
> # difference scores between "post" and "pre" scores for each subject
> diffs <- wide$post - wide$pre
> t.test(diffs)

One Sample t-test

data:  diffs
t = 4.8308, df = 9, p-value = 0.0009328
alternative hypothesis: true mean is not equal to 0
95 percent confidence interval:
0.6837866 1.8881689
sample estimates:
mean of x
1.285978

> c(F_time = 4.8308 ^ 2)
F_time
23.33663
> # the same F except for a little rounding error
>
> # the subject*time MS is literally the variance of these differences
> # except that, for uninteresting mathematical reasons, we have to
> # divide them by square root of 2 first to get the exact same MS value
> var(diffs/sqrt(2))
[1] 0.354318
> # compare to ANOVA table above
>
> # in other words, these 10 differences ARE our 10 subject*time
> # interaction effects (multiplied by a constant)
> diffs
[1] 0.8837522 2.8173120 1.3706279 1.5202165 0.2494680 1.8168998 0.1136425
[8] 0.6684224 2.1207127 1.2987237
> # so their variance, i.e. mean square, represents MS_subject*time
>
> # another interesting thing is that the subject mean square
> # represents the variance of the 10 subject "main effects"
> # in other words, the variance of these subject means
> (sub_effects <- c(subject = rowMeans(wide)))
subject1   subject2   subject3   subject4   subject5   subject6
1.0274049  2.1181220  0.5760106  0.3066111  0.7306215 -0.9095060
subject7   subject8   subject9  subject10
0.6869198  0.0580271  0.7761966 -0.2699602
> # except, again, for uninteresting mathematical reasons, we have to
> # first multiply them by a constant (sqrt 2) to get the exact same MS
> var(sub_effects * sqrt(2))
[1] 1.306209
> # compare to ANOVA table above

#### DrJBN

##### New Member
Thanks for the reply. I apologize for not replying sooner, but I've been doing some thinking.

I was aware of the relationship between the paired-samples t-test and the repeated measures ANOVA, and it was that relationship that was giving me problems.

As you demonstrate, you can get the error for the ANOVA by looking at the differences between the two samples. That the error represents the interaction of treatment with subjects becomes clear there. If the treatment had a constant effect, independent of the subject, then the differences would all be the same and the error zero.

What I could not do was "see" the literal extension to more than two levels (e.g., A, B, C). Where could I "see" the treatment by subjects interaction among the three variables?

I did finally make some sense of it, at least for myself (I think) by working with the 2-level case again.

A B
8 5
2 4
7 2
5 2

Doing the calcs for a simple repeated measures ANOVA gave me

A B Subject Mean
8 5 6.5
2 4 3
7 2 4.5
5 2 3.5

Source Sum of Squares
Total 37.88
Effect 10.12
Between Subjs. 14.38
Error (S x T) 13.76

Then, I apporached it differently by subtracting the mean of each subject from each A & B

A-Mean B-Mean
1.5 -1.5
-1 1
2.5 -2.5
1.5 -1.5

The individual differences have been removed completely and the "error" must be contained within each variable rather than in the differences. That is, the SS of each variable represents how the treatment affected the individuals differently at that level of the variable, rather than how the difference in the variables varied across subjects.