What's the difference between running an ANCOVA and running an ANOVA with the residuals (of the covariate) as the response?

#1
Hope that title wasn't too confusing. Anyways...say I'm interested in how a covariate might be influencing my response variable. Besides running it as an ANCOVA, I've also heard that you can run a regression of the covariate with the response variable. Then you can use those residuals from that regression as the response variable in an ANOVA. What's the difference between the two analyses? When is one preferred over the other? Is one typically more appropriate?
 

Jake

Cookie Scientist
#2
Besides running it as an ANCOVA, I've also heard that you can run a regression of the covariate with the response variable. Then you can use those residuals from that regression as the response variable in an ANOVA.
This is almost true, but not quite. You're missing one step here: you also need to regress the independent variable (IV) on the covariate and save those residuals too. Then if you regress the DV residuals (which you already mentioned) on the IV residuals (which I just mentioned), the resulting coefficient and sums of squares are identical to what you'd have gotten if you did the ANCOVA. If you have multiple IVs (i.e., more than 2 cells in the design) then you'd need to do this separately for each IV. Anyway, this equivalence is known as the Frisch-Waugh-Lovell Theorem, which you can google to read more about.


What's the difference between the two analyses? When is one preferred over the other? Is one typically more appropriate?
If you follow the procedure that I outlined above, that is, you residualize both the DV and the IVs with respect to the covariate, then there's virtually no difference: the only tiny difference is that the denominator degrees of freedom for the hypothesis tests will differ by the number of covariates you adjusted for. Which will make no practical difference to the significance level unless your sample size is truly tiny. But technically the degrees of freedom from the ANCOVA are more correct.

If you don't follow the procedure that I outlined, but instead do what you originally described, then it is not, in general, equivalent to ANCOVA: it's only equivalent in the special case where the IV and the covariate are perfectly orthogonal, that is, when the covariate has the same mean in each cell of the design. Which will rarely be true in the real world. The procedure you described is basically a semi-partial correlation analysis.
 
#3
Great, thank you for the really informative reply! That makes sense. I think I'll just stick with the ANCOVA then. Seems way more straight forward.
 

Jake

Cookie Scientist
#4
Yes, I agree. Mainly I think the alternative method is interesting as a way of understanding what ANCOVA is doing "under the hood." But in practice you wouldn't normally literally do it that way.