# Comparing groups - Normalization to control

#### mterr

##### New Member
Hello everyone!

This is my first time around. I know a few things about classical hypothesis testing, p values and CIs, but beyond that, my statistical knowledge is limited. I have come across a question which I can't answer & can't seem to be able to find any useful resources on, despite considerable efforts.

Given two experimental runs, each with its own experimental & control group, I would like to test for a statistically significant difference between the first and second experimental group (using Welch's t-test). Because of inter-experimental variability, this requires normalization of the values in the experimental groups to their respective control group first.

The obvious (and often used) solution would be to divide each value in the experimental group by the mean of the corresponding control group (I have seen this described as "calculating the fold change" or "normalizing the data by setting the untreated control group as 1"). Other approaches I have seen are mean centering (subtracting the control group mean from each individual value in the experimental group) or standardizing using (z-) t-scores.

Is this a valid approach? What about the points made in http://biostat.mc.vanderbilt.edu/wiki/pub/Main/TatsukiKoyama/RetreatPoster2009.pdf: Inflation of Type I error rates because of ignored variability in the control group mean; expressing effects as fold changes can hide true differences?

I was able to replicate their findings regarding inflation of Type I error rates with a simulation in R of my own. Interestingly, regardless of whether I used dividing by the control average, mean centering or transforming to z/t-scores, as soon as the SD in the control groups went up from 1 even a little bit, inflation of Type I error rates occurred at a worrisome rate. (I can post the code if wanted.)

I would greatly appreciate any insights or pointer to useful resources. Surely I can't be the first one struggling with this?

On a related note: I never had to use regression models before and don't quite get the "linear regression normalization" they are proposing as a solution for my problem in the poster:
A simple linear regression model can be fit to test for the difference between the groups of interest taking into account the difference in the control groups. It is simply a test for an interaction effect. I know how to fit linear models. But what is meant by "testing for an interaction effect" in this context?

Last edited:

#### msekn

##### New Member
Hi, were you able to figure this out? Especially their speciification of their regression?