Interaction of treatment with confounder

In a multivariate regression analysis, I examine the effect that a treatment method has on subjects' hemoglobin levels. Since this is a retrospective study, I could not control for gender (or age) by randomization prior to collection. As a result, I control for gender in my analyses. The treatment effect remains significant even when I control for gender. However, when I include an interaction term between gender and treatment, the treatment effect is no longer significant. However, the interaction term is significant. Unfortunately, I don't know what to conclude from this. Whether I should report the interaction and what exactly it means in that case. Hopefully, someone could help me with this, thank you!

Geschlecht means Gender, Gruppen_Zwei is the treatment and HB_postop is the hemoglobin level.



TS Contributor
You better call this multiple regression analysis, multivariate is commonly
used for if several dependent variables are analysed at the same time.
Is it true that you do not have males in the first group?

Maybe you better post us the main output from the regression analysis
(including coefficients, stadard errors, p-levels, R²).

With kind regards

Thank you for your reply and advice!
You're right, there are almost no male in the first condition. There are two, I do not know why there is not red dot.
It is 2 male and 25 female in the first condition and 9 male and 19 female in the second


Less is more. Stay pure. Stay poor.
Well this is why you just don't run a model without designing it a priori. If you have content knowledge to expect heterogeneous treatment effects and you run the model then you report the interaction or that you failed to reveal anything. However, when a person goes dredging and finds an interaction this is the worst scenario - mostly since it is likely not real.

Now to your apparent issue, you can put a jitter on the last plot to see all of the values. If you only have 2 males in one of the treatment arms - that is obvious crap. How can you conclude that they aren't below average just by chance or that they represent the population of all males you will target results toward? I believe Frank Harrel has a paper stating you need blank number of people just for an empty regression model and once you add an interaction the number of people is pretty large. Your sample is trivially small - not lending itself large enough to make any inferences from. So it seems you have n=55 "observations". So without randomization you have to condition (which you don't have enough data for) or use weights (which you likely can't show common support for [overlapping of propensity scores]), so you can't create exchangeability between the treatment groups. And we haven't even mentioned Age, which I am guessing you probably also looked for an interaction with - and are talking about that, right?
I didn't expect heterogeneous treatment effects but I included gender to control for the unequal number of females and males in one group compared to the other as you would usually do this when controlling for a variable which you could not randomize before. And we could not randomize as this is a clinical, retrospective study. So you would just conclude there is no information at all within this data and there cannot be made any conclusions about the effects of treatment?


Less is more. Stay pure. Stay poor.
Well as your titles references, you are trying to control for a confounder - a common cause for treatment assignment and the outcome. OK, let us say that is gender. Controlling for that then closes the backdoor path between the outcome and treatment and your results are now conditional, so your estimate for treatment is for the reference group of gender.

If controlling for the confounder is your purpose, then your job is done. However, you also need to remember that you can't try to interpret the gender coefficient - because your model was designed to identify treatment effect on the Hemoglobin, and gender could have it's own confounder(s) (table 2 fallacy) and is mediated through treatment. Given this you are done. But I will point out that you have a pretty small sample size for even trying to pull this off.
Thank you for your advice! I have now re-leveled the gender variable to see if the treatment effect is significant for the female group (by making them the baseline). In fact, I then find that for females getting the treatment is associated with a significant difference in outcomes and the interaction is still significant. For the whole group, the treatment effect is also significant if I do not include an interaction, which can be explained by the fact that most patients of the two conditions are women. What would be the logical conclusion now? There was a significant effect for treatment when adjusting for gender. Or should I write there was a significant effect of treatment which means females getting the treatment is associated with a significant difference in hemoglobin level. For men, this could not be found. However, the sample size was extremely small. So in other words, should I report the treatment effect for the whole group and just say that I controlled for gender (and ignore the interaction) or should I say that the treatment effect was found for females but not for males, but this could be due to the small sample size. The latter would, however, as I understood it from you, pay too much attention to the interaction, which probably came about because there are hardly any men in the sample and therefore men as such are not represented, and the observations are rather random products. Would be great to get your advice on that! Thank you!


Less is more. Stay pure. Stay poor.
What type of venue or dissemination of results are you planning?

I would provide descriptive stats for the sample and control for gender, but only report treatment effect estimate. How much does the treatment estimate change when excluding gender - if not much, I would just exclude it from the model and mention this in the Discussion section or limitations. Another option may be to just drop all of the men from the analysis and just report results for sample of females.

But once again this is given you have a biologically plausible rationale for biological sex effecting hemoglobin levels. Which is possible, given menstration (given ages) or diet, etc. But gender may also be a latent variable/construct. But does biological sex effect treatment, if not you are just adding another independent predictor - which can be beneficial for reducing variance, but your sample size is small so you have that issue.
It is an assignment, so it is not something that will end up in a journal. Nevertheless, I thought it was important for me to understand what the data was telling me. So thank you very much for your helpful responses, it really helped me a lot. I now have a much better understanding of what it means to include an interaction term in a regression analysis and I will definitely include the limitations as you suggested.


Less is more. Stay pure. Stay poor.
To blow your mind even more, there are two types of interactions that can be in a model. Multiplicative, which you did, and additive. So they presume the combination of two terms have an effect on the multiplicative or additive scale. So presences of both is greater then adding their individual effects or greater than multiplying their individual effects.

Also, what you have isn't actually an interaction in the nuanced sense. What you reference is actually effect modification. Interaction is where the predictors impact (think pharmaceuticals) each other, effect modification is when one variable differs across levels (typically categories).