Interaction effects regression wiith dummy variables

Hello everyone,

I am doing an analysis about the export diversification effect during the financial crisis.
I have my dependent variable as log (GDP per capita) and some independent variables.
To measure diversification I calculated an Index which is called HI, this Index takes values from 0 to 1. There are three different ways to measure diversification and therefore three different HI's (HI1, HI2, HI3). I also created a dummy variable for the time periode before the financial crisis and afterwards (years 2004,2005,2006 --> crisis_dummy=0, years 2007,2008,2009 --> crisis_dummy=1).
To find out whether or not the export diversification has an influence on log (GDP per capita) during/before the crisis I want to estimate the following estimation equation:
log(GDPpc)= a*crisis_dummy+b1*HI1+b2*HI2+b3*HI3+d1*crisis_dummy*HI1+d2*crisis_dummy*HI2+d3*crisis_dummy*HI3+(2 more independent variables as control variables).

My questions are:
1. Is this a proper estimation equation to figure out my research question (does a lower HI help to increase/not decrease log (GDP per capita) during the financial crisis)?
2. Is the application of the dummyvariables correct?
3. Besides the problem of multicollinearity do you see others potential problems?

I hope I explained my question in a propper way and I hope somebody can help me with my problem.

Thank you


Less is more. Stay pure. Stay poor.
So you have one value of say H1 before and after the crisis, not a traditional time series. Yeah the multicolinearity is an issue for sure. Why can't you run three models. Also, are H values centrally located between 0 and 1, or are they say hanging out around 0 or 1.
Thank your for the quick response.
OK so you mean something like:
log(GDPpc)= a*crisis_dummy+b1*HI1+crisis_dummy*HI1+(2 more independent variables as control variables)
log(GDPpc)= a*crisis_dummy+b2*HI2+crisis_dummy*HI2+(2 more independent variables as control variables)
log(GDPpc)= a*crisis_dummy+b3*HI3+crisis_dummy*HI3+(2 more independent variables as control variables) ?

Yes the values of the indices are hanging around between 0 and 1, why does that matter?
Last edited:


Less is more. Stay pure. Stay poor.
I was referring to:

model 1: y = intercept + time + H1 + time*H1 + Error
model 2: y = intercept + time + H2 + time*H2 + Error
model 3: y = intercept + time + H3 + time*H3 + Error

It becomes difficult some times interpreting the model when the covariate is bound, meaning a two unit increase may not be feasible or plus or minus a standard deviation of the covariate isn't symmetrical around the variable.
Thank you for the hint. Do you have any advice how I could overcome that difficulty? (literature, other Threads in this forum etc.)
But you have already helped me a lot, thank you for that!


Less is more. Stay pure. Stay poor.
I don't run many models with such an issue, I wonder if people may typically use some type of data transformation. I would also look to your field's literature and see how others have used these variables in there models and how they reported results. Though, you always have to be cautious, in that you can come across published studies, still with poor analyses.