graphing an interaction in Stata

#1
Hello,

I am trying to find the correct way to graph an interaction effect between two continuous variables in Stata. My regression, the code is

nestreg: regress y (c1 c2) (x1 x2) (x1_x_x2)

with y the outcome variable, c1 and 2 controls, x1 and 2 the independent variables, and x1_x_x2 a multiplicative variable obtained by 'generate x1_x_x2 = x1*x2'. All coefficients are statistically significant, as are the two R2-change values.

Next, I created a new variable called x2_groups, sorted the data by x2 values, and then divided the variable into low (1), medium (2), and high (3) groups.

To graph the interaction, I used the following code:

twoway (lft y x1 if x2_groups==1) (lft y x1 if x2_groups==2) (lft y x1 if x2_groups==3), ytitle(y) xtitle(x1) legend(label(x2 low) label(x2 medium) label(x2 high))

Which produces the attached graph.

However, I was cautioned that this was not the correct way to do this, and I have been searching for the proper graphing method. Should I use the predicted value of y from the model, i.e., by placing

predict pr_y

after the model and substituting this new pr_y value for y in the graph?

I appreciate any help that can be offered about this. Please ask any questions and I can clarify if I have not been clear.

Thank you.
 

bukharin

RoboStataRaptor
#2
I agree that this is not the correct approach. In the graph all you've done is fit a simple linear regression of y on x1 within 3 groups of x2. This is equivalent to the model:
regress y c.x1##i.x2_groups

(the ## means a factorial interaction - c.x1, i.x2_groups and their interaction. The c. tells Stata to treat x1 as a continuous variable and the i. tells Stata to treat x2_groups as a categorical variable)

In other words it forces your x2 into groups rather than modelling the true continuous relationship in your original model which is:
regress y c.x1##c.x2

Essentially you need a 3d graph - y as a function of both x1 and x2. You can create 3d graphs in Stata (there are different techniques), or an alternative might be to plot y vs x1 at representative values (rather than groups) of x2.

Here's a simple example using the auto dataset:
Code:
sysuse auto, clear
regress price c.mpg##c.weight
sum mpg weight // to determine values to plot
margins, at(weight=2000 mpg=(12 41)) at(weight=3000 mpg=(12 41)) at(weight=4000 mpg=(12 41))
marginsplot, noci plotopts(msymbol(none))
 
#3
I don't have Stata 12, so marginplot doesn't work.

I wonder how I create my 3 lines to show the interaction (line 3) between cont var 1 (=line 1) and cont var 2 (= line2).