Help on interpreting linear regression

Lukan27

New Member
Help on interpreting linear regression estimates

I'm currently working on a larger assignment, and I need some input how to interpret the results from my/a linear regression. I'm pretty sure I get it right, but as a precautionary measure some input would be lovely.

I've websearched and speculated alot about this, but I can't seem to get a final take on it.

This is a part of my regression (which is enough to illustrate the point):

Code:
> summary(fittest1)

Call:
lm(formula = homicides_any_method ~ gdp * education, data = raw_data)

Residuals:
Min     1Q Median     3Q    Max
-3000  -2082  -1399   -124  42265

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept)    3.442e+03  1.234e+03   2.790  0.00593 **
gdp           -6.125e-02  1.686e-01  -0.363  0.71692
education     -1.201e+02  1.690e+02  -0.711  0.47828
gdp:education  1.845e-03  1.528e-02   0.121  0.90404
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 5372 on 158 degrees of freedom
(15 observations deleted due to missingness)
Multiple R-squared:  0.03071,	Adjusted R-squared:  0.0123
F-statistic: 1.669 on 3 and 158 DF,  p-value: 0.176

> summary(fittest2)

Call:
lm(formula = log_any_homicide_rate ~ log_gdp * log_education,
data = raw_data)

Residuals:
Min      1Q  Median      3Q     Max
-2.2483 -0.6268 -0.0109  0.4907  2.7438

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept)             1.3101     1.8372   0.713  0.47684
log_gdp                 0.4819     0.2856   1.687  0.09355 .
log_education           2.6292     0.8446   3.113  0.00220 **
log_gdp:log_education  -0.3949     0.1245  -3.173  0.00181 **
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.998 on 158 degrees of freedom
(15 observations deleted due to missingness)
Multiple R-squared:  0.3181,	Adjusted R-squared:  0.3052
F-statistic: 24.57 on 3 and 158 DF,  p-value: 4.201e-13
Well, the first model is with raw numbers, and the second with logarithm applied on all variables. I was quick to see that not data transforming would make everything quite useless. As we observe, model 1 is far from signifigant, neither in total, or for any variable listed. The opposite with model 2, with the exception of log_gdp of course, but almost. So there's really no doubt that model 2 is way better. But I'm confused regarding the estimates of the individual variables, and their interaction.

See, as I'm interpreting it, in model 1, we have a -6.125e-02 estimate on gdp, -1.201e+02 on education and (positive) 1.845e-03 on their interaction. This is where I'm unsure; so for every move gdp, we have a -6.125e-02 decrease in homicides and -1.201e+02 decrease in homicides when it comes to education. This makes sense, since we should assume that wealth and education means less tendency to conduct homicide. But what about the interaction? So gdp:education means 1.845e-03 increase in homicide? So both of these in combination means an increase in homicide? This makes no sense, at least reagarding our assumption/theory that these two factors should reduce crime/homicides..

It's essentially the same problem in model 2, just inverted; now log_gdp:log_education is negative, but log_gdp and log_education positive. So in model 2 both log_gdp and log_education means an x% increase in homicides, but their interaction means a x% decrease?

And why does log transformation seemingly makes this invertion? Because that's the true/real interaction/effect, or?

Any help appreciated.

Last edited:

Lukan27

New Member
You know what, I think I figured it out. Since model 2 is log transformed, to say x% increase in a variable, you have to reverse the log transformation, so eg.; let's say 1% increase in a variable means x increase the dependent variable; log_gdp is 0.4819, so 1.01^0.4819 = 1.0048 = 0.481% = 1/0.481 = 2.08%, so 1% increase in GDP means 2.08% increase in homicides, everything controlled. However, log_gdp and log_education interaction results in 2.55% decrease, so in total 2.08% + 0.38% - 2.55% = -0.09%! For every 1% increase in GDP and education (in average, I guess) means 0.09% decrease in homicides!

Correct me if I'm wrong..