Multiple Regression on SPSS - Correlations and Coefficients

#1
Sorry if this question sounds laughable to any of you. I'm really new to stats and find it quite challenging.

I'm a 3rd year Biology student and am investigating the effect of several factors on the number of bats that pass the detector device each night.

I have run a multiple linear regression with bat passes as the dependent variable, and temperature, night length, and temperature/night length interaction (temperature*night length) as independent variables.

The SPSS output gives me a 'Correlations' table and a 'Coefficients' table. I am really struggling to understand how for example passes and temperature have a statistically significant relationship under the correlations table, but are not statistically significant on the coefficients table.

If someone could please explain this to me in very plain layman's terms I would be extremely grateful.

Thanks so much.
 

ondansetron

TS Contributor
#2
Sorry if this question sounds laughable to any of you. I'm really new to stats and find it quite challenging.

I'm a 3rd year Biology student and am investigating the effect of several factors on the number of bats that pass the detector device each night.

I have run a multiple linear regression with bat passes as the dependent variable, and temperature, night length, and temperature/night length interaction (temperature*night length) as independent variables.

The SPSS output gives me a 'Correlations' table and a 'Coefficients' table. I am really struggling to understand how for example passes and temperature have a statistically significant relationship under the correlations table, but are not statistically significant on the coefficients table.

If someone could please explain this to me in very plain layman's terms I would be extremely grateful.

Thanks so much.
Never apologize for asking questions, and don't worry if you feel that they're at a lower level (it's all relative). Experience and questioning is how we all learn, so definitely step up to the plate!

If you're comfortable, post some of your output that's confusing, we might be able to help more quickly (if not, it's alright).

Otherwise, let's start with this: what's the p-value for the coefficient relating to temperature*nightlength. If the interaction is significant, it doesn't make sense to test temperature or night length individually, because we know that they're needed for prediction (based on the significant interaction telling us so). In other words, don't let the nonsignificant temperature bother you (and don't remove it from the model) if an interaction of temperature with some other term is significant. (Also, interaction terms can introduce multicollinearity into a model, so the main effects (like temp) might appear nonsignificant.)

If the interaction is not significant, it could be due to multicollinearity (i.e. night length might better encompass the information provided by temperature as well as more information for predicting the number of passes).
 
Last edited:
#3
This is the most promising start I've had all day, thank you so much. Your explanation of the interaction effect was really clear.

Please find attached two screenshots from the SPSS output of the multiple regression; one of the coefficients table and one of the correlations table.

I think my confusion is stemming from the fact that from what I understand, in the most simple terms: correlation tells me if a relationship exists, and regression tells me how strong the relationship is, if there is one.

How therefore, can I have a correlation between temperature and passes of .535, with a p-value of .002, but the coefficient for the same relationship is not significant? In other words, how can I have a significant relationship in which the factors have no significant effect on each other?

I feel like I'm missing something really important and likely quite obvious, but I just can't make sense of it.
 

ondansetron

TS Contributor
#4
This is the most promising start I've had all day, thank you so much. Your explanation of the interaction effect was really clear.
Glad it was helpful. Feel free to hit the "thanks" button if you'd like!

Also, just to clarify, was the global F-test significant? (I presume it was, but we should only go further if it is.)

I think my confusion is stemming from the fact that from what I understand, in the most simple terms: correlation tells me if a relationship exists, and regression tells me how strong the relationship is, if there is one.
The Pearson correlation will tell you if there exists a linear relationship and it's strength. You could have a very strong non-linear relationship that has a Pearson correlation coefficient close to zero (think of a parabola). The regression allows you to examine the relationships and magnitude of change in Y given a (one) unit increase in a particular X variable, after accounting for the other X variables in the model. In other words, it allows you to say, "If we have temperature in the model, how does passes change with a one hour (assuming you measure night length in hours) increase in night length, i.e. what is left over?" For my example, assume there was no interaction, because that would complicate the example more than needed right now.

How therefore, can I have a correlation between temperature and passes of .535, with a p-value of .002, but the coefficient for the same relationship is not significant? In other words, how can I have a significant relationship in which the factors have no significant effect on each other?
See my explanation above. Once you put it into a regression with additional independent variables (X variables), you're examining the relationship of Passes and Temperature after accounting for the other terms in the model.

I feel like I'm missing something really important and likely quite obvious, but I just can't make sense of it.
Looking specifically at your regression output, it shows that, after accounting for the main effects, interaction is not significant. (Similarly, if you look just at temperature, it would say, after accounting for night length and the interaction of temp and night length, temperature is not significant--- however, as I mentioned in an earlier post, it doesn't make any sense to do this). If you have the interaction (or any higher order term, such as a square), you do no test the lower term while the higher order term is in the model (in this case, temp). The VIF for temp and the interaction are very high, which is expected (but is definitely deflating the t-stats for those coefficients). If you do believe that interaction is reasonable, I would try a centering method for the interaction, refitting the model with temp and night length as well as the centered interaction. This could help reduce multicollinearity and might give you different results.

If you don't have a strong reason to suspect interaction, I would recommend dropping it from the model, and refitting the model with only temp and night length. My guess is that if you rerun the model without the interaction you will see temp become more significant.

At this point, the choice is yours for the next step.