Two linear regressions or one moderated multiple regression?

mxw

New Member
#1
Hi everyone,

I would REALLY appreciate any help on this since I've exhausted a lot of options for help already.

I have a set of a data with multiple DVs, and two within-subjects IVs (2 x 3 design). Say I want to look at one of the DVs and use regression to examine whether there's any differences between the levels of one IVs as the result of the second IV.

Initially, I split my data up into two sections according to the levels of one of the IVs. So now I have two sets of data, each with the same IV. I ran linear regressions of these two sets and found that there's a significant effect for one set but not the other. It seems to indicate that there's a difference between the two sets of data, and this difference might be due to the different levels of the IV that we used to split the data.

However when I run a multiple regression, I noticed that there's no significant effects/interactions. I have a pretty strong a priori hypothesis for why there might be an interaction though, and the standard errors are pretty big. Is it acceptable, and could I justify why it might be more appropriate to use two linear regressions rather than a multiple linear regression?

My apologies if I didn't explain it too well. Please let me know if clarification is necessary.
 

rogojel

TS Contributor
#2
hi,
this looks very interesting, could you give us some more indo, data or residual plots , matrix plots, output from the SE you used etc?

Just as a first idea, stricktly seen the interaction means a change in the value of the coefficients not a change in the significance of the effect, so you might not see anything in the multiple regression.

regards
 
#3
The problem as I see it that the no-effect in the one strata/analysis, could be due to loss of statistical power when you reduce your sample size (as you do when you split the file). Thus the loss of an effect in one of the regression analysis does not necessarily indicate the presence of an interaction.

Effect-modification can be thoroughly examined in a table as recommended in this article http://ije.oxfordjournals.org/content/41/2/514.short. However, I not sure if it will solve your issue. It will demand some time to read and understand, so I suggest you skip it if you have little time.

Also, the thought crossed my mind that you should be sure that you included the interaction term correctly in the regression model. At what measurement level were the two measures and how did you make this interaction term (in which statistical pacakage for instance)?
 

mxw

New Member
#4
Thanks for the responses. I'll try to post some more information on the data soon.

I'm not well versed in stats since we mostly use it as a means to an end, but I'm not sure if the sample size necessarily changes when I split the data. It's been split by an within-subjects IV--so the number of data points for the DV along every IV remains the same. I did notice that the df changes however, from 170 (when the data was split) to 140ish (when I did a multiple regression on all of the data).

I tried doing a moderated regression through SPSS where I basically entered the two variables in different blocks (https://statistics.laerd.com/spss-tutorials/img/mcd/regression-all-transferred.png). I've also done RM ANOVAs and found the same thing (no surprises there). The IV I split the the data by was dictamous and my other IV was an interval, although there were only 3 possible numbers for the interval IV.
 

rogojel

TS Contributor
#5
I'm not sure if the sample size necessarily changes when I split the data. It's been split by an within-subjects IV--so the number of data points for the DV along every IV remains the same. I did notice that the df changes however, from 170 (when the data was split) to 140ish (when I did a multiple regression on all of the data).
Just to clarify this: my understanding is that you have something like a set of triplets M={(Y, X1, X2)}. Splitting according to the levels of X2 would mean that now you have two sets M1 ={(Y, X1, X2)|X2<L} and M2={(Y,X1,X2)|X2>=L}. This necessarily means that you have less elemenrts in both sets (unless one of the sets is void, in which case the splitting makes no sense).

Is this the case?
 

mxw

New Member
#6
Sorry if I wasn't too clear on that. You're right though, if I split the data then there will be less elements in both sets.
 

mxw

New Member
#7
I did some more reading and realized that I actually hadn't included an interaction term at all in the multiple regression. I should also add that I'm doing a moderated multiple regression.

I've attached the output from SPSS and you can see that p < .05 for Time for one the regressions (I've highlighted it), but not the other, but there's no significant interactions.

Again, I'd really appreciate it if someone could shed some light on why this is happening and what it could possibly mean.