Mixed Model: Subjects v.s. Random Factor

#1
Hi,
When using SPSS to do a Linear Mixed Model it first presents a dialogue box with boxes for Subjects and Repeated Measures.

Later on, when you build your model and build the random factors SPSS lists at the bottom of the dialogue box the variables that you entered at the beginning. If you move one of these to the right hand 'Combinations' box, then the procedure adds this variable as a \SUBJECT command in the SYNTAX.

My question is this: What is the difference between adding your individuals Subject Identifier here from adding the ID as a Main Effect Random Factor (which would be \RANDOM in the syntax)?

If I don't have the \SUBJECT command, is SPSS doing anything with the variables that I entered at the very beginning of the procedure?

Many Thanks,

S.
 
#2
I've never used the dialog boxes for Mixed (for this exact reason--it's too hard to tell what the heck they're doing).

So are you basically asking about the difference between using a /Random subcommand and a /Repeated? You can have a |Subjects option in both.

They're doing totally different things, though. The Repeated statement is adjusting the residuals to take into account that they're correlated for the same Subject. (And you can choose the exact pattern of correlations by specifying the Covariance matrix).

The Random statement is actually creating a new parameter in the model to take the effect of Subject into account. In a simple one-way repeated measures model (with or without a between subjects factor), you can actually get the exact same result either way, if you specify the covariance matrices right. But if your model gets complicated, you have to be more careful.

If you describe your model more specifically (and it's not too complicated), I can try to help you specify it. This gets pretty tricky.

Karen
 
#3
Hi Karen,
Thanks so much for your reply.

I'm actually concerned just about the Random statement actually I think. I'm using this, as you say, to take the effect of my subject into account. However, I don't know what the difference between these two statements are:

/RANDOM=fullgroup Individual | COVTYPE(VC).

and


/RANDOM=fullgroup | SUBJECT(Individual) COVTYPE(VC).

In the first I have the individual and their group as main effects, whereas in the second the individual is listed using the SUBJECTS statement. What is different between these two?

Many Thanks,

S.
 
#4
The first one is creating a random intercept for both Fullgroup and Individual.

Since you're not specifying a subject, there must be more than one value for each fullgroup and for each individual. Or it wouldn't run.

The second one is creating a random slope for Fullgroup (and is specifically leaving out a random intercept) across Individuals.

It would be the same as saying:

/RANDOM=fullgroup*Individual| COVTYPE(VC).

In case you're not familiar with the random intercept/random slope terminology (and many people who primarily use ANOVA aren't), what this essentially means is the first is adding a parameter to the model to specifically measure the variance of the Fullgroup measures AND a parameter to measure the variance among individuals.

The second one is saying the slope of fullgroup on the DV differs across individuals, and the extra parameter is measuring how much the slope differs.

Either is a plausible model in specific situations. The first is more common, but I did just see an example of the second. It's pretty rare, though, to fit a random slope without a random intercept. (In the case I saw, the intercepts couldn't vary, because they were all, by definition, 0).

Hope that helps and doesn't confuse. It took me years to figure this stuff out.

Karen
 
#6
I was wondering if you could offer me some help Karen. I am trying to do a nested repeated measures mixed model. My design is very messy, has small sample size, and is unbalanced. I am trying to figure out how to make it work so I understand it in SPSS. I think I am close but could use a boost in confidence! Here are the basics:

I am doing a Before-After-Control-Impact design.

Treatment 1 (impact) - 2 streams - measurements taken at 3 stations per stream
Treatment 2 (control - no treatment) - 3 streams - measurements taken at 3 stations per stream

Measurements were taken each year for 5 years (within subject factor) but there are many missing values (some stations were dry during certain years).

Before - 2009-2011
After - 2012-2013 (Treatment 1 was applied after 2011 measurements)

I have tried building my model in SPSS like this:

(the fixed term is the interaction term I am mostly interested in between Before-After and Control Impact)

FIXED = BA*CI
METHOD = REML
RANDOM = INTERCEPT station(stream) | SUBJECT (station) COVTYPE (VC)
REPEATED = year | SUBJECT (stream*station) COVTYPE (AR1)

For the random factor do I want to include the intercept?
and for the random Subject Grouping do I group by stream or station? When I group by:
Station: DF = 24.127 F = 6.23 Sig = 0.003
Stream: DF = 5.784 F = 4.139 Sig = 0.068

I am thinking I will use alpha 0.10 instead of 0.05 because of the small sample size and lots of noise in my data.

Any advice input you or anyone else may have would be so appreciated!! My other option is to average my stations to get one value per stream per year to simplify the model but my supervisor prefers me to make the nested version work to represent the data better.
 
#7
Hi Andrea,

This one is tricky because you have two levels of repeats. One is before/after the intervention and the other is the repeat over time within a condition.

In other words, you have a 3 level model.

I don't like giving too much advice on how to analyze these without getting every nitty-gritty detail. Details make a big difference in these models.

I offer an online workshop that explains all of this, and it's 16 hours of instruction. There's just a lot to it.

But your random statement doesn't make a lot of sense. It's saying that each station has a different effect of station within a stream.

At the very least, keep the intercept in the random statement, but remove the station(stream). You may also need a random slope in there, but it would be for something like CI, not station(stream).

The hard part is that either station or stream could be included as the subject and make sense in certain situations. That's why I would need to ask you 20 questions and possibly run some graphs and see what's going on in the data.

Also, if you include an interaction in the fixed statement, you also need to include the main effects.

Here are a few free resources I've put together on this to help you get started. I particularly recommend the webinar, as it shows SPSS syntax. It's a simpler example than this one, but like I said, it should get you started understanding what each part of the random statement is doing.

http://www.theanalysisfactor.com/random-intercept-and-random-slope-models-webinar/
http://www.theanalysisfactor.com/resources/mixed-multilevel-models/

Best,
Karen
 
#8
Hi Karen,
It is so nice of you to offer statistic advice to others! I am in a desperate need...
For my thesis, I have 120 crowdfunding projects created by 110 creators (10 creators created 2 projects). My DV is how much money each project collected, and IVs are Fduration(how long each project's funding last in days), and each project's Nbacker(total number of backers). So all my IVs and DV are at project level, but because 10 pairs of projects have the same creators, I was wondering whether I should conduct mixed model analysis. So I did a null model test (putting creatorID) as the subjects and did not include other IVs in SPSS, the ICC is 4.96/(4.96+1.64)=75% (calculated from covariance parameter estimate, residual's covariance estimate is 1.64, and CreatorID intercept's covariance estimate is 4.96). Did I do this right? If yes, I think I should do a mixed model right?

Then, since all my IVs are at project level, none of them are at creator level, how can I control creator effect? Do I need to include any random variable (creatorID) in my model in SPSS? Or I just include creatorID at the subject, and do not need to put anything in the random variable box? Thanks for any advice.
 
#9
Mixed models are tough to explain or decide in text, but I'll give you some initial impressions.

1. Don't use the boxes at all--use syntax. It's too easy to mess these up (in my experience).
2. Yes, a random intercept for creator will control for creator. But I'm honestly surprised that the model will run if 90% of creators have only one project. For them, creator and project are confounded. Perhaps it's because you don't have any creator level variables. <shrug>
 
#10
Dear Karen,

I am confused on what the ‘subject’ and ‘random effect builder’ in GLMM SPSS is. Are the variables added to the ‘Subject’ referring to random intercept and the ‘effect builder’ the random intercept? Could you please help explain the difference between them? It will ease the interpretation of the random effects.

For example, my study measured mood repeatedly for 2 hours (Level-1) from participants (Level-2) who were exposed to 2 different lighting patterns oscillation (Level-3) that were associated with a particular EH_setting. There are 4 EH_setting which has its specific group of participants. Each participant within an EH_setting was exposed to 2 different oscillation. So in GLMM, the

1. data structure was:

SUBJECTS= EH_setting*RID*oscillation

REPEATED_MEASURES=time

COVARIANCE_TYPE=DIAGONAL

2. the random effect was:

RANDOM EFFECTS= oscillation USE_INTERCEPT=TRUE

SUBJECTS= EH_setting*RID

COVARIANCE_TYPE=AR1



Referring to your reply dated 30 Oct 2010, does my model mean the random slope of oscillation on mood differs across participants within a EH_setting? Is the RANDOM EFFECTS= oscillation the random slope and SUBJECTS= EH_setting*RID (i.e. participants) as the random intercept?

Your help will be very much appreciated, as I am really confused and urgently seeking some guidance. I thank you in advance for your time and kind advice. GLMM_SPSS_random effect.jpg