Multilevel Models

#1
I start using this 7 years after I studied it, when I have largely forgotten the method. I have lots of questions but to start with, there seems to be disagreements on whether you should center you predictors at the lower (individual) level or not. And if you do, should you use group or grand mean centering.

Does centering at the upper level (group level) actually change the slope not just the intercept. I know it only changes the intercept at level 1.

There is also disagreement if you should center dummy predictors. Some say yes, some no.
 

hlsmith

Less is more. Stay pure. Stay poor.
#2
Good questions? I am guessing @Jake has answered these for you already in the past via the chat box. When you get all of the answers, let me know and I will write them down!

Comment, I have never understood centering categorical variables. It seems problemsome if prevalences are extreme. In LASSO regression it seems like people center when entering in continuous and cat vars, but it seems like there would be issues there as well and what about when there are >2 groups?
 
#3
I try to write down in my tomes when Jake answers something (you to). So my tome says Jake said.... :) I don't think he answered this one, he did address other issues in MLM.

For dummy variables, and some don't agree, this is what one professor wrote for his class.

"Note that while it may seem inappropriate at first to center a dummy variable, in HLM it actually is quite useful. If the dummy is uncentered, the intercept is the average value when the dummy variable is 0. If the dummy variable is centered, the intercept then becomes the mean adjusted for the proportion of cases with the dummy variable=1. For example, if the indicator for sex variable is centered around the grand mean, this centered predictor can take two values. If the subject is female, it will equal the proportion of male students in the sample. If the subject is male, it will equal to minus the proportion of female students in the sample. Zero on this variable becomes the average proportion of female students."

I am still working through that example.
 
#4
I am not sure what this means. That if you grand center the level 1 variables, or don't center them, you have to group center the level 2 variables?

"When using the predictor in the raw scale or within grand-mean centering, it is critical to include the group means of the predictor at Level 2 to properly disaggregate the effects. The effect obtained for the predictor at Level 1 will then be the within-group effect and the effect obtained at Level 2 will then be the difference between the between- and within-group effects, sometimes called the contextual effect"
 
#5
I always find these types of comments distressing. If high level statisticians disagree with each other, how do mere mortals such as myself know what is correct.

"General and specialized multilevel modeling software, both free (e.g., R) and commercial (e.g., HLM and SAS), are readily available. However, together with the growth of MLM as an analytic technique, several myths regarding the method abound and are found in many well-respected journals suggesting that both authors and reviewers may not be fully aware of more recent developments in the field related to the analysis of clustered data. We highlight some of these myths and golden rules which deserve some attention as newer studies, which we focus on in the View Today section of each myth, may have clarified some prior ambiguous modeling related issues."

http://faculty.missouri.edu/huangf/data/pubdata/SPQ_Myths/05 Multilevel myths_complete.pdf
 
#6
The way I interpret this is if you group center the data you will eliminate all bias tied to groups (ignoring SE issues).

"Another way is to include the group-mean centered level one variable, also known as centering within context (CWC). If the researcher is interested in the association of the level one predictor (Xij) on the outcome (Yij), group-mean centering is the best option because group-mean centering removes all between group variation (Dalal & Zickar, 2012). Group mean centering or using demeaned data (i.e., subtracting the group mean from variables) effectively eliminates the group-level effect from the variable and reduces the ICC of the predictor variable to zero as all of the clusters will have a mean of zero for the centered variable."

So, particularly in running OLS why wouldn't you always group center?

Does this mean if you have 25 or more groups White SE will effectively deal with the SE problems OLS causes by ignoring between group effects (lack of independence).

"
Based on Mplus documentation, the "type = complex" option applies the well-known Huber White standard error adjustments and retains the parameter estimates.

The standard error adjustment uses a sandwich estimation procedure (Berger, Graham, & Zeileis, 2017) which may account for the clustering when the number of groups is approximately 25 or more (see Huang, 2014, 2016). With few clusters however, the standard errors may still be misestimated (Bell & McCaffrey, 2002; Cameron & Miller, 2015)."
 
Last edited:
#7
Good questions? I am guessing @Jake has answered these for you already in the past via the chat box. When you get all of the answers, let me know and I will write them down!

Comment, I have never understood centering categorical variables. It seems problemsome if prevalences are extreme. In LASSO regression it seems like people center when entering in continuous and cat vars, but it seems like there would be issues there as well and what about when there are >2 groups?
[Jake]Technically speaking, you could estimate a random variance for as little as 2 areas. Do you have any particular reason to believe 7 would be too few?
[Hlsmith]You only have 7 clusters, and it is generally true that it is a somewhat small number of clusters, which can, thus, lead to biased results. I am not sure how far you would be prepared to go with your modeling but a quick Wikipedia search showed a number of viable solutions to this problem

But not on this topic per se.
 
#9
Since I asked too many questions let me ask the most important ones.

1)When is it better to do group and when grand centering at the first level?

2) If you center your first level, do you have to center your higher (2nd) level variables. One comment said you had to include the group means for 2nd level variables when grand mean centering first level variables, but I am unsure what that means. Another article says you only grand mean center at the highest level (and that makes sense to me).

This is what I am confused about, what they mean here.

"When using the predictor in the raw scale or within grand-mean centering, it is critical to include the group means of the predictor at Level 2 to properly disaggregate the effects. The effect obtained for the predictor at Level 1 will then be the within-group effect and the effect obtained at Level 2 will then be the difference between the between- and within-group effects, sometimes called the contextual effect.

If you included the group means in your model at Level 2, then you will obtain exactly the same within-group effect estimate (and p-value) for your Level 1 predictor regardless of which method of centering you use. On the other hand, if you haven’t included the group means in the model at Level 2, then group-mean centering will generate an estimate of the within-group effect that will differ from the mish mosh estimate you previously obtained with the raw scale or grand-mean centering. The significance of the within-group effect might well differ from the mish mosh estimate you had before."
 
Last edited:
#11
I spent a lot of time working my way through hundreds of recent articles on MLM (analysis with them). Thankfully I did not read all of them, but I noticed something that confused me. I expected analysis of the variances and discussion of the random slopes and intercepts (how there was variation in the groups on slopes and intercepts). And instead it read a lot like an OLS analysis with group factors. This slope was significant or it was not, not level 1 elements varied this much on the level 2....or anything like this.

I expected the analysis to be very different than OLS analysis. And I did not find this. No cross level analysis either which seems to be one of the big advantage of this method.
 
#12
How do you see how the slopes for a specific group vary from another when you have random slopes. I can not find this in any output, and have not found anyone who references this in any article on MLM.

I understand the variation will be different by group, but I want to see how slopes vary for specific units (groups) from another unit.
 
#13
I think it is done this way. They generate a fixed effect, then list how each of the random slopes and intercepts vary from that value.

"The random-effects estimates represent the estimated deviation from the mean intercept and slope for each batch (Output 77.5.8). Therefore, the intercept for the first batch is close to 102:7 while the intercepts for the other two batches are greater than 102.7. The second batch has a slope less than the mean slope of –0.526, while the other two batches have slopes greater than –0.526."


1624894679818.png
 
#14
Although not pertinent to any of my questions :p I found this interesting. It deals when you should have random slopes,

"In the presence of heterogeneity, note that while FE models with naïve SEs are the most anticonservative, neither FE models with “robust” standard errors nor RE models with only random intercepts are much better."

Because using robust SE is sometimes offered as a solution to this issue.

https://link.springer.com/article/10.1007/s11135-018-0802-x

As is this, I wonder how often real world data is normally distributed. I assume they mean normality at the 2nd level here. I wonder if the data is asymptotically correct as it is with first level effects for normality. Of course sample size is often small at 2nd level anyhow.

"A key assumption of RE models is that the random effects representing the level-2 entities are drawn from a Normal distribution. However, “the Normality of [the random coefficients] is clearly an assumption driven more by mathematical convenience than by empirical reality” (Beck and Katz 2007:90). Indeed, it is often an unrealistic assumption, and it is important to know the extent to which different estimates are biased when that assumption is broken."
 
Last edited:
#15
"None of the software packages produced any inadmissible solutions for
Conditions 6 through 15 in which all slopes had nonzero variance components
in the population. Therefore, we focus on reporting results from Conditions 1
through 5, for which lme4 produced a troubling number of nonconvergent and
error-laden solutions."

This is an interesting recent review of statistical packages doing MLM

http://www-personal.umich.edu/~bwest/mccoach_etal_2018_mlmcompare.pdf
 
#16
Ok several authors make this point

"
The choice of which mean to use for centering should be guided by the research question, i.e., what parameter is being tested. Raudenbush and Bryk (2002) argue that if the purpose is to estimate the effect of a level 1 variable (e.g., sibling bullying) adjusting for level 2 variables (parents’ maltreatment history), then the best centering approach is group-mean centering. This also addresses unobserved cross-level confounding as described above. If the purpose is to estimate the effect of a level 2 variable adjusting for level 1 variables, the best approach is
to grand-mean center."

So what do you do if you are interested in both the level 1 and 2 as I am? Run it with group centered and interpret the level 1 and interactions and then run it a second time with grand mean centered and interpret the level 2? What is the model results, I mean which model is better changes when you do this in terms of the AIC?

Or if you group center the level 1 and grand mean the level 2 (which is the only way you can center the level 2) can you interpret both levels with the same model. No one I read raised this.
 
#17
Is this true [it shows up in a book about fixed effects but I have never seen this suggested in discussions of group centering before]

"In the literature on multilevel models (Bryk and Raudenbusch 1992; Goldstein 1987; Kreft et al. 1995), the practice of subtracting person-specific means from each time-varying variable is referred to as group-mean centering. Although it is well-known that using group-mean centered variables can produce substantially different results, this literature has not generally made the connection to fixed effects models nor has it been recognized that group-mean centering controls for all time-invariant covariates."

I am a bit confused about the statement subtracting person specific as the writer uses. I thought group mean centering subtracted the associated group level of a variable (the level associated with a group not an individual]