Does centering at the upper level (group level) actually change the slope not just the intercept. I know it only changes the intercept at level 1.

There is also disagreement if you should center dummy predictors. Some say yes, some no.

- Thread starter noetsi
- Start date

Does centering at the upper level (group level) actually change the slope not just the intercept. I know it only changes the intercept at level 1.

There is also disagreement if you should center dummy predictors. Some say yes, some no.

Comment, I have never understood centering categorical variables. It seems problemsome if prevalences are extreme. In LASSO regression it seems like people center when entering in continuous and cat vars, but it seems like there would be issues there as well and what about when there are >2 groups?

For dummy variables, and some don't agree, this is what one professor wrote for his class.

"Note that while it may seem inappropriate at first to center a dummy variable, in HLM it actually is quite useful. If the dummy is uncentered, the intercept is the average value when the dummy variable is 0. If the dummy variable is centered, the intercept then becomes the mean adjusted for the proportion of cases with the dummy variable=1. For example, if the indicator for sex variable is centered around the grand mean, this centered predictor can take two values. If the subject is female, it will equal the proportion of male students in the sample. If the subject is male, it will equal to minus the proportion of female students in the sample. Zero on this variable becomes the average proportion of female students."

I am still working through that example.

"When using the predictor in the raw scale or within grand-mean centering, it is critical to include the group means of the predictor at Level 2 to properly disaggregate the effects. The effect obtained for the predictor at Level 1 will then be the within-group effect and the effect obtained at Level 2 will then be the

"General and specialized multilevel modeling software, both free (e.g., R) and commercial (e.g., HLM and SAS), are readily available. However, together with the growth of MLM as an analytic technique, several myths regarding the method abound and are found in many well-respected journals suggesting that both authors and reviewers may not be fully aware of more recent developments in the field related to the analysis of clustered data. We highlight some of these myths and golden rules which deserve some attention as newer studies, which we focus on in the View Today section of each myth, may have clarified some prior ambiguous modeling related issues."

http://faculty.missouri.edu/huangf/data/pubdata/SPQ_Myths/05 Multilevel myths_complete.pdf

The way I interpret this is if you group center the data you will eliminate all bias tied to groups (ignoring SE issues).

"Another way is to include the group-mean centered level one variable, also known as centering within context (CWC). If the researcher is interested in the association of the level one predictor (Xij) on the outcome (Yij), group-mean centering is the best option because group-mean centering removes all between group variation (Dalal & Zickar, 2012). Group mean centering or using demeaned data (i.e., subtracting the group mean from variables) effectively eliminates the group-level effect from the variable and reduces the ICC of the predictor variable to zero as all of the clusters will have a mean of zero for the centered variable."

So, particularly in running OLS why wouldn't you always group center?

Does this mean if you have 25 or more groups White SE will effectively deal with the SE problems OLS causes by ignoring between group effects (lack of independence).

"

Based on Mplus documentation, the "type = complex" option applies the well-known Huber White standard error adjustments and retains the parameter estimates.

The standard error adjustment uses a sandwich estimation procedure (Berger, Graham, & Zeileis, 2017) which may account for the clustering when the number of groups is approximately 25 or more (see Huang, 2014, 2016). With few clusters however, the standard errors may still be misestimated (Bell & McCaffrey, 2002; Cameron & Miller, 2015)."

"Another way is to include the group-mean centered level one variable, also known as centering within context (CWC). If the researcher is interested in the association of the level one predictor (Xij) on the outcome (Yij), group-mean centering is the best option because group-mean centering removes all between group variation (Dalal & Zickar, 2012). Group mean centering or using demeaned data (i.e., subtracting the group mean from variables) effectively eliminates the group-level effect from the variable and reduces the ICC of the predictor variable to zero as all of the clusters will have a mean of zero for the centered variable."

So, particularly in running OLS why wouldn't you always group center?

Does this mean if you have 25 or more groups White SE will effectively deal with the SE problems OLS causes by ignoring between group effects (lack of independence).

"

Based on Mplus documentation, the "type = complex" option applies the well-known Huber White standard error adjustments and retains the parameter estimates.

The standard error adjustment uses a sandwich estimation procedure (Berger, Graham, & Zeileis, 2017) which may account for the clustering when the number of groups is approximately 25 or more (see Huang, 2014, 2016). With few clusters however, the standard errors may still be misestimated (Bell & McCaffrey, 2002; Cameron & Miller, 2015)."

Last edited:

Comment, I have never understood centering categorical variables. It seems problemsome if prevalences are extreme. In LASSO regression it seems like people center when entering in continuous and cat vars, but it seems like there would be issues there as well and what about when there are >2 groups?

[Hlsmith]You only have 7 clusters, and it is generally true that it is a somewhat small number of clusters, which can, thus, lead to biased results. I am not sure how far you would be prepared to go with your modeling but a quick Wikipedia search showed a number of viable solutions to this problem

But not on this topic per se.

Since I asked too many questions let me ask the most important ones.

1)When is it better to do group and when grand centering at the first level?

2) If you center your first level, do you have to center your higher (2nd) level variables. One comment said you had to include the group means for 2nd level variables when grand mean centering first level variables, but I am unsure what that means. Another article says you only grand mean center at the highest level (and that makes sense to me).

This is what I am confused about, what they mean here.

"When using the predictor in the raw scale or within grand-mean centering, it is critical to include the**group means **of the predictor at Level 2 to properly disaggregate the effects. The effect obtained for the predictor at Level 1 will then be the within-group effect and the effect obtained at Level 2 will then be the *difference* between the between- and within-group effects, sometimes called the *contextual effect.*

If you included the group means in your model at Level 2, then you will obtain exactly the same within-group effect estimate (and p-value) for your Level 1 predictor regardless of which method of centering you use. On the other hand, if you haven’t included the group means in the model at Level 2, then group-mean centering will generate an estimate of the within-group effect that will differ from the mish mosh estimate you previously obtained with the raw scale or grand-mean centering. The significance of the within-group effect might well differ from the mish mosh estimate you had before."

1)When is it better to do group and when grand centering at the first level?

2) If you center your first level, do you have to center your higher (2nd) level variables. One comment said you had to include the group means for 2nd level variables when grand mean centering first level variables, but I am unsure what that means. Another article says you only grand mean center at the highest level (and that makes sense to me).

This is what I am confused about, what they mean here.

"When using the predictor in the raw scale or within grand-mean centering, it is critical to include the

If you included the group means in your model at Level 2, then you will obtain exactly the same within-group effect estimate (and p-value) for your Level 1 predictor regardless of which method of centering you use. On the other hand, if you haven’t included the group means in the model at Level 2, then group-mean centering will generate an estimate of the within-group effect that will differ from the mish mosh estimate you previously obtained with the raw scale or grand-mean centering. The significance of the within-group effect might well differ from the mish mosh estimate you had before."

Last edited:

I expected the analysis to be very different than OLS analysis. And I did not find this. No cross level analysis either which seems to be one of the big advantage of this method.

I understand the variation will be different by group, but I want to see how slopes vary for specific units (groups) from another unit.

"The random-effects estimates represent the estimated deviation from the mean intercept and slope for each batch (Output 77.5.8). Therefore, the intercept for the first batch is close to 102:7 while the intercepts for the other two batches are greater than 102.7. The second batch has a slope less than the mean slope of –0.526, while the other two batches have slopes greater than –0.526."

Although not pertinent to any of my questions I found this interesting. It deals when you should have random slopes,

"In the presence of heterogeneity, note that while FE models with naïve SEs are the most anticonservative, neither FE models with “robust” standard errors nor RE models with only random intercepts are much better."

Because using robust SE is sometimes offered as a solution to this issue.

https://link.springer.com/article/10.1007/s11135-018-0802-x

As is this, I wonder how often real world data is normally distributed. I assume they mean normality at the 2nd level here. I wonder if the data is asymptotically correct as it is with first level effects for normality. Of course sample size is often small at 2nd level anyhow.

"A key assumption of RE models is that the random effects representing the level-2 entities are drawn from a Normal distribution. However, “the Normality of [the random coefficients] is clearly an assumption driven more by mathematical convenience than by empirical reality” (Beck and Katz 2007:90). Indeed, it is often an unrealistic assumption, and it is important to know the extent to which different estimates are biased when that assumption is broken."

"In the presence of heterogeneity, note that while FE models with naïve SEs are the most anticonservative, neither FE models with “robust” standard errors nor RE models with only random intercepts are much better."

Because using robust SE is sometimes offered as a solution to this issue.

https://link.springer.com/article/10.1007/s11135-018-0802-x

As is this, I wonder how often real world data is normally distributed. I assume they mean normality at the 2nd level here. I wonder if the data is asymptotically correct as it is with first level effects for normality. Of course sample size is often small at 2nd level anyhow.

"A key assumption of RE models is that the random effects representing the level-2 entities are drawn from a Normal distribution. However, “the Normality of [the random coefficients] is clearly an assumption driven more by mathematical convenience than by empirical reality” (Beck and Katz 2007:90). Indeed, it is often an unrealistic assumption, and it is important to know the extent to which different estimates are biased when that assumption is broken."

Last edited:

Conditions 6 through 15 in which all slopes had nonzero variance components

in the population. Therefore, we focus on reporting results from Conditions 1

through 5, for which lme4 produced a troubling number of nonconvergent and

error-laden solutions."

This is an interesting recent review of statistical packages doing MLM

http://www-personal.umich.edu/~bwest/mccoach_etal_2018_mlmcompare.pdf

"

The choice of which mean to use for centering should be guided by the research question, i.e., what parameter is being tested. Raudenbush and Bryk (2002) argue that if the purpose is to estimate the effect of a

to grand-mean center."

So what do you do if you are interested in

Or if you group center the level 1 and grand mean the level 2 (which is the only way you can center the level 2) can you interpret both levels with the same model. No one I read raised this.

"In the literature on multilevel models (Bryk and Raudenbusch 1992; Goldstein 1987; Kreft et al. 1995), the practice of subtracting person-specific means from each time-varying variable is referred to as group-mean centering. Although it is well-known that using group-mean centered variables can produce substantially different results, this literature has not generally made the connection to fixed effects models nor has it been recognized that group-mean centering controls for all time-invariant covariates."

I am a bit confused about the statement subtracting person specific as the writer uses. I thought group mean centering subtracted the associated group level of a variable (the level associated with a group not an individual]