interaction effect?

noetsi

Fortran must die
#1
We have individuals with various disabilities. We are analyzing the impact of services on income for our customers. I have a theory that services provided impact on income varies with given disabilities. Would it be correct to specify an interaction effect between disability type and spending on a service to test this?

Not sure that is the best way to do this.
 

noetsi

Fortran must die
#3
Thanks hlsmith. Are you saying as a starting point my approach would be correct? Can you actually use interaction terms that way (would they do what I suggested)?

I am not sure how you would use MLM this way although I will look into that. Would services be nested inside disabilities?
 

noetsi

Fortran must die
#5
Income is the DV (interval). My theory is that spending on specific services (or how many services) both interval predictors has different impact on income depending on the level of another predictor (what disability group you are in).
 

hlsmith

Less is more. Stay pure. Stay poor.
#6
Yeah, so you can model this as a MLM with your group level being which disability group they are in. This will allow each group to have its own slope, which if they are different between groups then they have a different effect. It will also allow each group to have its own intercept.

This is what I mean (see link). And I am not saying you have to code it in R - but this is how they can have random intercepts and slopes, which allows you to tease our the differences.

http://mfviz.com/hierarchical-models/
 

noetsi

Fortran must die
#7
Thanks. Given that I am relatively new to MLM is it valid to to the much simpler specifying of interaction between disability group and spending. I would like to start there.
 
#10
Would it be correct to specify an interaction effect between disability type and spending on a service to test this?
Yes. I think that it would be very natural to investigate that possibility.

But in my view, you do not have to think about it (the disabilities) as a random effect. It could be a fixed effect. It depends on how you think about it. I mean you have not randomly selected these disabilities, but rather you have this list of disabilities. That is fixed. But you can still test for different intercepts and different slopes.
 

hlsmith

Less is more. Stay pure. Stay poor.
#11
you do not have to think about it (the disabilities) as a random effect. It could be a fixed effect.
Talk more about this and the selection between the two options.

It is my understanding that you can get category intercepts and slopes from both process, but the MLM will address the within and between variability. I think of MLM as addressing that groups have a different data generating process composed of different coefficients that you are controlling for the similarities within the groups. It seems like you are referring to the theory when one should be select - and I would like to hear more about that!
 

noetsi

Fortran must die
#12
Yes. I think that it would be very natural to investigate that possibility.

But in my view, you do not have to think about it (the disabilities) as a random effect. It could be a fixed effect. It depends on how you think about it. I mean you have not randomly selected these disabilities, but rather you have this list of disabilities. That is fixed. But you can still test for different intercepts and different slopes.
I actually had not thought of them as a random effect until HLSMITH raised that. I was just thinking of them as moderating factor which is why I brought up interaction. We have disabilities that clearly behave differently than others on some dimensions.
 

noetsi

Fortran must die
#13
Talk more about this and the selection between the two options.

It is my understanding that you can get category intercepts and slopes from both process, but the MLM will address the within and between variability. I think of MLM as addressing that groups have a different data generating process composed of different coefficients that you are controlling for the similarities within the groups. It seems like you are referring to the theory when one should be select - and I would like to hear more about that!
I never thought of MLM as an interaction effect, or mirroring it although it does this of course. I think of it as a way of exploring higher level factors that influence lower level variables. So if individual is nested in school, what about school influences individual and thus the DV. Which reminds me of the indirect effects of sem although the impact is probably direct not indirect I assume.
 
#14
you do not have to think about it (the disabilities) as a random effect. It could be a fixed effect.
Talk more about this and the selection between the two options.
Someone said: You can check if the effect is a random effect or a fixed effect by thinking about what would happened if you hypothetically would repeat the study. If an other set of levels would appear in a repeated experiment, then it would be a random effect. But if the same levels would appear it would be a fixed effect. (Even if the number of levels is large.) (If you randomly selects school and an other set of school would appear the second time, the schools are a random effect. (And you would make your inference to all the schools, i.e. to the population of schools.) But if you have a list of disabilities and (I guess) the same levels of disabilities would be used it would be a fixed effect.)

I would say that this is about how you think about your model.

Maybe this is similar to that I have an extra motivation for why you should randomize your experiment. Randomization will reveal what restrictions you have on randomization. "So you don't want to do a completely randomized design. You want a randomized block design. Fine!" Or: "you don't want to do a completely randomized design. You want a split plot design - a sort of multilevel design. Good to know!"


It seems like you are referring to the theory when one should be select
I did not understand this sentence.
when one should be select
 
#15
you do not have to think about it (the disabilities) as a random effect. It could be a fixed effect.
Talk more about this and the selection between the two options.
Someone said: You can check if the effect is a random effect or a fixed effect by thinking about what would happened if you hypothetically would repeat the study. If an other set of levels would appear in a repeated experiment, then it would be a random effect. But if the same levels would appear it would be a fixed effect. (Even if the number of levels is large.) (If you randomly selects school and an other set of school would appear the second time, the schools are a random effect. (And you would make your inference to all the schools, i.e. to the population of schools.) But if you have a list of disabilities and (I guess) the same levels of disabilities would be used it would be a fixed effect.)

I would say that this is about how you think about your model.

Maybe this is similar to that I have an extra motivation for why you should randomize your experiment. Randomization will reveal what restrictions you have on randomization. "So you don't want to do a completely randomized design. You want a randomized block design. Fine!" Or: "you don't want to do a completely randomized design. You want a split plot design - a sort of multilevel design. Good to know!"


It seems like you are referring to the theory when one should be select
I did not understand this sentence.
when one should be select
 

noetsi

Fortran must die
#16
To me random effects are a confusing concept because different people mean different things by it. Gellman has a great discussion of it somewhere.

To me random effects in MLM just mean the variables don't behave the same way in different groups. Or the relationship of the groups vary on the DV.
 

hlsmith

Less is more. Stay pure. Stay poor.
#17
Hmm. Schools is a hard one to digest. Since people love to use schools as a random effects example, but how are you going to get new schools - since you are sampling schools? So each time you let each school be a group? Also, what if you get the same schools, but they have different effects do to change between time periods in the new sampling. I feel like you are onto something here, but this example and idea still aren't clear yet!
 
#18
Well if you take a sample of 100 schools out of maybe 4000 schools in the state. Then that would be a random sample in my view. (And you take a sample of students from each school.) But if you take all 21 counties in my country I would think of that as a fixed factor, just like Noetsi's list of disabilities.

Or if you sample 100 fields and treat different subplots with different levels of nitrogen.

But if you have say all the hospitals that there is in the state (and that is all you are interested in) then I don't think that it is wrong to use the random effect model in a mixed model. I just guess that there would be a larger inference space if you think of it as a random effect. At least an author Anderson talked like that in an old book. Maybe the hospital can be thought of coming from a superpopulation model.
 

noetsi

Fortran must die
#19
To be clear we have all the disabilities and the entire population of interest. It is unlikely that the disabilities will change in future years (these definitions rarely change) although how many has each will. And since it is a judgement of who has what there is a potential problem with reliability. One counselor might decide a customer had a disability another counselor might not.
 

noetsi

Fortran must die
#20
Another strange wrinkle I have not seen in the literature. The disability effect has 5 levels (five types of disability). The federal government is controlling for them by placing 4 dummy variables in the model which I can not remove. So if I want to test an interaction effect between disability group and how many of a specific type of service a customer got (count of services) how exactly do I do this?

So the model is income = countofservicetype1, disab1 ,disab2,disab3,disab4 (and many other control variables). Do I build the interaction as income =countofservicetype1,disab1, countofservicetype1*disab1, disab2,countofservicetype1*disab2....and so on?

All the examples of interaction I have ever run into roll up what they are interested into one variable and specify an interaction with that rather than break it down into its categories and then analyze interaction.