GLLAMM in R

jpkelley

TS Contributor
#1
Hi all,
I seem to recall that some of you (spunky, Dason, and other) might be doing research using latent variable models and their variants.

As I start to ask more questions about the temporal sequences of signaling, aggression, and courtship in my experiments on birds (and now spiders), I'm looking into Generalized Linear Latent and Mixed Models. I haven't done an exhaustive search for resources, but I'm curious if any of you might have some ideas of R packages that are currently under development for these sorts of models OR if there are papers outlining how to manually construct such models piece-meal. I mention that latter because this is what I ended up having to do for some recent predator census data. I'm okay doing this long-hand when the models are being put through the proverbial tests in the literature.

Just curious if you all had some ideas of where to begin.

Best,
Patrick
 

spunky

Can't make spagetti
#2
have you tried flexmix? that's the one i've been using a lot lately for finite mixture regressions and i believe the most recent version of it allows for some GLM models (poisson and negative binomial... probably more, but i haven't checked it out...)

so... why do you need latent variable modeling? i'm just curious because, at least on this province of academia, we've elevated latent variables to the pantheon of almost mystical creatures so i'm curious as how other people use them or conceive them...
 

jpkelley

TS Contributor
#3
Thanks, spunky. I reviewed the flexmix documention. It looks promising. That said, your question forced me to examine my approach...again.

Essentially, my problem comes down to this. Every data set I'm now dealing with involves more than a set of exogenous variables. To make better sense of the data, I'm envisioning some sort of SEM approach that would enable me to test multiple causal scenarios that each include different sets of my exogenous/endogenous variables. As my data set grew from small groups of factors that exhibited little or no colinearity to larger sets of 8-10 variables, I'm quite certain (from the a priori knowledge I have about my study system) that some of the causal path segments are most likely indirect causal links. From the little bit of SEM experience I have, I understand that modeling latent variables will make my parameter estimates much more accurate (I'm guessing by the way they "correct" the error of each parameter estimate). This is, of course, my naive view. Mainly, I'm trying to approach SEM using GLMMs since each causal path might have a unique error structure and might be measured using different experimental designs.

A bad example: For instance, if I want to know if rainfall across a single year at two sites influences breeding activity (number of nests) either directly or indirectly through vegetation density, I might have measured vegetation density at each using multiple subplots. Thus, I might want to include plot as a random factor for the path between vegetation and number of nests, while also account for the Poisson error structure. On the upstream side, since there might be only one rain gauge at each site, I would somehow model how rainfall each month influences vegetation AND how rainfall influence nest number (while including a latent variable representative of insect abundance or something).

Just trying to get some ideas about what's out there.
 

spunky

Can't make spagetti
#4
I'm quite certain (from the a priori knowledge I have about my study system) that some of the causal path segments are most likely indirect causal links. From the little bit of SEM experience I have, I understand that modeling latent variables will make my parameter estimates much more accurate (I'm guessing by the way they "correct" the error of each parameter estimate). This is, of course, my naive view.
well then i recommend both the sem package for starters and i guess you can expand on flexmix depending on which parameterisation you'd like to put on your data (although i guess if you're settling for poisson and the like flexmix might be the best way to go...) besides, path analysis is right up your alley, it was originally developed in biology (genetics)....

welcome to the land of "the-things-we-cannot-see-but-are-believed-to-exist" ... or, as my advisor puts it, "ever latent variable is an act of faith"

are you familiar with bayesian networks? in my personal view, they're gonna be (or most likely already are) the next hot stuff that's gonna take over structural equation modeling...
 

jpkelley

TS Contributor
#5
OK, then SEM will be my first step, and I'll move to flexmix when I get frustrated. Of course, it sounds like flexmix will offer its own suite of frustrations. You're right, path analysis should be up my alley considering the Sewell Wright link, but it's surprising how so few people in my field use the tool.

Not yet familiar with Bayesian networks, but you better believe I'm going to be looking into them now. Thanks for the heads-up. Never a dull moment, eh?

Thanks again, spunky. Your suggestions for the next step were precisely what I was looking for.