Association estimate (partial reverse temporality)


Less is more. Stay pure. Stay poor.
I have a project where i modeled variables associated with a target variable. I used logistic regression. My issue is that covariates are causes of the target and some are effects. This seems bizzarre and is regularly done in the field of medicine. Say i am looking at an injury and variables associated with it are its cause and others are secondary signs of injury, say pain, etc. If I throw them all in the model is it fine since i am looking for associations not causality? Some covariates may be outcomes from causes and the target (colliders, which controlling for can open up backdoors), so in that case the target is the mediator in the path. The final active set of covariates makes since to clinicians (e.g.,mechanism of injury is fall, pain, etc.), but feels wierd analytically. I will post a directed graph to display the context.

@Jake - this is partially up your alley - any input?


No cake for spunky
What is the difference between a cause and an effect? I am not sure what you mean. If you are saying there is two way causation, then that violates key assumptions of any regression approach I know of.

I would look at SEM if you think that is occurring. It has methods to address this.


Less is more. Stay pure. Stay poor.
I haven't seen any "two-way" causation assumptions before. Can you elaborate. Statistical models are directionally agnostic. They don't know if it is Y = X or X = Y.

I will post a graphic tomorrow, but it is like I am trying to find predictors of a mediator using its parents and children. There is a thing called Markovian blanket, meaning all you need to know about a variable is its causes and effects, anything further upstream or downstream than the term is conditionally independent of it given you control for the blanket variables. However, there is some incest in my model, since the parent term is a direct cause of the target variable and its child, but it is also mediated. If this hasn't been written up, that would be a pseudo-interesting term to use, though slightly incorrect (since direct and indirect effect occur from the parent).

X -> Y -> Z <- X, want to model Y, or find variables associated with it, no loops in this. Another way to think about it is that I am trying to model the targeted variable which it's relationship with Z is confounded.