design question

noetsi

No cake for spunky
#1
We have a service that we are trying to evaluate. We have a dependent variable of income gain. The problem is that those that get the service may vary in many ways from those who don't get the service (most of which we probably don't measure). For instance someone may have a more serious disability than those who don't get it and this impact their income (we measure severity, but its not nuanced enough to be truly effective). Random assignment is not a possibility, the data occurred in the past, but even if it had not we are a state agency and can't legally do this (I doubt RDD or interrupted time series is feasible either).

My best effort so far has been to find a list of those who are eligible for the service and got it versus those that did not. But there are problems with this including that those who need it might not be known to need it.

Any suggestions would be appreciated.
 

hlsmith

Less is more. Stay pure. Stay poor.
#2
If I was in your shoes, it seems like of those eligible, you could run a propensity score model to get weights to use in your outcomes model. As you mentioned, you may not have access to all the variables but this would be better than a naive or conditional model. For the propensity score model - gradient boosted trees or ensembles are best - but logistic reg would be sufficient. For the outcome model you can just use a simple regression with propensity scores as weights.

Once you get the weights, many like to use the standardized inverse weights. So use inverse of (probability and times prevalence of which group they are in).
 

noetsi

No cake for spunky
#3
If you were in my shoes you would quit and go find a less boring job. Oh and know what to do. :p

I have read of but know little about propensity scores. Any suggestions where to start. I have an interval dependent variable, I don't know what you mean by an outcome model.
 

hlsmith

Less is more. Stay pure. Stay poor.
#4
You model propensity scores and put those into the outcome model as weights. The outcome model is typically a simple regression, exposure and outcome.

Can you provide a histogram of the outcome variable.
 

noetsi

No cake for spunky
#5
You model propensity scores and put those into the outcome model as weights. The outcome model is typically a simple regression, exposure and outcome.

Can you provide a histogram of the outcome variable.
The dependent variable is income two quarters after closure so it has many many values. I can try to build one if that is helpful.