Propensity Score


New Member
Hello there,
I am conducting a study on the relationship between age and pregnancy complications/ birth outcomes. I have a small sample size ( 300 participants) and I am using multiple logistic regression to account for confounders. My advisor (senior statistician) recommended that I use propensity scores to include them in my logistic model because she was worried that I might be needing to adjust for many confounders which could make my confidence intervals very wide.
I tried reading more about the uses of propensity scores in regression but I am still not clear how to do it and why I should do it. I would appreciate if some one could help me with that.

Thanks you so much
Hi NadaN, I believe you can calculate a propensity score / PS (prob of being in a particular group based on covariates) and then either adjust for it (eg in a logistic regression) or choose to compare only a fraction of your individuals that have similar PS values.
There are a number of good reviews eg Benedetto (Statistical primer: propensity score matching and its alternatives) has a good abstract that seems to answer your point re why might want to use propensity scores rather than adjust via a multvariable model
"Propensity score (PS) methods offer certain advantages over more traditional regression methods to control for confounding by indication
in observational studies. Although multivariable regression models adjust for confounders by modelling the relationship between covariates
and outcome, the PS methods estimate the treatment effect by modelling the relationship between confounders and treatment assignment.
Therefore, methods based on the PS are not limited by the number of events, and their use may be warranted when the number
of confounders is large, or the number of outcomes is small. The PS is the probability for a subject to receive a treatment conditional on a
set of baseline characteristics (confounders). The PS is commonly estimated using logistic regression, and it is used to match patients with
similar distribution of confounders so that difference in outcomes gives unbiased estimate of treatment effect"


Less is more. Stay pure. Stay poor.
@NadaN -

What is your full outcome model? I am assuming there are more covariates than just age. As mentioned by @statlimp - you calculate the probability of being in a particular group (e.g., traditionally using logistic regression) than put this weight in your outcome model, which balances baseline covariate which may confound the estimates from your outcome model. But to do this you need a variable of interest. Age is continuous. You can use age, but it will get much much more confusing using a none categorical variable. You can use age groups, which would be frowned on since you would be butchering a continuous variable and lose in full relationship with the outcome variable.

Moreover, tell us more about you planned model. Your advisor may have misspoken or you did not relay an accurate interpretation of their comment. I am happy to help, just provide an update or more information.

Hello @statlimp and @hlsmith,
Thank you for taking the time to answer my question.
Here is the table I made for my variables:
Screen Shot 2018-12-23 at 11.23.05 AM.png
Some of the outcomes have few observations. For example, UTI is present in only 40 women.

I hope this clarifies my question more.



Less is more. Stay pure. Stay poor.
Propensity scores are a viable option when you have a single variable you want to unconfound and get an estimate for. It is not apparent you have this scenario!!!