Number of variables that can be included in an exact logistic regression

#1
I'm currently analyzing a dichotomous outcome using a small dataset of only 100 individuals. the outcome occurs in only 25 individuals and I have some predetermined confounders that I would like to adjust for. I know that I am limited by, at most, 2-3 variables if I were to model using a binary logistic analysis. I have read that exact logistic regression may be preferred in the case of a relatively rare outcome, although I don't know if that implies a large sample size.

Is an exact logistic regression an appropriate test?
If I use an "exact" logistic regression can I include more confounders?
Is there an outcome number below which an exact logistic regression is preferred? (If it matters assume I have a similar sample size of 100 individuals)

Thanks for your help,

Kar
 
#2
Hi,
I recently did some logistic regression analyses with 3 categorical predictors in SAS. I ran into quasi-complete separation problems when I started to include interactions, and tried to used exact tests to deal with them. SAS ran out of memory before it could finish the calculations despite running on a fast Core i5 ASUS laptop that is about 2 year old. In "Logistic regression using SAS" 2nd ed by Allison, he used PROC LOGISTIC with 2 predictor variables and found that the default maximum likelihood analysis took 0.1 seconds while exact tests took 3.5 minutes on the same data. So it sounds like exact tests are computationally intensive and I'm guessing that you might have problems if you add more predictor variables. No harm in trying though.

Maximum likelihood logistic regression is "asymptotically unbiased" (good with large N), but it is unclear how good it is for small sample sizes. Exact tests are good ways to deal with small samples, but I don't know of a cutoff for what is too small. I can say though that its not the sample size itself, but the number of cases in the less frequent binary outcome. Here is a guess on how you could decide if your sample is small enough (I only use SAS so sorry if this doesn't help you with different software):

SAS by default spits out confidence intervals on odds ratios that are based on Wald Chi-squared statistics, which are sensitive to small sample sizes. Profile likelihood confidence intervals are recommended for small samples. Request both types of confidence intervals and if they don't match very well, you might be in small sample size territory. You can request both types of CIs by adding "/CLODDS=BOTH" at the end of your model statement in PROC LOGISTIC.

Another thing to consider for small samples is using "penalized likelihood". This is a type of maximum likelihood that is less biased in small samples. When my interaction tests didn't work and exact tests ran out of memory, penalized likelihoods got rid of my separation problems and allowed the analysis to converge. You can ask for penalized likelihoods by adding "/FIRTH" at the end of your model statements.

Hope this helps!