I'm currently working on a churn modelling exercise that uses logistic regression. The challenge that i'm facing now is that the model is not returning good results. My hypothesis is that the churn rate in my data set is too small, such that a logistic regression was not able to pick up something meaningful.

Some of the context are:

- Sample population is 250000 with a total churners of about 0.3% (approx 700) each month over 3 months duration

- Predictors include education, marriage, gender, income level etc., most of them requires dummy coding

My questions are therefore:

- Is the number of churners in the data set too small for any meaningful logistic regression?

- If that's the case, would it be ok to remove some of records from the sample population that are non-churners to create a dataset that containers larger numbers of churners?

- Is there an ideal ratio of churn/ non-churners within dataset that allows for a meaningful logistic regression run?

Help much appreciated!