# Poison regression or transform data

#### Csmithni

##### New Member
Hi all,

I’m trying to do some analysis of the extent to which private schools are located in poor rural districts in Uganda.

so far, I have done a poison regression with number of private schools on the left hand side and %district population living in poverty, % district population living in rural area, district school aged population and no of government schools on the right hand side.

ive chosen poison because the left hand variable is count data. Now I’m wondering would I be better to transform that variable into no of private schools per 1,000 school aged district population and use linear regression instead. I guess the main reason would be the easier interpretation of the coefficients. I’m very new to regression so would appreciate some advice. Should I be looking to do some test to see whether the transformed data would have a normal distribution?

#### hlsmith

##### Less is more. Stay pure. Stay poor.
General comments, an attribute of the Poisson distribution (regression) is that the mean and variance are equal. A basic rule is that if the mean count is 8 or larger for the dependent variable, Poisson reg may be approximated by the normal distribution. If you think about it, if you had a mean of 8 or more schools and its variance was 8, the standard deviation would be around 3 and around 99% of expected data would be above '0'. So worrying about impossible negative estimates would be almost moot and a linear regression model would be a reasonable alternative. Or, you can just use Poisson regression, why not just employ the appropriate model. Its selection though is heavily based on this disperse attribute I mentioned.

#### Csmithni

##### New Member
Thank you so much for your reply. The mean of private schools per district is 5.16 and the variance is 27.6 so perhaps I have missed a key point here. Would negative binomial regression be more appropriate here?

Perhaps!