Comparing lower and upper quartiles while adjusting for covariates (Public Health)

Hi Everyone - I am new to Talk Stats and grateful for the help.

I am working on a public health project and trying to assess the degree to which a certain risk factor (let's call it Soda) influences the health outcome of interest (we'll say Diabetes). Note: this is not the actual study but should be good enough to answer my question. My group and I have run a number of other analyses confirming our hypothesis including MV linear regression, stepwise regression analysis and regional analyses and all have confirmed that that Soda can increase diabetes.

The data is ecological and based at the US County level on survey datasets. So we have average amount of soda drank per week (1 can, 1.1 cans, 10 cans, 11.5 cans etc) and % diabetics by County. We also have a number of covariates for known diabetes risk factors, such as obesity, education, income, certain ethnicities, etc. Also, the both the diabetes and soda data is normally distributed.

I wanted to do an additional test to show more precisely how much soda results in an increased diabetes prevalence (aside from inferring from the beta of the multivariate linear regression results). What I did for this was to generate quartiles of the Soda dataset (<5 cans, 5-10,11-15, 16<) and then generate the mean diabetes prevalence of the upper quartile to the lower quartile and in effect say that increasing soda intake by 10 cans results in a 33% increase in the diabetes rate (from a mean county rate of the less than 5 cans of say 6% to 8% in the 16< cans counties, 33% increase). I confirmed the statistical significance of the counties with both 95%CI and t-test.

I have two questions regarding this. 1 - is there any way to statistically justify this comparison. 2, is there any way to do this comparison while adjusting for the various aforementioned covariates (ethnicity, education etc) and 3 - is there any OTHER way that I am not thinking of to get at the same result?



TS Contributor
use logistic regression. outcome is diabetes and one of your covarietes is the cut in quarters exposure, if you like, you can have others.