Dose Response and logistic regression

#1
Hello and thank you for your time. It's quite long so you could get the accurate picture

I'm investigating the association between a (life time mean) vitamin level and the occurrence of a disease (yes/no).

I am using a 4600 patients data set (after cleaned) among 160 developed the disease and the rest haven't.

I have done a logistic regression where my dependent variable was the disease status (yes/no) and the independent variables were those I have adjusted to using stepwise approach and clinical literature. My variable of interest I have coded to dichotomic variable for each patient 1 insufficient vs 0 sufficient using accepted level. Then I have found the same for deficient vs sufficient and then for severe Def vs suf. Every time using the same model and always vs sufficient levels.

The odds ratios were always significant and the as the deficiency was stronger the odds ratio were getting stronger (the expB was bigger each time).
=====
Since it's quite convincing there's some sort of trend I wanted to show a dose response. How would you do so as for analysis and as for graphically?
=====
The only idea I came up with was this: having 75 and above are normal making new dichotomic variables instead for three levels (deficient vs sufficient, insufficient vs sufficient and svr deficient vs insufficient) to do so for a lot of levels saying every 5 nmol interval, for each one writing the odds ratio as appears in the same adjusted log reg and then to show a scatter plot y for odds ratio x for vit level and since the odds ratio are increasing gradually it would like nicely. I can also calculate regression line r square and so on. How does it sound? Any other ideas?
====

Thank you

David
 

hlsmith

Less is more. Stay pure. Stay poor.
#2
Are you able to fit a proportional hazards model? Logistic looks at a ross-section of time neglecting increased risk with time.

The dose-response between categories is likely not uniform, thus I would just place the group with the lowest odds as the reference group and compare the relative odds of the two other categories to it. You can plot the these odds ratios with corrected (e.g., Bonferroni) confidence intervals. I would not try to make too many inferences, since I am guessing data are self-reported and not based on sequential bioassays across time and there are likely plenty of potential confounders. I would imagine three categories is sufficient as well, if you use too many you'll get some sparsity in the subgroups and the confidence intervals will blow-up and you could end-up with complete data separation (convergence issues).