p-trend

#1
I have data on waist circumfrence that is divided into quartiles and we are measuring a variable called e' for each of these quartiles(described in the table below) I need help in STATA-
1. To calculate p-value for this variable (e') that is divided into quartiles and then
2. adjust for various other variables like age, sex, race, baseline SBP, DBP, HR, prior h/o DM, trial, treatment assignment, change in SBP, and E’
3. I also need to calculate p-trend
Thanks for your help.
Gopi

waist circumfrence quartiles
--------------------------------------------------------------------------------------
<26.6, 26.6-29.9, 29.9-33.9, >33.9 P for trend Adj. P value
--------------------------------------------------------------------------------------

Mean E’ 7.3±1.1, 7.6±1.2, 7.6±1.5, 7.5±1.20
 
#2
Why is it people feel a compulsive obsession to bin? Binning just throws away data! Don't bin and this whole problem reduces to a very straightforward multivariate linear regression. Which gives you a clear, clean answer, with error bars, quantifying how much an increase in each independent variable contributes to an increase in the dependent variable, controlling for all the other variables.
 

Link

Ninja say what!?!
#3
I hope you don't get offended Ichbin. I consider that there ARE times when binning is beneficial though. I'd have to consider if this is one of them.

However, if you talk to a doctor, s/he would tell you that a BMI of over 25 is considered clinically overweight, while a BMI of over 30 is considered obese. A BMI of over 40 is considered morbidly obese. Conducting the analyses, I think it'd be more beneficial to the reader to bin so that they have a comparison of "normal" body mass to obese body mass.
 
#4
I'm not offended, but I would argue that the right time to form such bins is after the data analysis, when you are getting ready to present your results to patients. Then you can apply some pre-determined objective rule to set the bin borders (e.g. where the measured factor creates a <25% risk, >75% risk, and in between). But setting the bins beforehard to round numbers is totally arbitrary, needlessly throws away information, and (as you can see in this case) actually complicates the analysis.
 

Link

Ninja say what!?!
#5
Good point there. I agree that setting bins is somewhat arbitrary. I'm kinda iffy about setting bins after the data analysis if you know beforehand there are standard cut off points and know that you'll be doing it when you present your findings anyway. However, I do understand the lost efficiency that's encountered. I guess that comes down to weighing pro's and cons.

Going to the original question of interest though...
First off, I want to verify that you know what a p-value is. What does it mean? What would it mean in your scenario? Assuming that you do know the answer to that question (and please answer in your reply post so that we can correct and guide you if you're wrong), if the data has already been binned, it looks like ANOVA would work for your question 1.

For question 2, I'd set up a linear regression model and indicators for each group. I would then add in the indicators for each of three of the groups to compare to the fourth that I choose as my base group. This would allow me to adjust for the other variables.

For your question 3, I'm still unfamiliar with the term trend p-value. I've heard it here and there, but no one in my university really ever uses this term. Could you clarify what you mean by this? If you're just trying to determine if there's a trend across the quartiles, I'd just correlation. Try this link: http://talkstats.com/showthread.php?t=4715
 

Dason

Ambassador to the humans
#6
I'm not offended, but I would argue that the right time to form such bins is after the data analysis, when you are getting ready to present your results to patients. Then you can apply some pre-determined objective rule to set the bin borders (e.g. where the measured factor creates a <25% risk, >75% risk, and in between). But setting the bins beforehard to round numbers is totally arbitrary, needlessly throws away information, and (as you can see in this case) actually complicates the analysis.
I agree that if you think a linear trend is appropriate then there is no reason to bin. If there's some other sort of trend that you can justify then go ahead and model that with the continuous variable. But what if you don't know what the trend will be? Or you do but you can't model it very well? Binning in this case allows us to break free from the restriction of the linearity that we would be imposing on ourselves before. Now I know there are ways around this that allow you to keep the variable continuous but sometimes it's just easier/saner to bin and be done with it.