Logistic regression with continuous variables

rbody

New Member
#1
I have levels of eight biochemical markers (continuous variables) and want to determine whether each marker helps to predict the presence or absence of a certain diagnosis.

Thus when I enter each marker into a logistic regression analysis I can determine the odds ratio for a 1 unit increase in marker levels.

For some markers a 1 unit increase is massive in proportion to the standard deviation whereas for other markers it is negligible.

Is anyone aware of a suitable way that I could transform my data in order to make the odds ratios more meaningful?

I have toyed with the following but each seems to have its limitations:

a). Divide biomarker levels into quintiles/tertiles (thus converting the variable to an ordinal scale)
b). Divide levels of all/selected biomarkers by 10

I noticed in the literature that one group used log-2 of their continuous variables but I wasn't sure how valid this approach would be.

I'd really appreciate any advice/suggestions! :confused:
 
#2
You want approach b, or at least a modification of it.

You don't have to divide all of your predictors by 10--just the ones that make sense to do so. It is totally legitimate to multiply some predictors by 100 and others by 10, for example. It won't change your results, p-values, etc. at all--just how the odds ratios are interpreted.

Taking approach a will just throw away information in your predictor. There is no way to interpret any type of regression parameter from an ordinal predictor. They need to be either numerical or nominal.

Taking the log of X (a predictor), is used either to change the relationship between X and Y from a linear one or to pull in outliers in X.

Karen
 

rbody

New Member
#3
Thanks, Karen - that's really helpful!

Is it reasonable for my decision about which variables to multiply/divide to be arbitrary? Or would you use set criteria regarding the size of (for example) the standard deviation in order to help decide? (e.g. variables with sd above a stated level should be divided by 10)

Rick.
 
#4
Hi Rick,

You should multiply them so that a one unit change makes sense.

Here is a really simple example that I use in my logistic regression workshop.

Y= whether or not a student goes on academic probation after the first semester in college

X1= high school GPA (measured on a 4 pt scale)
X2=SAT score

If you don't multiply either predictor, neither makes sense. A one-unit change in GPA is huge-- a .1 unit change makes much more sense.

Likewise, a one-unit change in SAT score is too small. In fact, the way it's scored, you couldn't have two SAT scores one unit apart--they go by 10's.

I don't remember in my example the odds ratio for GPA, but for SAT score it's something like 1.03. Tiny, but significant, because a one-unit difference in SAT score is tiny. But if you divide SAT score by 10, a 10 units becomes 1 unit, so the odds ratio is based on that scale.

Likewise, you could multiply GPA by 10 (essentially changing it from a 4 to a 40 point scale). Now a 1 unit change is meaningful.

The meaning of the scale is more important than the standard deviation.

Does that help?

Karen