Logistic regression cutoff at 0.3

#1
Hi,

I have built a logistic regression model with misclassification rate around 18%. When I look at the ROC curve(area under the curve is 0.76) and find out the accuracy at various cutoff points, the best accuracy is at 0.3 probability cutoff. Is 0.3 acceptable?Shouldn't the cutoff chosen be greater than 0.5?

Thanks
 

Dason

Ambassador to the humans
#2
Well if you're going to say that the cutoff should be greater than .5 always then realize that if you recoded so that your outcome was 0 for success and 1 for failure that the same cutoff would now be less than .5 so just saying "it should be > .5" doesn't make any sense. If you're asking if it makes sense for it to be that far away from .5 then why not?
 

hlsmith

Less is more. Stay pure. Stay poor.
#3
If you slap confidence bands on it or test that it is not equal to 0.5 and see that it is different - then traditionally people flip the outcome groups so it is a convex instead of concave curve.

But as Dason wrote it can go either way.
 
#4
Maybe this link can be of help.

I think of a 'safe' traffic junction were the death risk is almost zero, and compare that with a very dangerous place were the death rate is say 3% = 0.03. Although that is much lower that 0.50, you don't want to play Russian roulette by driving there, where the death risk is one in 33 (=0.03) so a reasonable cut off value would be much lower than 0.03 (which is also much lower than 0.50).


I used to think about the "cut-off-issue" as choosing a value on a continuous scale to classify something, like choosing a centimeter value to classify someone as "tall" or "short". For example in allergy medicine there is a need in clinical practice to classify persons as "allergic" or "not-allergic". That classification is taken from a continuous measure that can be from zero to very high values. But that is another kind of cut off issue.