Logistics Regression and Decision Tree produced Different results

#1
Dear all,

I am running both logistic regression and decision tree (CHAID) analysis for some data.

Logistic regression produced the result which told me which variables were most important.
I thought CHAID would put those same variables (as output by logistic regression) on the top of the tree,
as I was told a decision tree will put the most important variables on the top.

But unexpectedly the decision tree didn't, it produced different result.
Can anybody explain why??

Thank you in advance.

Jack
 

hlsmith

Not a robit
#2
Well how different were the two? Also, how do you define important in logistic reg. Also, a decision tree is equivalent to a bunch of interaction terms, your logistic reg likely didn't include all pairwise interactions to the order of the number of terms in the dataset. Also may depend on criteria used for splits and buckets on tree. Why do you want a tree? I bet if you looked at the "Variable Importance" metrics from a random forest it would be a little closer.

P.S., logistic is considered a parametric model and decision tree is considered a nonparametric model - they are different to start with.