Dear All,

I have a pretty hard time of interpreting the result generated by the commands rpart(). For example, below is my code of using the rpart() for a classification tree. I grow a tree called fit, and then I summary(tree):

fit <- rpart(Kyphosis ~ Age + Number + Start,

method="class", data=kyphosis)

summary(fit)

> summary(fit) # detailed summary of splits

Call:

rpart(formula = Kyphosis ~ Age + Number + Start, data = kyphosis,

method = "class")

n= 81

CP nsplit rel error xerror xstd

1 0.17647059 0 1.0000000 1.0000000 0.2155872

2 0.01960784 1 0.8235294 0.8823529 0.2056488

3 0.01000000 4 0.7647059 0.8823529 0.2056488

Variable importance

Start Age Number

64 24 12

Node number 1: 81 observations, complexity param=0.1764706

predicted class=absent expected loss=0.2098765 P(node) =1

class counts: 64 17

probabilities: 0.790 0.210

left son=2 (62 obs) right son=3 (19 obs)

Primary splits:

Start < 8.5 to the right, improve=6.762330, (0 missing)

Number < 5.5 to the left, improve=2.866795, (0 missing)

Age < 39.5 to the left, improve=2.250212, (0 missing)

Surrogate splits:

Number < 6.5 to the left, agree=0.802, adj=0.158, (0 split)

Node number 2: 62 observations, complexity param=0.01960784

predicted class=absent expected loss=0.09677419 P(node) =0.7654321

class counts: 56 6

probabilities: 0.903 0.097

left son=4 (29 obs) right son=5 (33 obs)

Primary splits:

Start < 14.5 to the right, improve=1.0205280, (0 missing)

Age < 55 to the left, improve=0.6848635, (0 missing)

Number < 4.5 to the left, improve=0.2975332, (0 missing)

Surrogate splits:

Number < 3.5 to the left, agree=0.645, adj=0.241, (0 split)

Age < 16 to the left, agree=0.597, adj=0.138, (0 split)

Node number 3: 19 observations

predicted class=present expected loss=0.4210526 P(node) =0.2345679

class counts: 8 11

probabilities: 0.421 0.579

Node number 4: 29 observations

predicted class=absent expected loss=0 P(node) =0.3580247

class counts: 29 0

probabilities: 1.000 0.000

Node number 5: 33 observations, complexity param=0.01960784

predicted class=absent expected loss=0.1818182 P(node) =0.4074074

class counts: 27 6

probabilities: 0.818 0.182

left son=10 (12 obs) right son=11 (21 obs)

Primary splits:

Age < 55 to the left, improve=1.2467530, (0 missing)

Start < 12.5 to the right, improve=0.2887701, (0 missing)

Number < 3.5 to the right, improve=0.1753247, (0 missing)

Surrogate splits:

Start < 9.5 to the left, agree=0.758, adj=0.333, (0 split)

Number < 5.5 to the right, agree=0.697, adj=0.167, (0 split)

Node number 10: 12 observations

predicted class=absent expected loss=0 P(node) =0.1481481

class counts: 12 0

probabilities: 1.000 0.000

Node number 11: 21 observations, complexity param=0.01960784

predicted class=absent expected loss=0.2857143 P(node) =0.2592593

class counts: 15 6

probabilities: 0.714 0.286

left son=22 (14 obs) right son=23 (7 obs)

Primary splits:

Age < 111 to the right, improve=1.71428600, (0 missing)

Start < 12.5 to the right, improve=0.79365080, (0 missing)

Number < 3.5 to the right, improve=0.07142857, (0 missing)

Node number 22: 14 observations

predicted class=absent expected loss=0.1428571 P(node) =0.1728395

class counts: 12 2

probabilities: 0.857 0.143

Node number 23: 7 observations

predicted class=present expected loss=0.4285714 P(node) =0.08641975

class counts: 3 4

probabilities: 0.429 0.571

1. According to the complexity table, the smallest xerror is generated by the 4 splits or 5 nodes. However, the summary table lists the node number from 1 to 23. Why is this?

2. If we select the tree by one standard error rule, the third tree has the smallest xerror that is 0.8823529, and its standard error is 0.2056488. The xerror of the firs tree that is non-split is located within the one standard error of the third tree. So is the best model that a tree without splitting? This sounds really strange, but it is correct by the one standard error rule.

3. Why are the trees produced by rpart() and tree() different? I understand this is a big topic, but could you give me some resources that I can read and find the answer?

Thank you so much for reading my questions.

I have a pretty hard time of interpreting the result generated by the commands rpart(). For example, below is my code of using the rpart() for a classification tree. I grow a tree called fit, and then I summary(tree):

fit <- rpart(Kyphosis ~ Age + Number + Start,

method="class", data=kyphosis)

summary(fit)

> summary(fit) # detailed summary of splits

Call:

rpart(formula = Kyphosis ~ Age + Number + Start, data = kyphosis,

method = "class")

n= 81

CP nsplit rel error xerror xstd

1 0.17647059 0 1.0000000 1.0000000 0.2155872

2 0.01960784 1 0.8235294 0.8823529 0.2056488

3 0.01000000 4 0.7647059 0.8823529 0.2056488

Variable importance

Start Age Number

64 24 12

Node number 1: 81 observations, complexity param=0.1764706

predicted class=absent expected loss=0.2098765 P(node) =1

class counts: 64 17

probabilities: 0.790 0.210

left son=2 (62 obs) right son=3 (19 obs)

Primary splits:

Start < 8.5 to the right, improve=6.762330, (0 missing)

Number < 5.5 to the left, improve=2.866795, (0 missing)

Age < 39.5 to the left, improve=2.250212, (0 missing)

Surrogate splits:

Number < 6.5 to the left, agree=0.802, adj=0.158, (0 split)

Node number 2: 62 observations, complexity param=0.01960784

predicted class=absent expected loss=0.09677419 P(node) =0.7654321

class counts: 56 6

probabilities: 0.903 0.097

left son=4 (29 obs) right son=5 (33 obs)

Primary splits:

Start < 14.5 to the right, improve=1.0205280, (0 missing)

Age < 55 to the left, improve=0.6848635, (0 missing)

Number < 4.5 to the left, improve=0.2975332, (0 missing)

Surrogate splits:

Number < 3.5 to the left, agree=0.645, adj=0.241, (0 split)

Age < 16 to the left, agree=0.597, adj=0.138, (0 split)

Node number 3: 19 observations

predicted class=present expected loss=0.4210526 P(node) =0.2345679

class counts: 8 11

probabilities: 0.421 0.579

Node number 4: 29 observations

predicted class=absent expected loss=0 P(node) =0.3580247

class counts: 29 0

probabilities: 1.000 0.000

Node number 5: 33 observations, complexity param=0.01960784

predicted class=absent expected loss=0.1818182 P(node) =0.4074074

class counts: 27 6

probabilities: 0.818 0.182

left son=10 (12 obs) right son=11 (21 obs)

Primary splits:

Age < 55 to the left, improve=1.2467530, (0 missing)

Start < 12.5 to the right, improve=0.2887701, (0 missing)

Number < 3.5 to the right, improve=0.1753247, (0 missing)

Surrogate splits:

Start < 9.5 to the left, agree=0.758, adj=0.333, (0 split)

Number < 5.5 to the right, agree=0.697, adj=0.167, (0 split)

Node number 10: 12 observations

predicted class=absent expected loss=0 P(node) =0.1481481

class counts: 12 0

probabilities: 1.000 0.000

Node number 11: 21 observations, complexity param=0.01960784

predicted class=absent expected loss=0.2857143 P(node) =0.2592593

class counts: 15 6

probabilities: 0.714 0.286

left son=22 (14 obs) right son=23 (7 obs)

Primary splits:

Age < 111 to the right, improve=1.71428600, (0 missing)

Start < 12.5 to the right, improve=0.79365080, (0 missing)

Number < 3.5 to the right, improve=0.07142857, (0 missing)

Node number 22: 14 observations

predicted class=absent expected loss=0.1428571 P(node) =0.1728395

class counts: 12 2

probabilities: 0.857 0.143

Node number 23: 7 observations

predicted class=present expected loss=0.4285714 P(node) =0.08641975

class counts: 3 4

probabilities: 0.429 0.571

**My questions are:**1. According to the complexity table, the smallest xerror is generated by the 4 splits or 5 nodes. However, the summary table lists the node number from 1 to 23. Why is this?

2. If we select the tree by one standard error rule, the third tree has the smallest xerror that is 0.8823529, and its standard error is 0.2056488. The xerror of the firs tree that is non-split is located within the one standard error of the third tree. So is the best model that a tree without splitting? This sounds really strange, but it is correct by the one standard error rule.

3. Why are the trees produced by rpart() and tree() different? I understand this is a big topic, but could you give me some resources that I can read and find the answer?

Thank you so much for reading my questions.

Last edited: