# p-values in regression trees

#### rogojel

##### TS Contributor
Hi,
I am analyzing some pretty hopeless datasets where the link between the DVs and IVs is quite weak. I observed however that when I take the two groups resulting from the first partition in the tree I can generallly get a nicely low p-value with a t-test . Is this some property of the trees I wonder. Is there any theorem pointing in this direction or is this possibly a weak signal I am detecting?

regards

#### Dason

I don't quite understand what you're asking.

#### hlsmith

##### Less is more. Stay pure. Stay poor.
Is the program kicking out p-values at your splits (partitions), and you are finding in a certain subgroup the outcome that do significantly differ between groups at the second level?

A pictorial example would be great. Is this coming from a single decision tree that you have run a couple of times?

#### rogojel

##### TS Contributor
Hi,
yes, it is a single tree and a continuous DV. Imagine that I have the first partition, and I have two subsets , one where the partition condition is TRUE (e.g.Volume>5) and one where the condition is FALSE . If I consider the two subsets and do a t-test for the two subsets like

t.test(dataset[condition,]$dv, dataset[!condition,]$dv)

I always get a low p-value (<0.05). My question is whether this is to be expected, as sort of the normal behavior of partitions or it is something one might consider as a signal?

Now that I think of it, it looks like a case of multiple comparisons.
Regards