negative R² in output of randomForest package

alexa_cgn

New Member
Hi everyone,
I am currently writing my master thesis about random forests and just started to work with the R software.
When I am running my model the output looks like this:
Mean of squared residuals: 0.0002441535
% Var explained: -8.82

Can anyone explain me why I get a negative R²? I always thought that a negative R² is not possible...

I would appreciate any help!

Thank you!

Alexandra

hlsmith

Less is more. Stay pure. Stay poor.
Yeah, you can't square a number and get a negative number (you are on track with that conclusion).

Dason

For things outside of a linear model it's possible to get a negative R^2 since the way that it gets calculated in those situations doesn't guarantee that it will be positive.

noetsi

No cake for spunky
Substantively it is a meaningless number. Obviously you can't have less than 0 explained variance. It would be like a chemical that broke down before it entered solution (a negative ph).

rogojel

TS Contributor
hi,
just my two cents:

the formula if I recall it correctly is R2=(1-variance of residuals/total variance) . In case of a regression it is impossible to have a situation where the residuals have a larger variance then the total so R2 will always be positive.

I could imagine, that in the case of random forests the way to estimate the residual variance ( whatever that is for random forests) can in some cases lead to a larger number then the estimate of the total variance.

regards

noetsi

No cake for spunky
Also I think with adjusted R square the formula does allow negative values, but that simply means your model has no predictive value.

hlsmith

Less is more. Stay pure. Stay poor.
Looking up the source documentation for the R package would be very handy!