negative R² in output of randomForest package

#1
Hi everyone,
I am currently writing my master thesis about random forests and just started to work with the R software.
When I am running my model the output looks like this:
Mean of squared residuals: 0.0002441535
% Var explained: -8.82

Can anyone explain me why I get a negative R²? I always thought that a negative R² is not possible...

I would appreciate any help!

Thank you!

Alexandra
 

Dason

Ambassador to the humans
#3
For things outside of a linear model it's possible to get a negative R^2 since the way that it gets calculated in those situations doesn't guarantee that it will be positive.
 

noetsi

Fortran must die
#4
Substantively it is a meaningless number. Obviously you can't have less than 0 explained variance. It would be like a chemical that broke down before it entered solution (a negative ph).
 

rogojel

TS Contributor
#5
hi,
just my two cents:

the formula if I recall it correctly is R2=(1-variance of residuals/total variance) . In case of a regression it is impossible to have a situation where the residuals have a larger variance then the total so R2 will always be positive.

I could imagine, that in the case of random forests the way to estimate the residual variance ( whatever that is for random forests) can in some cases lead to a larger number then the estimate of the total variance.

regards
 

noetsi

Fortran must die
#6
Also I think with adjusted R square the formula does allow negative values, but that simply means your model has no predictive value.