# calculating goodness of fit and choosing the right model without plotting

#### paul15071992

##### New Member
I'm trying to evaluate the value of an object, depending on his characteristics. In order to do this, I'm building the following regression model price ~ ., using similar objects and for each variable I got min 20 observations

I encountered following problem: none of the regression models worked for all my data, so I decided to use all of the followings methods:
Code:
    model.lm <- lm(price ~ .)
model.lmLog <- lm(log(price) ~ .)
model.ltsReg <- ltsReg(price ~ .)
model.ltsRegLog <- lts(log(price) ~ .)
model.lmrob <- lmrob(price ~ .)
model.lmrobLog <- lmrob(log(price) ~ .)
model.lmRob <- lmRob(price ~ .)
model.lmRobLog <- lmRob(log(price) ~ .)
model.glm <- glm(price ~ .)
model.glmLog <- glm(price ~ ., family=gaussian(link="log"))
My question is: how can I decide which of this models fits best for the current data, without plotting the results?

As far as I know, the r-squared aren't trusty, because I the data is corrupted, so will be the r-squared.

Any ideas?

Thank you!

#### consuli

##### Member
none of the regression models worked for all my data
(...)
because I the data is corrupted
What does that mean? Does your data contain missings, or what? Your question cannot be answered, until you have explained the problem with your data in detail.

#### paul15071992

##### New Member
By corrupted data, I mean data which contains outliers.

#### consuli

##### Member
Maybe

Spearman Rank Correlation or
Gaussian Copula Correlation

#### paul15071992

##### New Member
could you please be more explicit?
thank you!

#### consuli

##### Member
Well, I just heard of it as a measure of association for corrupted data. I do not know any further, sorry.

#### paul15071992

##### New Member
does anyone know if BIC and AIC are suitable for this job?

#### consuli

##### Member
does anyone know if BIC and AIC are suitable for this job?
It depends. BIC and AIC are based on Maximum Likelihood, which is based on an assumed distribution. The more probabllity the assumed distribution has on the tails, the less sensitive the maximum likelihood will be to outliers.

Imho the sensitivity to outliers will follow the following ranking
Spearman Rank Correlation (least sensitive)
Gaussian Copula Correlation
Maximum Likelihood based Measure with a distribution of huge probability on the tails
Maximum Likelihood based Measure with a distribution of low probability on the tails (most sensitive)

where the standard Pearson correlation is a special case of a Maximum Likelihood based goodness of fit measure with medium probablity on the tails (assumed normal distribution has kurtosis= 3).

Anyway AIC and BIC are no standardized goodness of fit measures. So you can only compare within your dataset, not against another study for instance.

I am sorry, I do not have the time to explain you in detail at the moment. Someone else has to help you out.

Last edited:

#### paul15071992

##### New Member
@consuli thank you!

is there somebody who has time and is willing to explain me what consuli began?

thank you!