Need help

#1
Hello,
I got an assignment in which i need to predict blood pressure based on weight(kg), bmi(body mass index), age, puls, stress index and duration of high blood pressure in years. In datebase are 20 patients.
After some tests and analyzation i got that parameters stress, puls and duration aren't signif. Ploted residuals and i removed one observation. After that all of the parameters are significant. R squared =0,99. The problem is that coef. of corr. between weight and bmi is 0.88 and VIF of weight is 8.25 and of bmi is 5.17 . Is there a problem with multicolinearity, considering it is a small sample, and what should I do if so, or model is okay and I can accept it. I'm really confused about VIF cut point some say <10 is good some 5 some even 4.
As bmi is weight divided by height in cm squared, how shuold i interpret coef. For example coef. of weight. If weight gets increased by one unit the blood pressure will increase by coef of weight along other variables are not changed. If weight gets changed the bmi will aslo change and i won t be able to intermpret like that way. Does that mean i have to remove one of those variables??
Sorry I am total begginer at this.
Thanks :)
 

obh

Active Member
#2
Hi Statmess,

1. 20 is a very small sample...
2. Generally, you don't necessarily let a significance level choose the model, you should start with the model from theory/researches.
3. VIF - it is only a rule of thumb as you already mentioned. bigger than 2.5 you suspect for multicollinearity bigger than 10 is definitely multicollinearity.
A Multicollinearity is only a problem if you want to know how each change in any IV influences the DV. it doesn't affect the estimation of the DV.
This also answers your BMI/Weight question.

I understand this is only an assignment...