Mahalanobis distance among clusters obtained by Discriminant function Analysis

#1
Hi everyone I am new here and first of all I would to say hello. I have a question. I am trying to assess similarity among groups. I ran a discriminant function analysis using 10 variables and now I would like see how similar are the groups, so I was reading and I figured I could measure mahalanobis distances among the group centroids, but I do not know how to do it, does anyone have an idea how to do it?. I am using SPSS currently.

thank you for your help

sensemilla
 
#3
Hello Myles,
first of all thank you so much to reply my question,

No I have not used Mahalanobis distances before. Like I said I want to see similarities between the clusters that DA gives me, and I think a good approach would be to measure the distance between the mean groups (centroids). I have been checking out how to do it and I found that xlstat has the option when you run a DA of giving you the mahalanobis distances among the clusters. So at the moment I am using that.

Do you have experience using this method?,
 
#4
I have used the Mahalanobis distance to identify outliers in a multiple regression.

This guy explains it well.

http://www.talkstats.com/showthread.php/786-Mahalanobis-distance

Tabachnick & Fidell's book goes over the process for identifying multivariate outliers really clearly in Chpt 4.

To calculate h2 distances in SPSS you just need to run a regression using the variables you plan to use in a set of analyses as the predictors and any variable (including ID number) as the outcome variable. Under the "save" button in the regression window, there is an option to save mahalanobis distances for each respondent. These are distributed on a chi-squared distribution with the degrees of freedom equaling the number of predictor variables used. Any participant with a significant mahalanobis distance would likely be an outlier.

To give a small piece of advice - when selecting scales to use for calculating these distances, I would avoid using extremely highly correlated scales (r of .75 or higher) as separate predictors as that seems to make the test exceedingly sensitive.