Multivariate outliers

Hi everyone!

I have a question about detecting multivariate outliers in a correlation analysis. I'm calculating Spearman correlations between 5 variables (in a within-subject design), and I was suggested to calculate Mahalanobis distance for all 5 variables for detecting multivariate outliers. However, each correlation is a two-dimensional space, so it might have different outliers with respect to the initial five-dimensional space, if I understand correctly. But detecting outliers for each correlation separately also doesn't seem like a good idea.
Could someone please clarify this? Thanks!



New Member
Hi Maria,

You are supposed to screen for both univariate and multivariate outliers. There are different recommendations depending on your sample sizes, but usually for univariate outliers look at the raw data and compute the Z scores (through the analyse and descriptive statistics tab). Anything that is above +/- 3.29 (depending on sample size) should be counted as a univariate outlier (for a reference see Ghasemi & Zahediasl, 2012). Once you've dealt with them you run multivariate outlier analysis using Mahalanobis distances (if you're unsure of whether you will have outliers or not; see Zijlstra et al., 2010 for a justification) across all 5 variables. Again, you run it on your raw data rather than on the correlations themselves, and your exclusion value will be determined by having 5 variables. So you don't need to look at it separately, as having 5 is taken into account with the analysis anyway.

Hope that helps.