best test for this non-normal distribution?


I've recently been working on a report for some research I conducted in september and I've run into a bit of a problem during the analysis as I am relatively new to stats.:shakehead I would really appreciate any help anyone could provide! Thank you! :eek:

The scenario:

I'm looking to see if the height above the low water mark and the width of crabs. Fundamentally: 2 continuous variables which are not normally distributed when plotted on a scatter graph or transformed with the appropriate equations. I then looked into an appropriate non-parametric test (I have been taught the mann-whitney, Kruskal wallace and wilcoxon) which due to the continuous nature of the data could not be applied.

I haven't been taught spearmans rank but is that appropriate? and can I perform a GLM even if the data is non parametric?

I'm sorry if I'm asking a silly question, I would really appreciate your pity all the same haha!

Thanks again,



TS Contributor
Hi Laura,

Sadly, I didn't quite understand what kind of results you need. If you try to calculate correlations, you can use either Spearman's or Kendall's correlation coefficients, since none of them require normality on the variables. Spearman's correlation is usually more common since it is just the pearson correlation calculated on the ranks:

You can also use general linear models, since these do not require normality in the variables, only in the residuals. Still, the best analysis will depend on your hypothesis or research inquiries. Hope to be helpful
thank you terzi! :)

Sorry, I didn't explain myself very well, I've been revising all holiday so my brain is a little dead!

I took 1 sample within which variable numbers of y values (width of crab) corresponded to a given x value (height above low water). I was trying to investigate if width was connected to height above low water (Ho = width does not vary with height above low water/H1= width varies with height above low water). Looking at past research, the trend should be that width increased with height above low water.

Sorry for the confusion,
thank you again for your help!




TS Contributor
First try using the correlation coefficient. Spearman's correlation is a good choice that does not demand normality. You can test a hypothesis regarding the correlation coefficient being distinct to zero in order to test for a relationship.

If you have enough sample size (about 30 would be fine if you have only one IV) and wish to study the relationship more deeply, you can perform a regression model, it does not matter that the variables are non-normal. Just be careful with the assumptions:
One more question, how do I deal with missing data (ie. no widths recorded at a given height above shore) when performing spearman's rank? is it ok to exclude them for the purpose of the test or do I have to do something else?

Thank you! :)


TS Contributor
Hi again laura!

Actually that one is a really tricky question. The short, possibly incorrect answer is yes, you can eliminate those observations. Truthful, long answer would be that the decision will depend on the number of missing values and the possible causes of those observations being missing. I'd suggest dropping the observations (listwise deletion is how the "process" of dropping observations is called) if the percentage of missing values is very low and you can assume that those missing values will note be related with the widths (i.e., it is not more likely to measure only subjects with low widths, for example).

I hope this sheds some light