What kind of test for comparing right-skewed data?

ISAK

New Member
#1
Hi

Im working on a project involving adipose tissue from 10 different locations from 30 individuals, which equals 300 biopsies.

The biopsies has been analysed in regard to "adipocyte count" and "area pr adipocyte" using stereology and digital image processing. The minimum count of adipocyte pr biopsy is 154 and the maximum count is 852 adipocytes, hence the 300 biopsies do not have the same adipocyte count - each adipocyte has an area and diameter meaning that the mean of each biopsy do not build on the same number of observations as the mean of another biopsy.
The data does not fit a normal distribution and transformation is not possible (still not a normal distribution), the data is right skewed.

We wish to compare the adipocytes of the different locations and see if there is a difference in adipocyte size (area).
We are disgussing how to handle the data and how to make the comparison and we can't seem to find an agreement.

Should the data be compared using Mann-Whitney (non-parametric test for unpaired data) or Wilcoxon (non-parametric test for paired data)? (some say the data is paired and some say it is not)

Another approach suggested is: Should the means of the biopsies be compared using a student t-test? (comparing the means does not seem like the right option when the mean does not represent non-parametric data so well)

Hoping for some input!
Thanks.
 

ondansetron

TS Contributor
#2
Regarding the title: data are neither parametric or nonparametric. Those terms refer to broad classes of statistical methods.

Let’s back before determining a test: what are the specific questions you hope to answer?
 

ISAK

New Member
#3
....
Right skewed data - data not fitting a normal distribution.

Which test to use to compare the adipocytes from different locations
 

ondansetron

TS Contributor
#4
You're still not asking a specific question.
If you're in medicine this is basically asking the workup for a cough without knowing any other information on the patient.
If you're in biology this is basically asking someone where vertebrates live.

What is it that you are hoping to learn, in specific terms? No one can reasonably suggest a "test" or approach without a specific question or set of questions to tackle.
 

ISAK

New Member
#5
Sorry, im trying.. (We wish to compare the adipocytes of the different locations and see if there is a difference in adipocyte size (area). We are disgussing how to handle the data and how to make the comparison and we can't seem to find an agreement.)

We want to learn if there is a significant difference in cell size between 10 different locations. - but we can't agree on a statistical test to use - that is why I tried to explain the data in my post.

For the moment, we're not interested in anything about the individual the biopsy arises from.
 
Last edited:
#8
Hi, I believe non parametric tests (Mann Whitney if 2 locations or Kruskal Wallis for >2 locations) is good. But may be you should first try to apply a transformation of your data in case it can make it normal. More complex options are also possible, but that should be a good start.
 

ISAK

New Member
#9
Hi, I believe non parametric tests (Mann Whitney if 2 locations or Kruskal Wallis for >2 locations) is good. But may be you should first try to apply a transformation of your data in case it can make it normal. More complex options are also possible, but that should be a good start.
Thanks!

I've already tried to transform the data but it does not make it normal.
 

ondansetron

TS Contributor
#10
YES! :)
The sites are of great importance and the 10 sites are the same for every individual.
I don't particularly think either test recommended by @CamilleJosion would be very appropriate given that each individual has 10 measurements. This is a kind of repeated measure design, unless you have reason to believe biopsies on the same person are in no way related (which doesn't make too much sense to me).

I would think a mixed model using patient ID as a random effect and with site as a fixed effect would be appropriate. It may be beneficial to use a generalized linear mixed model so you can pick a link and random component to appropriately model the outcome (I have an idea but want to see what some others might suggest if they're more familiar).
 

Karabiner

TS Contributor
#11
Hi, I believe non parametric tests (Mann Whitney if 2 locations or Kruskal Wallis for >2 locations) is good. But may be you should first try to apply a transformation of your data in case it can make it normal. More complex options are also possible, but that should be a good start.
But neither t-tests nor ANOVA requirie normal data.

With kind regards

Karabiner
 

ondansetron

TS Contributor
#13
t-tests do require normal data, and ANOVA requires that the residuals are iid and follow a N(0,s2).
Best,
Camille
I think the answer is yes and no-- @Karabiner is probably referring to the fact that with two groups the central limit theorem may be applicable, or he may have been making the distinction that the data (unconditional Y) need not be normally distributed so long as the errors are drawn from a normal distribution with common variance.
 

Karabiner

TS Contributor
#14
t-tests do require normal data,
Consider 2 samples from normally distributed populations, with mean1 = 100,
mean2 = 250, and SD =50. The total sample would show a bimodal distribution.
According to your requirement, this would mean a t-test cannot be carried out.
ANOVA requires that the residuals are iid and follow a N(0,s2).
If samples are small. Otherwise this assumption is not considered necessary.

BTW, t-test and oneway ANOVA with 2 groups are equivalent, therefore the same
assumptions apply.

With kind regards

Karabiner