Transforming two non-normal sets of data for 2 sample t-test?

#1
Hey all,

Ok so I'm trying to compare the means of two sets of data using a 2-sample t-test. Unfortunately, both sets of data are non-normal. It appears box-cox or Johnson transforms will work, but my question is, wouldn't I need to transform both sets of data using the same transform parameters (lambda for BC and the transform function for the Johnson) in order for the comparison to be valid? I would think if I use two different sets of transform parameters than the data would be shifted in differing amounts, potentially spreading out the means (incorrectly??). Hopefully this makes sense! Thanks.
 

Dason

Ambassador to the humans
#4
That should be more than enough data for the CLT to kick in. You shouldn't need to transform. Then again you could do something like a Wilcoxon test instead.

But if you post histograms of your two groups we could help decide further whether or not your data is well behaved enough to choose between the two.
 

Dason

Ambassador to the humans
#7
As you can see, my distribution definitely is skewed...not sure if it is enough to throw out the CLT or not.
It doesn't look bad enough to me given what your sample sizes are. But you could bootstrap the sampling distribution of the difference in sample means and see if that distribution is approximately normal - thats what we really care about.
 
#8
But you could bootstrap the sampling distribution of the difference in sample means and see if that distribution is approximately normal - thats what we really care about.
Are you saying create a distribution of the differences between each sample's mean and see if that is normal? Meaning if I had means of 2, 7, 12, and 24, that I would want to see if (24-2), (12-2), (7-2), (24-7), (12-7), and (24-12), is normally distributed? That would validate the t-test? Thanks!