Transformation then Standardization...?

MCG

New Member
#1
Hi,

I was wondering if anyone could answer what is probably quite an obvious question...

I've got a dataset with 9 variables, of which I am looking to combine 8 paris... (after standardization) so that I end-up with a dataset of 5 variables.

One of the variables (which I intend to combine) is positively skewed. I'm planning to do a Log transformation to sort this out... but is it then okay to sandardize the new Log score in to a Z score (ie. to standardize it)???

I've looked in a few books (Howell, Field), but none seem to talk about transformation followed by standardization...

I'd be really grateful for any help.
MCG
 

JohnM

TS Contributor
#2
It's fine to do this. Generally I'm not a "fan" of transforming, due to the difficulties in interpreting and explaining the results in terms of the original scales.

As long as you understand what you're doing and you explain it thoroughly in any report that is published or distributed, then it's fine.

What you may want to try is to do an analysis with and without transforming to make sure you get consistent results.
 

MCG

New Member
#3
Hi - thanks, that's great.

I have another quick question... the 9 variables that I'm looking at come-out with highly signficant statistics for Kolmogorov-Smirnov and Sapiro-Wilk tests of normality (implying that they're non-normally distributed)... what's your opinion on the reliability of these tests for N = 172 ? I understand that it decreases the larger the sample size, but can't find any definition of 'large'...

Obviously normality affects generalisability of results, so I'm assuming that it's best to run two analyses, 1. with transformation 2. without transformation. Is there anything else that I can do?

Thanks for your help!
MCG
 

JohnM

TS Contributor
#4
The SW is usually for small sample-sizes - the KS can be done on larger ones, I believe...

"Large" depends on the application. If you were evaluating or setting norms for an academic achievement test, then 172 would be considered too small. For a lot of applications, though, it would be considered large..

The best all-around method to test for normality is to do the following:

(1) run the Anderson-Darling test*
(2) do a normal probability plot

*this is becoming widely recognized as the best overall statistical test for "goodness of fit" to a hypothesized distribution

If you get similar results for transformed and non-transformed data, then generalizability shouldn't be an issue.

Another way to deal with non-normality is to use a nonparametric procedure, which can have a higher power when the departure from normality is extreme -but I can't really recommend anything unless you tell me the nature of your study.