different test for normality

#1
Hi,
Can anybody give a brief description for different non-parametric tests of distributions (like normality)? what are differences between Kolmogorov-Smirnov, Lilifor and Shapiro-Walk tests for normality?
 
#2
The KS test can be used to test for any distribution, including but not limited to a normal distribution, but you must completely specify the distribution to test against. In order words, you can't use it to just test for normality, you have to use it to test for normality with a specific mean and standard deviation. If you plug in the measured mean and standard deviation of your data, you will have violated the rules of the test, although it's likely to not matter much if your sample size is large.

The Lilifors test is a modified version of the KS test designed to account for the fact that you plug in the measured mean and standard deviation. It uses the same test statistic but a different, corrected null distribution, which has not been specified analytically, so you will need to look up critical values in a table.

The Shapiro-Wilk test (and variants like the Shapiro-Francia test) are specific to the normal distribution, and are essentially formalizations of the "eyeball" test of doing a QQ plot. They are the most powerful tests for normality against most typical alternatives.
 

Miner

TS Contributor
#3
Another difference is that the K-S test is relatively insensitive to the tail area of the distribution. It responds more to deviations in the middle of the distribution. The Anderson-Darling test is more sensitive to the tail area.
 

Dason

Ambassador to the humans
#4
Two of my favorite tests from R:

Code:
> SnowsPenultimateNormalityTest
function(x){


        # the following function works for current implementations of R
        # to my knowledge, eventually it may need to be expanded
        is.rational <- function(x){
                rep( TRUE, length(x) )
        }


        tmp.p <- if( any(is.rational(x))) {
                0
        } else {
                # current implementation will not get here if length
                # of x is positive.  This part is reserved for the
                # ultimate test
                1
        }


        out <- list(
                p.value = tmp.p,
                alternative = strwrap(paste('The data does not come from a',
        'strict normal distribution (but may represent a distribution',
        'that is close enough)'), prefix="\n\t"),
                method = "Snow's Penultimate Normality Test",
                data.name = deparse(substitute(x))
        )


        class(out) <- 'htest'
        out
}
and
Code:
> SnowsCorrectlySizedButOtherwiseUselessTestOfAnything
function(x,
            data.name=deparse(substitute(x)), alternative='You Are Lucky',
                                                                 ...,
                                                                 seed) {
    if( !missing(seed) ) {
        if( is.numeric(seed) ) {
            set.seed(seed)
        } else {
            char2seed(seed)
        }
    }

    tmp.p <- runif(1)

    out <- list(
                p.value = tmp.p,
                data.name=data.name,
                method = "Snow's Correctly Sized But Otherwise Useless Test of Anything",
                alternative=alternative)
    if( !missing(seed) ) out$seed <- seed
    names(tmp.p) <- 'Random Uniform Value'
    out$statistic <- tmp.p

    class(out) <- 'htest'
    return(out)
}
(found in the TeachingDemos package if you're wondering...)