Correlation pvalue from a bootstrap

trinker

ggplot2orBust
#1
I'm working through some machine learning stuff that deals with simulation/bootstrapping and come across a point where a pvalue is calculated but not how it was calculated. This is precisely the type of work I'm moving towards so I want to understand how to do it. Here's what the author of the tutorial states:

The plot reveals, in dramatic fashion, just how much the data clusters around
the mean, which as you recall from above is nearly 0. It also dramatizes the out-
lier status of the actual value (-0.2411) that was observed. In 10,000 random
iterations, only 10 correlation coefficients were calculated to be less than the ac-
tual observed value and the actual observed value was nearly 3 (2.8) standard
deviations away from the mean. In short, the probability of observing a random
value as extreme as the actual value observed (-0.2411) is just 0.4%.
This is the information we have:

Code:
min(mycors.v)
## [1] -0.2772
max(mycors.v)
## [1] 0.3129
range(mycors.v)
## [1] -0.2772 0.3129
mean(mycors.v)
## [1] -0.0001494
sd(mycors.v)
## [1] 0.08685
This is on 10,000 replications. So we're testing the probability that -0.2411 occurred by chance if the NULL is TRUE.

Using R (please mix theory in if you think I need it), how can I jump to this 4%?

I assume pnorm or pt will be of use here maybe. I know I could use corr.test :

Code:
> cor.test(cor.data.df[, 1], cor.data.df[, 2])

        Pearson's product-moment correlation

data:  cor.data.df[, 1] and cor.data.df[, 2]
t = -2.8651, df = 133, p-value = 0.004848
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
 -0.39400960 -0.07520945
sample estimates:
       cor 
-0.2411029
But somehow the guy jumped from a simulation to the pvalue. Here's tutorial if it helps:

http://www.matthewjockers.net/wp-content/uploads/2013/09/TAWR.pdf (p. 54-57)

Maybe the simulation was just for demo purposes and wasn't used to calculate the pvalue. In retrospect this seems the most logical.
 

spunky

Doesn't actually exist
#2
trinker, i haven't gone over the example in your pdf but intuition tells me that all this person did was calculate the empirical quantile of the simulated distribution.

so (s)he asked R to tell him/her maybe how many correlations were greater (in absolute value, so (s)he should have used both tails) than -0.2441 and divided that number by the number of simulated correlations (10k) to get the proportion of how many of the 10k correlations were greater than |-0.2441|
 

trinker

ggplot2orBust
#3
Hmm something like:

Code:
sum(abs(mycors.v) > 0.2411)/10000
Is this a legitimate approach that I could report as a p.value in a journal?
 

trinker

ggplot2orBust
#5
I have at least one use for this: http://stats.stackexchange.com/a/22333/7482

Someone suggested this for a distance measure I was trying to get but I didn't have the skills or understanding at the time. I've since then made the R function to measure the distance and now I can generate p.values. Very cool.

This is going to be my LRA presentation this year at the beginning of December. No one in my field is doing this sort of work.
 

trinker

ggplot2orBust
#6
@Spunky do you have any papers out or know of any one else's work (paper) that use this approach (or similar) and that do a decent write up of the methods section.
 

spunky

Doesn't actually exist
#7
@Spunky do you have any papers out or know of any one else's work (paper) that use this approach (or similar) and that do a decent write up of the methods section.
by saying "this approach" do you mean the permutation-test approach described in your pdf or the actual computation of the empirical rejection rate from simulated data?
 

trinker

ggplot2orBust
#8
More so the...
permutation-test approach described in your pdf
though it it doesn't have to be with word/language use (I doubt you have anything on that). But both would be great. I assume one is a discussion of theory while the other is an application??? I.E. we can use the permutation test b/c of the theory behind it from computing...

the empirical rejection rate from simulated data
All of this is good because I wasn't even sure what things were called to use my google fu.
 

spunky

Doesn't actually exist
#9
though it it doesn't have to be with word/language use (I doubt you have anything on that). But both would be great. I assume one is a discussion of theory while the other is an application??? I.E. we can use the permutation test b/c of the theory behind it from computing
i don't really use permutation tests very much, but didn't Jake once worked on something like this? i mean, he even has a 'nonNestedPermutation.R' script in his TS Dropbox folder.
 

Jake

Cookie Scientist
#10
You might find the following instructional script I had lying around useful... it illustrates the difference between bootstrapping and a permutation test, in the context of a simple difference between two group means. When we bootstrap, we obtain a confidence interval around the difference in means, from a sample of bootstrapped statistics centered around the observed difference in means. When we do the permutation test, we obtain a p-value for the test of a null hypothesis of 0 difference in means, from a sample of permuted statistics centered around 0. These should agree in simple cases. I've heard stories that in more complicated cases they might not agree, but I don't know what those cases are.
Code:
# Bootstrap of group difference -------------------------------------------


library(boot)

# make up data
set.seed(12345)
dat <- stack(list(a = rnorm(25, mean=10), b = rnorm(25, mean=10.5)))
t.test(values ~ ind, data=dat, var.equal=T)

# define bootstrap statistic
bootDiff <- function(data, indices, print=F){
  newdata <- data[indices,]
  if(print) print(indices)
  return(diff = mean(newdata$values[26:50]) - mean(newdata$values[1:25]))
}

# examine indices
boot(data=dat, statistic=bootDiff, R=2, strata=rep(1:2, each=25), print=T)
# this calls bootDiff() 3 times:
# once with ordered indices (to get the empirical test statistic),
# and twice with indices sampled *with replacement* within each group

# bootstrap!
resamples <- 1000
results1 <- boot(data=dat, statistic=bootDiff, R=resamples, strata=rep(1:2, each=25))
results1
boot.ci(results1)
plot(density(results1$t))


# Permutation test --------------------------------------------------------


str(dat)

# define permutation test statistic
permDiff <- function(data, indices, print=F){
  if(print) print(indices)
  if(print) print(length(unique(indices)))
  indices <- factor(indices <= 25, labels=c("b","a"))
  if(print) print(indices)
  newdata <- cbind.data.frame(indices, values=data$values)
  return(diff = mean(newdata$values[newdata$indices=="b"]) - mean(newdata$values[newdata$indices=="a"]))
}

# examine indices
boot(data=dat, statistic=permDiff, R=2, sim="permutation", print=T)
# this calls bootDiff() 3 times:
# once with ordered indices (to get the empirical test statistic),
# twice with indices sampled *without replacement* (i.e., shuffled) across *all* obs

# permute!
resamples <- 1000
results2 <- boot(data=dat, statistic=permDiff, R=resamples, sim="permutation")
plot(density(results2$t))
est <- mean(dat$values[dat$ind=="b"]) - mean(dat$values[dat$ind=="a"])
# compute p-value using quantiles in both tails
p.value <- mean(results2$t > est) + mean(results2$t < -1*est)
p.value