High significance of correlation coefficient with small sample size

#1
Dear forum members! A few years ago Guillaume Rousselet posted results of his helpful study titled "Small n correlations cannot be trusted" here: https://garstats.wordpress.com/2018/06/01/smallncorr/. It clearly shows the effect of small sample size on reproducibility of correlation data. From my own analysis, I have noticed that correlation coefficient r with sample size as small as 3 can give very high r (e.g. r=0.999), which appears to be significantly different from zero (p-value = 0.014). I wonder if anyone can share her/his opinion as to whether results of the significance tests for r with small sample size should be distrusted too? Thank you.
 

hlsmith

Not a robit
#2
Well I am usually fairly biased toward small samples and n=3 seems pretty small. I also wonder if assumptions can be met with small samples. I would also throw in the mix that for say an r=0.999, the p-value = 0.14 isn't that small, say a result that extreme given the null is true may be seen 1.4% time when repeating the study. IMHO, I usually consider getting a significant p-value with a correlation a pretty easy feat especially as the sample size increases. I know that running correlations doesn't mean you are trying to get at causality, but trying to illustrate or articulate the underlying relationship between the variables is important in these circumstances. Also, I don't know your field, but a r=0.999 is bizarrely high. Were these repeated measures or proxies/surrogates of the same variable of interest. To better digest these sorts of things, perhaps making a scatter graph with a confidence cloud/bubble around it may help process all of the possibilities given the provided dataset.
 

spunky

Doesn't actually exist
#3
I wonder if anyone can share her/his opinion as to whether results of the significance tests for r with small sample size should be distrusted too? Thank you.
Yes, they should be. In general, it is safe to say you should distrust **anything** that has such a small sample size.

Small simulation example to exemplify:

Code:
rr <- double(10000)

for(i in 1:10000){

a<-rnorm(3)
b<-rnorm(3)

rr[i]<-cor(a,b)

}

> sum(abs(rr)>.9)/10000
[1] 0.2804
From a true population correlation of 0 and N=3 across 10,000 replications, you can have almost 30% of the correlations being greater than |0.9| . Let's look at the empirical density plot. Can you see how you end up with a bimodal distribution where the peaks centre around the larger ends of the correlation? And this is merely an artifact of you having a small sample because the true population value is 0.

So, if anything, a large correlation with a small sample size should make you distrust said result (and associated p-value) MORE and not less.


pic0001.jpg
 

Dason

Ambassador to the humans
#4
I agree with spunky but at the same time completely disagree with his argument. One should be skeptical of results with small sample sizes but just looking at the raw size of the correlation and then making an argument that you shouldn't trust it based solely on that and trying to extend the argument to distrusting the p-value as well seems a bit much. *If* the assumptions of the test are true we certainly can (and will in a lot of cases) see quite high values for the correlation. However, the p-value will still be uniform under the null hypothesis *if* the assumptions of the test are met. We can extend spunky's simulation easily to include the p-value and show that it is still uniform under the null hypothesis so there isn't a reason to distrust it solely based on the small sample size (for the reason's spunky provided)

Code:
n <- 10000
m <- 3
rr <- double(n)
p <- double(n)

for(i in 1:n){
    
    a<-rnorm(m)
    b<-rnorm(m)
    
    rr[i]<-cor(a,b)
    p[i] <- cor.test(a, b)$p.value
    
}

hist(p)
Rplot.png

If we are going to distrust the results it should be on the basis that it's difficult to assess whether the assumptions of the tests are met.
 
#6
Dear Dason, El Spunky, and Hlsmith,
Thank you for your clear, helpful explanations. You are right, the r=0.999 was unnaturally high, because the correlation example was purely hypothetical (for educational purposes). In fact, when a chi-square test for association is applied for the same example, as a conservative non-parametric alternative, the p-value is 0.2.