Population correlation coefficient sampling distribution

anonymous1234

New Member
Hello,

When constructing a CI for population correlation coefficient (p), I am told you cannot use +/- 1.96 as the sampling distribution is non-normal? However, I am also told that each X and Y are assumed to be pairs of independent observations from a bivariate normal distribution?

How come normality is an assumption but it cannot be used to calculate CI?

Many thanks for your time. I hope this question makes sense!

Kind regards,

SF

fed2

Active Member
The normality of X and Y is not sufficient condition for every function of Xi, and Yi to be normal, i = 1 ...n. For example consider f(x,y) = e^x_bar, then f is log-normal. Now roh is a function of Xi and Yi, i = 1...n. Fisher z-transform is usually used.

anonymous1234

New Member
The normality of X and Y is not sufficient condition for every function of Xi, and Yi to be normal, i = 1 ...n. For example consider f(x,y) = e^x_bar, then f is log-normal. Now roh is a function of Xi and Yi, i = 1...n. Fisher z-transform is usually used.
Hello and thank you for your response. Sorry I didn't quite understand your comment, but do you mean that while we can assume normality of the sampling distribution for the population correlation coefficient, it is not strong enough to feasibly create a confidence interval?

Many thanks,

SF

fed2

Active Member
I think you actual mean sampling distribution of the sample correlation coefficient. The population correlation is a constant, in normal thinking of the issue, just like a population mean of SD. The sampling distribution of roh is not normal just cuz the underlying data are normal. Fishers z-transform makes it more normal, for pretty much any underlying distribution. It is also best to have large sample sizes as this aids the convergence to normality. You'd have to be autitistic to derive the actual sampling distribution of a correlation coefficient, which is why it is not used in practice.