NOOB Interpreting non-significant findings

Hi all,

I'm very new to stats. I'm studying health sciences. I'm struggling to describe what's happening with one of my scatterplots. The findings are non-significant however the graph clearly shows a positive correlation. Can anyone tell me what's going on with that? Is it because of the outlier I've got? Thanks :)



TS Contributor
In your sample with n=23 you have a small linear relationship between the variables.

However, the test of significance deals with the question of whether there might be
a correlation in the population from which the sample is drawn. The null hypothesis says:
"the sample is drawn from a population where r = 0.00000000...". Which implies that the
sample coefficient's difference from 0.0000... is completely due to chance.

The correlation in your sample is not tremendously large AND at the same time the data
base (the sample) is small. So one would usually decide that there is not enough evidence
to reject the null hypothesis. The null hypothesis is rejected if it is unlikely (often, if p < 0.05)
that the sample data are drawn from a population where r=0.00000.... If p > 0.05, then routinely
the null hypothesis is retained.




Less is more. Stay pure. Stay poor.
Just to add a little more, look at the slope of the line. It is barely higher than being level, null. Given the small sample you can't rule out that the source population may be null. The output also reports the r^2 value. It may be educational to provide 95% confidence intervals on both of these estimates, instead of solely looking at the estimates.
Hi both,

Thank you for taking the time to respond and you've both provided me with some really helpful suggestions. Hlsmith - could you explain what you mean by providing "95% confidence intervals on both estimates"? I don't quite understand what you mean.

Sorry - again - this is all completely new to me so apologies if I'm not understanding something blindingly obvious.


Less is more. Stay pure. Stay poor.
Whenever you are doing frequentist statistics (non-Bayesian statistics) you can provide confidence intervals on the generated estimate. Your slope and the R^2 value are both estimates of those values for the target population. Confidence intervals can be presented with whatever level of confidence as you desire using the number of standard deviation from the standard normal distribution. People traditionally use 95% CIs. These will represent, if you had repeatedly resampled from the target population, that 95% of the CIs will include the true target population estimate value. These values can help you see how wide they are, representing that you may have a small sample and to be confident to capture the true estimate the interval has to take on a wide range of values. There are likely better descriptions online.