Small difference but still significant?

Marvin85

Member
I hae sample size of 67 participants. They took a pre and post knowledge survey. The maximum score was 14 answers correct. I run a pair ttest. These are my results:

pre 9.3 SD 2.077207
post 9.8 SD 2.157419

p value; <0.05

I have to present this to a non technical audience. My supervisor ask me why is significant if the difference is very small. I tried to explain her say that it depends on the SD difference and sample size. But I didn't have a clear response. How can I explain her in simple terms. How can I interpret this small difference.

Thank you very much!
Marvin

BGM

TS Contributor
One possible way to explain your idea is to use visual aids - draw a normal pdf with mean different from 0, representing the sampling distribution of the difference. When the standard error is relatively small, the values are more concentrated around the mean, and thus the tail is smaller and we have a larger evidence to say that the means are different from 0 (although close), and vice versa.

Marvin85

Member
Thank you very much for you explanation. but how would you answer if a non technical person ask you: how comes it is significant if only increase 0.5 questions (half of a question) from pre to post?

Mean Joe

TS Contributor
One thing you could do, is change from 9.8 vs 9.3 (out of 14) to 70% vs 66%. The difference between a C- and a D. Not a great idea, just an idea.

Another thing you can do, is talk about how many answered (for example) 10+ questions in pre vs post.

Dason

Well you're doing a paired t-test so it makes sense to compute the actual differences for each observation. What does the histogram of the differences look like?

PeterFlom

New Member
You've not presented the right numbers for someone to judge properly; by giving the group means and sds for a paired t-test, you don't show what you are testing which is actually change Show the average difference and its sd.

You can also create a graph showing the decline per person.

Marvin85

Member
Thank you very much to all. I will take all your suggestions and create a nice table to show this change and most importantly to answer the question "why is significant if it is a small change?" to an not technical audience. Sometimes explaining statistics to non-technical audiences can be challenging. Well, it is a skill that I am trying to develop. Thank you again!

bruin

Member
Correct me if I'm wrong, but it kind of sounds like you're looking for an explanation that is going to show your supervisor why the 0.5 difference is practically significant. If so, you won't find it, because statistical significance doesn't indicate practical significance.

I mean, a difference from 9.3 to 9.8 could very well have practical significance depending on the context, but the size of the p-value isn't what confers that practical significance.

Marvin85

Member
Hi Bruin,

thank s for explaining this new perspective. You are right one thing is statically significance and other practical significance. The questions arrive just becuase my a staff member ask but why it is statistically significant if it only increase very little? Also he ask just explain in simple word what does significant mean? why is this important for our social program? If you find a simple way to answer these question in this context, I would appreciate it. Thank you!

noetsi

No cake for spunky
I missed bruin's comment so this is a long winded development of their point.

One thing that has been stressed more in the literature is the fundamental difference between statistical difference and substantive difference. The former involves the classical p values and null hypothesis. What this is really testing is whether the results, the effect size is likely due to random error in measurement. For many years I believe there was tendency to assume if something was statistically signficant the effect size was substantively important (I hear this on my job all the time).

But that view has been challenged a lot recently. Just because an effect size is statistically signficant does not mean it matters in any meaningful way. If your test is powerful enough you can have very small effect sizes, say differences in means before and after, and they can stil result in very small p scores signifying statistical signficance.. Essentially everything is signficant statistically when you have enough power.

The key then is to make a decision that given this effect size, the difference in means here, is real does it really matter? And that is a a decision for SME not statisticians based on context, the nature of the phenomenon etc.

bruin

Member
Hi Bruin,

thank s for explaining this new perspective. You are right one thing is statically significance and other practical significance. The questions arrive just becuase my a staff member ask but why it is statistically significant if it only increase very little? Also he ask just explain in simple word what does significant mean? why is this important for our social program? If you find a simple way to answer these question in this context, I would appreciate it. Thank you!
Well, if I was the one having to answer this question, I'd probably do 2 things:

1. Explain that there's nothing in inferential statistics that is going to help us decide whether the difference is important for your social program. Only you guys can decide whether its a big enough difference to matter or to make whatever it was that you did to people to get that difference something that's worth continuing to invest in.

2. Calculate a confidence interval estimating the population mean difference, because in my experience confidence intervals are far more intuitive than p-values. If you look at the actual definition of a p-value, its kind of like the answer to a question that virtually no one would ever ask. Studies have found that misunderstandings about the meaning of a p-value abound even among applied researchers who regularly use hypothesis tests in their research.

SigmundFreud

New Member
My interpretation of the results would be that there is a significant difference but the effect size (presuming you have calculated Cohen's d) indicates that this difference is trivial. You could run a G power analyses to determine if the sample size (N = 67) was sufficient.