I found the following statement in an undergraduate psychology textbook:
The way I would read those sentences, they are clearly incorrect. Look at the last sentence:
This sentence refers to 5% "of research conclusions." That sounds to me like mistaken research conclusions of all types -- Type I and Type II errors combined. (I guess you could argue that "flukes" specifically refers to Type I errors, but to think that an undergraduate would read it that way is really a stretch).
Since I believe any average undergraduate would read this sentence to mean "95% of all studies get their conclusion correct and 5% get it wrong," then by that reading, the 5% number is clearly inaccurate.
In principle, I think you could calculate the proportion of all research studies conducted that come to "mistaken" conclusions about H0, if you did the following:
Let alpha = type I error rate
Let beta = type II error rate
Let p0 = proportion of all studies for which H0 is true
Let p1 = proportion of all studies for which H0 is false
Then the proportion of research conclusions which are incorrect could be calculated as:
p0*alpha + p1*beta
Whereas, the authors of this book seem to simply be equating alpha itself with the percentage of mistaken research conclusions, which would only be true if EVERY study EVER conducted had a true H0 (i.e. a useless treatment). Let's hope and pray that is not true.
Obviously, to actually calculate the value of p0*alpha + p1*beta requires all kinds of things - knowing p0 and p1 (which we never would) and knowing beta with perfect accuracy (which would be contingent on a flawless power analysis with a perfectly accurate effect size estimate - unrealistic).
Lastly, is there any way this sentence can be saved? To me it seems wrong in so many ways that it can't be salvaged:
I am thinking of emailing the author of this textbook, so I'm wondering:
Usually researchers test their hypotheses at the .05 (or 5%) significance level. If the test is significant at this level, it means that researchers are 95% confident that the results from their studies indicate a real difference and not just a random fluke. Thus, only 5% of research conclusions should be 'flukes.'
only 5% of research conclusions should be 'flukes.'
Since I believe any average undergraduate would read this sentence to mean "95% of all studies get their conclusion correct and 5% get it wrong," then by that reading, the 5% number is clearly inaccurate.
In principle, I think you could calculate the proportion of all research studies conducted that come to "mistaken" conclusions about H0, if you did the following:
Let alpha = type I error rate
Let beta = type II error rate
Let p0 = proportion of all studies for which H0 is true
Let p1 = proportion of all studies for which H0 is false
Then the proportion of research conclusions which are incorrect could be calculated as:
p0*alpha + p1*beta
Whereas, the authors of this book seem to simply be equating alpha itself with the percentage of mistaken research conclusions, which would only be true if EVERY study EVER conducted had a true H0 (i.e. a useless treatment). Let's hope and pray that is not true.
Obviously, to actually calculate the value of p0*alpha + p1*beta requires all kinds of things - knowing p0 and p1 (which we never would) and knowing beta with perfect accuracy (which would be contingent on a flawless power analysis with a perfectly accurate effect size estimate - unrealistic).
Lastly, is there any way this sentence can be saved? To me it seems wrong in so many ways that it can't be salvaged:
If the test is significant at this level, it means that researchers are 95% confident that the results from their studies indicate a real difference and not just a random fluke.
- Do you agree with the way I am reading these sentences (through an undergraduate's eyes)?
- Do you agree with my math that the "proportion of mistaken studies" could in principle be calculated as p0*alpha + p1*beta? And that the author is simply substituting alpha itself for this value?
- Do you have any suggestions how this could be worded better, taking into account that the undergraduates reading it:
- may not have had a statistics class
- may not be familiar with Type I and Type II errors
- may only think in binary terms of "correct/incorrect" conclusions, rather than in terms of conditional probabilities like "correct/incorrect IF H0 TRUE" and "correct/incorrect IF H0 FALSE"
Last edited: