- Thread starter vanguard605
- Start date

Some possible differences:

- We are only interested in large effects that stand out from the background noise (no Andrew Gelman's kangaroo here). Changes cost money, so they have to pay for themselves.
- Results must be replicated or they are not implemented (too costly)
- We don't publish to gain publicity (results become proprietary trade secrets)
- There is no motivation to exaggerate or make outrageous claims (you will lose your job if you are wrong)

The problem with p values is that you can have a significant p value and your effect size be meaningless. I have seen people judge the value of something purely by which p value is lower. At work people, who don't have a statistical background, treat statistical significance as critical, if it is significant than they see it as important and if it is not they don't. Despite the fact we have populations.

When you combine this with issues of statistical power, a major issue for some tests, and generalizability (something which it seems to me statisticians commonly pay too little attention to in the social sciences, and I agree we should move away from a focus on p values. We should stress effect size and how likely it is you can generalize to the population from your data.

The problem with p values is that you can have a significant p value and your effect size be meaningless. I have seen people judge the value of something purely by which p value is lower. At work people, who don't have a statistical background, treat statistical significance as critical, if it is significant than they see it as important and if it is not they don't. Despite the fact we have populations.

When you combine this with issues of statistical power, a major issue for some tests, and generalizability (something which it seems to me statisticians commonly pay too little attention to in the social sciences, and I agree we should move away from a focus on p values. We should stress effect size and how likely it is you can generalize to the population from your data.

It is the way things are commonly done in my observation. And I doubt it will change. Do you really think many decide what effect size is important before they run the numbers - even academics. I never did and never read anyone who suggested this. And I have read a lot of articles including elite journals over the years.

I don't know what inherently bad means. I think there should be more emphasis on effect size and less on p values. I think in much of the social science literature p values are all that comes up. Effect sizes are rarely discussed. It is simply, this is important and this is not because it is below p or not below it. Whether you can generalize from your cases to the population is apparently not that important (although there is always a one sentence warning about this at the end of the document ....which I think is entirely ignored.

I emphasize planning the design in advance and thinking strongly about sample sizes required because they are expensive to obtain. You are interfering with production uptime and are probably generating some scrap or rework to boot. I am just saying that some disciplines could learn from the practices of other disciplines. We don't have all the answers. For example, many of our engineers still hold to the One Factor at a Time approach, when DOE is much better. On the positive side, we don't have a replication crisis or questioning of fundamentals.

Your comment about generalization is on point. Most experiments are performed under tightly controlled conditions and will not scale up to the noise found in the real world. I cover the concept of narrow and broad inference space extensively in my classes.

Your comment about generalization is on point. Most experiments are performed under tightly controlled conditions and will not scale up to the noise found in the real world. I cover the concept of narrow and broad inference space extensively in my classes.

Last edited:

I work in the world of vocational rehabilitation where even why we spend money (and how) is a mystery. I do think in general too much attention is given to statistics, because that is what most are taught in social sciences and only the basics, and far too little to DOE. Personally I think the later is the gold mind that needs far more focus than it gets.

I agree about learning across disciplines although there is danger in that those who adopt may not understand what they are adapting.

https://www.vox.com/science-and-hea...values-statistical-significance-redefine-0005

The argument confuses p values, alpha, and the meaning of the word "significant".

We can't adjust p values, they are what they are. We can adjust alpha, to wherever we want it. We made a major mistake with the word "significant", our understanding of the definition.

In statland, significant means "p is closer to mu than alpha. In the real world, significant means "important".

The problem starts with statfolk confusing the two meanings, not understanding the meanings, and misinforming the non-statfolk.

If the readers of statworks do not understand the meanings, shame on them.

If the statfolk do not include alternate alphas, such as .01 and .10 in their statwork, shame on them.

If we, the statteachers, don't explain clearly, shame on us.

And, we should change "significance" to "pass/fail" or "accept/reject".

From my book:

"In normal English, "significant" means important, while in Statistics

When the terms enter the non-statistical or real world, trouble begins. They have certain specific meanings to statisticians that are NOT the meanings that non-statistician English speakers understand."

While fixing poor choices of stattalk, we should fix/delete/change/ any "confidence" and ALL "margin of error" use; MOE is about the clueless newsfolk announcing to the uncaring.

...

"In normal English, "significant" means important, while in Statistics**"significant" means probably true** (not due to chance). A research finding may be true without being important. When statisticians say a result is "highly significant" they mean it is very probably true.

"In normal English, "significant" means important, while in Statistics

"A p-value is calculated under the assumption that the null hypothesis is true." Not by me. I don't think this or any assumption affects the calculation of p.

"P-values do not quantify the probability of "truth" for any "result" or hypothesis." Not in the sense that p = xx.xx% sure that Ho is accepted; but certainly in the sense that p = .49 = 49% = 49/100 makes me more sure that Ho should be accepted than does p = .06 = 6% = 6/100.

Remember, it's statistics, we can be precise AND vague, can't prove anything.

"A p-value is calculated under the assumption that the null hypothesis is true." Not by me. I don't think this or any assumption affects the calculation of p.

"P-values do not quantify the probability of "truth" for any "result" or hypothesis." Not in the sense that p = xx.xx% sure that Ho is accepted; but certainly in the sense that p = .49 = 49% = 49/100 makes me more sure that Ho should be accepted than does p = .06 = 6% = 6/100.

Remember, it's statistics, we can be precise AND vague, can't prove anything.

Then you also claim you [don't assume Ho is true when calculating a p-value]; how are you calculating p-values? This is literally embedded in the calculation. Either you're unaware of the assumption or you're not calculating a p-value. This is definitional; assume different null hypotheses and get different p-values on the same data.

Your last statement that for any p1>p2 you should be [more sure of accepting Ho] is also not generally true. P-values have nothing to do with accepting Ho.

This is part of the issue embedded in "p-value controversy."

Perhaps I'm wrong.

With two sets of numbers distributed Normal, calculate the mean and standard deviation. Estimate the distribution of sample means of one set. The mean of the other set lies somewhere on the distribution of the other set, and is a certain distance from the mean. I call that the p value.

We're done operating operating on the two sets of numbers.

We now write Ho: mu1 = mu2

and

Ha; mu 1 > mu 2, or

Mu 1 < m 2, or

Mu 1 not equal to mu 2

The p value will sorta change, but sorta not, as we change Ha. p can be 10% or 90% or 20% depending on which Ha is chosen; but it's like saying Boston to Chicago = 1000 miles, Chicago to Los Angeles is 2000 miles, Boston to Los Angeles is 3000 miles. Three numbers describe Chicago's location, Chicago's location hasn't changed.

While Ha changes p; I don't see where Ho changes p.

Perhaps I'm wrong.

Statistics is about translating numbers into words, and is darn difficult. I'm talking here about the interpretation, the definitions, commonly used.

If x bar is close to mu than alpha, the result of the test/look is significant.

If x bar is closer to mu than alpha, and close to alpha, the result of the test/look is less significant. p is small.

If x bar is closer to mu than aplha, and far from alpha, the result of the test/look is is more significant. p is large.

"Your last statement that for any p1>p2 you should be [more sure of accepting Ho] is also not generally true. P-values have nothing to do with accepting Ho."

You are correct, my words are incorrect. If* ***x bar is closer to mu than alpha** is the accept condition, then accept/reject has nothing to do with p; reject with alpha = 5% and p = 3% and reject when p = 4.99%.

However, the greater p/alpha, the more sure I am of the test result. A digital person lives in accept/reject land. I live in an analog world, where as p approaches 50%, my smile widens. Why am I wrong?

You are correct, my words are incorrect. If

However, the greater p/alpha, the more sure I am of the test result. A digital person lives in accept/reject land. I live in an analog world, where as p approaches 50%, my smile widens. Why am I wrong?

Last edited:

"In statland, significant means "p is closer to mu than alpha. In the real world, significant means "important". Well that is true I think the key is it means the effect size is substantively important. Not statistically significant, a very different concept. This is the heart of the problem, and why I am not a great fan of p values. Most people are not even statisticians of course. I have problems with this every time a p value comes up.

http://www.talkstats.com/help/bb-codes/