What do YOU mean by "statistically significant"?

jawon

New Member
#1
Here's a question more about communication than statistical methods.

I've experienced that the term "statistically significant" is used often by the people I serve, who almost always aren't stats/data people. They just want to confirm that some sort of formal statistical analysis was done.

When I look in glossaries of various stats books, "statistically significant" usually refers to results where the p-value is less than 0.05. My sense is that this is not really what our customers are looking for.

For example, a chi-square test can be "statistically significant" but my understanding is that the Cramer's V needs to be checked in order to make sure it is not due to large sample size. So, technically, I can say yes it is "statistically significant" but they probably want to know whether the results were "statistically repeatable" or something to that effect.

So I guess my question is how do others answer questions about "statistical significance" or more generally, how do you try to speak the same language as your customers?
 

DAV

New Member
#2
Darn good question!

Before I answer, know that I am not a practicing statistician although I am a member of the ASA.

I think most people who are looking for answers are really hunting for the probability of being right or wrong and they are usually in the best position to know the cost/benefit of each. So, instead of saying "it's statistically significant at the 95% level" (or similar language), it would be far better to say: "There's a 1 in 20 chance the <whatever> isn't real." That's something most people can grasp quite readily.

I guess it really depends on the goal. Most business people and gambler's are after something that will give an edge. The best thing is probably determining how important the answer is to your client before you begin. People in other fields have other goals, one of which, not to be cynical, is to look for an excuse to publish. (When one sees a p-value cutoff of 0.20 was used and well hidden in a footnote, it's hard not to be cynical. Yes, that was in a real epidemiology paper and, no, I won't link to it.) OTOH, a venture capitalist might be happy with that threshold.

I note you didn't seem to question a p-value threshold of 0.05. Why not? What's so special about it except that a lot of people use it? Fisher's Law or something?
 
#3
For example, a chi-square test can be "statistically significant" but my understanding is that the Cramer's V needs to be checked in order to make sure it is not due to large sample size. So, technically, I can say yes it is "statistically significant" but they probably want to know whether the results were "statistically repeatable" or something to that effect.

So I guess my question is how do others answer questions about "statistical significance" or more generally, how do you try to speak the same language as your customers?
I think it´s good that we have different concepts for different things. Effect size is one thing, statistically significant another, power another, replicable yet another, etc. "Statistically significant" usually means one thing, namely p < alpha. But it (or rather "statistical significance") can also refer to the observed p-value, meaning the probability that you would observe a test statistic of the observed size or more extreme given that the null hypothesis is true (not necessarily the nil hypothesis).

If I talk to somebody who knows nothing about statistics I would not use "statistically significant" because they often, lacking any other connotations for the concept, think it means "BIG". That´s perfectly understandable, because it is quite a weird concept to begin with, conditioning on unobserved regions of the data that forces you into specifying your subjective intentions for what you would do in hypothetical replications (generating a sampling distribution), which in turn gives you even more weird concepts like stopping rules and corrections for looking at your data (do the data all of a sudden carry less evidence because you glanced at them for a few seconds? sounds like magic). But that´s a long story. :)
 
Last edited:

jawon

New Member
#4
Thanks so much for the thoughts. This leads me to another question that might generate some more practical dialogue (tho I realize it's too general, but oh well)...

What are some laymen phrasing for communicating stat findings so that you are conveying the appropriate level of importance (ie, not overselling)?

For example, for a simple independent t-test, I might simply say "The difference in the averages do not appear to be random."

I presume that each stat method will probably have variations depending on what exactly you did with it.
 
#5
it would be far better to say: "There's a 1 in 20 chance the <whatever> isn't real." That's something most people can grasp quite readily.

Is that a correct interpretation? And that is the trouble when you start free styling and why (in my meager graduate school opinion) "statistically significant" is such an oft used phrase. Language that has been vetted is good.

In general though the probability of observing something under the null hypothesis is not the probability that the null hypothesis is true. (ie it is not "1 in 20 chance that whatever is not real")

I been meaning to tackle my advisor and make sure I am write about this, but the most common mistake I see in interpretation is turning the probability of the event under all the framework of the null or the alternative and then somehow changing it into a statement about the probability that the hypothesis is true or not true. But there is no probability/density framework that makes that even concievable.

We do not know the probability that the null hypothesis is true. We would have to be bayesians to even consider then concept (and we might be). In that sense it is anagolous to "the probability that a true mean is in the confidence interval" being bad language (confidence intervals might cover population means with probabilities--the true mean is either in the confidence interval or it is not). But in an even deeper sense (not anagolous to confidence intervals) there is no sense of how you would associate a probability with the null being true. That is a far different statement than the probability of observing the data assuming the null.

It is convention that we reject a null when the probability of observing something this extreme was less than .05. But that is not because there is only a .05 probability that the null is true.
 

JohnM

TS Contributor
#6
Thanks so much for the thoughts. This leads me to another question that might generate some more practical dialogue (tho I realize it's too general, but oh well)...
What are some laymen phrasing for communicating stat findings so that you are conveying the appropriate level of importance (ie, not overselling)?

For example, for a simple independent t-test, I might simply say "The difference in the averages do not appear to be random."

I presume that each stat method will probably have variations depending on what exactly you did with it.
Guys.....IMHO you're getting off-track and going down the wrong path. Assuming that you're trying to communicate findings to an audience of non-statisticians...they already assume you know a lot about statistics, you don't need to convince them of that.

What you need to do is put the findings in their "language" - use things like the size of the effect or something else that is meaningful to them.

Otherwise, if you try to mangle phrases that involve probabilities and so forth, they will perceive that you don't understand their issues and can't help them.......
 

DAV

New Member
#7
Is that a correct interpretation? And that is the trouble when you start free styling and why (in my meager graduate school opinion) "statistically significant" is such an oft used phrase. Language that has been vetted is good.

In general though the probability of observing something under the null hypothesis is not the probability that the null hypothesis is true. (ie it is not "1 in 20 chance that whatever is not real").
Well, look at it from the client's perspective.

Say I'm the CEO of MegaPharms and I want to know if the experiment I just ran shows our new wonder drug is more effective than a placebo. The only course I've ever taken was "Intro to Stats 101" and I only took it so I could be next to Betty Bazongas. I did pay enough attention to get the impression that stats could help here so I hire YOU to give me something for the next board meeting.

You come back to me and say "We do not know the probability that the null hypothesis is true. Blah Blah Blah" and what I hear is "I DON"T KNOW". IOW: Hiring you was a waste of good money. The question wasn't: "Are random occurrences real?" the question was "Is it reasonable to assume our drug is more effective than doing nothing?" Answering the wrong question is wasting everyone's time and money, including yours.

Yes, the proper answer is best stated in betting odds terms. That's something a business person can understand. Also, the LAST thing you want to do is sound like you're whole lots smarter than the guy who signs the paychecks. It's best to keep the techno-talk in your pants and assume that the "English Spoked Hear" sign was taken for annual cleaning. Mr. Spock is seldom invited to lunch.
 
Last edited:

bugman

Super Moderator
#8
Dav, this might be of interest:

Warren,W. 1986. On the the presentation of statisitcal analysis: reason or ritual.
Canadian Journal of Forestry Research, 16:1185-1191.


Cheers
 

DAV

New Member
#9
Thanks! It is interesting. I've only skimmed it so far but he seems to agree with what I've been saying. How did you come across it?
 
#10
Well, look at it from the client's perspective.

Say I'm the CEO of MegaPharms and I want to know if the experiment I just ran shows our new wonder drug is more effective than a placebo. The only course I've ever taken was "Intro to Stats 101" and I only took it so I could be next to Betty Bazongas. I did pay enough attention to get the impression that stats could help here so I hire YOU to give me something for the next board meeting.

You come back to me and say "We do not know the probability that the null hypothesis is true. Blah Blah Blah" and what I hear is "I DON"T KNOW". IOW: Hiring you was a waste of good money. The question wasn't: "Are random occurrences real?" the question was "Is it reasonable to assume our drug is more effective than doing nothing?" Answering the wrong question is wasting everyone's time and money, including yours.

Yes, the proper answer is best stated in betting odds terms. That's something a business person can understand. Also, the LAST thing you want to do is sound like you're whole lots smarter than the guy who signs the paychecks. It's best to keep the techno-talk in your pants and assume that the "English Spoked Hear" sign was taken for annual cleaning. Mr. Spock is seldom invited to lunch.
I just had to say thanks, DAV, for making me laugh. Twice.

Karen
 
#11
I was actually asking the question is what you wrote really the same as the actual result. IE

"A 1 in 20 chance the <whatever> isn't real."

versus

"There is a a 1 in 20 chance of observing a result this extreme if the effect isn't real"

Let me very clear what the problem is and it is not the terse verbage. If you say "A 1 in 20 chance the whatever isn't real" you implying the effect has a probability of existance. But for most of these problems it is real or it is not real. Probability 0 or 1.

Another complaint is if you perform a slight twist of verbage and say "there is a 1 in 20 chance that there is no effect." That can be so easily be misinterpreted to mean there is a 1 in 20 chance that the procedure has no effect each and every time you apply it.

It is not so much that I am arguing against simple words being bad. I am arguing misleading words are bad--even simple ones.

On a side note by that administrator is probably a senior statistician that is going to spot the change of wording of "1 in 20 chance" being applied to something that is fully realized and either true or not true.
 

bugman

Super Moderator
#12
Hi Dav,

It is an interesting paper. I came across it because I was/am interested in writing a review on the use of p-values and pretty much what you where discussing i.e. why the arbitrary 0.05 level is chosen and the very common over use of the term "significant" in reports -especially in my field.

Here are two more: I think you will find the Feinstein paper very interesting.

I have them as pdfs if you can't get hold of them - (However, I dont think I can attach Pdfs)

## Feinstien, A.R. P-values and confidence intervals: Two sides of the same unsatisfactory coin. 1998. Journal of Clinical Epidemiology. 51(4):355-360

## Goodman, S.N. 1999. Towards evidence based medical statistics. 1.: The P-value fallacy. Annals of Internal Medicine. 130: 995-1004.
 

DAV

New Member
#13
I was actually asking the question is what you wrote really the same as the actual result. IE

"A 1 in 20 chance the <whatever> isn't real."

versus

"There is a 1 in 20 chance of observing a result this extreme if the effect isn't real"
Not to be funny, but I really don't see any wording difference between the two except you replaced "the" with "of".
A 1 in 20 chance the <whatever> isn't real​
a 1 in 20 chance of <observing a result this extreme if the effect> isn't real​

Let me very clear what the problem is and it is not the terse verbiage. If you say "A 1 in 20 chance the whatever isn't real" you implying the effect has a probability of existence. But for most of these problems it is real or it is not real. Probability 0 or 1.
Well, maybe here is where we differ. Yes, I am very definitely implying the effect has a probability of existence. No test or even a series of tests is definitive. They can only suggest whether the value is closer to either 0 or 1. By test, I mean ANY test, statistical or physical.

As statisticians, we are attempting to verify the existence of some reality through the use of some mathematical model of the real world. It should never be forgotten that mathematics has no foundation in the real world. If you think the client is in danger of forgetting that (or, more likely, is totally unaware) it might be prudent to emphasize it without using language guaranteed to evoke a glassy-eyed condition. Mostly, though, it's YOUR job to verify the statistical model(s) used is(are) appropriate to the situation. You'll quickly discover that your client is assuming that you really did do this and isn't in the least interested in the details.

I suspect that you have spent your entire statistical career in academia and have never really had to interface directly with non-statistician clients. In my engineering training (one of my degrees is in EE), we were required to answer in English and were pounded mercilessly if the prose strayed to far into technical verbiage. The reason: at some point in a EE's career it will become necessary to convey an answer to someone who is NOT a EE. It's an invaluable viewpoint.

My advice: stop trying for language precision and start thinking about ways to convey the concepts in simple terms when dealing with non-statisticians, beginning students or the public in general.

Another complaint is if you perform a slight twist of verbiage and say "there is a 1 in 20 chance that there is no effect." That can be so easily be misinterpreted to mean there is a 1 in 20 chance that the procedure has no effect each and every time you apply it.
Quite correct but that interpretation carries the assumption that the effect is real and if really meant is almost always stated along the lines of "the effectiveness is X". It's hard to see how this interpretation would occur if the impetus of the statistical testing was to verify the assumption of existence.

It is not so much that I am arguing against simple words being bad. I am arguing misleading words are bad--even simple ones.
The places where wording is harmful are actually in places like press releases. All too often, we hear that the likelihood of contracting some condition X from some substance Y doubles with a 95% confidence while leaving out the fact, or perhaps obscuring, that this doubling is from 1:10000000 to 2:10000000. You may be surprised to discover that most of the public reads this as a 95% probability of contracting X if exposed at any time to Y.

It's a public perception like this that resulted in an entire block being evacuated and subsequently invaded by a team in bunny suits recently near me after someone spilled 10 ml of mercury in a school lab.

What most of the public doesn't understand -- and sometimes this extends even to statisticians -- is that if a statistical test is being applied it's usually because it's hard to see the effect because of the noise. I doubt that anyone has ever conducted a statistical analysis to verify that encephalitic interception of high velocity, copper plated, lead projectiles is hazardous to one's health.

Many times, it's not the reporters who are at fault but the originator of the press release. The reporters tend to use press release wording verbatim. Of course, the originator is more likely to be a doctor scientist playing statistician than being a genuine statistician -- not necessarily exclusive, by any means, after all there's R. A. Fisher.
 

DAV

New Member
#14
I just had to say thanks, DAV, for making me laugh. Twice.

Karen
Always happy to brighten someone's day :yup:

Do you also do analyses for clients? It isn't clear if you do. But, if you do, how do you approach giving the results to your clients?



A suggestion: the sidebar where it says "sign-up for a free subscription to StatWise" should have a link to sample articles from StatWise.
 
#16
Not to be funny, but I really don't see any wording difference between the two except you replaced "the" with "of".
A 1 in 20 chance the <whatever> isn't real​
a 1 in 20 chance of <observing a result this extreme if the effect> isn't real​
Maybe I misread your post, but surely you do see the difference between these two. They are radically different.
 
#17
Always happy to brighten someone's day :yup:

Do you also do analyses for clients? It isn't clear if you do. But, if you do, how do you approach giving the results to your clients?
I do (clearly I need to improve communication on my web site), but my clients are pretty much all academic researchers. Anyone who hires me to actually do the analysis for them has enough grant money that they've been doing research a long time. They pretty much get the idea.

However, I more often do true consulting, in that I am guiding researchers through their analysis. I've had this come up in consulting with newer (but not always) researchers.

This is the way I always explain it, as I think it gets to the heart of it, but is understandable. Often the verbage has to be really simple.

First, I remind them of what a Type I error is--my clients have always had at least one stat class, but usually more--not always true of a CEO. Then I translate into English. "The p-value is the probability that the result you're reporting (rejecting null hypothesis) is wrong (you did it when the null hypothesis is true)."

I find the biggest problem is that researchers depend on "statistical significance" as an indication of "scientifically meaningful." I encourage them to consider both.

Having both taught statistics and consulted, I also find that people really "get" statistical thinking only as they analyze their own data. It's just too abstract for most people without a context.

A suggestion: the sidebar where it says "sign-up for a free subscription to StatWise" should have a link to sample articles from StatWise.
Thanks. Good idea. I appreciate feedback.

Karen
 

DAV

New Member
#18
Maybe I misread your post, but surely you do see the difference between these two. They are radically different.
In re:
A 1 in 20 chance the <whatever> isn't real
A 1 in 20 chance of <observing a result this extreme if the effect> isn't real
No, there really isn't a difference between the two. Think of <whatever> as a placeholder for, well, whatever. I think you are substituting something in your own mind and then arguing against it. In some circles that's called a strawman. Perhaps it's the the that's causing a problem. If it is, sorry about that. My fingers sometimes insert things when I'm not looking, which I don't always see during editing because of the way I read.

Would "A 1 in 20 chance <whatever> isn't real" have been clearer?

The important point was to emphasize the use of something like 1:20 and you don't seem to disagree with that.
 

DAV

New Member
#19
...
This is the way I always explain it, as I think it gets to the heart of it, but is understandable. Often the verbiage has to be really simple.

First, I remind them of what a Type I error is-- ... Then I translate into English. "The p-value is the probability that the result you're reporting (rejecting null hypothesis) is wrong (you did it when the null hypothesis is true)."
Yes, Karen, I think the key is to use simple language. My personal opinion is that an inability explain in understandable terms is often a sign of lack of true understanding. Though not always. Sometimes it's just a lack of proper focus. My brother designs digital circuitry but is often at a loss when communicating the function of a design, especially if the purpose is specifically for himself.
"What does it do, Terry?" "Well, this wire goes from this pin to this pin..." "OK, great! What does it do?" "That's what I'm trying to tell you."
... und so weiter​

Bottom-up explanations are hard on everybody.

You might also want to consider dropping terms like "TYPE I" and "TYPE II". If you think about it, they are meaningless jargon; adding nothing. "False positive" and "False Negative" are much more easily grasped.

Every field has something like this. The computer world has "big-endian" and "little-endian" for the format of numbers. The problem arises in remembering which end (front or back) is actually meant. Believe it or not, these terms are supposed elimination of the confusion between "Most significant first" and "Least significant first". Frankly, I think the replaced less ambiguous than the replacement.

FWIW, a bit of history: The replacements are the result of a not-so-vague comparison between a debate in Gulliver's Travels and a flame war that erupted many years ago over the very subject of whether numbers should be "Least significant first" in Internet protocols. Their later appearance in an IEEE standard (IEEE 754, IIRC) was a tongue-in-cheek legitimization of a legendary episode. Having to continuously explain them isn't as funny.

my clients have always had at least one stat class, but usually more--not always true of a CEO.
LOL. My CEO was a bit of an exaggeration. Most of them today likely have MBA's. MBA holders and economists have a far better understanding of statistics and problems associated with using them than the average person; perhaps far better than researchers in many other field.

I find the biggest problem is that researchers depend on "statistical significance" as an indication of "scientifically meaningful." I encourage them to consider both.
And that dependence is very sad. If anything it should only be one piece of information and the "statistically significant" outcome should only be a guidepost indicating the direction of subsequent research. Equally sad is that the journals tend to encourage this. This leads to a publication bias being built into just about every meta-study. {soapbox}Statisticians should be shouting in outrage over the abuse of statistics that is the inevitable outcome of these biased studies{/soapbox}

The very term "statistical significance" is what leads to a lot of confusion. To many, "significant" means far more than "locally meaningful" which is the real meaning of "statistical significance" and even then usually only in terms of a positive answer.

Having both taught statistics and consulted, I also find that people really "get" statistical thinking only as they analyze their own data. It's just too abstract for most people without a context.
I know it took that for me. I have a hard time learning anything unless I perceive an immediate use. There are a lot of things I've had to relearn because I glossed over them in the past. Guess what I need to do is learn to learn.
 
Last edited:

JohnM

TS Contributor
#20
MBA holders and economists have a far better understanding of statistics and problems associated with using them than the average person; perhaps far better than researchers in many other field.
Funny - I wonder where these "statistically knowledgeable" MBAs are. I've yet to meet one. :p