This is presently among the hottest topics / discussions / developments in statistics. Seriously. Just look at the abstract and dozens of distinguished authors of the paper below, which is forthcoming in one of the world's leading science outlets, Nature Human Behavior.
Of course data mining, or overfitting, or whatever you want to call it, has always been a problem, warranting strong and healthy skepticism regarding alleged "new discoveries". But the whole point of examining p-values is to AVOID anchoring on arbitrary significance thresholds, whether the old magic .05 or the newly-proposed magic .005. Just report the p-value, and let people decide for themselves how they feel. Why obsess over asterisks, and whether/when to put them next to things?
Postscript:
Reading the paper, which I had not done before writing the paragraph above (there's largely no need, as the wonderfully concise abstract says it all), I see that it anticipates my objection at the end of a section entitled "potential objections":
The paper offers only a feeble refutation of that "potential" objection:
Of course data mining, or overfitting, or whatever you want to call it, has always been a problem, warranting strong and healthy skepticism regarding alleged "new discoveries". But the whole point of examining p-values is to AVOID anchoring on arbitrary significance thresholds, whether the old magic .05 or the newly-proposed magic .005. Just report the p-value, and let people decide for themselves how they feel. Why obsess over asterisks, and whether/when to put them next to things?
Postscript:
Reading the paper, which I had not done before writing the paragraph above (there's largely no need, as the wonderfully concise abstract says it all), I see that it anticipates my objection at the end of a section entitled "potential objections":
Changing the significance threshold is a distraction from the real solution, which is to replace null hypothesis significance testing (and bright-line thresholds) with more focus on effect sizes and confidence intervals, treating the P-value as a continuous measure, and/or a Bayesian method.Here here! Marvelously well put.
The paper offers only a feeble refutation of that "potential" objection:
Many of us agree that there are better approaches to statistical analyses than null hypothesis significance testing, but as yet there is no consensus regarding the appropriate choice of replacement. ... Even after the significance threshold is changed, many of us will continue to advocate for alternatives to null hypothesis significance testing.I'm all for advocating alternatives to significance testing. That's important and helpful. As for continuing to promulgate significance testing with magic significance thresholds, whether .05 or .005, well, you can decide for yourself.
Redefine Statistical Significance
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.