Significant does not equal important:
Statistics and the way we understand and act upon them can be a matter of life and death – a fact sadly attested by some 50,000 needless cot deaths since 1970.
But the study of statistical methods and cognition is a highly specialised and contentious field.
Reform of the way statistics are taught and used by many scientists is long overdue – however progress has been made says La Trobe University Professor of Psychology, Geoff Cumming.
It’s a subject in which he excels, and about which he is extremely passionate. So much so, that he was commissioned by the American Psychological Association (APA) to help formulate its latest statistical guidelines.
The APA is the world’s largest association of psychologists, with around 150,000 members including scientists, educators, clinicians, consultants and students.
Professor Cumming, who heads La Trobe’s Statistical Cognition Laboratory, has also written a new book, ‘Understanding The New Statistics’, published by Routledge in the US, which was launched at the Annual Convention of the APA in Washington in August.
He says that psychology and other disciplines in the biomedical sciences have developed a tradition of relying on a statistical technique called ‘significance testing’ rather than the much better ‘estimation’, which is widely used in physical sciences and engineering.
‘Seductive illusion of certainty’
The conclusion of his research is that while significance testing gives a ‘seductive illusion of certainty’, it is actually extremely unreliable. ‘It also distorts the published research record. All round, it’s a terrible idea,’ he says.
He describes significance testing as a ‘strange ritual’ that uses ‘weird backward logic’ and ‘bamboozles countless students’ every year in their introduction to statistics.
‘Damning critiques of significance testing and how it hampers research progress have been published over many years, and rarely answered,’ says Professor Cumming. ‘Yet it is used in more than 90 per cent of research in psychology, and taught in every introductory textbook.’
‘There is also extensive evidence that students, researchers, and even statistics teachers often don’t understand it correctly (yet) it is expected by most journal editors.’
Why? ‘I suspect one reason is that declaring a result ‘significant’ strongly suggests certainty, and that the result is large and important – even though statistical significance does not imply that.’
Second, he says ‘confidence intervals’, which are used in statistical estimation, can often be ‘embarrassingly long’.
However, says Professor Cumming, those long intervals might encourage us to combine results from multiple studies to get better estimates. Modern meta-analysis does exactly that.
Meta-analysis is based on estimation, and makes statistical significance virtually irrelevant. ‘It is becoming widely used, and can allow clear conclusions to be drawn from messy research literature’, he says.
‘Yet scientific journals are more likely to publish a significant result – so studies that fail to obtain “significance” tend to languish in researchers’ file drawers and therefore escape the attention of anyone conducting a meta-analysis.’
Breakthrough for reformers
With statistical reformers like Professor Cumming long advocating a switch from significance testing to estimation, a breakthrough has now occurred.
The influential Publication Manual of the American Psychological Association – to whose review Professor Cumming has acted as consultant – now recommends that the interpretation of results should, wherever possible, be based on estimation.
‘The Manual is used by more than 1,000 journals across numerous disciplines, and by millions of students and researchers around the world, so is highly influential. Its support for estimation greatly improves the prospects for statistical reform.’
Professor Cumming’s book ‘Understanding The New Statistics: Effect sizes, confidence intervals, and meta-analysis’ explains estimation and meta-analysis, with examples from many disciplines.
The book supports statistical recommendations with the results of statistical cognition research – hence ‘it may well be the first evidence-based statistics textbook’.
‘These are hardly new techniques, but I label them ‘The New Statistics’ because using them would for many researchers be quite new, as well as a highly beneficial change!’
Professor Cumming concludes with a classic example that is heart-rending but hopeful. He says the advice given to parents in the 1970s to prevent SIDs or cot death in apparently healthy infants was to put them to sleep face-down on a sheepskin.
A recent review applied meta-analysis to the evidence published in various years about the influence of sleeping position on the risk of SIDS. It found that by 1970 there was reasonably clear evidence favouring back sleeping, even though parenting books recommended front sleeping as late as 1988.
‘It estimated that, if meta-analysis had been available and used, and the resulting recommendation for back sleeping had been made in 1970, as many as 50,000 infant deaths in the developed world could have been prevented.’
Significant does not equal important: Why we need ‘The New Statistics’: