Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

IN PRACTICE Cargo-cult statistics and scientific crisis The mechanical, ritualistic application of statistics is contributing to a crisis in science. Education, software and peer review

image text in transcribedimage text in transcribedimage text in transcribedimage text in transcribed
image text in transcribedimage text in transcribedimage text in transcribedimage text in transcribed
IN PRACTICE Cargo-cult statistics and scientific crisis The mechanical, ritualistic application of statistics is contributing to a crisis in science. Education, software and peer review have encouraged poor practice - and it is time for statisticians to fight back. By Philip B. Stark and Andrea Saltelli por practice is catching up with Some, such as historian and sociologist of cience, manifesting in part in the science Steven Shapin, still argue that science failure of results to be reproducible survives thanks to the ethical commitment of and replicable. Various causes scientists,' but others, such as philosopher of have been posited, but we believe that science Jerome Ravetz, find this a charitable poor statistical education and practice are perspective.? Much of what is currently called symptoms of and contributors to problems in "science" may be viewed as mechanical science as a whole. application of particular technologies, The problem is one of cargo-cult statistics including statistical calculations, rather than Philip B. Stark is professor Andrea Saltelli is an adjunct in the Department of professor in the Centre for - the ritualistic miming of statistics rather adherence to shared moral norms. Statistics and associate the Study of the Sciences than conscientious practice. This has become We believe the root of the problem lies in the dean in the Division of and the Humanities at the the norm in many disciplines, reinforced and Mathematical and Physical University of Bergen, and a mid-twentieth century. Sciences at the University researcher with the Open abetted by statistical education, statistical of California, Berkeley. Evidence Research group software, and editorial policies The bigger picture at the Universitat Oberta de At the risk of oversimplifying a complex After World War II, governments increased Catalunya, Barcelona. historical process, we think the strongest force funding for science in response to the pushing science (and statistics) in the wrong assessment that scientific progress is important direction is existential: science has become for national security, prosperity, and quality of a career, rather than a calling, while quality life. This increased the scale of science and the control mechanisms have not kept pace. pool of scientific labour: science became "big 40 SIGNIFICANCE August 2018 @2018 The Royal Statistical SocietymaclanLeatlr'Bigstockcom science\" conducted by career professionals. The resulting increase in scientic output required new approaches to managing science and scientists, including new mvernment agencies and a focus on quantifying scientic productivity. The norms and selfmregulating aspects of \"little science" mmmunities that valued questioning, craftsmanship, scepticism, self-doubt, critical appraisal of the quality of evidence, and the veriable, and veriably replicable, advancement of human knowledge gave way to current approaches centring on metrics, funding, publication, and prestige. Such approaches may invite and reward \"gaming\" of the system When understanding, care, and honesty become valued 155 than novelty, visibility, scale, funding, and salary, science is at risk. Elements of the praent crisis were anticipated by scholars such as Price3 and Ravetz.\"i in the 19605 and 19705; a more modern explanation of science's crisis, in terms of the prevailing economic model and of the commodication of science is offered by Mirowski.5 Other scholars such as John Ioannidis, Brian Nosek, Marc Edwards, et at, now study the perverse system of incentives and its consequences and how bad science outcompetes better science.\" While some argue that there is no crisis (or at least no systemic problem), had incentives, bad scientic practices, outdated methods of vetting and disseminating results, and techno- science appear to be producing misleading and incorrect results. This might produce a crisis of biblical proportions. As Edwards and Roy write: \"If a critical mass of scientists become untrustworthy, a tipping point is possible in which the scientic enterprise itself becomes inherently corrupt and public trust is lost, risking a new dark age with devastating consequences to humanity.\"9 Scientists collectively risk losing credibility and authority in part because of prominent examples of poor practice, but also because many are guilty of ultracrepidation: acting asiftheir stature inone domainmakes them authoritative in others. Science is "show me\had a prior." To proceed with a model or Statistical software: methods than the discipline is accustomed to, prior that is not chosen carefully and well power without wisdom simply because the methods are unfamiliar. grounded in disciplinary knowledge, to mix Statistical software enables and promotes Conversely, some disciplines become frequentist and Bayesian methods obliviously, cargo-cult statistics. Marketing and adoption enthralled with methodology du jour without to select the prior after looking at the data of statistical software are driven by ease of careful vetting. Even the increased volume of to get a result one likes, and to combine use and the range of statistical routines the research suggests that quality must suffer. systematic and stochastic errors as if they software implements. Offering complex and There is structural moral hazard in the were independent random errors are all forms 'modern" methods provides a competitive current scientific publishing system. Many of cargo-cult statistics. The calculations are advantage. And some disciplines have in effect turf battles are fought at the editorial level as likely to produce valid inferences as cargo standardised on particular statistical software, Our own experience suggests that journals are cults were to summon cargo planes. often proprietary software. reluctant to publish papers critical of work the Statistical software does not help you journal published previously, or of work by Statistics education: know what to compute, nor how to interpret scientists who are referees or editors for the contributory negligence the result. It does not offer to explain the journal. While statistical education has started a assumptions behind methods, nor does it flag Editorial control of prestigious journals sea change for the better, in our experience, delicate or dubious assumptions. It does not confers indirect but substantial control of many statistics courses - especially "service" warn you about multiplicity or p-hacking. employment, research directions, research courses for non-specialists - teach cargo-cult It does not check whether you picked the funding, and professional recognition. Editors statistics: mechanical calculations with little hypothesis or analysis after looking at the data, and referees can keep competitors from being attention to scientific context, experimental nor track the number of analyses you tried heard, funded, and hired. Nobel biologist design, assumptions and limitations of before arriving at the one you sought to publish Randy Shekman reports: "Young people tell methods, or the interpretation of results. -another form of multiplicity. The more me all the time, 'If I don't publish in CNS [a This should not be surprising. These courses "powerful" and "user-friendly" the software is, common acronym for Cell/Nature/Science, the are often taught outside statistics departments the more it invites cargo-cult statistics. most prestigious journals in biology], I won't by faculty whose own understanding of This is hard to fix. Checks of residuals get a job' ... [Those journals] have a very big foundational issues is limited, having and similar tests cannot yield evidence influence on where science goes."14 possibly taken similarly shallow courses that that modelling assumptions are true - and Editorial policies may preclude authors from emphasise technique and calculation over running such checks makes the estimates providing enough information for a reviewer understanding and evidence and inferences conditional, which software (or reader) to check whether the results are Service courses taught in statistics generally does not take into account. In-built correct, or even to check whether the figures departments often have high enrolments, warnings could be used to remind the user and tables accurately reflect the underlying which help justify departmental budgets of the assumptions, but these are unlikely to data. As a result, the editorial process simply and staffing levels. Statistics departments have much effect without serious changes to cannot perform its intended quality-control may be under administrative, social, and incentives. Indeed, if software offered such function. inancial pressure to cater to the disciplinary warnings, it might be seen as an irritant, Academic research is often funded, at "consumers" of the courses. Consumers and hence a competitive disadvantage to the least in part, by taxes. Yet many scientists may not care whether methods are used vendor and the user, rather than an aid. ry to become rent-seekers, keeping the data appropriately, in part because, in their fields, and code that results from public funding to the norm (including the expectations of editors Scientific publishing themselves indefinitely, or until they feel they and referees) is cargo-cult statistics. The bad and open science have exhausted its main value. This is morally incentives for individuals, departments, and Peer review can reinforce bad scientific and murky. To "publish" the resulting research disciplines are clear; negative consequences statistical practice. Indeed, journals may reject behind a paywall, inaccessible to the general for science and society are expected. papers that use more reliable or more rigorous public, is even more troubling. Scientificpublishing is big business, and its interests are not those of science or scientists.\" Open data, open software, and open publication may provide better value for society and a better ethical foundation for science. What can statisticians do? Statisticians can help with important, controversial issues with immediate consequencm for society. We can help ght power asymmetries in the use of evidence. We can stand up for the responsible use of statistics, even when that means taking personal risks. We should be vocally critical of cargo-cult statistics, including where study design is ignored, where p-valum, condence intervals and posterior distributions are misused, and where probabilities are calculated under irrelevant, misleading assumptions. We should be critical even when the abuses involve politically charged issues, such as the social cost of climate change. lfan authority treats estimates based on an ad hoc collection of related numerical models with unknown, potentially large systematic errors as if they were a random sample from a distribution centred at the parameter, we should object whether or not we like the conclusion. We can insist that \"service\" courses foster statistical thinking, deep understanding, and appropriate scepticism, rather than promulgating cargo-cult statistics. We can help empower individuals to appraise quantitative information critically to be informed, effective citizens of the world. We also can help educate the media, which often reduces science to \"infotainment\" through inaccurate, sensationalised, truncated, and uncircumspect reporting. Journalists rarely ask basic questions about experimental design or data quality, report uncertainties, or check the scientic literature for conicting results, etc. We can address this by teaching courses for journalists and editors. When we appraise each other's work in academia, we can ignore impact factors, citation counts, and the like: they do not measure importance, correctness, or quality. We can pay attention to the work itself, rather than the masthead of the joun'ral in which it appeared, the press coverage it received, or the funding that supported it. We can insist on evidence that the workis correct on reproducibility and replicability rather than pretend that editors and referees can reliably vet research by proxy when the requisite evidence was not even submitted for scrutiny. We can decline to referee manuscripts that do not include enough information to tell whether they are correct. We can commit to working reproducibly, to publishing code and data, and generally to contributing to the intellectual commons. We can point out when studies change endpoints. We can decline to serve as an editor or referee for journals that proteer or that enable scientists to be rent-seekers by publishing \"results" without the underlying publicly funded evidence: data and code. And we can be of service. Direct involvement of statisticians on the side of citizens in societal and environmental problems can help earn the justied trust of society. For instance, statisticians helped show that the erroneous use of zip codes to identify the geographic area of intermt in the Flint, Michigan water pollution scandal made the water contamination problem disappear. Statistical election forensics has revealed electoral manipulation in countries such as Russia. Statistical \"risk-limiting\" audits, endorsed by the ASA, can provide assurance that election outcomes are correct. Such methods have been tested in California, Colorado, Ohio, and Denmark, and are required by law in Colorado and Rhode Island; other states have pending legislation. Statisticians and computer scientists developed methods and software; worked with election ofcials, legislators, and government agencies on logistics, laws, and regulations; and advocatedwith the public through op-eds and media appearances. Statisticians and mathematicians can help assess and combat gerrymandering, the practice of redrawing electoral districts to advantage a party unfairly. Statisticians are pointing out biases inherent in \"big data\" and machinemlearning approaches to social issues, such as predictive policing. They could also work with emnomists to monitor new forms of exploitation of intellectual labour now that new modes of working can be exploited in old ways. We statisticians can support initiatives such as the Reproducibility Project, the Meta- research Innovation Center, the EQUATOR network, alltrialsnet, retractionwatchcom, and others that aim to improve quality and ethics in science, and hold scientists accountable for sloppy, disingenuous, or fraudulent work. And we can change how we work. We can recognise that software engineering is as IN PRACTICE important to modern data analysis as washing test tubes is to wet chemistry: we can develop better computational hygiene. And we can ensure that publicly funded research is public. In the 16605, radical philosophers sought to understand and master the world by becoming scientifrc creating science. In the 19705, radical scientists sought to change the world by changing science. Perhaps thatis now needed again. I Editnr's note A fully referenced version of this article is available online at signicancemagazine.coml593. References 1. Shapin, S. (2008) The Scientic Life: A Moral History of a Late Modern Vocation. Chicago: University of Chicago Press. 2. Ravetz, I. (2009) Mctralsand manners in modern science. Nature, 4517230), 662663. 3. Price, D. I. de 5. (1969 Little Science, Big Science. New York: Columbia University Press. 4. Ravetz, I. P. (1971) Scientic Knowledge and its Social Problems. Oxford.- Clarendon Press. 5. Mirowslci, P. (2011) Science-Mart: Privatizing American Science. Cambridge, MA: Harvard UniversityPreE. 6.10annidis, I. P. A, Oren, 1., Kodell, R., Haug, C. and Hoey, I. (2005) Why most published research ndings are false. PMSMedicine, 2(8), e124. 7.Munafo, M. R., Nosek, B. 1%., Bishop, D. \\t M, Buttnn, K. 5., Chambers, C. D, Percie du Sert, N., Simonsohn, U., Wagenmakers, E.-)., Ware, J. I. and I0annidis, I. P. A. (2017) A manifesto fcr reproducible science. Nature Human Behaviour, 1(1), 0021. 8. Smaldino, P. E. and McElreatn, R. (2016) The natural selection of bad science. Royal Society Open Science, 3, 160384. 9. Edwards, M. A. and Roy, 5. (2017) Academic research in the 21st century: Maintaining scientic integrity in a climate 0fperverse incentives and hypercompetition. Environmental Engineering Science, 34(1), 5161. 10. Urestertaon, G. K. (2015) Napoleon ofNotting Hill. Open Road Media. 11. Feynman, R. P., Leighton, R. and Hutachin gs, E. (1985) Surely Yo u're joking, Mr. Feynman! New York: W. W. Norton. 1:. Wasserstein, R. L. and Iazar, N. A. (2016) The Mitts statement on p-values: Context, process, and purpose. American Statistician, 70(2), 129133. 13. Eeedman, D. (1995) Some issues in the foundation of statistics. Foundations of Science, 1(1), 19 39. 14. Buranyi, 5. (2017) Is the staggeringly protable business of scientic publishing bad for science? The Guardian, 2]! lune. August 2018 | significancemagazine.com 43

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Math In Plain English Literacy Strategies For The Mathematics Classroom

Authors: Amy Benjamin

1st Edition

1317926757, 9781317926757

More Books

Students also viewed these Mathematics questions

Question

What is a flexible manufacturing system? LO1

Answered: 1 week ago

Question

How do cost management and financial accounting differ? LO1

Answered: 1 week ago