Is there a reproducibility crisis in science Matt Anticole

In 2011, a team of physicists reported
a startling discovery:

neutrinos traveled faster
than the speed of light

by 60 billionths of a second

in their 730 kilometer trip from Geneva
to a detector in Italy.

Despite six months of double checking,
the bizarre discovery refused to yield.

But rather than celebrating
a physics revolution,

the researchers published a cautious paper

arguing for continued research in an
effort to explain the observed anomaly.

In time, the error was tracked to a single
incorrectly connected fiber optic cable.

This example reminds us that real
science is more than static textbooks.

Instead, researchers around the world
are continuously publishing

their latest discoveries

with each paper adding
to the scientific conversation.

Published studies
can motivate future research,

inspire new products,

and inform government policy.

So it’s important that we have confidence
in the published results.

If their conclusions are wrong,

we risk time,

resources,

and even our health in the pursuit
of false leads.

When findings are significant,

they are frequently double-checked
by other researchers,

either by reanalyzing the data

or by redoing the entire experiment.

For example, it took repeated
investigation of the CERN data

before the timing error was tracked down.

Unfortunately, there are currently neither
the resources nor professional incentives

to double check the more than 1 million
scientific papers published annually.

Even when papers are challenged,
the results are not reassuring.

Recent studies that examined dozens
of published pharmaceutical papers

managed to replicate the results of
less than 25% of them.

And similar results have been found
in other scientific disciplines.

There are a variety of sources
for irreproducible results.

Errors could hide in their original
design, execution, or analysis of the data.

Unknown factors,

such as patients' undisclosed condition
in a medical study,

can produce results that are
not repeatable in new test subjects.

And sometimes, the second research group
can’t reproduce the original results

simply because they don’t know
exactly what the original group did.

However, some problems might stem
from systematic decisions

in how we do science.

Researchers,

the institutions that employ them,

and the scientific journals
that publish findings

are expected to produce
big results frequently.

Important papers can advance careers,

generate media interest,

and secure essential funding,

so there’s slim motivation for researchers
to challenge their own exciting results.

In addition, little incentive exists

to publish results unsupportive
of the expected hypothesis.

That results in a deluge of agreement
between what was expected

and what was found.

In rare occasions, this can even lead
to deliberate fabrication,

such as in 2013, when a researcher
spiked rabbit blood with human blood

to give false evidence that
his HIV vaccine was working.

The publish or perish mindset

can also compromise academic journals'
traditional peer-review processes

which are safety checks

where experts examine submitted papers
for potential shortcomings.

The current system,

which might involve only one
or two reviewers,

can be woefully ineffective.

That was demonstrated in a 1998 study

where eight weaknesses were deliberately
inserted into papers,

but only around 25%
were caught upon review.

Many scientists are working toward
improving reproducibility in their fields.

There’s a push to make researchers
raw data,

experimental procedures,

and analytical techniques
more openly available

in order to ease replication efforts.

The peer review process can also
be strengthened

to more efficiently weed out weak papers
prior to publication.

And we could temper the pressure
to find big results

by publishing more papers that fail
to confirm the original hypothesis,

an event that happens far more than
current scientific literature suggests.

Science always has, and always will,
encounter some false starts

as part of the collective acquisition
of new knowledge.

Finding ways to improve
the reproducibility of our results

can help us weed out those false starts
more effectively,

keeping us moving steadily toward
exciting new discoveries.

2011 年,一个物理学家团队报告
了一个惊人的发现:在从日内瓦到意大利探测器的 730 公里

旅程中,中微子的传播速度

光速快 600 亿分之一秒

尽管进行了六个月的双重检查,但
这个奇怪的发现仍然没有结果。

但研究人员并没有庆祝
一场物理革命

,而是发表了一篇谨慎的论文,

主张继续研究
以解释观察到的异常现象。

随着时间的推移,错误被追踪到一根
错误连接的光缆。

这个例子提醒我们,真正的
科学不仅仅是静态的教科书。

相反,世界各地的研究
人员不断发表

他们的最新发现

,每篇论文都
加入了科学对话。

已发表的研究
可以激发未来的研究,

激发新产品,

并为政府政策提供信息。

因此,重要的是我们
对公布的结果充满信心。

如果他们的结论是错误的,

我们就会冒着时间、

资源

甚至健康的风险
来寻找错误的线索。

当发现很重要时,

其他研究人员经常

通过重新分析数据

或重做整个实验来对它们进行双重检查。

例如,在追踪计时误差之前,它需要
对 CERN 数据

进行反复调查。

不幸的是,目前既没有
资源也没有专业的激励措施

来仔细检查每年发表的超过 100 万
篇科学论文。

即使论文受到挑战
,结果也不能令人放心。

最近对
数十篇已发表的药物论文

进行的研究成功地复制了其中
不到 25% 的结果。

在其他科学学科中也发现了类似的结果。

无法重现的结果有多种来源。

错误可能隐藏在其原始
设计、执行或数据分析中。

未知因素,

例如患者
在医学研究中未公开的状况,

可能会产生
在新测试对象中不可重复的结果。

有时,第二个研究小组
无法复制原始结果

只是因为他们不
知道原始小组做了什么。

然而,一些问题可能源于

我们如何进行科学的系统决策。

研究人员

、雇用他们的机构

以及
发表研究结果的科学期刊预计

会经常产生重大成果。

重要的论文可以促进职业发展,

引起媒体的兴趣,

并获得必要的资金,

因此研究
人员挑战自己令人兴奋的结果的动力很小。

此外,几乎没有动力

发表不
支持预期假设的结果。

这导致
在预期

和发现之间达成大量一致。

在极少数情况下,这甚至会
导致故意捏造,

例如在 2013 年,一名研究人员在
兔子血液中掺入人血,

以提供虚假证据证明
他的 HIV 疫苗正在发挥作用。

发表或消亡的心态

也可能损害学术期刊的
传统同行评审流程

,即安全检查

,专家检查提交的论文
是否存在潜在缺陷。

当前的

系统可能只涉及
一两个审阅者,

但效率可能非常低。

这在 1998 年的一项研究中得到了证明,该研究

故意
在论文中插入了 8 个弱点,

但只有大约 25% 的弱点
在审查时被发现。

许多科学家正在努力
提高其领域的可重复性。

推动研究人员更公开地使用
原始数据、

实验程序

和分析技术

,以简化复制工作。

同行评审过程也
可以得到加强,

以便在发表之前更有效地淘汰薄弱的论文

我们可以

通过发表更多
未能证实原始假设的论文来缓和寻找重大结果的压力,

这一事件发生的次数远远超过
当前科学文献所暗示的。 作为集体获取新知识的一部分

,科学总是会
遇到一些错误的开始

寻找方法来提高
结果的

可重复性可以帮助我们更有效地清除那些错误的开始

让我们朝着
令人兴奋的新发现稳步前进。