阳光肺科

 找回密码
 立即注册

微信扫码登录

搜索
查看: 1|回复: 0

Science:学术的黑夜——披露出来的论文造假数量惊人

[复制链接]

91

主题

28

回帖

2043

积分

V3

积分
2043
mimi 发表于 2023-6-16 17:47:08 | 显示全部楼层 |阅读模式

5b2d95c6a2a748e019f7f34437f66fb3.jpg

Fake scientific papers are alarmingly common

When neuropsychologist Bernhard Sabel put his new fake-paper detector to work, he was “shocked” by what it found. After screening some 5000 papers, he estimates up to 34% of neuroscience papers published in 2020 were likely made up or plagiarized; in medicine, the figure was 24%. Both numbers, which he and colleagues report in a medRxiv preprint posted on 8 May, are well above levels they calculated for 2010—and far larger than the 2% baseline estimated in a 2022 publishers’ group report.“It is just too hard to believe” at first, says Sabel of Otto von Guericke University Magdeburg and editor-in-chief of Restorative Neurology and Neuroscience. It’s as if “somebody tells you 30% of what you eat is toxic.”His findings underscore what was widely suspected: Journals are awash in a rising tide of scientific manuscripts from paper mills—secretive businesses that allow researchers to pad their publication records by paying for fake papers or undeserved authorship. “Paper mills have made a fortune by basically attacking a system that has had no idea how to cope with this stuff,” says Dorothy Bishop, a University of Oxford psychologist who studies fraudulent publishing practices. A 2 May announcement from the publisher Hindawi underlined the threat: It shut down four of its journals it found were “heavily compromised” by articles from paper mills.Sabel’s tool relies on just two indicators—authors who use private, noninstitutional email addresses, and those who list an affiliation with a hospital. It isn’t a perfect solution, because of a high false-positive rate. Other developers of fake-paper detectors, who often reveal little about how their tools work, contend with similar issues.Still, the detectors raise hopes for gaining the advantage over paper mills, which churn out bogus manuscripts containing text, data, and images partly or wholly plagiarized or fabricated, often massaged by ghost writers. Some papers are endorsed by unrigorous reviewers solicited by the authors. Such manuscripts threaten to corrupt the scientific literature, misleading readers and potentially distorting systematic reviews. The recent advent of artificial intelligence tools such as ChatGPT has amplified the concern.To fight back, the International Association of Scientific, Technical, and Medical Publishers (STM), representing 120 publishers, is leading an effort called the Integrity Hub to develop new tools. STM is not revealing much about the detection methods, to avoid tipping off paper mills. “There is a bit of an arms race,” says Joris van Rossum, the Integrity Hub’s product director. He did say one reliable sign of a fake is referencing many retracted papers; another involves manuscripts and reviews emailed from internet addresses crafted to look like those of legitimate institutions.Twenty publishers—including the largest, such as Elsevier, Springer Nature, and Wiley—are helping develop the Integrity Hub tools, and 10 of the publishers are expected to use a paper mill detector the group unveiled in April. STM also expects to pilot a separate tool this year that detects manuscripts simultaneously sent to more than one journal, a practice considered unethical and a sign they may have come from paper mills. Such large-scale cooperation is meant to improve on what publishers were doing individually and to share tools across the publishing industry, van Rossum says.“It will never be a [fully] automated process,” he says. Rather, the tools are like “a spam filter … you still want to go through your spam filter every week” to check for erroneously flagged legitimate content.STM hasn’t yet generated figures on accuracy or false-positive rates because the project is too new. But catching as many fakes as possible typically produces more false positives. Sabel’s tool correctly flagged nearly 90% of fraudulent or retracted papers in a test sample. For every 56 true fakes it detected, however, it erroneously flagged 44 genuine papers, so results still need to be confirmed by skilled reviewers. Other paper mill detectors typically have a similar trade-off, says Adam Day, founding director of a startup called Clear Skies who consulted with STM on the Integrity Hub. But without some reliance on automated methods, “You either have to spot check randomly, or you use your own human prejudice to choose what to check. And that’s not generally very fair.”Scrutinizing suspect papers can be time-consuming: In 2021, Springer Nature’s postpublication review of about 3000 papers suspected of coming from paper mills required up to 10 part- and full-time staffers, said Chris Graf, the company’s director of research integrity, at a U.S. House of Representatives subcommittee hearing about paper mills in July 2022. (Springer Nature publishes about 400,000 papers annually.)Newly updated guidelines for journals issued in April may help ease the workload. They may decide to reject or retract batches of papers suspected of having been produced by a paper mill, even if the evidence is circumstantial, says the nonprofit Committee on Publication Ethics, which is funded by publishers. Its previous guidelines encouraged journals to ask authors of each suspicious paper for more information, which can trigger a lengthy back and forth.Some outsiders wonder whether journals will make good on promises to crack down. Publishers embracing gold open access—under which journals collect a fee from authors to make their papers immediately free to read when published—have a financial incentive to publish more, not fewer, papers. They have “a huge conflict of interest” regarding paper mills, says Jennifer Byrne of the University of Sydney, who has studied how paper mills have doctored cancer genetics data.The “publish or perish” pressure that institutions put on scientists is also an obstacle. “We want to think about engaging with institutions on how to take away perhaps some of the [professional] incentives which can have these detrimental effects,” van Rossum says. Such pressures can push clinicians without research experience to turn to paper mills, Sabel adds, which is why hospital affiliations can be a red flag.Publishers should also welcome help from outsiders to improve the technology supporting paper mill detectors, although this will require transparency about how they work, Byrne says. “When tools are developed behind closed doors, no one can criticize or investigate how they perform,” she says. A more public, broad collaboration would likely strengthen them faster than paper mills could keep up, she adds.Day sees some hope: Flagging journals suspected of being targeted by paper mills can quickly deter additional fraudulent submissions. He points to his analysis of journals that the Chinese Academy of Sciences (CAS) put on a public list because of suspicions they contained paper mill papers. His company’s Papermill Alarm detector showed that before the CAS list came out, suspicious papers made up the majority of some journals’ content; afterward, the proportion dropped to nearly zero within months. (Papermill Alarm flags potentially fraudulent papers based on telltale patterns revealed when a paper mill repeatedly submits papers; the company does not publicly disclose what these signs are.) Journals could drive a similar crash by using automated detectors to flag suspicious manuscripts, nudging paper mills to take them elsewhere, Day says.Some observers worry paper mill papers will merely migrate to lower impact journals with fewer resources to detect them. But if many journals act collectively, the viability of the entire paper mill industry could shrink.It’s not necessary to catch every fake paper, Day says. “It’s about having practices which are resistant to their business model.”

参考翻译:伪造的科学论文数量惊人当神经心理学家伯恩哈德·萨贝尔(Bernhard Sabel)让他的新型伪造论文检测器开始工作时,他对它所找到的结果感到“震惊”。在筛查了约5000篇论文之后,他估计,2020年发表的神经科学论文中,高达34%可能是捏造的或剽窃的;在医学领域,这个数字为24%。这些数字都比他和同事们为2010年计算出的数字高得多——远超出了2022年一个出版商组织报告中估计的2%基线。萨贝尔是几本期刊的发表编辑,并且在马格德堡的奥托·冯·圭里克大学任职,他说,起初这个数字“太难以置信了”。这就像“有人告诉你你吃的食物有30%是有毒的”。他的发现强调了广泛怀疑的事实:期刊涌现出一股来自“纸厂”的科学手稿潮流——这些秘密企业让研究人员通过支付伪造论文或不应得的作者身份来塞满他们的出版记录。牛津大学心理学家多萝西·毕晓普(Dorothy Bishop)研究伪造出版活动,她说,“纸厂通过攻击一个毫无应对这种问题的系统赚得了一大笔钱。”出版商“ Hindawi”在5月2日发布的公告强调了这种威胁:它关闭了四个期刊,因为发现这些期刊中出现了大量来自“纸厂”的文章。萨贝尔的工具仅依赖于两个指标——使用私人、非机构邮件地址的作者和写作所属医院。它并不是一个完美的解决方案,因为存在高误报率。其他伪造论文检测器的开发者,他们通常不透露自己的工具如何运作,也面临着类似的问题。然而,这些检测器仍然希望获得优势,对抗纸厂。纸厂生产出来的假手稿中含有部分或全部剽窃或捏造的文字、数据和图像,通常由别人代笔。有些论文得到了作者征求的没有严谨审核的评审人的认可。这些论文威胁着科学文献的完整性,误导读者,可能扭曲系统性综述。最近出现的人工智能工具,如ChatGPT,进一步加剧了这种关注。为了进行反击,代表120家出版商的国际科学技术和医学出版协会(STM)牵头发起了名为“廉正中心”的行动,开发新工具。STM并未透露太多关于检测方法的细节,以避免向纸厂透露信息。廉正中心产品总监乔里斯·范·罗森(Joris van Rossum)说,“这有点像军备竞赛”。他表示,其中一种可靠的伪造手稿特征是引用了许多被撤稿的论文;另一种涉及手稿和评审意见被发送至看起来像合法机构的互联网地址。包括Elsevier、Springer Nature和Wiley在内的20家出版商正在帮助发展廉正中心的工具,其中10家出版商预计将使用该组在4月发布的伪造论文检测器。STM还期望在今年试行另一项工具,该工具能够检测同时发送给多个期刊的手稿,这被认为是不道德的,也是从纸厂来的信号。van Rossum说,这种大规模的合作旨在改善出版商的个人行动,并在整个出版业之间共享工具。他说,“这永远不会是一个[完全]自动化的过程。这些工具就像是一个垃圾邮件过滤器……你仍然希望每周过一遍你的垃圾邮件,查看是否有被错误标记的合法内容。”STM尚未生成关于准确性或误报率的数据,因为该项目还太新。但是,尽可能多地发现伪造手稿通常会产生更多的误报。萨贝尔的工具在测试样本中正确标记了近90%的伪造或已被撤稿的手稿。然而,对于它检测到的56篇真正的伪造手稿,它误报了44篇真正的手稿,因此结果仍需要被熟练的评审者证实。Adam Day是一家名为Clear Skies的初创公司的创始人及STM“廉正中心”顾问,他表示,其他伪造论文检测器通常具有类似的权衡。但是,如果没有某些依赖于自动化方法的依赖,“你要么要随机进行抽样检查,要么你使用自己的主观偏见来选择要检查的内容。而这通常是不公平的。”怀疑手稿受到严格审核可能是耗时费力的。2021年,Springer Nature对大约3000篇疑似来自纸厂的论文进行的事后审查,需要10名兼职和全职员工,该公司的研究诚信总监Chris Graf在2022年7月美国众议院小组委员会关于纸厂的听证会上说。(Springer Nature每年出版约40万篇论文。)4月份发布的期刊新指南的更新可能有助于减轻工作负担。即使证据是迂回的,期刊也可能决定拒绝或撤回疑似由纸厂制作的一批论文,并由由出版商资助的非盈利委员会(COPE)制定的新指南。以前的指南鼓励期刊要求每篇可疑论文的作者提供更多信息,这可能会引发漫长的来回交流。有些外部人士在担心期刊能否兑现打击纸厂的承诺。拥抱黄金开放获取,即期刊向作者收取费用,以便在出版时让他们的论文立即免费阅读的出版商,拥有发布更多而不是更少论文的财务动机。曾研究过纸厂如何篡改肿瘤遗传数据的悉尼大学的珍妮弗·伯恩(Jennifer Byrne)说,他们有关于纸厂的“巨大利益冲突”。机构对科学家施加的“发表或灭亡”的压力也是个障碍。“我们要考虑与机构合作,以消除可能产生这些不良影响(职业)激励的方法,”范·罗森表示,这种压力可以推动没有研究经验的临床医生转向纸厂,这就是为什么医院隶属可以是一个红旗。出版商也应该欢迎外界的帮助,以改善支持纸厂检测器技术,尽管这需要透明化它们运作的方式,伯恩表示。“当工具在封闭的门后开发时,没有人可以批评或调查它们的性能,”她说。一种更公开、广泛的合作方式可能会比纸厂保持更新更快。Day看到一些希望:标记怀疑受到纸厂瞄准的期刊可以迅速阻止其他作假的提交。他指出,他对中国科学院(CAS)公布的期刊名单进行了分析,因为怀疑这些期刊包含纸厂论文。他公司的纸厂警告检测器显示,在CAS名单发布之前,可疑文章占一些期刊的绝大多数内容;在之后的几个月内,这个比例降至接近零。Papermill Alarm基于纸厂重复提交论文时揭示出的特征模式来标记潜在的欺诈性文章;该公司不公开披露这些特征是什么。Day说,期刊可以通过使用自动检测器来标记可疑手稿,从而迅速遏制纸厂的欺诈行为。一些观察家担心,纸厂论文可能会转移到更低影响力的期刊,这些期刊没有足够的资源来检测它们。但是如果许多期刊采取集体行动,Day认为有一些希望:标记疑似遭到“论文工厂”攻击的期刊可以快速遏制进一步的欺诈投稿。他指出,中国科学院(CAS)公布了一份名单,上面列出了疑似包含论文工厂论文的期刊。他公司的论文工厂警报探测器显示,在CAS名单公布之前,一些期刊中可疑论文占据了绝大多数的比例;之后,在几个月内,这一比例降至近乎零。(论文工厂警报会根据论文工厂反复提交论文时揭示出的一些显著模式来标记潜在的欺诈论文;该公司没有公开披露这些标志是什么。)Day表示,期刊可以使用自动化检测器标记可疑手稿,将论文工厂推向其他地方,从而引发类似的冲击。一些观察家担心,论文工厂的论文只会转移到影响力较小的期刊,而这些期刊缺乏检测它们的资源。但是如果许多期刊共同行动,整个论文工厂行业的可行性可能会缩小。Day表示,不需要抓住每一篇伪造的论文。“关键在于有抵御它们商业模式的做法。”


词汇学习:

  • detector /dɪˈtɛktər/ n.探测器
  • plagiarized /ˈpleɪdʒəraɪzd/ adj.抄袭的
  • manuscript /ˈmænəskrɪpt/ n.手稿
  • massaged /ˈmæsɑːʒd/ adj.曲解的
  • ghost writer /ˈɡoʊst ˈraɪtər/ n.代笔作者
  • systematic review /sɪstəˈmætɪk rɪˈvjuː/ n.系统性综述
  • artificial intelligence /ˌɑːrtɪˈfɪʃl ɪnˈtɛlədʒəns/ n.人工智能
  • unethical /ʌnˈeθɪkl/ adj.不道德的
  • false positive /fɔːls ˈpɑːzətɪv/ n.假阳性结果
  • fraudulent /ˈfrɔːdjələnt/ adj.欺诈的
  • back and forth /bæk ənd fɔːrθ/ 往返
  • incentive /ɪnˈsen.tɪv/ n.激励,刺激
  • red flag /rɛd flæg/ 警示信号
  • transparency /trænsˈpærənsi/ n.透明度
  • viability /vaɪəˈbɪlɪti/ n.生存能力
  • lower impact journal /loʊər ˈɪmpækt ˈdʒɜːrnəl/ 没有影响力的期刊
  • resistant /rɪˈzɪstənt/ adj.抵抗力强的
  • business model /ˈbɪznɪs ˈmɑːdl/ n.商业模式



主旨大意:
一篇发表在medRxiv预印本网站上的文章称,新型假论文检测器发现了大量虚假科学论文。经过筛查,神经心理学、医学领域的虚假论文的数量惊人,并呼吁采取行动整顿。其他的假论文检测器采用不同的方法,大多数检测器存在较高的误报率和漏报率的问题。在此方面,国际科学、技术和医学出版商协会(STM)已与20个出版商合作开发了整合性枪击和检测工具,以期解决虚假论文泛滥问题。检测工具显示,如果期刊集体行动,可以使整个论文工厂行业的可能性大大降低。同时,出版商和学术机构也应减少对学术成果的过度重视,从而减少论文工厂的动机。

63dec601fb8bcab78ab04f5862a29a1d.jpg

文章节选自《Science》

回复

使用道具 举报

给我们建议|手机版|PIME|阳光肺科 ( 粤ICP备2020077405号-1 )

GMT+8, 2024-9-20 18:00

Powered by Discuz! X3.5

© 2001-2024 Discuz! Team.

快速回复 返回顶部 返回列表