Nature Machine Intelligence ( IF 18.8 ) Pub Date : 2024-10-04 , DOI: 10.1038/s42256-024-00911-w Cas Wognum, Jeremy R. Ash, Matteo Aldeghi, Raquel Rodríguez-Pérez, Cheng Fang, Alan C. Cheng, Daniel J. Price, Djork-Arné Clevert, Ola Engkvist, W. Patrick Walters
Machine learning (ML) is driving exciting innovations in drug discovery, but we need to be mindful of the circumstances that set this application apart. Unlike other fields with fit-for-purpose datasets consisting of millions of examples, published datasets in drug discovery are classically heterogeneous, imbalanced, noisy and expensive to generate1. Furthermore, the applications of ML in drug discovery are numerous, require familiarity with several scientific disciplines, and inform high-stakes decisions, such as expensive or time-consuming experiments. The absence of standardized, domain-appropriate datasets, guidelines and tools for the evaluation and comparison of methods has led to a growing gap between perceived progress and real-world impact, which is delaying the adoption of ML in drug discovery. To bridge this gap, we believe that the unique expertise of scientists in the industry, who operate in real-world contexts, will be essential in developing benchmarking protocols tailored to drug discovery. To that end, we already formed a unique collaboration between representatives from ten biotech and pharmaceutical companies, but we believe that an open-science, cross-industry and interdisciplinary effort is needed to tackle such grand challenges.
Fit-for-purpose benchmarks are powerful instruments to direct the ML community towards more impactful research and to unlock breakthrough results. The gold standard for unbiased evaluation is a blind, prospective benchmark, in which different methods are evaluated on a newly generated test set that will only be disclosed after the results have been announced. A popular example in drug discovery is CASP (Critical Assessment of Structure Prediction)2, which enabled a revolution in protein structure prediction by systematically identifying valuable innovations in the community3. However, data acquisition in drug discovery is expensive and time-consuming, limiting the accessibility and availability of blind benchmarks to the general research community.
中文翻译:
呼吁一项由行业主导的计划,以批判性地评估机器学习在真实世界药物发现中的应用
机器学习 (ML) 正在推动药物发现领域令人兴奋的创新,但我们需要注意使此应用程序与众不同的环境。与其他具有由数百万个示例组成的适用数据集的领域不同,药物发现中已发布的数据集通常是异构的、不平衡的、噪声大的,并且生成1 的成本很高。此外,ML 在药物发现中的应用很多,需要熟悉多个科学学科,并为高风险决策提供信息,例如昂贵或耗时的实验。缺乏标准化的、适合领域的数据集、指南和工具来评估和比较方法,导致感知进展与实际影响之间的差距越来越大,这推迟了 ML 在药物发现中的采用。为了弥合这一差距,我们相信,在现实世界环境中工作的行业科学家的独特专业知识对于制定针对药物发现量身定制的基准方案至关重要。为此,我们已经在来自 10 家生物技术和制药公司的代表之间建立了独特的合作,但我们认为需要开放科学、跨行业和跨学科的努力来应对这些重大挑战。
适合用途的基准测试是引导 ML 社区进行更具影响力的研究并解锁突破性结果的强大工具。无偏倚评估的黄金标准是一个盲目的前瞻性基准,其中不同的方法在新生成的测试集上进行评估,该测试集只有在结果公布后才会披露。药物发现的一个常见例子是 CASP(结构预测的关键评估)2,它通过系统地识别社区中有价值的创新,推动了蛋白质结构预测的革命3。然而,药物发现中的数据采集既昂贵又耗时,限制了一般研究界对盲基准测试的可及性和可用性。