Nature Biotechnology ( IF 33.1 ) Pub Date : 2024-11-04 , DOI: 10.1038/s41587-024-02469-9 M. Frank Erasmus, Laura Spector, Fortunato Ferrara, Roberto DiNiro, Thomas J. Pohl, Katheryn Perea-Schmittle, Wei Wang, Peter M. Tessier, Crystal Richardson, Laure Turner, Sumit Kumar, Daniel Bedinger, Pietro Sormanni, Monica L. Fernández-Quintero, Andrew B. Ward, Johannes R. Loeffler, Olivia M. Swanson, Charlotte M. Deane, Matthew I. J. Raybould, Andreas Evers, Carolin Sellmann, Sharrol Bachas, Jeff Ruffolo, Horacio G. Nastri, Karthik Ramesh, Jesper Sørensen, Rebecca Croasdale-Wood, Oliver Hijano, Camila Leal-Lopes, Melody Shahsavarian, Yu Qiu, Paolo Marcatili, Erik Vernet, Rahmad Akbar, Simon Friedensohn, Rick Wagner, Vinodh babu Kurella, Shipra Malhotra, Satyendra Kumar, Patrick Kidger, Juan C. Almagro, Eric Furfine, Marty Stanton, Christilyn P. Graff, Santiago David Villalba, Florian Tomszak, Andre A. R. Teixeira, Elizabeth Hopkins, Molly Dovner, Sara D’Angelo, Andrew R. M. Bradbury
Science is frequently subject to the Gartner hype cycle1: emergent technologies spark intense initial enthusiasm with the recruitment of dedicated scientists. As limitations are recognized, disillusionment often sets in; some scientists turn away, disappointed in the inability of the new technology to deliver on initial promise, while others persevere and further develop the technology. Although the value (or not) of a new technology usually becomes clear with time, appropriate benchmarks can be invaluable in highlighting strengths and areas for improvement, substantially speeding up technology maturation. A particular challenge in computational engineering and artificial intelligence (AI)/machine learning (ML) is that benchmarks and best practices are uncommon, so it is particularly hard for non-experts to assess the impact and performance of these methods. Although multiple papers have highlighted best practices and evaluation guidelines2,3,4, the true test for such methods is ultimately prospective performance, which requires experimental testing.
中文翻译:
AIntibody:一项经过计算机验证的抗体发现设计挑战赛
科学经常受到 Gartner 炒作周期1 的影响:新兴技术通过招募敬业的科学家激发了强烈的初始热情。随着局限性被认识到,幻灭往往会开始;一些科学家对新技术无法兑现最初的承诺感到失望,而其他科学家则坚持不懈并进一步发展这项技术。尽管新技术的价值(或价值)通常会随着时间的推移而变得清晰,但适当的基准对于突出优势和需要改进的领域非常宝贵,从而大大加快技术的成熟。计算工程和人工智能 (AI)/机器学习 (ML) 的一个特殊挑战是基准和最佳实践并不常见,因此非专家特别难以评估这些方法的影响和性能。尽管多篇论文都强调了最佳实践和评估指南2,3,4,但对此类方法的真正测试最终是前瞻性性能,这需要实验测试。