Challenges and Prospects of DNA-Encoded Library Data Interpretation,Chemical Reviews

当前位置： X-MOL 学术 › Chem. Rev. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Challenges and Prospects of DNA-Encoded Library Data Interpretation
Chemical Reviews ( IF 51.4 ) Pub Date : 2024-11-07 , DOI: 10.1021/acs.chemrev.4c00284
Moreno Wichert, Laura Guasch, Raphael M. Franzini

DNA-encoded library (DEL) technology is a powerful platform for the efficient identification of novel chemical matter in the early drug discovery process enabled by parallel screening of vast libraries of encoded small molecules through affinity selection and deep sequencing. While DEL selections provide rich data sets for computational drug discovery, the underlying technical factors influencing DEL data remain incompletely understood. This review systematically examines the key parameters affecting the chemical information in DEL data and their impact on hit triaging and machine learning integration. The need for rigorous data handling and interpretation is emphasized, with standardized methods being critical for the success of DEL-based approaches. Major challenges include the relationship between sequence counts and binding affinities, frequent hitters, and the influence of factors such as inhomogeneous library composition, DNA damage, and linkers on binding modes. Experimental artifacts, such as those caused by protein immobilization and screening matrix effects, further complicate data interpretation. Recent advancements in using machine learning to denoise DEL data and predict drug candidates are highlighted. This review offers practical guidance on adopting best practices for integrating robust methodologies, comprehensive data analysis, and computational tools to improve the accuracy and efficacy of DEL-driven hit discovery.

中文翻译：

DNA 编码文库数据解析的挑战与前景

DNA 编码文库（DEL）技术是一个强大的平台，可通过亲和选择和深度测序平行筛选大量编码小分子文库，从而在早期药物发现过程中有效鉴定新型化学物质。虽然 DEL 选择为计算药物发现提供了丰富的数据集，但影响 DEL 数据的潜在技术因素仍未完全了解。本综述系统地研究了影响 DEL 数据中化学信息的关键参数及其对命中分类和机器学习集成的影响。强调了严格的数据处理和解释的必要性，标准化方法对于基于 DEL 的方法的成功至关重要。主要挑战包括序列计数与结合亲和力之间的关系、频繁的偶联物以及不均匀文库组成、DNA 损伤和接头等因素对结合模式的影响。实验伪影，例如由蛋白质固定化和筛选基质效应引起的伪影，使数据解释进一步复杂化。重点介绍了使用机器学习对 DEL 数据进行降噪和预测候选药物的最新进展。本综述为采用最佳实践来整合稳健的方法、全面的数据分析和计算工具，以提高 DEL 驱动的命中发现的准确性和有效性提供了实用指导。

更新日期：2024-11-07

点击分享查看原文

点击收藏

阅读更多本刊新发论文本刊介绍/投稿指南