当前位置: X-MOL 学术Sci. Adv. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Domain adaptation in small-scale and heterogeneous biological datasets
Science Advances ( IF 11.7 ) Pub Date : 2024-12-20 , DOI: 10.1126/sciadv.adp6040
Seyedmehdi Orouji, Martin C. Liu, Tal Korem, Megan A. K. Peters

Machine-learning models are key to modern biology, yet models trained on one dataset are often not generalizable to other datasets from different cohorts or laboratories due to both technical and biological differences. Domain adaptation, a type of transfer learning, alleviates this problem by aligning different datasets so that models can be applied across them. However, most state-of-the-art domain adaptation methods were designed for large-scale data such as images, whereas biological datasets are smaller and have more features, and these are also complex and heterogeneous. This Review discusses domain adaptation methods in the context of such biological data to inform biologists and guide future domain adaptation research. We describe the benefits and challenges of domain adaptation in biological research and critically explore some of its objectives, strengths, and weaknesses. We argue for the incorporation of domain adaptation techniques to the computational biologist’s toolkit, with further development of customized approaches.

中文翻译:


小规模和异构生物数据集中的域适应



机器学习模型是现代生物学的关键,但由于技术和生物学差异,在一个数据集上训练的模型通常无法推广到来自不同队列或实验室的其他数据集。领域适应是一种迁移学习,它通过调整不同的数据集来缓解这个问题,以便可以跨数据集应用模型。然而,大多数最先进的域适应方法是为图像等大规模数据设计的,而生物数据集更小,特征更多,而且这些数据集也很复杂和异构。本综述讨论了此类生物数据背景下的域适应方法,以为生物学家提供信息并指导未来的域适应研究。我们描述了生物研究中领域适应的好处和挑战,并批判性地探讨了它的一些目标、优势和劣势。我们主张将领域适应技术纳入计算生物学家的工具包,并进一步开发定制方法。
更新日期:2024-12-20
down
wechat
bug