当前位置: X-MOL 学术Nat. Biotechnol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Evo learns biological complexity from the molecular to genome scale
Nature Biotechnology ( IF 33.1 ) Pub Date : 2024-12-11 , DOI: 10.1038/s41587-024-02514-7
Iris Marchal

Generative artificial intelligence models of molecular biology are often restricted to individual molecules or DNA segments and are built in a way that makes them computationally demanding when applied to long sequences. The ability to capture broader genomic interactions will be crucial for both the understanding and engineering of complex biological processes. Writing in Science, Nguyen et al. introduce Evo, a genomic foundation model that can interpret and generate DNA sequences at whole-genome scale while maintaining single-nucleotide resolution.

Evo is built on a StripedHyena architecture and equipped with 7 billion parameters, with a context length of up to 131 kilobases. Trained on 2.7 million microbial genomes, Evo performed well on various tasks that were previously performed with domain-specific models. For example, Evo learned the effects of mutations on protein and noncoding RNA function, modeled the activity of regulatory elements, and also understood how small mutations affect organismal fitness by predicting gene essentiality.



中文翻译:


Evo 从分子到基因组尺度学习生物复杂性



分子生物学的生成式人工智能模型通常仅限于单个分子或 DNA 片段,并且其构建方式使其在应用于长序列时对计算要求很高。捕获更广泛的基因组相互作用的能力对于理解和设计复杂的生物过程都至关重要。Nguyen 等人在《科学》杂志上撰文介绍了 Evo,这是一种基因组基础模型,可以在全基因组规模上解释和生成 DNA 序列,同时保持单核苷酸分辨率。


Evo 基于 StripedHyena 架构构建,配备 70 亿个参数,上下文长度高达 131 KB。Evo 在 270 万个微生物基因组上进行了训练,在以前使用特定领域模型执行的各种任务中表现良好。例如,Evo 了解了突变对蛋白质和非编码 RNA 功能的影响,模拟了调节元件的活性,还通过预测基因必需性了解了微小突变如何影响生物体适应性。

更新日期:2024-12-12
down
wechat
bug