当前位置:
X-MOL 学术
›
Nucleic Acids Res.
›
论文详情
Our official English website, www.x-mol.net, welcomes your
feedback! (Note: you will need to create a separate account there.)
Harmonizome 3.0: integrated knowledge about genes and proteins from diverse multi-omics resources
Nucleic Acids Research ( IF 16.6 ) Pub Date : 2024-11-20 , DOI: 10.1093/nar/gkae1080 Ido Diamant, Daniel J B Clarke, John Erol Evangelista, Nathania Lingam, Avi Ma’ayan
Nucleic Acids Research ( IF 16.6 ) Pub Date : 2024-11-20 , DOI: 10.1093/nar/gkae1080 Ido Diamant, Daniel J B Clarke, John Erol Evangelista, Nathania Lingam, Avi Ma’ayan
By processing and abstracting diverse omics datasets into associations between genes and their attributes, the Harmonizome database enables researchers to explore and integrate knowledge about human genes from many central omics resources. Here, we introduce Harmonizome 3.0, a significant upgrade to the original Harmonizome database. The upgrade adds 26 datasets that contribute nearly 12 million associations between genes and various attribute types such as cells and tissues, diseases, and pathways. The upgrade has a dataset crossing feature to identify gene modules that are shared across datasets. To further explain significantly high gene set overlap between dataset pairs, a large language model (LLM) composes a paragraph that speculates about the reasons behind the high overlap. The upgrade also adds more data formats and visualization options. Datasets are downloadable as knowledge graph (KG) assertions and visualized with Uniform Manifold Approximation and Projection (UMAP) plots. The KG assertions can be explored via a user interface that visualizes gene–attribute associations as ball-and-stick diagrams. Overall, Harmonizome 3.0 is a rich resource of processed omics datasets that are provided in several AI-ready formats. Harmonizome 3.0 is available at https://maayanlab.cloud/Harmonizome/.
中文翻译:
Harmonizome 3.0:来自各种多组学资源的基因和蛋白质的综合知识
通过将不同的组学数据集处理和抽象为基因及其属性之间的关联,Harmonizome 数据库使研究人员能够从许多中心组学资源中探索和整合有关人类基因的知识。在这里,我们介绍了 Harmonizome 3.0,这是对原始 Harmonizome 数据库的重大升级。此次升级增加了 26 个数据集,这些数据集在基因和各种属性类型(如细胞和组织、疾病和通路)之间贡献了近 1200 万个关联。升级具有数据集交叉功能,用于识别跨数据集共享的基因模块。为了进一步解释数据集对之间显著的高基因集重叠,一个大型语言模型 (LLM) 组成了一个段落来推测高重叠背后的原因。升级还添加了更多数据格式和可视化选项。数据集可下载为知识图谱 (KG) 断言,并使用均匀流形近似和投影 (UMAP) 图进行可视化。KG 断言可以通过用户界面进行探索,该界面将基因-属性关联可视化为球棒图。总体而言,Harmonizome 3.0 是一个丰富的已处理组学数据集资源,这些数据集以多种 AI 就绪格式提供。Harmonizome 3.0 可在 https://maayanlab.cloud/Harmonizome/ 购买。
更新日期:2024-11-20
中文翻译:
Harmonizome 3.0:来自各种多组学资源的基因和蛋白质的综合知识
通过将不同的组学数据集处理和抽象为基因及其属性之间的关联,Harmonizome 数据库使研究人员能够从许多中心组学资源中探索和整合有关人类基因的知识。在这里,我们介绍了 Harmonizome 3.0,这是对原始 Harmonizome 数据库的重大升级。此次升级增加了 26 个数据集,这些数据集在基因和各种属性类型(如细胞和组织、疾病和通路)之间贡献了近 1200 万个关联。升级具有数据集交叉功能,用于识别跨数据集共享的基因模块。为了进一步解释数据集对之间显著的高基因集重叠,一个大型语言模型 (LLM) 组成了一个段落来推测高重叠背后的原因。升级还添加了更多数据格式和可视化选项。数据集可下载为知识图谱 (KG) 断言,并使用均匀流形近似和投影 (UMAP) 图进行可视化。KG 断言可以通过用户界面进行探索,该界面将基因-属性关联可视化为球棒图。总体而言,Harmonizome 3.0 是一个丰富的已处理组学数据集资源,这些数据集以多种 AI 就绪格式提供。Harmonizome 3.0 可在 https://maayanlab.cloud/Harmonizome/ 购买。