当前位置: X-MOL 学术Nucleic Acids Res. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
CATH v4.4: major expansion of CATH by experimental and predicted structural data
Nucleic Acids Research ( IF 16.6 ) Pub Date : 2024-11-20 , DOI: 10.1093/nar/gkae1087
Vaishali P Waman, Nicola Bordin, Andy Lau, Shaun Kandathil, Jude Wells, David Miller, Sameer Velankar, David T Jones, Ian Sillitoe, Christine Orengo

CATH (https://www.cathdb.info) is a structural classification database that assigns domains to the structures in the Protein Data Bank (PDB) and AlphaFold Protein Structure Database (AFDB) and adds layers of biological information, including homology and functional annotation. This article covers developments in the CATH classification since 2021. We report the significant expansion of structural information (180-fold) for CATH superfamilies through classification of PDB domains and predicted domain structures from the Encyclopedia of Domains (TED) resource. TED provides information on predicted domains in AFDB. CATH v4.4 represents an expansion of ∼64 844 experimentally determined domain structures from PDB. We also present a mapping of ∼90 million predicted domains from TED to CATH superfamilies. New PDB and TED data increases the number of superfamilies from 5841 to 6573, folds from 1349 to 2078 and architectures from 41 to 77. TED data comprises predicted structures, so these new folds and architectures remain hypothetical until experimentally confirmed. CATH also classifies domains into functional families (FunFams) within a superfamily. We have updated sequences in FunFams by scanning FunFam-HMMs against UniProt release 2024_02, giving a 276% increase in FunFams coverage. The mapping of TED structural domains has resulted in a 4-fold increase in FunFams with structural information.

中文翻译:


CATH v4.4:通过实验和预测的结构数据对 CATH 进行重大扩展



CATH (https://www.cathdb.info) 是一个结构分类数据库,可为蛋白质数据库 (PDB) 和 AlphaFold 蛋白质结构数据库 (AFDB) 中的结构分配结构域,并添加生物信息层,包括同源性和功能注释。本文介绍了自 2021 年以来 CATH 分类的发展。我们通过对 PDB 结构域进行分类和来自结构域百科全书 (TED) 资源的预测结构域结构,报告了 CATH 超家族的结构信息显著扩展(180 倍)。TED 提供有关 AFDB 中预测域的信息。CATH v4.4 代表了 PDB 实验确定的结构域结构的扩展了 ∼64 844。我们还提出了从 TED 到 CATH 超家族的 ∼9000 万个预测结构域的映射。新的 PDB 和 TED 数据将超家族的数量从 5841 个增加到 6573 个,从 1349 个增加到 2078 个,架构从 41 个增加到 77 个。TED 数据包括预测的结构,因此在实验确认之前,这些新的折叠和结构仍然是假设的。CATH 还将结构域分类为超家族中的功能家族 (FunFams)。我们通过针对 UniProt 版本 2024_02 扫描 FunFam-HMM 来更新 FunFams 中的序列,使 FunFams 覆盖率增加了 276%。TED 结构域的映射导致具有结构信息的 FunFams 增加了 4 倍。
更新日期:2024-11-20
down
wechat
bug