当前位置: X-MOL 学术Genome Res. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Ultrasensitive allele inference from immune repertoire sequencing data with MiXCR
Genome Research ( IF 6.2 ) Pub Date : 2024-10-21 , DOI: 10.1101/gr.278775.123
Artem Mikelov, George Nefedev, Aleksandr Tashkeev, Oscar L Rodriguez, Diego A Ortmans, Valeriia Skatova, Mark Izraelson, Alexey N Davydov, Stanislav Poslavsky, Souad Rahmouni, Corey T Watson, Dmitriy M Chudakov, Scott D Boyd, Dmitry A Bolotin

Allelic variability in the adaptive immune receptor loci, which harbor the gene segments that encode B cell and T cell receptors (BCR/TCR), is of critical importance for immune responses to pathogens and vaccines. Adaptive immune receptor repertoire sequencing (AIRR-seq) has become widespread in immunology research making it the most readily available source of information about allelic diversity in immunoglobulin (IG) and T cell receptor (TR) loci. Here we present a novel algorithm for extra-sensitive and specific variable (V) and joining (J) gene allele inference, allowing reconstruction of individual high-quality gene segment libraries. The approach can be applied for inferring allelic variants from peripheral blood lymphocyte BCR and TCR repertoire sequencing data, including hypermutated isotype-switched BCR sequences, thus allowing high-throughput novel allele discovery from a wide variety of existing datasets. The developed algorithm is a part of the MiXCR software. We demonstrate the accuracy of this approach using AIRR-seq paired with long-read genomic sequencing data, comparing it to a widely used algorithm, TIgGER. We applied the algorithm to a large set of IG heavy chain (IGH) AIRR-seq data from 450 donors of ancestrally diverse population groups, and to the largest reported full-length TCR alpha and beta chain (TRA; TRB) AIRR-seq dataset, representing 134 individuals. This allowed us to assess the genetic diversity within the IGH, TRA and TRB loci in different populations and to establish a database of alleles of V and J genes inferred from AIRR-seq data and their population frequencies with free public access through an online database.

中文翻译:


使用 MiXCR 从免疫组库测序数据中进行超灵敏等位基因推断



适应性免疫受体位点的等位基因变异性,其中包含编码 B 细胞和 T 细胞受体 (BCR/TCR) 的基因片段,对于病原体和疫苗的免疫反应至关重要。适应性免疫受体库测序 (AIRR-seq) 在免疫学研究中已得到广泛应用,使其成为有关免疫球蛋白 (IG) 和 T 细胞受体 (TR) 位点等位基因多样性的最容易获得的信息来源。在这里,我们提出了一种用于超敏感和特异性变量 (V) 和连接 (J) 基因等位基因推断的新算法,允许重建单个高质量基因片段文库。该方法可用于从外周血淋巴细胞 BCR 和 TCR 库测序数据中推断等位基因变异,包括超突变的同种型开关 BCR 序列,从而允许从各种现有数据集中高通量发现新等位基因。开发的算法是 MiXCR 软件的一部分。我们使用 AIRR-seq 与长读长基因组测序数据配对来证明这种方法的准确性,并将其与广泛使用的算法 TIgGER 进行比较。我们将该算法应用于来自 450 名祖先不同人群供体的大量 IG 重链 (IGH) AIRR-seq 数据,以及已报道的最大全长 TCR α 和 β 链 (TRA;TRB) AIRR-seq 数据集,代表 134 个个体。这使我们能够评估不同种群中 IGHTRATRB 基因座内的遗传多样性,并建立一个从 AIRR-seq 数据及其种群频率推断出的 V 和 J 基因等位基因数据库,并通过在线数据库免费向公众开放。
更新日期:2024-10-22
down
wechat
bug