-
Modeling gene interactions in polygenic prediction via geometric deep learning Genome Res. (IF 6.2) Pub Date : 2024-11-19 Han Li, Jianyang Zeng, Michael P Snyder, Sai Zhang
Polygenic risk score (PRS) is a widely-used approach for predicting individuals' genetic risk of complex diseases, playing a pivotal role in advancing precision medicine. Traditional PRS methods, predominantly following a linear structure, often fall short in capturing the intricate relationships between genotype and phenotype. In this study, we present PRS-Net, an interpretable geometric deep learning-based
-
Multisample motif discovery and visualization for tandem repeats Genome Res. (IF 6.2) Pub Date : 2024-11-13 Yaran Zhang, Marc Hulsman, Alex Salazar, Niccoló Tesi, Lydian Knoop, Sven van der Lee, Sanduni Wijesekera, Jana Krizova, Erik-Jan Kamsteeg, Henne Holstege
Tandem Repeats (TR) occupy a significant portion of the human genome and are the source of polymorphism due to variations in sizes and motif compositions. Some of these variations have been associated with various neuropathological disorders, highlighting the clinical importance of assessing the motif structure of TRs. Moreover, assessing the TR motif variation can offer valuable insights into evolutionary
-
Multiple paralogues and recombination mechanisms contribute to the high incidence of 22q11.2 Deletion Syndrome Genome Res. (IF 6.2) Pub Date : 2024-11-13 Lisanne Vervoort, Nicolas Dierckxsens, Marta Sousa Santos, Senne Meynants, Erika Souche, Ruben Cools, Tracy Heung, Koen Devriendt, Hilde Peeters, Donna McDonald-McGinn, Ann Swillen, Jeroen Breckpot, Beverly S. Emanuel, Hilde Van Esch, Anne S. Bassett, Joris R. Vermeesch
The 22q11.2 deletion syndrome (22q11.2DS) is the most common microdeletion disorder. Why the incidence of 22q11.2DS is much greater than that of other genomic disorders remains unknown. Short read sequencing cannot resolve the complex segmental duplicon structure to provide direct confirmation of the hypothesis that the rearrangements are caused by nonallelic homologous recombination between the low
-
High-quality sika deer omics data and integrative analysis reveal genic and cellular regulation of antler regeneration Genome Res. (IF 6.2) Pub Date : 2024-11-14 Zihe Li, Ziyu Xu, Lei Zhu, Tao Qin, Jinrui Ma, Zhanying Feng, Huishan Yue, Qing Guan, Botong Zhou, Ge Han, Guokun Zhang, Chunyi Li, Shuaijun Jia, Qiang Qiu, Dingjun Hao, Yong Wang, Wen Wang
Antler is the only organ that can fully regenerate annually in mammals. However, the regulatory pattern and mechanism of gene expression and cell differentiation during this process remain largely unknown. Here, we obtain comprehensive assembly and gene annotation of the sika deer (Cervus nippon) genome. Together with large-scale chromatin accessibility and gene expression data, we construct gene regulatory
-
ISWI1 complex proteins facilitate developmental genome editing in Paramecium Genome Res. (IF 6.2) Pub Date : 2024-11-14 Aditi Singh, Lilia Häußermann, Christiane Emmerich, Emily Nischwitz, Brandon KB Seah, Falk Butter, Mariusz Nowacki, Estienne C. Swart
One of the most extensive forms of natural genome editing occurs in ciliates, a group of microbial eukaryotes. Ciliate germline and somatic genomes are contained in distinct nuclei within the same cell. During the massive reorganization process of somatic genome development, ciliates eliminate tens of thousands of DNA sequences from a germline genome copy. Recently, we showed that the chromatin remodeler
-
Understanding isoform expression by pairing long-read sequencing with single-cell and spatial transcriptomics Genome Res. (IF 6.2) Pub Date : 2024-11-01 Natan Belchikov, Justine Hsu, Xiang Jennie Li, Julien Jarroux, Wen Hu, Anoushka Joglekar, Hagen U. Tilgner
RNA isoform diversity, produced via alternative splicing, and alternative usage of transcription start and poly(A) sites, results in varied transcripts being derived from the same gene. Distinct isoforms can play important biological roles, including by changing the sequences or expression levels of protein products. The first single-cell approaches to RNA sequencing—and later, spatial approaches—which
-
Challenges in identifying mRNA transcript starts and ends from long-read sequencing data Genome Res. (IF 6.2) Pub Date : 2024-11-01 Ezequiel Calvo-Roitberg, Rachel F. Daniels, Athma A. Pai
Long-read sequencing (LRS) technologies have the potential to revolutionize scientific discoveries in RNA biology through the comprehensive identification and quantification of full-length mRNA isoforms. Despite great promise, challenges remain in the widespread implementation of LRS technologies for RNA-based applications, including concerns about low coverage, high sequencing error, and robust computational
-
Leveraging the power of long reads for targeted sequencing Genome Res. (IF 6.2) Pub Date : 2024-11-01 Shruti V. Iyer, Sara Goodwin, William Richard McCombie
Long-read sequencing technologies have improved the contiguity and, as a result, the quality of genome assemblies by generating reads long enough to span and resolve complex or repetitive regions of the genome. Several groups have shown the power of long reads in detecting thousands of genomic and epigenomic features that were previously missed by short-read sequencing approaches. While these studies
-
Revolutionizing genomics and medicine—one long molecule at a time Genome Res. (IF 6.2) Pub Date : 2024-11-01 Ana Conesa, Alexander Hoischen, Fritz J. Sedlazeck
Long-read sequencing (LRS) has matured, and the dramatically increased accuracy, ever-increasing throughput, and access now allow new and advanced studies even at scale. This Special Issue of Genome Research on “Long-read DNA and RNA Sequencing Applications in Biology and Medicine” garnered a record number of submissions, reflecting both the intense and broad interest in the technologies and the next
-
Haplotype-resolved genome and population genomics of the threatened garden dormouse in Europe Genome Res. (IF 6.2) Pub Date : 2024-11-01 Paige A. Byerly, Alina von Thaden, Evgeny Leushkin, Leon Hilgers, Shenglin Liu, Sven Winter, Tilman Schell, Charlotte Gerheim, Alexander Ben Hamadou, Carola Greve, Christian Betz, Hanno J. Bolz, Sven Büchner, Johannes Lang, Holger Meinig, Evax Marie Famira-Parcsetich, Sarah P. Stubbe, Alice Mouton, Sandro Bertolino, Goedele Verbeylen, Thomas Briner, Lídia Freixas, Lorenzo Vinciguerra, Sarah A. Mueller
Genomic resources are important for evaluating genetic diversity and supporting conservation efforts. The garden dormouse (Eliomys quercinus) is a small rodent that has experienced one of the most severe modern population declines in Europe. We present a high-quality haplotype-resolved reference genome for the garden dormouse, and combine comprehensive short and long-read transcriptomics data sets
-
Construction and evaluation of a new rat reference genome assembly, GRCr8, from long reads and long-range scaffolding Genome Res. (IF 6.2) Pub Date : 2024-11-01 Kai Li, Melissa L. Smith, J. Chris Blazier, Kelli J. Kochan, Jonathan M.D. Wood, Kerstin Howe, Anne E. Kwitek, Melinda R. Dwinell, Hao Chen, Julia L. Ciosek, Patrick Masterson, Terence D. Murphy, Theodore S. Kalbfleisch, Peter A. Doris
We report the construction and analysis of a new reference genome assembly for Rattus norvegicus, the laboratory rat, a widely used experimental animal model organism. The assembly has been adopted as the rat reference assembly by the Genome Reference Consortium and is named GRCr8. The assembly has employed 40× Pacific Biosciences (PacBio) HiFi sequencing coverage and scaffolding using optical mapping
-
Gapless assembly of complete human and plant chromosomes using only nanopore sequencing Genome Res. (IF 6.2) Pub Date : 2024-11-01 Sergey Koren, Zhigui Bao, Andrea Guarracino, Shujun Ou, Sara Goodwin, Katharine M. Jenike, Julian Lucas, Brandy McNulty, Jimin Park, Mikko Rautiainen, Arang Rhie, Dick Roelofs, Harrie Schneiders, Ilse Vrijenhoek, Koen Nijbroek, Olle Nordesjo, Sergey Nurk, Mike Vella, Katherine R. Lawrence, Doreen Ware, Michael C. Schatz, Erik Garrison, Sanwen Huang, William Richard McCombie, Karen H. Miga, Alexander
The combination of ultra-long (UL) Oxford Nanopore Technologies (ONT) sequencing reads with long, accurate Pacific Bioscience (PacBio) High Fidelity (HiFi) reads has enabled the completion of a human genome and spurred similar efforts to complete the genomes of many other species. However, this approach for complete, “telomere-to-telomere” genome assembly relies on multiple sequencing platforms, limiting
-
Factors impacting target-enriched long-read sequencing of resistomes and mobilomes Genome Res. (IF 6.2) Pub Date : 2024-11-01 Ilya B. Slizovskiy, Nathalie Bonin, Jonathan E. Bravo, Peter M. Ferm, Jacob Singer, Christina Boucher, Noelle R. Noyes
We investigated the efficiency of target-enriched long-read sequencing (TELSeq) for detecting antimicrobial resistance genes (ARGs) and mobile genetic elements (MGEs) within complex matrices. We aimed to overcome limitations associated with traditional antimicrobial resistance (AMR) detection methods, including short-read shotgun metagenomics, which can lack sensitivity, specificity, and the ability
-
Leveraging the T2T assembly to resolve rare and pathogenic inversions in reference genome gaps Genome Res. (IF 6.2) Pub Date : 2024-11-01 Kristine Bilgrav Saether, Jesper Eisfeldt, Jesse D. Bengtsson, Ming Yin Lun, Christopher M. Grochowski, Medhat Mahmoud, Hsiao-Tuan Chao, Jill A. Rosenfeld, Pengfei Liu, Marlene Ek, Jakob Schuy, Adam Ameur, Hongzheng Dai, Undiagnosed Diseases Network, James Paul Hwang, Fritz J. Sedlazeck, Weimin Bi, Ronit Marom, Josephine Wincent, Ann Nordgren, Claudia M.B. Carvalho, Anna Lindstrand
Chromosomal inversions (INVs) are particularly challenging to detect due to their copy-number neutral state and association with repetitive regions. Inversions represent about 1/20 of all balanced structural chromosome aberrations and can lead to disease by gene disruption or altering regulatory regions of dosage-sensitive genes in cis. Short-read genome sequencing (srGS) can only resolve ∼70% of cytogenetically
-
Resolving complex duplication variants in autism spectrum disorder using long-read genome sequencing Genome Res. (IF 6.2) Pub Date : 2024-11-01 Jesper Eisfeldt, Edward J. Higginbotham, Felix Lenner, Jennifer Howe, Bridget A. Fernandez, Anna Lindstrand, Stephen W. Scherer, Lars Feuk
Rare or de novo structural variation, primarily in the form of copy number variants, is detected in 5%–10% of autism spectrum disorder (ASD) families. While complex structural variants involving duplications can generally be detected using microarray or short-read genome sequencing (GS), these methods frequently fail to characterize breakpoints at nucleotide resolution, requiring additional molecular
-
A national long-read sequencing study on chromosomal rearrangements uncovers hidden complexities Genome Res. (IF 6.2) Pub Date : 2024-11-01 Jesper Eisfeldt, Adam Ameur, Felix Lenner, Esmee Ten Berk de Boer, Marlene Ek, Josephine Wincent, Raquel Vaz, Jesper Ottosson, Tord Jonson, Sofie Ivarsson, Sofia Thunström, Alexandra Topa, Simon Stenberg, Anna Rohlin, Anna Sandestig, Margareta Nordling, Pia Palmebäck, Magnus Burstedt, Frida Nordin, Eva-Lena Stattin, Maria Sobol, Panagiotis Baliakas, Marie-Louise Bondeson, Ida Höijer, Kristine Bilgrav
Clinical genetic laboratories often require a comprehensive analysis of chromosomal rearrangements/structural variants (SVs), from large events like translocations and inversions to supernumerary ring/marker chromosomes and small deletions or duplications. Understanding the complexity of these events and their clinical consequences requires pinpointing breakpoint junctions and resolving the derivative
-
An optimized protocol for quality control of gene therapy vectors using nanopore direct RNA sequencing Genome Res. (IF 6.2) Pub Date : 2024-11-01 Kathleen Zeglinski, Christian Montellese, Matthew E. Ritchie, Monther Alhamdoosh, Cédric Vonarburg, Rory Bowden, Monika Jordi, Quentin Gouil, Florian Aeschimann, Arthur Hsu
Despite recent advances made toward improving the efficacy of lentiviral gene therapies, a sizeable proportion of produced vector contains an incomplete and thus potentially nonfunctional RNA genome. This can undermine gene delivery by the lentivirus as well as increase manufacturing costs and must be improved to facilitate the widespread clinical implementation of lentiviral gene therapies. Here,
-
Generation and analysis of a mouse multitissue genome annotation atlas Genome Res. (IF 6.2) Pub Date : 2024-11-01 Matthew Adams, Christopher Vollmers
Generating an accurate and complete genome annotation for an organism is complex because the cells within each tissue can express a unique set of transcript isoforms from a unique set of genes. A comprehensive genome annotation should contain information on what tissues express what transcript isoforms at what level. This tissue-level isoform information can then inform a wide range of research questions
-
Unraveling the architecture of major histocompatibility complex class II haplotypes in rhesus macaques Genome Res. (IF 6.2) Pub Date : 2024-11-01 Nanine de Groot, Marit van der Wiel, Ngoc Giang Le, Natasja G. de Groot, Jesse Bruijnesteijn, Ronald E. Bontrop
The regions in the genome that encode components of the immune system are often featured by polymorphism, copy number variation, and segmental duplications. There is a need to thoroughly characterize these complex regions to gain insight into the impact of genomic diversity on health and disease. Here we resolve the organization of complete major histocompatibility complex (MHC) class II regions in
-
Accurate bacterial outbreak tracing with Oxford Nanopore sequencing and reduction of methylation-induced errors Genome Res. (IF 6.2) Pub Date : 2024-11-01 Mara Lohde, Gabriel E. Wagner, Johanna Dabernig-Heinz, Adrian Viehweger, Sascha D. Braun, Stefan Monecke, Celia Diezel, Claudia Stein, Mike Marquet, Ralf Ehricht, Mathias W. Pletz, Christian Brandt
Our study investigates the effectiveness of Oxford Nanopore Technologies for accurate outbreak tracing by resequencing 33 isolates of a 3-year-long Klebsiella pneumoniae outbreak with Illumina short-read sequencing data as the point of reference. We detect considerable base errors through cgMLST and phylogenetic analysis of genomes sequenced with Oxford Nanopore Technologies, leading to the false exclusion
-
Long-read genome assembly of the insect model organism Tribolium castaneum reveals spread of satellite DNA in gene-rich regions by recurrent burst events Genome Res. (IF 6.2) Pub Date : 2024-11-01 Marin Volarić, Evelin Despot-Slade, Damira Veseljak, Brankica Mravinac, Nevenka Meštrović
Eukaryotic genomes are replete with satellite DNAs (satDNAs), large stretches of tandemly repeated sequences that are mostly underrepresented in genome assemblies. Here we combined nanopore long-read sequencing with a reference-guided assembly approach to generate an improved, high-quality genome assembly, TcasONT, of the model beetle Tribolium castaneum. Enriched by 45 Mb in repetitive regions, the
-
Telomere-to-telomere assembly by preserving contained reads Genome Res. (IF 6.2) Pub Date : 2024-11-01 Sudhanva Shyam Kamath, Mehak Bindra, Debnath Pal, Chirag Jain
Automated telomere-to-telomere (T2T) de novo assembly of diploid and polyploid genomes remains a formidable task. A string graph is a commonly used assembly graph representation in the assembly algorithms. The string graph formulation employs graph simplification heuristics, which drastically reduce the count of vertices and edges. One of these heuristics involves removing the reads contained in longer
-
Detecting m6A RNA modification from nanopore sequencing using a semisupervised learning framework Genome Res. (IF 6.2) Pub Date : 2024-11-01 Haotian Teng, Marcus Stoiber, Ziv Bar-Joseph, Carl Kingsford
Direct nanopore-based RNA sequencing can be used to detect posttranscriptional base modifications, such as N6-methyladenosine (m6A) methylation, based on the electric current signals produced by the distinct chemical structures of modified bases. A key challenge is the scarcity of adequate training data with known methylation modifications. We present Xron, a hybrid encoder–decoder framework that delivers
-
Characterizing tandem repeat complexities across long-read sequencing platforms with TREAT and otter Genome Res. (IF 6.2) Pub Date : 2024-11-01 Niccoló Tesi, Alex Salazar, Yaran Zhang, Sven van der Lee, Marc Hulsman, Lydian Knoop, Sanduni Wijesekera, Jana Krizova, Anne-Fleur Schneider, Maartje Pennings, Kristel Sleegers, Erik-Jan Kamsteeg, Marcel Reinders, Henne Holstege
Tandem repeats (TRs) play important roles in genomic variation and disease risk in humans. Long-read sequencing allows for the accurate characterization of TRs; however, the underlying bioinformatics perspectives remain challenging. We present otter and TREAT: otter is a fast targeted local assembler, cross-compatible across different sequencing platforms. It is integrated in TREAT, an end-to-end workflow
-
Genomic epidemiology of carbapenem-resistant Enterobacterales at a New York City hospital over a 10-year period reveals complex plasmid-clone dynamics and evidence for frequent horizontal transfer of blaKPC Genome Res. (IF 6.2) Pub Date : 2024-11-01 Angela Gomez-Simmonds, Medini K. Annavajhala, Dwayne Seeram, Todd W. Hokunson, Heekuk Park, Anne-Catrin Uhlemann
Transmission of carbapenem-resistant Enterobacterales (CRE) in hospitals has been shown to occur through complex, multifarious networks driven by both clonal spread and horizontal transfer mediated by plasmids and other mobile genetic elements. We performed nanopore long-read sequencing on CRE isolates from a large urban hospital system to determine the overall contribution of plasmids to CRE transmission
-
Nanopore strand-specific mismatch enables de novo detection of bacterial DNA modifications Genome Res. (IF 6.2) Pub Date : 2024-11-01 Xudong Liu, Ying Ni, Lianwei Ye, Zhihao Guo, Lu Tan, Jun Li, Mengsu Yang, Sheng Chen, Runsheng Li
DNA modifications in bacteria present diverse types and distributions, playing crucial functional roles. Current methods for detecting bacterial DNA modifications via nanopore sequencing typically involve comparing raw current signals to a methylation-free control. In this study, we found that bacterial DNA modification induces errors in nanopore reads. And these errors are found only in one strand
-
High-coverage nanopore sequencing of samples from the 1000 Genomes Project to build a comprehensive catalog of human genetic variation Genome Res. (IF 6.2) Pub Date : 2024-11-01 Jonas A. Gustafson, Sophia B. Gibson, Nikhita Damaraju, Miranda P.G. Zalusky, Kendra Hoekzema, David Twesigomwe, Lei Yang, Anthony A. Snead, Phillip A. Richmond, Wouter De Coster, Nathan D. Olson, Andrea Guarracino, Qiuhui Li, Angela L. Miller, Joy Goffena, Zachary B. Anderson, Sophie H.R. Storz, Sydney A. Ward, Maisha Sinha, Claudia Gonzaga-Jauregui, Wayne E. Clarke, Anna O. Basile, André Corvelo
Fewer than half of individuals with a suspected Mendelian or monogenic condition receive a precise molecular diagnosis after comprehensive clinical genetic testing. Improvements in data quality and costs have heightened interest in using long-read sequencing (LRS) to streamline clinical genomic testing, but the absence of control data sets for variant filtering and prioritization has made tertiary
-
Long-read genome sequencing and variant reanalysis increase diagnostic yield in neurodevelopmental disorders Genome Res. (IF 6.2) Pub Date : 2024-11-01 Susan M. Hiatt, James M.J. Lawlor, Lori H. Handley, Donald R. Latner, Zachary T. Bonnstetter, Candice R. Finnila, Michelle L. Thompson, Lori Beth Boston, Melissa Williams, Ivan Rodriguez Nunez, Jerry Jenkins, Whitley V. Kelley, E. Martina Bebin, Michael A. Lopez, Anna C.E. Hurst, Bruce R. Korf, Jeremy Schmutz, Jane Grimwood, Gregory M. Cooper
Variant detection from long-read genome sequencing (lrGS) has proven to be more accurate and comprehensive than variant detection from short-read genome sequencing (srGS). However, the rate at which lrGS can increase molecular diagnostic yield for rare disease is not yet precisely characterized. We performed lrGS using Pacific Biosciences “HiFi” technology on 96 short-read-negative probands with rare
-
Chromosome-level subgenome-aware de novo assembly provides insight into Saccharomyces bayanus genome divergence after hybridization Genome Res. (IF 6.2) Pub Date : 2024-11-01 Cory Gardner, Junhao Chen, Christina Hadfield, Zhaolian Lu, David Debruin, Yu Zhan, Maureen J. Donlin, Tae-Hyuk Ahn, Zhenguo Lin
Interspecies hybridization is prevalent in various eukaryotic lineages and plays important roles in phenotypic diversification, adaptation, and speciation. To better understand the changes that occurred in the different subgenomes of a hybrid species and how they facilitate adaptation, we have completed chromosome-level de novo assemblies of all chromosomes for a recently formed hybrid yeast, Saccharomyces
-
Full-length RNA transcript sequencing traces brain isoform diversity in house mouse natural populations Genome Res. (IF 6.2) Pub Date : 2024-11-01 Wenyu Zhang, Anja Guenther, Yuanxiao Gao, Kristian Ullrich, Bruno Huettel, Aftab Ahmad, Lei Duan, Kaizong Wei, Diethard Tautz
The ability to generate multiple RNA transcript isoforms from the same gene is a general phenomenon in eukaryotes. However, the complexity and diversity of alternative isoforms in natural populations remain largely unexplored. Using a newly developed full-length transcript enrichment protocol with 5′ CAP selection, we sequenced full-length RNA transcripts of 48 individuals from outbred populations
-
Long-read RNA sequencing of archival tissues reveals novel genes and transcripts associated with clear cell renal cell carcinoma recurrence and immune evasion Genome Res. (IF 6.2) Pub Date : 2024-11-01 Joshua Lee, Elizabeth A. Snell, Joanne Brown, Charlotte E. Booth, Rosamonde E. Banks, Daniel J. Turner, Naveen S. Vasudev, Dimitris Lagos
The use of long-read direct RNA sequencing (DRS) and PCR cDNA sequencing (PCS) in clinical oncology remains limited, with no direct comparison between the two methods. We used DRS and PCS to study clear cell renal cell carcinoma (ccRCC), focusing on new transcript and gene discovery. Twelve primary ccRCC archival tumors, six from patients who went on to relapse, were analyzed. Results were validated
-
Measuring X-Chromosome inactivation skew for X-linked diseases with adaptive nanopore sequencing Genome Res. (IF 6.2) Pub Date : 2024-11-01 Sena A. Gocuk, James Lancaster, Shian Su, Jasleen K. Jolly, Thomas L. Edwards, Doron G. Hickey, Matthew E. Ritchie, Marnie E. Blewitt, Lauren N. Ayton, Quentin Gouil
X-linked genetic disorders typically affect females less severely than males owing to the presence of a second X Chromosome not carrying the deleterious variant. However, the phenotypic expression in females is highly variable, which may be explained by an allelic skew in X-Chromosome inactivation. Accurate measurement of X inactivation skew is crucial to understand and predict disease phenotype in
-
Enhanced detection of RNA modifications and read mapping with high-accuracy nanopore RNA basecalling models Genome Res. (IF 6.2) Pub Date : 2024-11-01 Gregor Diensthuber, Leszek P. Pryszcz, Laia Llovera, Morghan C. Lucas, Anna Delgado-Tejedor, Sonia Cruciani, Jean-Yves Roignant, Oguzhan Begik, Eva Maria Novoa
In recent years, nanopore direct RNA sequencing (DRS) became a valuable tool for studying the epitranscriptome, owing to its ability to detect multiple modifications within the same full-length native RNA molecules. Although RNA modifications can be identified in the form of systematic basecalling “errors” in DRS data sets, N6-methyladenosine (m6A) modifications produce relatively low “errors” compared
-
Long-read transcriptome sequencing of CLL and MDS patients uncovers molecular effects of SF3B1 mutations Genome Res. (IF 6.2) Pub Date : 2024-11-01 Alicja Pacholewska, Matthias Lienhard, Mirko Brüggemann, Heike Hänel, Lorina Bilalli, Anja Königs, Felix Heß, Kerstin Becker, Karl Köhrer, Jesko Kaiser, Holger Gohlke, Norbert Gattermann, Michael Hallek, Carmen D. Herling, Julian König, Christina Grimm, Ralf Herwig, Kathi Zarnack, Michal R. Schweiger
Mutations in splicing factor 3B subunit 1 (SF3B1) frequently occur in patients with chronic lymphocytic leukemia (CLL) and myelodysplastic syndromes (MDSs). These mutations have different effects on the disease prognosis with beneficial effect in MDS and worse prognosis in CLL patients. A full-length transcriptome approach can expand our knowledge on SF3B1 mutation effects on RNA splicing and its contribution
-
Long-read DNA and cDNA sequencing identify cancer-predisposing deep intronic variation in tumor-suppressor genes Genome Res. (IF 6.2) Pub Date : 2024-11-01 Suleyman Gulsuner, Amal AbuRayyan, Jessica B. Mandell, Ming K. Lee, Greta V. Bernier, Barbara M. Norquist, Sarah B. Pierce, Mary-Claire King, Tom Walsh
The vast majority of deeply intronic genomic variants are benign, but some extremely rare or private deep intronic variants lead to exonification of intronic sequence with abnormal transcriptional consequences. Damaging variants of this class are likely underreported as causes of disease for several reasons: Most clinical DNA and RNA testing does not include full intronic sequences; many of these variants
-
Visualization and analysis of medically relevant tandem repeats in nanopore sequencing of control cohorts with pathSTR Genome Res. (IF 6.2) Pub Date : 2024-11-01 Wouter De Coster, Ida Höijer, Inge Bruggeman, Svenn D'Hert, Malin Melin, Adam Ameur, Rosa Rademakers
The lack of population-scale databases hampers research and diagnostics for medically relevant tandem repeats and repeat expansions. We attempt to fill this gap using our pathSTR web tool, which leverages long-read sequencing of large cohorts to determine repeat length and sequence composition in a healthy population. The current version includes 1040 individuals of The 1000 Genomes Project cohort
-
Independent expansion, selection, and hypervariability of the TBC1D3 gene family in humans Genome Res. (IF 6.2) Pub Date : 2024-11-01 Xavi Guitart, David Porubsky, DongAhn Yoo, Max L. Dougherty, Philip C. Dishuck, Katherine M. Munson, Alexandra P. Lewis, Kendra Hoekzema, Jordan Knuth, Stephen Chang, Tomi Pastinen, Evan E. Eichler
TBC1D3 is a primate-specific gene family that has expanded in the human lineage and has been implicated in neuronal progenitor proliferation and expansion of the frontal cortex. The gene family and its expression have been challenging to investigate because it is embedded in high-identity and highly variable segmental duplications. We sequenced and assembled the gene family using long-read sequencing
-
Long-read Ribo-STAMP simultaneously measures transcription and translation with isoform resolution Genome Res. (IF 6.2) Pub Date : 2024-11-01 Pratibha Jagannatha, Alexandra T. Tankka, Daniel A. Lorenz, Tao Yu, Brian A. Yee, Kristopher W. Brannan, Cathy J. Zhou, Jason G. Underwood, Gene W. Yeo
Transcription and translation are intertwined processes in which mRNA isoforms are crucial intermediaries. However, methodological limitations in analyzing translation at the mRNA isoform level have left gaps in our understanding of critical biological processes. To address these gaps, we developed an integrated computational and experimental framework called long-read Ribo-STAMP (LR-Ribo-STAMP) that
-
DNA-m6A calling and integrated long-read epigenetic and genetic analysis with fibertools Genome Res. (IF 6.2) Pub Date : 2024-11-01 Anupama Jha, Stephanie C. Bohaczuk, Yizi Mao, Jane Ranchalis, Benjamin J. Mallory, Alan T. Min, Morgan O. Hamm, Elliott Swanson, Danilo Dubocanin, Connor Finkbeiner, Tony Li, Dale Whittington, William Stafford Noble, Andrew B. Stergachis, Mitchell R. Vollger
Long-read DNA sequencing has recently emerged as a powerful tool for studying both genetic and epigenetic architectures at single-molecule and single-nucleotide resolution. Long-read epigenetic studies encompass both the direct identification of native cytosine methylation and the identification of exogenously placed DNA N6-methyladenine (DNA-m6A). However, detecting DNA-m6A modifications using single-molecule
-
Full-resolution HLA and KIR gene annotations for human genome assemblies Genome Res. (IF 6.2) Pub Date : 2024-11-01 Ying Zhou, Li Song, Heng Li
The human leukocyte antigen (HLA) genes and the killer cell immunoglobulin-like receptor (KIR) genes are critical to immune responses and are associated with many immune-related diseases. Located in highly polymorphic regions, it is difficult to study them with traditional short-read alignment-based methods. Although modern long-read assemblers can often assemble these genes, using existing tools to
-
Long-read subcellular fractionation and sequencing reveals the translational fate of full-length mRNA isoforms during neuronal differentiation Genome Res. (IF 6.2) Pub Date : 2024-11-01 Alexander J. Ritter, Jolene M. Draper, Christopher Vollmers, Jeremy R. Sanford
Alternative splicing (AS) alters the cis-regulatory landscape of mRNA isoforms, leading to transcripts with distinct localization, stability, and translational efficiency. To rigorously investigate mRNA isoform-specific ribosome association, we generated subcellular fractionation and sequencing (Frac-seq) libraries using both conventional short reads and long reads from human embryonic stem cells (ESCs)
-
Geometric deep learning framework for de novo genome assembly Genome Res. (IF 6.2) Pub Date : 2024-10-29 Lovro Vrček, Xavier Bresson, Thomas Laurent, Martin Schmitz, Kenji Kawaguchi, Mile Šikić
The critical stage of every de novo genome assembler is identifying paths in assembly graphs that correspond to the reconstructed genomic sequences. The existing algorithmic methods struggle with this, primarily due to repetitive regions causing complex graph tangles, leading to fragmented assemblies. Here, we introduce GNNome, a framework for path identification based on geometric deep learning that
-
Long-read RNA sequencing reveals allele-specific N6-methyladenosine modifications Genome Res. (IF 6.2) Pub Date : 2024-10-29 Dayea Park, Can Cenik
Long-read sequencing technology enables highly accurate detection of allele-specific RNA expression, providing insights into the effects of genetic variation on splicing and RNA abundance. Furthermore, the ability to directly sequence RNA using the Oxford Nanopore technology promises the detection of RNA modifications in tandem with ascertaining the allelic origin of each molecule. Here, we leverage
-
Binding profiles for 961 Drosophila and C. elegans transcription factors reveal tissue-specific regulatory relationships Genome Res. (IF 6.2) Pub Date : 2024-10-22 Michelle Kudron, Louis Gewirtzman, Alec Victorsen, Bridget C Lear, Dionne Vafeados, Jiahao Gao, Jinrui Xu, Swapna Samanta, Emily Frink, Adri Tran-Pearson, Chau Hyunh, Ann Hammonds, William Fisher, Martha L Wall, Greg Wesseling, Vanessa Hernandez, Zhichun Lin, Mary Kasparian, Kevin P White, Ravi Allada, Mark Gerstein, LaDeana Hillier, Susan E Celniker, Valerie Reinke, Robert Waterston
A catalog of transcription factor (TF) binding sites in the genome is critical for deciphering regulatory relationships. Here we present the culmination of the efforts of the Model Organism ENCyclopedia Of DNA Elements (modENCODE) and the model organism Encyclopedia of Regulatory Networks (modERN) consortia to systematically assay TF binding events in vivo in two major model organisms, Drosophila melanogaster
-
Candida albicans isolates contain frequent heterozygous structural variants and transposable elements within genes and centromeres Genome Res. (IF 6.2) Pub Date : 2024-10-22 Ursula Oggenfuss, Robert T Todd, Natthapon Soisangwan, Bailey Kemp, Alison Guyer, Annette Beach, Anna Selmecki
The human fungal pathogen Candida albicans poses a significant burden on global health, causing high rates of mortality and antifungal drug resistance. C. albicans is a heterozygous diploid organism that reproduces asexually. Structural variants (SVs) are an important source of genomic rearrangement, particularly in species that lack sexual recombination. To comprehensively investigate SVs across clinical
-
Inferring ancestry with the hierarchical soft clustering approach tangleGen Genome Res. (IF 6.2) Pub Date : 2024-10-21 Klara Elisabeth Burger, Solveig Klepper, Ulrike von Luxburg, Franz Baumdicker
Understanding the genetic ancestry of populations is central to numerous scientific and societal fields. It contributes to a better understanding of human evolutionary history, advances personalized medicine, aids in forensic identification, and allows individuals to connect to their genealogical roots. Existing methods, such as ADMIXTURE, have significantly improved our ability to infer ancestries
-
Analyzing super-enhancer temporal dynamics reveals potential critical enhancers and their gene regulatory networks underlying skeletal muscle development Genome Res. (IF 6.2) Pub Date : 2024-10-21 Song Zhang, Chao Wang, Shenghua Qin, Choulin Chen, Yongzhou Bao, Yuanyuan Zhang, Lingna Xu, Qingyou Liu, Yunxiang Zhao, Kui Li, Zhonglin Tang, Yuwen Liu
Super-enhancers (SEs) govern the expression of genes defining cell identity. However, the dynamic landscape of SEs and their critical constituent enhancers involved in skeletal muscle development remains unclear. In this study, using pig as a model, we employed CUT&Tag to profile the enhancer-associated histone modification marker H3K27ac in skeletal muscle across two prenatal and three postnatal stages
-
Ultrasensitive allele inference from immune repertoire sequencing data with MiXCR Genome Res. (IF 6.2) Pub Date : 2024-10-21 Artem Mikelov, George Nefedev, Aleksandr Tashkeev, Oscar L Rodriguez, Diego A Ortmans, Valeriia Skatova, Mark Izraelson, Alexey N Davydov, Stanislav Poslavsky, Souad Rahmouni, Corey T Watson, Dmitriy M Chudakov, Scott D Boyd, Dmitry A Bolotin
Allelic variability in the adaptive immune receptor loci, which harbor the gene segments that encode B cell and T cell receptors (BCR/TCR), is of critical importance for immune responses to pathogens and vaccines. Adaptive immune receptor repertoire sequencing (AIRR-seq) has become widespread in immunology research making it the most readily available source of information about allelic diversity in
-
The chromatin tapestry as a framework for neurodevelopment Genome Res. (IF 6.2) Pub Date : 2024-10-01 Ben Nolan, Timothy E. Reznicek, Christopher T. Cummings, M. Jordan Rowley
The neuronal nucleus houses a meticulously organized genome. Within this structure, genetic material is not simply compacted but arranged into a precise and functional 3D chromatin landscape essential for cellular regulation. This mini-review highlights the importance of this chromatin landscape in healthy neurodevelopment, as well as the diseases that occur with aberrant chromatin architecture. We
-
Chromatin interaction maps identify oncogenic targets of enhancer duplications in cancer Genome Res. (IF 6.2) Pub Date : 2024-10-01 Yueqiang Song, Fuyuan Li, Shangzi Wang, Yuntong Wang, Cong Lai, Lian Chen, Ning Jiang, Jin Li, Xingdong Chen, Swneke D. Bailey, Xiaoyang Zhang
As a major type of structural variants, tandem duplication plays a critical role in tumorigenesis by increasing oncogene dosage. Recent work has revealed that noncoding enhancers are also affected by duplications leading to the activation of oncogenes that are inside or outside of the duplicated regions. However, the prevalence of enhancer duplication and the identity of their target genes remains
-
Dynamic dysregulation of retrotransposons in neurodegenerative diseases at the single-cell level Genome Res. (IF 6.2) Pub Date : 2024-10-01 Wankun Deng, Citu Citu, Andi Liu, Zhongming Zhao
Retrotransposable elements (RTEs) are common mobile genetic elements comprising ∼42% of the human genome. RTEs play critical roles in gene regulation and function, but how they are specifically involved in complex diseases is largely unknown. Here, we investigate the cellular heterogeneity of RTEs using 12 single-cell transcriptome profiles covering three neurodegenerative diseases, Alzheimer's disease
-
De novo genome assemblies of two cryptodiran turtles with ZZ/ZW and XX/XY sex chromosomes provide insights into patterns of genome reshuffling and uncover novel 3D genome folding in amniotes Genome Res. (IF 6.2) Pub Date : 2024-10-01 Basanta Bista, Laura González-Rodelas, Lucía Álvarez-González, Zhi-qiang Wu, Eugenia E. Montiel, Ling Sze Lee, Daleen B. Badenhorst, Srihari Radhakrishnan, Robert Literman, Beatriz Navarro-Dominguez, John B. Iverson, Simon Orozco-Arias, Josefa González, Aurora Ruiz-Herrera, Nicole Valenzuela
Understanding the evolution of chromatin conformation among species is fundamental to elucidate the architecture and plasticity of genomes. Nonrandom interactions of linearly distant loci regulate gene function in species-specific patterns, affecting genome function, evolution, and, ultimately, speciation. Yet, data from nonmodel organisms are scarce. To capture the macroevolutionary diversity of vertebrate
-
Seamless, rapid, and accurate analyses of outbreak genomic data using split k-mer analysis Genome Res. (IF 6.2) Pub Date : 2024-10-01 Romain Derelle, Johanna von Wachsmann, Tommi Mäklin, Joel Hellewell, Timothy Russell, Ajit Lalvani, Leonid Chindelevitch, Nicholas J. Croucher, Simon R. Harris, John A. Lees
Sequence variation observed in populations of pathogens can be used for important public health and evolutionary genomic analyses, especially outbreak analysis and transmission reconstruction. Identifying this variation is typically achieved by aligning sequence reads to a reference genome, but this approach is susceptible to reference biases and requires careful filtering of called genotypes. There
-
PWAS Hub for exploring gene-based associations of common complex diseases Genome Res. (IF 6.2) Pub Date : 2024-10-01 Guy Kelman, Roei Zucker, Nadav Brandes, Michal Linial
PWAS (proteome-wide association study) is an innovative genetic association approach that complements widely used methods like GWAS (genome-wide association study). The PWAS approach involves consecutive phases. Initially, machine learning modeling and probabilistic considerations quantify the impact of genetic variants on protein-coding genes’ biochemical functions. Secondly, for each individual,
-
Theoretical framework for the difference of two negative binomial distributions and its application in comparative analysis of sequencing data Genome Res. (IF 6.2) Pub Date : 2024-10-01 Alicia Petrany, Ruoyu Chen, Shaoqiang Zhang, Yong Chen
High-throughput sequencing (HTS) technologies have been instrumental in investigating biological questions at the bulk and single-cell levels. Comparative analysis of two HTS data sets often relies on testing the statistical significance for the difference of two negative binomial distributions (DOTNB). Although negative binomial distributions are well studied, the theoretical results for DOTNB remain
-
Complete genomes of Asgard archaea reveal diverse integrated and mobile genetic elements Genome Res. (IF 6.2) Pub Date : 2024-10-01 Luis E. Valentin-Alvarado, Ling-Dong Shi, Kathryn E. Appler, Alexander Crits-Christoph, Valerie De Anda, Benjamin A. Adler, Michael L. Cui, Lynn Ly, Pedro Leão, Richard J. Roberts, Rohan Sachdeva, Brett J. Baker, David F. Savage, Jillian F. Banfield
Asgard archaea are of great interest as the progenitors of Eukaryotes, but little is known about the mobile genetic elements (MGEs) that may shape their ongoing evolution. Here, we describe MGEs that replicate in Atabeyarchaeia, a wetland Asgard archaea lineage represented by two complete genomes. We used soil depth–resolved population metagenomic data sets to track 18 MGEs for which genome structures
-
Global characterization of somatic mutations and DNA methylation changes during vegetative propagation in strawberries Genome Res. (IF 6.2) Pub Date : 2024-10-01 Shaoqiang Hu, Xiangguo Zeng, Yuguo Liu, Yongping Li, Minghao Qu, Wen-Biao Jiao, Yongchao Han, Chunying Kang
Somatic mutations arise and accumulate during tissue culture and vegetative propagation, potentially affecting various traits in horticultural crops, but their characteristics are still unclear. Here, somatic mutations in regenerated woodland strawberry derived from tissue culture of shoot tips under different conditions and 12 cultivated strawberry individuals are analyzed by whole genome sequencing
-
Evolutionary dynamics of polyadenylation signals and their recognition strategies in protists Genome Res. (IF 6.2) Pub Date : 2024-10-01 Marcin P. Sajek, Danielle Y. Bilodeau, Michael A. Beer, Emma Horton, Yukiko Miyamoto, Katrina B. Velle, Lars Eckmann, Lillian Fritz-Laylin, Olivia S. Rissland, Neelanjan Mukherjee
The poly(A) signal, together with auxiliary elements, directs cleavage of a pre-mRNA and thus determines the 3′ end of the mature transcript. In many species, including humans, the poly(A) signal is an AAUAAA hexamer, but we recently found that the deeply branching eukaryote Giardia lamblia uses a distinct hexamer (AGURAA) and lacks any known auxiliary elements. Our discovery prompted us to explore
-
Targeted and complete genomic sequencing of the major histocompatibility complex in haplotypic form of individual heterozygous samples Genome Res. (IF 6.2) Pub Date : 2024-10-01 Taishan Hu, Timothy L. Mosbruger, Nikolaos G. Tairis, Amalia Dinou, Pushkala Jayaraman, Mahdi Sarmady, Kingham Brewster, Yang Li, Tristan J. Hayeck, Jamie L. Duke, Dimitri S. Monos
The human major histocompatibility complex (MHC) is a ∼4 Mb genomic segment on Chromosome 6 that plays a pivotal role in the immune response. Despite its importance in various traits and diseases, its complex nature makes it challenging to accurately characterize on a routine basis. We present a novel approach allowing targeted sequencing and de novo haplotypic assembly of the MHC region in heterozygous
-
Rapid SARS-CoV-2 surveillance using clinical, pooled, or wastewater sequence as a sensor for population change Genome Res. (IF 6.2) Pub Date : 2024-10-01 Apurva Narechania, Dean Bobo, Kevin Deitz, Rob DeSalle, Paul J. Planet, Barun Mathema
The COVID-19 pandemic has highlighted the critical role of genomic surveillance for guiding policy and control. Timeliness is key, but sequence alignment and phylogeny slow most surveillance techniques. Millions of SARS-CoV-2 genomes have been assembled. Phylogenetic methods are ill equipped to handle this sheer scale. We introduce a pangenomic measure that examines the information diversity of a k-mer