-
MCHelper automatically curates transposable element libraries across eukaryotic species Genome Res. (IF 6.2) Pub Date : 2024-12-09 Simon Orozco-Arias, Pío Sierra, Richard Durbin, Josefa González
The number of species with high-quality genome sequences continues to increase, in part due to the scaling up of multiple large-scale biodiversity sequencing projects. While the need to annotate genic sequences in these genomes is widely acknowledged, the parallel need to annotate transposable element (TE) sequences that have been shown to alter genome architecture, rewire gene regulatory networks
-
Resolving the chromatin impact of mosaic variants with targeted Fiber-seq Genome Res. (IF 6.2) Pub Date : 2024-12-09 Stephanie C. Bohaczuk, Zachary J. Amador, Chang Li, Benjamin J. Mallory, Elliott G. Swanson, Jane Ranchalis, Mitchell R. Vollger, Katherine M. Munson, Tom Walsh, Morgan O. Hamm, Yizi Mao, Andre Lieber, Andrew B. Stergachis
Accurately quantifying the functional consequences of noncoding mosaic variants requires the pairing of DNA sequences with both accessible and closed chromatin architectures along individual DNA molecules—a pairing that cannot be achieved using traditional fragmentation-based chromatin assays. We demonstrate that targeted single-molecule chromatin fiber sequencing (Fiber-seq) achieves this, permitting
-
An integrative TAD catalog in lymphoblastoid cell lines discloses the functional impact of deletions and insertions in human genomes Genome Res. (IF 6.2) Pub Date : 2024-12-05 Chong Li, Marc Jan Bonder, Sabriya Syed, Matthew Jensen, Human Genome Structural Variation Consortium (HGSVC), HGSVC Functional Analysis Working Group, Mark B. Gerstein, Michael C. Zody, Mark J.P. Chaisson, Michael E. Talkowski, Tobias Marschall, Jan O. Korbel, Evan E. Eichler, Charles Lee, Xinghua Shi
The human genome is packaged within a three-dimensional (3D) nucleus and organized into structural units known as compartments, topologically associating domains (TADs), and loops. TAD boundaries, separating adjacent TADs, have been found to be well conserved across mammalian species and more evolutionarily constrained than TADs themselves. Recent studies show that structural variants (SVs) can modify
-
Rearrangements of viral and human genomes at human papillomavirus integration events and their allele-specific impacts on cancer genome regulation Genome Res. (IF 6.2) Pub Date : 2024-12-05 Vanessa L. Porter, Michelle Ng, Kieran O'Neill, Signe MacLennan, Richard D. Corbett, Luka Culibrk, Zeid Hamadeh, Marissa Iden, Rachel Schmidt, Shirng-Wern Tsaih, Carolyn Nakisige, Martin Origa, Jackson Orem, Glenn Chang, Jeremy Fan, Ka Ming Nip, Vahid Akbari, Simon K. Chan, James Hopkins, Richard A. Moore, Eric Chuah, Karen L. Mungall, Andrew J. Mungall, Inanc Birol, Steven J.M. Jones, Janet S. Rader
Human papillomavirus (HPV) integration has been implicated in transforming HPV infection into cancer. To resolve genome dysregulation associated with HPV integration, we performed Oxford Nanopore long-read sequencing on 72 cervical cancer genomes from an Ugandan dataset that was previously characterized using short-read sequencing. We found recurrent structural rearrangement patterns at HPV integration
-
Hydra has mammal-like mutation rates facilitating fast adaptation despite its nonaging phenotype Genome Res. (IF 6.2) Pub Date : 2024-12-04 Arne Sahm, Konstantin Riege, Marco Groth, Martin Bens, Johann Kraus, Martin Fischer, Hans Kestler, Christoph Englert, Ralf Schaible, Matthias Platzer, Steve Hoffmann
Growing evidence suggests that somatic mutations may be a major cause of the aging process. However, it remains to be tested whether the predictions of the theory also apply to species with longer life spans than humans. Hydra is a genus of freshwater polyps with remarkable regeneration abilities and a potentially unlimited life span under laboratory conditions. By genome sequencing of single cells
-
Characterization of DNA methylation reader proteins of Arabidopsis thaliana Genome Res. (IF 6.2) Pub Date : 2024-12-04 Jonathan Cahn, James P.B. Lloyd, Ino D. Karemaker, Pascal W.T.C. Jansen, Jahnvi Pflueger, Owen Duncan, Jakob Petereit, Ozren Bogdanovic, A. Harvey Millar, Michiel Vermeulen, Ryan Lister
In plants, cytosine DNA methylation (mC) is largely associated with transcriptional repression of transposable elements, but it can also be found in the body of expressed genes, referred to as gene body methylation (gbM). gbM is correlated with ubiquitously expressed genes; however, its function, or absence thereof, is highly debated. The different outputs that mC can have raise questions as to how
-
Structure-optimized sgRNA selection with PlatinumCRISPr for efficient Cas9 generation of knockouts Genome Res. (IF 6.2) Pub Date : 2024-12-03 Irmgard U. Haussmann, Thomas C. Dix, David W.J. McQuarrie, Veronica Dezi, Abdullah I. Hans, Roland Arnold, Matthias Soller
A single guide RNA (sgRNA) directs Cas9 nuclease for gene-specific scission of double-stranded DNA. High Cas9 activity is essential for efficient gene editing to generate gene deletions and gene replacements by homologous recombination. However, cleavage efficiency is below 50% for more than half of randomly selected sgRNA sequences in human cell culture screens or model organisms. We used in vitro
-
A low-abundance class of Dicer-dependent siRNAs produced from a variety of features in C. elegans Genome Res. (IF 6.2) Pub Date : 2024-12-02 Thiago L. Knittel, Brooke E. Montgomery, Alex J. Tate, Ennis W. Deihl, Anastasia S. Nawrocki, Frederic J. Hoerndli, Taiowa A. Montgomery
Canonical small interfering RNAs (siRNAs) are processed from double-stranded RNA (dsRNA) by Dicer and associate with Argonautes to direct RNA silencing. In Caenorhabditis elegans, 22G-RNAs and 26G-RNAs are often referred to as siRNAs but display distinct characteristics. For example, 22G-RNAs do not originate from dsRNA and do not depend on Dicer, whereas 26G-RNAs require Dicer but derive from an atypical
-
Inferring disease progressive stages in single-cell transcriptomics using a weakly-supervised deep learning approach Genome Res. (IF 6.2) Pub Date : 2024-12-02 Fabien Wehbe, Levi Adams, Jordan Babadoudou, Samantha Yuen, Yoon-Seong Kim, Yoshiaki Tanaka
Application of single-cell/nucleus genomic sequencing to patient-derived tissues offers potential solutions to delineate disease mechanisms in human. However, individual cells in patient-derived tissues are in different pathological stages, and hence such cellular variability impedes subsequent differential gene expression analyses. To overcome such heterogeneity issue, we present a novel deep learning
-
The rate and spectrum of new mutations in mice inferred by long-read sequencing Genome Res. (IF 6.2) Pub Date : 2024-12-02 Eugenio López-Cortegano, Jobran Chebib, Anika Jonas, Anastasia Vock, Sven Künzel, Peter D. Keightley, Diethard Tautz
All forms of genetic variation originate from new mutations, making it crucial to understand their rates and mechanisms. Here, we use long-read PacBio sequencing to investigate de novo mutations that accumulated in 12 inbred mouse lines derived from three commonly used inbred strains (C3H, C57BL/6, and FVB) maintained for 8-15 generations in a mutation accumulation (MA) experiment. We built chromosome-level
-
Single-nucleus CUT&RUN elucidates the function of intrinsic and genomics-driven epigenetic heterogeneity in head and neck cancer progression Genome Res. (IF 6.2) Pub Date : 2024-12-02 Howard Womersley, Daniel Muliaditan, Ramanuj DasGupta, Lih Feng Cheow
Interrogating regulatory epigenetic alterations during tumor progression at the resolution of single cells has remained an understudied area of research. Here we developed a highly sensitive single-nucleus CUT&RUN (snCUT&RUN) assay to profile histone modifications in isogenic primary, metastatic, and cisplatin-resistant head and neck squamous cell carcinoma (HNSCC) patient-derived tumor cell lines
-
Analysis of a cell-free DNA-based cancer screening cohort links fragmentomic profiles, nuclease levels, and plasma DNA concentrations Genome Res. (IF 6.2) Pub Date : 2024-11-27 Yasine Malki, Guannan Kang, Wai Kei Jacky Lam, Qing Zhou, Suk Hang Cheng, Peter Pak Hang Cheung, Jinyue Bai, Ming Lok Chan, Chui Ting Lee, Wenlei Peng, Yiqiong Zhang, Wanxia Gai, Wing Sum Winsome Wong, Mary Jane Lizhen Ma, Wenshuo Li, Xinzhou Xu, Zhuoran Gao, Irene Oi Lin Tse, Huimin Shang, Lok Yee Lois Choy, Peiyong Jiang, Kwan Chee Allen Chan, Yuk Ming Dennis Lo
The concentration of circulating cell-free DNA (cfDNA) in plasma is an important determinant of the robustness of liquid biopsies. However, biological mechanisms that lead to inter-individual differences in cfDNA concentrations remain unexplored. The concentration of plasma cfDNA is governed by an interplay between its release and clearance. We hypothesize that cfDNA clearance by nucleases might be
-
Chimeric mitochondrial RNA transcripts predict mitochondrial genome deletion mutations in mitochondrial genetic diseases and aging Genome Res. (IF 6.2) Pub Date : 2024-11-27 Amy R Vandiver, Allen Herbst, Paul Stothard, Jonathan Wanagat
While it is well understood that mitochondrial DNA (mtDNA) deletion mutations cause incurable diseases and contribute to aging, little is known about the transcriptional products that arise from these DNA structural variants. We hypothesized that mitochondrial genomes containing deletion mutations express chimeric mitochondrial RNAs. To test this, we analyzed human and rat RNA sequencing data to identify
-
A deconvolution framework that uses single-cell sequencing plus a small benchmark dataset for accurate analysis of cell type ratios in complex tissue samples Genome Res. (IF 6.2) Pub Date : 2024-11-25 Shuai Guo, Xiaoqian Liu, Xuesen Cheng, Yujie Jiang, Shuangxi Ji, Qingnan Liang, Andrew Koval, Yumei Li, Leah A. Owen, Ivana K. Kim, Ana Aparicio, Sanghoon Lee, Anil K. Sood, Scott Kopetz, John Paul Shen, John N. Weinstein, Margaret M. DeAngelis, Rui Chen, Wenyi Wang
Bulk deconvolution with single-cell/nucleus RNA-seq data is critical for understanding heterogeneity in complex biological samples, yet the technological discrepancy across sequencing platforms limits deconvolution accuracy. To address this, we utilize an experimental design to match inter-platform biological signals, hence revealing the technological discrepancy, and then develop a deconvolution framework
-
Global identification of mammalian host and nested gene pairs reveal tissue-specific transcriptional interplay Genome Res. (IF 6.2) Pub Date : 2024-11-22 Bertille Montibus, James A. Cain, Rocio T. Martinez-Nunez, Rebecca J. Oakey
Nucleotide sequences along a gene provide instructions to transcriptional and cotranscriptional machinery allowing genome expansion into the transcriptome. Nucleotide sequence can often be shared between two genes and in some occurrences, a gene is located completely within a different gene; these are known as host/nested gene pairs. In these instances, if both genes are transcribed, overlap can result
-
Convergent relaxation of molecular constraint in herbivores reveals the changing role of liver and kidney functions across mammalian diets Genome Res. (IF 6.2) Pub Date : 2024-11-22 Matthew D. Pollard, Wynn K. Meyer, Emily E. Puckett
Mammalia comprises a great diversity of diet types and associated adaptations. An understanding of the genomic mechanisms underlying these adaptations may offer insights for improving human health. Comparative genomic studies of diet that employ taxonomically restricted analyses or simplified diet classifications may suffer reduced power to detect molecular convergence associated with diet evolution
-
Advancements in prospective single-cell lineage barcoding and their applications in research Genome Res. (IF 6.2) Pub Date : 2024-11-21 Xiaoli Zhang, Yirui Huang, Yajing Yang, Qi-En Wang, Lang Li
Single-cell lineage tracing (scLT) has emerged as a powerful tool, providing unparalleled resolution to investigate cellular dynamics, fate determination, and the underlying molecular mechanisms. This review thoroughly examines the latest prospective lineage DNA barcode tracing technologies. It further highlights pivotal studies that leverage single-cell lentiviral integration barcoding technology
-
The chromatin landscape of the histone-possessing Bacteriovorax bacteria Genome Res. (IF 6.2) Pub Date : 2024-11-21 Georgi K. Marinov, Benjamin Doughty, Anshul Kundaje, William J Greenleaf
Histone proteins have traditionally been thought to be restricted to eukaryotes and most archaea, with eukaryotic nucleosomal histones deriving from their archaeal ancestors. In contrast, bacteria lack histones as a rule. However, histone proteins have recently been identified in a few bacterial clades, most notably the phylum Bdellovibrionota, and these histones have been proposed to exhibit a range
-
KAS-ATAC reveals the genome-wide single-stranded accessible chromatin landscape of the human genome Genome Res. (IF 6.2) Pub Date : 2024-11-21 Samuel H Kim, Georgi K. Marinov, William Greenleaf
Gene regulation in most eukaryotes involves two fundamental physical processes -- alterations in the packaging of the genome by nucleosomes, with active cis-regulatory elements (CREs) generally characterized by an open-chromatin configuration, and the activation of transcription. Mapping these physical properties and biochemical activities genome-wide -- through profiling chromatin accessibility and
-
Modeling gene interactions in polygenic prediction via geometric deep learning Genome Res. (IF 6.2) Pub Date : 2024-11-19 Han Li, Jianyang Zeng, Michael P Snyder, Sai Zhang
Polygenic risk score (PRS) is a widely-used approach for predicting individuals' genetic risk of complex diseases, playing a pivotal role in advancing precision medicine. Traditional PRS methods, predominantly following a linear structure, often fall short in capturing the intricate relationships between genotype and phenotype. In this study, we present PRS-Net, an interpretable geometric deep learning-based
-
Multisample motif discovery and visualization for tandem repeats Genome Res. (IF 6.2) Pub Date : 2024-11-13 Yaran Zhang, Marc Hulsman, Alex Salazar, Niccoló Tesi, Lydian Knoop, Sven van der Lee, Sanduni Wijesekera, Jana Krizova, Erik-Jan Kamsteeg, Henne Holstege
Tandem Repeats (TR) occupy a significant portion of the human genome and are the source of polymorphism due to variations in sizes and motif compositions. Some of these variations have been associated with various neuropathological disorders, highlighting the clinical importance of assessing the motif structure of TRs. Moreover, assessing the TR motif variation can offer valuable insights into evolutionary
-
Multiple paralogues and recombination mechanisms contribute to the high incidence of 22q11.2 Deletion Syndrome Genome Res. (IF 6.2) Pub Date : 2024-11-13 Lisanne Vervoort, Nicolas Dierckxsens, Marta Sousa Santos, Senne Meynants, Erika Souche, Ruben Cools, Tracy Heung, Koen Devriendt, Hilde Peeters, Donna McDonald-McGinn, Ann Swillen, Jeroen Breckpot, Beverly S. Emanuel, Hilde Van Esch, Anne S. Bassett, Joris R. Vermeesch
The 22q11.2 deletion syndrome (22q11.2DS) is the most common microdeletion disorder. Why the incidence of 22q11.2DS is much greater than that of other genomic disorders remains unknown. Short read sequencing cannot resolve the complex segmental duplicon structure to provide direct confirmation of the hypothesis that the rearrangements are caused by nonallelic homologous recombination between the low
-
High-quality sika deer omics data and integrative analysis reveal genic and cellular regulation of antler regeneration Genome Res. (IF 6.2) Pub Date : 2024-11-14 Zihe Li, Ziyu Xu, Lei Zhu, Tao Qin, Jinrui Ma, Zhanying Feng, Huishan Yue, Qing Guan, Botong Zhou, Ge Han, Guokun Zhang, Chunyi Li, Shuaijun Jia, Qiang Qiu, Dingjun Hao, Yong Wang, Wen Wang
Antler is the only organ that can fully regenerate annually in mammals. However, the regulatory pattern and mechanism of gene expression and cell differentiation during this process remain largely unknown. Here, we obtain comprehensive assembly and gene annotation of the sika deer (Cervus nippon) genome. Together with large-scale chromatin accessibility and gene expression data, we construct gene regulatory
-
ISWI1 complex proteins facilitate developmental genome editing in Paramecium Genome Res. (IF 6.2) Pub Date : 2024-11-14 Aditi Singh, Lilia Häußermann, Christiane Emmerich, Emily Nischwitz, Brandon KB Seah, Falk Butter, Mariusz Nowacki, Estienne C. Swart
One of the most extensive forms of natural genome editing occurs in ciliates, a group of microbial eukaryotes. Ciliate germline and somatic genomes are contained in distinct nuclei within the same cell. During the massive reorganization process of somatic genome development, ciliates eliminate tens of thousands of DNA sequences from a germline genome copy. Recently, we showed that the chromatin remodeler
-
Understanding isoform expression by pairing long-read sequencing with single-cell and spatial transcriptomics Genome Res. (IF 6.2) Pub Date : 2024-11-01 Natan Belchikov, Justine Hsu, Xiang Jennie Li, Julien Jarroux, Wen Hu, Anoushka Joglekar, Hagen U. Tilgner
RNA isoform diversity, produced via alternative splicing, and alternative usage of transcription start and poly(A) sites, results in varied transcripts being derived from the same gene. Distinct isoforms can play important biological roles, including by changing the sequences or expression levels of protein products. The first single-cell approaches to RNA sequencing—and later, spatial approaches—which
-
Challenges in identifying mRNA transcript starts and ends from long-read sequencing data Genome Res. (IF 6.2) Pub Date : 2024-11-01 Ezequiel Calvo-Roitberg, Rachel F. Daniels, Athma A. Pai
Long-read sequencing (LRS) technologies have the potential to revolutionize scientific discoveries in RNA biology through the comprehensive identification and quantification of full-length mRNA isoforms. Despite great promise, challenges remain in the widespread implementation of LRS technologies for RNA-based applications, including concerns about low coverage, high sequencing error, and robust computational
-
Leveraging the power of long reads for targeted sequencing Genome Res. (IF 6.2) Pub Date : 2024-11-01 Shruti V. Iyer, Sara Goodwin, William Richard McCombie
Long-read sequencing technologies have improved the contiguity and, as a result, the quality of genome assemblies by generating reads long enough to span and resolve complex or repetitive regions of the genome. Several groups have shown the power of long reads in detecting thousands of genomic and epigenomic features that were previously missed by short-read sequencing approaches. While these studies
-
Revolutionizing genomics and medicine—one long molecule at a time Genome Res. (IF 6.2) Pub Date : 2024-11-01 Ana Conesa, Alexander Hoischen, Fritz J. Sedlazeck
Long-read sequencing (LRS) has matured, and the dramatically increased accuracy, ever-increasing throughput, and access now allow new and advanced studies even at scale. This Special Issue of Genome Research on “Long-read DNA and RNA Sequencing Applications in Biology and Medicine” garnered a record number of submissions, reflecting both the intense and broad interest in the technologies and the next
-
Haplotype-resolved genome and population genomics of the threatened garden dormouse in Europe Genome Res. (IF 6.2) Pub Date : 2024-11-01 Paige A. Byerly, Alina von Thaden, Evgeny Leushkin, Leon Hilgers, Shenglin Liu, Sven Winter, Tilman Schell, Charlotte Gerheim, Alexander Ben Hamadou, Carola Greve, Christian Betz, Hanno J. Bolz, Sven Büchner, Johannes Lang, Holger Meinig, Evax Marie Famira-Parcsetich, Sarah P. Stubbe, Alice Mouton, Sandro Bertolino, Goedele Verbeylen, Thomas Briner, Lídia Freixas, Lorenzo Vinciguerra, Sarah A. Mueller
Genomic resources are important for evaluating genetic diversity and supporting conservation efforts. The garden dormouse (Eliomys quercinus) is a small rodent that has experienced one of the most severe modern population declines in Europe. We present a high-quality haplotype-resolved reference genome for the garden dormouse, and combine comprehensive short and long-read transcriptomics data sets
-
Construction and evaluation of a new rat reference genome assembly, GRCr8, from long reads and long-range scaffolding Genome Res. (IF 6.2) Pub Date : 2024-11-01 Kai Li, Melissa L. Smith, J. Chris Blazier, Kelli J. Kochan, Jonathan M.D. Wood, Kerstin Howe, Anne E. Kwitek, Melinda R. Dwinell, Hao Chen, Julia L. Ciosek, Patrick Masterson, Terence D. Murphy, Theodore S. Kalbfleisch, Peter A. Doris
We report the construction and analysis of a new reference genome assembly for Rattus norvegicus, the laboratory rat, a widely used experimental animal model organism. The assembly has been adopted as the rat reference assembly by the Genome Reference Consortium and is named GRCr8. The assembly has employed 40× Pacific Biosciences (PacBio) HiFi sequencing coverage and scaffolding using optical mapping
-
Gapless assembly of complete human and plant chromosomes using only nanopore sequencing Genome Res. (IF 6.2) Pub Date : 2024-11-01 Sergey Koren, Zhigui Bao, Andrea Guarracino, Shujun Ou, Sara Goodwin, Katharine M. Jenike, Julian Lucas, Brandy McNulty, Jimin Park, Mikko Rautiainen, Arang Rhie, Dick Roelofs, Harrie Schneiders, Ilse Vrijenhoek, Koen Nijbroek, Olle Nordesjo, Sergey Nurk, Mike Vella, Katherine R. Lawrence, Doreen Ware, Michael C. Schatz, Erik Garrison, Sanwen Huang, William Richard McCombie, Karen H. Miga, Alexander
The combination of ultra-long (UL) Oxford Nanopore Technologies (ONT) sequencing reads with long, accurate Pacific Bioscience (PacBio) High Fidelity (HiFi) reads has enabled the completion of a human genome and spurred similar efforts to complete the genomes of many other species. However, this approach for complete, “telomere-to-telomere” genome assembly relies on multiple sequencing platforms, limiting
-
Factors impacting target-enriched long-read sequencing of resistomes and mobilomes Genome Res. (IF 6.2) Pub Date : 2024-11-01 Ilya B. Slizovskiy, Nathalie Bonin, Jonathan E. Bravo, Peter M. Ferm, Jacob Singer, Christina Boucher, Noelle R. Noyes
We investigated the efficiency of target-enriched long-read sequencing (TELSeq) for detecting antimicrobial resistance genes (ARGs) and mobile genetic elements (MGEs) within complex matrices. We aimed to overcome limitations associated with traditional antimicrobial resistance (AMR) detection methods, including short-read shotgun metagenomics, which can lack sensitivity, specificity, and the ability
-
Leveraging the T2T assembly to resolve rare and pathogenic inversions in reference genome gaps Genome Res. (IF 6.2) Pub Date : 2024-11-01 Kristine Bilgrav Saether, Jesper Eisfeldt, Jesse D. Bengtsson, Ming Yin Lun, Christopher M. Grochowski, Medhat Mahmoud, Hsiao-Tuan Chao, Jill A. Rosenfeld, Pengfei Liu, Marlene Ek, Jakob Schuy, Adam Ameur, Hongzheng Dai, Undiagnosed Diseases Network, James Paul Hwang, Fritz J. Sedlazeck, Weimin Bi, Ronit Marom, Josephine Wincent, Ann Nordgren, Claudia M.B. Carvalho, Anna Lindstrand
Chromosomal inversions (INVs) are particularly challenging to detect due to their copy-number neutral state and association with repetitive regions. Inversions represent about 1/20 of all balanced structural chromosome aberrations and can lead to disease by gene disruption or altering regulatory regions of dosage-sensitive genes in cis. Short-read genome sequencing (srGS) can only resolve ∼70% of cytogenetically
-
Resolving complex duplication variants in autism spectrum disorder using long-read genome sequencing Genome Res. (IF 6.2) Pub Date : 2024-11-01 Jesper Eisfeldt, Edward J. Higginbotham, Felix Lenner, Jennifer Howe, Bridget A. Fernandez, Anna Lindstrand, Stephen W. Scherer, Lars Feuk
Rare or de novo structural variation, primarily in the form of copy number variants, is detected in 5%–10% of autism spectrum disorder (ASD) families. While complex structural variants involving duplications can generally be detected using microarray or short-read genome sequencing (GS), these methods frequently fail to characterize breakpoints at nucleotide resolution, requiring additional molecular
-
A national long-read sequencing study on chromosomal rearrangements uncovers hidden complexities Genome Res. (IF 6.2) Pub Date : 2024-11-01 Jesper Eisfeldt, Adam Ameur, Felix Lenner, Esmee Ten Berk de Boer, Marlene Ek, Josephine Wincent, Raquel Vaz, Jesper Ottosson, Tord Jonson, Sofie Ivarsson, Sofia Thunström, Alexandra Topa, Simon Stenberg, Anna Rohlin, Anna Sandestig, Margareta Nordling, Pia Palmebäck, Magnus Burstedt, Frida Nordin, Eva-Lena Stattin, Maria Sobol, Panagiotis Baliakas, Marie-Louise Bondeson, Ida Höijer, Kristine Bilgrav
Clinical genetic laboratories often require a comprehensive analysis of chromosomal rearrangements/structural variants (SVs), from large events like translocations and inversions to supernumerary ring/marker chromosomes and small deletions or duplications. Understanding the complexity of these events and their clinical consequences requires pinpointing breakpoint junctions and resolving the derivative
-
An optimized protocol for quality control of gene therapy vectors using nanopore direct RNA sequencing Genome Res. (IF 6.2) Pub Date : 2024-11-01 Kathleen Zeglinski, Christian Montellese, Matthew E. Ritchie, Monther Alhamdoosh, Cédric Vonarburg, Rory Bowden, Monika Jordi, Quentin Gouil, Florian Aeschimann, Arthur Hsu
Despite recent advances made toward improving the efficacy of lentiviral gene therapies, a sizeable proportion of produced vector contains an incomplete and thus potentially nonfunctional RNA genome. This can undermine gene delivery by the lentivirus as well as increase manufacturing costs and must be improved to facilitate the widespread clinical implementation of lentiviral gene therapies. Here,
-
Generation and analysis of a mouse multitissue genome annotation atlas Genome Res. (IF 6.2) Pub Date : 2024-11-01 Matthew Adams, Christopher Vollmers
Generating an accurate and complete genome annotation for an organism is complex because the cells within each tissue can express a unique set of transcript isoforms from a unique set of genes. A comprehensive genome annotation should contain information on what tissues express what transcript isoforms at what level. This tissue-level isoform information can then inform a wide range of research questions
-
Unraveling the architecture of major histocompatibility complex class II haplotypes in rhesus macaques Genome Res. (IF 6.2) Pub Date : 2024-11-01 Nanine de Groot, Marit van der Wiel, Ngoc Giang Le, Natasja G. de Groot, Jesse Bruijnesteijn, Ronald E. Bontrop
The regions in the genome that encode components of the immune system are often featured by polymorphism, copy number variation, and segmental duplications. There is a need to thoroughly characterize these complex regions to gain insight into the impact of genomic diversity on health and disease. Here we resolve the organization of complete major histocompatibility complex (MHC) class II regions in
-
Accurate bacterial outbreak tracing with Oxford Nanopore sequencing and reduction of methylation-induced errors Genome Res. (IF 6.2) Pub Date : 2024-11-01 Mara Lohde, Gabriel E. Wagner, Johanna Dabernig-Heinz, Adrian Viehweger, Sascha D. Braun, Stefan Monecke, Celia Diezel, Claudia Stein, Mike Marquet, Ralf Ehricht, Mathias W. Pletz, Christian Brandt
Our study investigates the effectiveness of Oxford Nanopore Technologies for accurate outbreak tracing by resequencing 33 isolates of a 3-year-long Klebsiella pneumoniae outbreak with Illumina short-read sequencing data as the point of reference. We detect considerable base errors through cgMLST and phylogenetic analysis of genomes sequenced with Oxford Nanopore Technologies, leading to the false exclusion
-
Long-read genome assembly of the insect model organism Tribolium castaneum reveals spread of satellite DNA in gene-rich regions by recurrent burst events Genome Res. (IF 6.2) Pub Date : 2024-11-01 Marin Volarić, Evelin Despot-Slade, Damira Veseljak, Brankica Mravinac, Nevenka Meštrović
Eukaryotic genomes are replete with satellite DNAs (satDNAs), large stretches of tandemly repeated sequences that are mostly underrepresented in genome assemblies. Here we combined nanopore long-read sequencing with a reference-guided assembly approach to generate an improved, high-quality genome assembly, TcasONT, of the model beetle Tribolium castaneum. Enriched by 45 Mb in repetitive regions, the
-
Telomere-to-telomere assembly by preserving contained reads Genome Res. (IF 6.2) Pub Date : 2024-11-01 Sudhanva Shyam Kamath, Mehak Bindra, Debnath Pal, Chirag Jain
Automated telomere-to-telomere (T2T) de novo assembly of diploid and polyploid genomes remains a formidable task. A string graph is a commonly used assembly graph representation in the assembly algorithms. The string graph formulation employs graph simplification heuristics, which drastically reduce the count of vertices and edges. One of these heuristics involves removing the reads contained in longer
-
Detecting m6A RNA modification from nanopore sequencing using a semisupervised learning framework Genome Res. (IF 6.2) Pub Date : 2024-11-01 Haotian Teng, Marcus Stoiber, Ziv Bar-Joseph, Carl Kingsford
Direct nanopore-based RNA sequencing can be used to detect posttranscriptional base modifications, such as N6-methyladenosine (m6A) methylation, based on the electric current signals produced by the distinct chemical structures of modified bases. A key challenge is the scarcity of adequate training data with known methylation modifications. We present Xron, a hybrid encoder–decoder framework that delivers
-
Characterizing tandem repeat complexities across long-read sequencing platforms with TREAT and otter Genome Res. (IF 6.2) Pub Date : 2024-11-01 Niccoló Tesi, Alex Salazar, Yaran Zhang, Sven van der Lee, Marc Hulsman, Lydian Knoop, Sanduni Wijesekera, Jana Krizova, Anne-Fleur Schneider, Maartje Pennings, Kristel Sleegers, Erik-Jan Kamsteeg, Marcel Reinders, Henne Holstege
Tandem repeats (TRs) play important roles in genomic variation and disease risk in humans. Long-read sequencing allows for the accurate characterization of TRs; however, the underlying bioinformatics perspectives remain challenging. We present otter and TREAT: otter is a fast targeted local assembler, cross-compatible across different sequencing platforms. It is integrated in TREAT, an end-to-end workflow
-
Genomic epidemiology of carbapenem-resistant Enterobacterales at a New York City hospital over a 10-year period reveals complex plasmid-clone dynamics and evidence for frequent horizontal transfer of blaKPC Genome Res. (IF 6.2) Pub Date : 2024-11-01 Angela Gomez-Simmonds, Medini K. Annavajhala, Dwayne Seeram, Todd W. Hokunson, Heekuk Park, Anne-Catrin Uhlemann
Transmission of carbapenem-resistant Enterobacterales (CRE) in hospitals has been shown to occur through complex, multifarious networks driven by both clonal spread and horizontal transfer mediated by plasmids and other mobile genetic elements. We performed nanopore long-read sequencing on CRE isolates from a large urban hospital system to determine the overall contribution of plasmids to CRE transmission
-
Nanopore strand-specific mismatch enables de novo detection of bacterial DNA modifications Genome Res. (IF 6.2) Pub Date : 2024-11-01 Xudong Liu, Ying Ni, Lianwei Ye, Zhihao Guo, Lu Tan, Jun Li, Mengsu Yang, Sheng Chen, Runsheng Li
DNA modifications in bacteria present diverse types and distributions, playing crucial functional roles. Current methods for detecting bacterial DNA modifications via nanopore sequencing typically involve comparing raw current signals to a methylation-free control. In this study, we found that bacterial DNA modification induces errors in nanopore reads. And these errors are found only in one strand
-
High-coverage nanopore sequencing of samples from the 1000 Genomes Project to build a comprehensive catalog of human genetic variation Genome Res. (IF 6.2) Pub Date : 2024-11-01 Jonas A. Gustafson, Sophia B. Gibson, Nikhita Damaraju, Miranda P.G. Zalusky, Kendra Hoekzema, David Twesigomwe, Lei Yang, Anthony A. Snead, Phillip A. Richmond, Wouter De Coster, Nathan D. Olson, Andrea Guarracino, Qiuhui Li, Angela L. Miller, Joy Goffena, Zachary B. Anderson, Sophie H.R. Storz, Sydney A. Ward, Maisha Sinha, Claudia Gonzaga-Jauregui, Wayne E. Clarke, Anna O. Basile, André Corvelo
Fewer than half of individuals with a suspected Mendelian or monogenic condition receive a precise molecular diagnosis after comprehensive clinical genetic testing. Improvements in data quality and costs have heightened interest in using long-read sequencing (LRS) to streamline clinical genomic testing, but the absence of control data sets for variant filtering and prioritization has made tertiary
-
Long-read genome sequencing and variant reanalysis increase diagnostic yield in neurodevelopmental disorders Genome Res. (IF 6.2) Pub Date : 2024-11-01 Susan M. Hiatt, James M.J. Lawlor, Lori H. Handley, Donald R. Latner, Zachary T. Bonnstetter, Candice R. Finnila, Michelle L. Thompson, Lori Beth Boston, Melissa Williams, Ivan Rodriguez Nunez, Jerry Jenkins, Whitley V. Kelley, E. Martina Bebin, Michael A. Lopez, Anna C.E. Hurst, Bruce R. Korf, Jeremy Schmutz, Jane Grimwood, Gregory M. Cooper
Variant detection from long-read genome sequencing (lrGS) has proven to be more accurate and comprehensive than variant detection from short-read genome sequencing (srGS). However, the rate at which lrGS can increase molecular diagnostic yield for rare disease is not yet precisely characterized. We performed lrGS using Pacific Biosciences “HiFi” technology on 96 short-read-negative probands with rare
-
Chromosome-level subgenome-aware de novo assembly provides insight into Saccharomyces bayanus genome divergence after hybridization Genome Res. (IF 6.2) Pub Date : 2024-11-01 Cory Gardner, Junhao Chen, Christina Hadfield, Zhaolian Lu, David Debruin, Yu Zhan, Maureen J. Donlin, Tae-Hyuk Ahn, Zhenguo Lin
Interspecies hybridization is prevalent in various eukaryotic lineages and plays important roles in phenotypic diversification, adaptation, and speciation. To better understand the changes that occurred in the different subgenomes of a hybrid species and how they facilitate adaptation, we have completed chromosome-level de novo assemblies of all chromosomes for a recently formed hybrid yeast, Saccharomyces
-
Full-length RNA transcript sequencing traces brain isoform diversity in house mouse natural populations Genome Res. (IF 6.2) Pub Date : 2024-11-01 Wenyu Zhang, Anja Guenther, Yuanxiao Gao, Kristian Ullrich, Bruno Huettel, Aftab Ahmad, Lei Duan, Kaizong Wei, Diethard Tautz
The ability to generate multiple RNA transcript isoforms from the same gene is a general phenomenon in eukaryotes. However, the complexity and diversity of alternative isoforms in natural populations remain largely unexplored. Using a newly developed full-length transcript enrichment protocol with 5′ CAP selection, we sequenced full-length RNA transcripts of 48 individuals from outbred populations
-
Long-read RNA sequencing of archival tissues reveals novel genes and transcripts associated with clear cell renal cell carcinoma recurrence and immune evasion Genome Res. (IF 6.2) Pub Date : 2024-11-01 Joshua Lee, Elizabeth A. Snell, Joanne Brown, Charlotte E. Booth, Rosamonde E. Banks, Daniel J. Turner, Naveen S. Vasudev, Dimitris Lagos
The use of long-read direct RNA sequencing (DRS) and PCR cDNA sequencing (PCS) in clinical oncology remains limited, with no direct comparison between the two methods. We used DRS and PCS to study clear cell renal cell carcinoma (ccRCC), focusing on new transcript and gene discovery. Twelve primary ccRCC archival tumors, six from patients who went on to relapse, were analyzed. Results were validated
-
Measuring X-Chromosome inactivation skew for X-linked diseases with adaptive nanopore sequencing Genome Res. (IF 6.2) Pub Date : 2024-11-01 Sena A. Gocuk, James Lancaster, Shian Su, Jasleen K. Jolly, Thomas L. Edwards, Doron G. Hickey, Matthew E. Ritchie, Marnie E. Blewitt, Lauren N. Ayton, Quentin Gouil
X-linked genetic disorders typically affect females less severely than males owing to the presence of a second X Chromosome not carrying the deleterious variant. However, the phenotypic expression in females is highly variable, which may be explained by an allelic skew in X-Chromosome inactivation. Accurate measurement of X inactivation skew is crucial to understand and predict disease phenotype in
-
Enhanced detection of RNA modifications and read mapping with high-accuracy nanopore RNA basecalling models Genome Res. (IF 6.2) Pub Date : 2024-11-01 Gregor Diensthuber, Leszek P. Pryszcz, Laia Llovera, Morghan C. Lucas, Anna Delgado-Tejedor, Sonia Cruciani, Jean-Yves Roignant, Oguzhan Begik, Eva Maria Novoa
In recent years, nanopore direct RNA sequencing (DRS) became a valuable tool for studying the epitranscriptome, owing to its ability to detect multiple modifications within the same full-length native RNA molecules. Although RNA modifications can be identified in the form of systematic basecalling “errors” in DRS data sets, N6-methyladenosine (m6A) modifications produce relatively low “errors” compared
-
Long-read transcriptome sequencing of CLL and MDS patients uncovers molecular effects of SF3B1 mutations Genome Res. (IF 6.2) Pub Date : 2024-11-01 Alicja Pacholewska, Matthias Lienhard, Mirko Brüggemann, Heike Hänel, Lorina Bilalli, Anja Königs, Felix Heß, Kerstin Becker, Karl Köhrer, Jesko Kaiser, Holger Gohlke, Norbert Gattermann, Michael Hallek, Carmen D. Herling, Julian König, Christina Grimm, Ralf Herwig, Kathi Zarnack, Michal R. Schweiger
Mutations in splicing factor 3B subunit 1 (SF3B1) frequently occur in patients with chronic lymphocytic leukemia (CLL) and myelodysplastic syndromes (MDSs). These mutations have different effects on the disease prognosis with beneficial effect in MDS and worse prognosis in CLL patients. A full-length transcriptome approach can expand our knowledge on SF3B1 mutation effects on RNA splicing and its contribution
-
Long-read DNA and cDNA sequencing identify cancer-predisposing deep intronic variation in tumor-suppressor genes Genome Res. (IF 6.2) Pub Date : 2024-11-01 Suleyman Gulsuner, Amal AbuRayyan, Jessica B. Mandell, Ming K. Lee, Greta V. Bernier, Barbara M. Norquist, Sarah B. Pierce, Mary-Claire King, Tom Walsh
The vast majority of deeply intronic genomic variants are benign, but some extremely rare or private deep intronic variants lead to exonification of intronic sequence with abnormal transcriptional consequences. Damaging variants of this class are likely underreported as causes of disease for several reasons: Most clinical DNA and RNA testing does not include full intronic sequences; many of these variants
-
Visualization and analysis of medically relevant tandem repeats in nanopore sequencing of control cohorts with pathSTR Genome Res. (IF 6.2) Pub Date : 2024-11-01 Wouter De Coster, Ida Höijer, Inge Bruggeman, Svenn D'Hert, Malin Melin, Adam Ameur, Rosa Rademakers
The lack of population-scale databases hampers research and diagnostics for medically relevant tandem repeats and repeat expansions. We attempt to fill this gap using our pathSTR web tool, which leverages long-read sequencing of large cohorts to determine repeat length and sequence composition in a healthy population. The current version includes 1040 individuals of The 1000 Genomes Project cohort
-
Independent expansion, selection, and hypervariability of the TBC1D3 gene family in humans Genome Res. (IF 6.2) Pub Date : 2024-11-01 Xavi Guitart, David Porubsky, DongAhn Yoo, Max L. Dougherty, Philip C. Dishuck, Katherine M. Munson, Alexandra P. Lewis, Kendra Hoekzema, Jordan Knuth, Stephen Chang, Tomi Pastinen, Evan E. Eichler
TBC1D3 is a primate-specific gene family that has expanded in the human lineage and has been implicated in neuronal progenitor proliferation and expansion of the frontal cortex. The gene family and its expression have been challenging to investigate because it is embedded in high-identity and highly variable segmental duplications. We sequenced and assembled the gene family using long-read sequencing
-
Long-read Ribo-STAMP simultaneously measures transcription and translation with isoform resolution Genome Res. (IF 6.2) Pub Date : 2024-11-01 Pratibha Jagannatha, Alexandra T. Tankka, Daniel A. Lorenz, Tao Yu, Brian A. Yee, Kristopher W. Brannan, Cathy J. Zhou, Jason G. Underwood, Gene W. Yeo
Transcription and translation are intertwined processes in which mRNA isoforms are crucial intermediaries. However, methodological limitations in analyzing translation at the mRNA isoform level have left gaps in our understanding of critical biological processes. To address these gaps, we developed an integrated computational and experimental framework called long-read Ribo-STAMP (LR-Ribo-STAMP) that
-
DNA-m6A calling and integrated long-read epigenetic and genetic analysis with fibertools Genome Res. (IF 6.2) Pub Date : 2024-11-01 Anupama Jha, Stephanie C. Bohaczuk, Yizi Mao, Jane Ranchalis, Benjamin J. Mallory, Alan T. Min, Morgan O. Hamm, Elliott Swanson, Danilo Dubocanin, Connor Finkbeiner, Tony Li, Dale Whittington, William Stafford Noble, Andrew B. Stergachis, Mitchell R. Vollger
Long-read DNA sequencing has recently emerged as a powerful tool for studying both genetic and epigenetic architectures at single-molecule and single-nucleotide resolution. Long-read epigenetic studies encompass both the direct identification of native cytosine methylation and the identification of exogenously placed DNA N6-methyladenine (DNA-m6A). However, detecting DNA-m6A modifications using single-molecule
-
Full-resolution HLA and KIR gene annotations for human genome assemblies Genome Res. (IF 6.2) Pub Date : 2024-11-01 Ying Zhou, Li Song, Heng Li
The human leukocyte antigen (HLA) genes and the killer cell immunoglobulin-like receptor (KIR) genes are critical to immune responses and are associated with many immune-related diseases. Located in highly polymorphic regions, it is difficult to study them with traditional short-read alignment-based methods. Although modern long-read assemblers can often assemble these genes, using existing tools to
-
Long-read subcellular fractionation and sequencing reveals the translational fate of full-length mRNA isoforms during neuronal differentiation Genome Res. (IF 6.2) Pub Date : 2024-11-01 Alexander J. Ritter, Jolene M. Draper, Christopher Vollmers, Jeremy R. Sanford
Alternative splicing (AS) alters the cis-regulatory landscape of mRNA isoforms, leading to transcripts with distinct localization, stability, and translational efficiency. To rigorously investigate mRNA isoform-specific ribosome association, we generated subcellular fractionation and sequencing (Frac-seq) libraries using both conventional short reads and long reads from human embryonic stem cells (ESCs)