The correlation between CpG methylation and gene expression is driven by sequence variants
Nature Genetics ( IF 31.7 ) Pub Date : 2024-07-24 , DOI: 10.1038/s41588-024-01851-2
Olafur Andri Stefansson 1 , Brynja Dogg Sigurpalsdottir 1, 2 , Solvi Rognvaldsson 1 , Gisli Hreinn Halldorsson 1, 3 , Kristinn Juliusson 1 , Gardar Sveinbjornsson 1 , Bjarni Gunnarsson 1 , Doruk Beyter 1 , Hakon Jonsson 1 , Sigurjon Axel Gudjonsson 1 , Thorunn Asta Olafsdottir 1, 4 , Saedis Saevarsdottir 1, 4 , Magnus Karl Magnusson 1, 4 , Sigrun Helga Lund 1, 3 , Vinicius Tragante 1 , Asmundur Oddsson 1 , Marteinn Thor Hardarson 1, 2 , Hannes Petur Eggertsson 1 , Reynir L Gudmundsson 1 , Sverrir Sverrisson 1 , Michael L Frigge 1 , Florian Zink 1 , Hilma Holm 1 , Hreinn Stefansson 1 , Thorunn Rafnar 1 , Ingileif Jonsdottir 1, 4 , Patrick Sulem 1 , Agnar Helgason 1, 5 , Daniel F Gudbjartsson 1, 3 , Bjarni V Halldorsson 1, 2 , Unnur Thorsteinsdottir 1, 4 , Kari Stefansson 1, 4

Gene promoter and enhancer sequences are bound by transcription factors and are depleted of methylated CpG sites (cytosines preceding guanines in DNA). The absence of methylated CpGs in these sequences typically correlates with increased gene expression, indicating a regulatory role for methylation. We used nanopore sequencing to determine haplotype-specific methylation rates of 15.3 million CpG units in 7,179 whole-blood genomes. We identified 189,178 methylation depleted sequences where three or more proximal CpGs were unmethylated on at least one haplotype. A total of 77,789 methylation depleted sequences (~41%) associated with 80,503 cis-acting sequence variants, which we termed allele-specific methylation quantitative trait loci (ASM-QTLs). RNA sequencing of 896 samples from the same blood draws used to perform nanopore sequencing showed that the ASM-QTL, that is, DNA sequence variability, drives most of the correlation found between gene expression and CpG methylation. ASM-QTLs were enriched 40.2-fold (95% confidence interval 32.2, 49.9) among sequence variants associating with hematological traits, demonstrating that ASM-QTLs are important functional units in the noncoding genome.


CpG 甲基化和基因表达之间的相关性是由序列变异驱动的

基因启动子和增强子序列与转录因子结合,并去除甲基化 CpG 位点(DNA 中鸟嘌呤之前的胞嘧啶)。这些序列中甲基化 CpG 的缺失通常与基因表达增加相关,表明甲基化的调节作用。我们使用纳米孔测序来确定 7,179 个全血基因组中 1530 万个 CpG 单位的单倍型特异性甲基化率。我们鉴定了 189,178 个甲基化缺失序列,其中至少一种单倍型上的三个或更多近端 CpG 未甲基化。总共 77,789 个甲基化缺失序列 (~41%) 与80,503个顺式作用序列变体相关,我们将其称为等位基因特异性甲基化数量性状基因座 (ASM-QTL)。对来自用于执行纳米孔测序的相同抽血的 896 个样本进行的 RNA 测序表明,ASM-QTL(即 DNA 序列变异性)驱动着基因表达和 CpG 甲基化之间发现的大部分相关性。与血液学性状相关的序列变异中,ASM-QTL 富集了 40.2 倍(95% 置信区间 32.2、49.9),表明 ASM-QTL 是非编码基因组中的重要功能单位。
