当前位置:
X-MOL 学术
›
Genet. Sel. Evol.
›
论文详情
Our official English website, www.x-mol.net, welcomes your
feedback! (Note: you will need to create a separate account there.)
GWABLUP: genome-wide association assisted best linear unbiased prediction of genetic values
Genetics Selection Evolution ( IF 3.6 ) Pub Date : 2024-03-01 , DOI: 10.1186/s12711-024-00881-y Theo Meuwissen 1 , Leiv Sigbjorn Eikje 2 , Arne B Gjuvsland 2
Genetics Selection Evolution ( IF 3.6 ) Pub Date : 2024-03-01 , DOI: 10.1186/s12711-024-00881-y Theo Meuwissen 1 , Leiv Sigbjorn Eikje 2 , Arne B Gjuvsland 2
Affiliation
Since the very beginning of genomic selection, researchers investigated methods that improved upon SNP-BLUP (single nucleotide polymorphism best linear unbiased prediction). SNP-BLUP gives equal weight to all SNPs, whereas it is expected that many SNPs are not near causal variants and thus do not have substantial effects. A recent approach to remedy this is to use genome-wide association study (GWAS) findings and increase the weights of GWAS-top-SNPs in genomic predictions. Here, we employ a genome-wide approach to integrate GWAS results into genomic prediction, called GWABLUP. GWABLUP consists of the following steps: (1) performing a GWAS in the training data which results in likelihood ratios; (2) smoothing the likelihood ratios over the SNPs; (3) combining the smoothed likelihood ratio with the prior probability of SNPs having non-zero effects, which yields the posterior probability of the SNPs; (4) calculating a weighted genomic relationship matrix using the posterior probabilities as weights; and (5) performing genomic prediction using the weighted genomic relationship matrix. Using high-density genotypes and milk, fat, protein and somatic cell count phenotypes on dairy cows, GWABLUP was compared to GBLUP, GBLUP (topSNPs) with extra weights for GWAS top-SNPs, and BayesGC, i.e. a Bayesian variable selection model. The GWAS resulted in six, five, four, and three genome-wide significant peaks for milk, fat and protein yield and somatic cell count, respectively. GWABLUP genomic predictions were 10, 6, 7 and 1% more reliable than those of GBLUP for milk, fat and protein yield and somatic cell count, respectively. It was also more reliable than GBLUP (topSNPs) for all four traits, and more reliable than BayesGC for three of the traits. Although GWABLUP showed a tendency towards inflation bias for three of the traits, this was not statistically significant. In a multitrait analysis, GWABLUP yielded the highest accuracy for two of the traits. However, for SCC, which was relatively unrelated to the yield traits, including yield trait GWAS-results reduced the reliability compared to a single trait analysis. GWABLUP uses GWAS results to differentially weigh all the SNPs in a weighted GBLUP genomic prediction analysis. GWABLUP yielded up to 10% and 13% more reliable genomic predictions than GBLUP for single and multitrait analyses, respectively. Extension of GWABLUP to single-step analyses is straightforward.
中文翻译:
GWABLUP:全基因组关联辅助遗传值的最佳线性无偏预测
自基因组选择一开始,研究人员就研究了改进 SNP-BLUP(单核苷酸多态性最佳线性无偏预测)的方法。 SNP-BLUP 对所有 SNP 给予相同的权重,但预计许多 SNP 并不接近因果变异,因此不会产生实质性影响。最近解决这个问题的方法是利用全基因组关联研究 (GWAS) 的结果并增加 GWAS-top-SNP 在基因组预测中的权重。在这里,我们采用全基因组方法将 GWAS 结果整合到基因组预测中,称为 GWABLUP。 GWABLUP 包含以下步骤:(1)在训练数据中执行 GWAS,得出似然比; (2) 平滑 SNP 上的似然比; (3)将平滑后的似然比与具有非零效应的SNP的先验概率相结合,得到SNP的后验概率; (4)以后验概率为权重,计算加权基因组关系矩阵; (5)使用加权基因组关系矩阵进行基因组预测。使用奶牛的高密度基因型和牛奶、脂肪、蛋白质和体细胞计数表型,将 GWABLUP 与 GBLUP、GBLUP (topSNP) 以及 GWAS top-SNP 的额外权重和 BayesGC(即贝叶斯变量选择模型)进行比较。 GWAS 分别产生了牛奶、脂肪和蛋白质产量以及体细胞计数的六个、五个、四个和三个全基因组显着峰值。 GWABLUP 基因组预测在乳汁、脂肪和蛋白质产量以及体细胞计数方面分别比 GBLUP 可靠 10%、6%、7% 和 1%。对于所有四个性状,它也比 GBLUP (topSNP) 更可靠,对于其中三个性状,它比 BayesGC 更可靠。 尽管 GWABLUP 显示其中三个特征存在通货膨胀偏差的趋势,但这在统计上并不显着。在多性状分析中,GWABLUP 对其中两个性状产生了最高的准确度。然而,对于与产量性状相对无关的 SCC,包括产量性状 GWAS 结果与单一性状分析相比降低了可靠性。 GWABLUP 使用 GWAS 结果对加权 GBLUP 基因组预测分析中的所有 SNP 进行差异加权。对于单性状和多性状分析,GWABLUP 的基因组预测可靠性分别比 GBLUP 高 10% 和 13%。将 GWABLUP 扩展到单步分析非常简单。
更新日期:2024-03-01
中文翻译:
GWABLUP:全基因组关联辅助遗传值的最佳线性无偏预测
自基因组选择一开始,研究人员就研究了改进 SNP-BLUP(单核苷酸多态性最佳线性无偏预测)的方法。 SNP-BLUP 对所有 SNP 给予相同的权重,但预计许多 SNP 并不接近因果变异,因此不会产生实质性影响。最近解决这个问题的方法是利用全基因组关联研究 (GWAS) 的结果并增加 GWAS-top-SNP 在基因组预测中的权重。在这里,我们采用全基因组方法将 GWAS 结果整合到基因组预测中,称为 GWABLUP。 GWABLUP 包含以下步骤:(1)在训练数据中执行 GWAS,得出似然比; (2) 平滑 SNP 上的似然比; (3)将平滑后的似然比与具有非零效应的SNP的先验概率相结合,得到SNP的后验概率; (4)以后验概率为权重,计算加权基因组关系矩阵; (5)使用加权基因组关系矩阵进行基因组预测。使用奶牛的高密度基因型和牛奶、脂肪、蛋白质和体细胞计数表型,将 GWABLUP 与 GBLUP、GBLUP (topSNP) 以及 GWAS top-SNP 的额外权重和 BayesGC(即贝叶斯变量选择模型)进行比较。 GWAS 分别产生了牛奶、脂肪和蛋白质产量以及体细胞计数的六个、五个、四个和三个全基因组显着峰值。 GWABLUP 基因组预测在乳汁、脂肪和蛋白质产量以及体细胞计数方面分别比 GBLUP 可靠 10%、6%、7% 和 1%。对于所有四个性状,它也比 GBLUP (topSNP) 更可靠,对于其中三个性状,它比 BayesGC 更可靠。 尽管 GWABLUP 显示其中三个特征存在通货膨胀偏差的趋势,但这在统计上并不显着。在多性状分析中,GWABLUP 对其中两个性状产生了最高的准确度。然而,对于与产量性状相对无关的 SCC,包括产量性状 GWAS 结果与单一性状分析相比降低了可靠性。 GWABLUP 使用 GWAS 结果对加权 GBLUP 基因组预测分析中的所有 SNP 进行差异加权。对于单性状和多性状分析,GWABLUP 的基因组预测可靠性分别比 GBLUP 高 10% 和 13%。将 GWABLUP 扩展到单步分析非常简单。