A k-mer-based pangenome approach for cataloging seed-storage-protein genes in wheat to facilitate genotype-to-phenotype prediction and improvement of end-use quality,Molecular Plant

当前位置： X-MOL 学术 › Mol. Plant › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

A k-mer-based pangenome approach for cataloging seed-storage-protein genes in wheat to facilitate genotype-to-phenotype prediction and improvement of end-use quality
Molecular Plant ( IF 17.1 ) Pub Date : 2024-05-24 , DOI: 10.1016/j.molp.2024.05.006
Zhaoheng Zhang ₁ , Dan Liu ₁ , Binyong Li ₁ , Wenxi Wang ₁ , Jize Zhang ₁ , Mingming Xin ₁ , Zhaorong Hu ₁ , Jie Liu ₁ , Jinkun Du ₁ , Huiru Peng ₁ , Chenyang Hao ₂ , Xueyong Zhang ₂ , Zhongfu Ni ₁ , Qixin Sun ₁ , Weilong Guo ₁ , Yingyin Yao ₁

Affiliation

Wheat is a staple food for more than 35% of the world’s population, with wheat flour used to make hundreds of baked goods. Superior end-use quality is a major breeding target; however, improving it is especially time-consuming and expensive. Furthermore, genes encoding seed-storage proteins (SSPs) form multi-gene families and are repetitive, with gaps commonplace in several genome assemblies. To overcome these barriers and efficiently identify superior wheat SSP alleles, we developed “PanSK” (Pan-SSP -mer) for genotype-to-phenotype prediction based on an SSP-based pangenome resource. PanSK uses 29-mer sequences that represent each SSP gene at the pangenomic level to reveal untapped diversity across landraces and modern cultivars. Genome-wide association studies with -mers identified 23 SSP genes associated with end-use quality that represent novel targets for improvement. We evaluated the effect of rye secalin genes on end-use quality and found that removal of ω-secalins from 1BL/1RS wheat translocation lines is associated with enhanced end-use quality. Finally, using machine-learning-based prediction inspired by PanSK, we predicted the quality phenotypes with high accuracy from genotypes alone. This study provides an effective approach for genome design based on SSP genes, enabling the breeding of wheat varieties with superior processing capabilities and improved end-use quality.

中文翻译：

一种基于 k-mer 的泛基因组方法，用于对小麦中的种子储存蛋白基因进行分类，以促进基因型到表型的预测和最终用途质量的提高

小麦是世界上超过 35% 的人口的主食，小麦粉用于制作数百种烘焙食品。卓越的最终用途质量是一个主要的育种目标;然而，改进它特别耗时且昂贵。此外，编码种子储存蛋白（SSP）的基因形成多基因家族并且是重复的，在几个基因组组装中常见的间隙。为了克服这些障碍并有效识别优良的小麦 SSP 等位基因，我们开发了“PanSK”（Pan-SSP -mer），用于基于基于 SSP 的泛基因组资源进行基因型到表型预测。PanSK 使用在泛基因组水平代表每个 SSP 基因的 29 聚体序列来揭示地方品种和现代栽培品种中尚未开发的多样性。使用 -mers 的全基因组关联研究确定了 23 个与最终使用质量相关的 SSP 基因，这些基因代表了新的改进目标。我们评估了黑麦 secalin 基因对最终使用质量的影响，发现从 1BL/1RS 小麦易位系中去除 ω-secalins 与提高最终使用质量有关。最后，使用受 PanSK 启发的基于机器学习的预测，我们仅从基因型中以高精度预测了质量表型。本研究为基于 SSP 基因的基因组设计提供了一种有效的方法，能够培育出具有卓越加工能力和提高最终使用质量的小麦品种。

更新日期：2024-05-24

点击分享查看原文

点击收藏

公开下载

阅读更多本刊新发论文本刊介绍/投稿指南