当前位置: X-MOL 学术Nucleic Acids Res. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Generating, modeling and evaluating a large-scale set of CRISPR/Cas9 off-target sites with bulges
Nucleic Acids Research ( IF 16.6 ) Pub Date : 2024-05-30 , DOI: 10.1093/nar/gkae428
Ofir Yaish 1 , Yaron Orenstein 2, 3
Affiliation  

The CRISPR/Cas9 system is a highly accurate gene-editing technique, but it can also lead to unintended off-target sites (OTS). Consequently, many high-throughput assays have been developed to measure OTS in a genome-wide manner, and their data was used to train machine-learning models to predict OTS. However, these models are inaccurate when considering OTS with bulges due to limited data compared to OTS without bulges. Recently, CHANGE-seq, a new in vitro technique to detect OTS, was used to produce a dataset of unprecedented scale and quality. In addition, the same study produced in cellula GUIDE-seq experiments, but none of these GUIDE-seq experiments included bulges. Here, we generated the most comprehensive GUIDE-seq dataset with bulges, and trained and evaluated state-of-the-art machine-learning models that consider OTS with bulges. We first reprocessed the publicly available experimental raw data of the CHANGE-seq study to generate 20 new GUIDE-seq experiments, and hundreds of OTS with bulges among the original and new GUIDE-seq experiments. We then trained multiple machine-learning models, and demonstrated their state-of-the-art performance both in vitro and in cellula over all OTS and when focusing on OTS with bulges. Last, we visualized the key features learned by our models on OTS with bulges in a unique representation.

中文翻译:


生成、建模和评估一组大规模的带有凸起的 CRISPR/Cas9 脱靶位点



CRISPR/Cas9 系统是一种高度准确的基因编辑技术,但它也可能导致意外的脱靶位点 (OTS)。因此,许多高通量检测方法被开发出来,以全基因组的方式测量 OTS,并且它们的数据被用来训练机器学习模型来预测 OTS。然而,由于与没有凸起的 OTS 相比数据有限,在考虑带有凸起的 OTS 时,这些模型是不准确的。最近,CHANGE-seq(一种检测 OTS 的新体外技术)被用来生成规模和质量前所未有的数据集。此外,在细胞 GUIDE-seq 实验中也进行了相同的研究,但这些 GUIDE-seq 实验均不包含凸起。在这里,我们生成了最全面的带有凸起的 GUIDE-seq 数据集,并训练和评估了考虑带有凸起的 OTS 的最先进的机器学习模型。我们首先重新处理了 CHANGE-seq 研究公开的实验原始数据,生成了 20 个新的 GUIDE-seq 实验,以及数百个原始 GUIDE-seq 实验中存在凸起的 OTS。然后,我们训练了多个机器学习模型,并在体外和细胞中展示了它们在所有 OTS 上以及在关注带有凸起的 OTS 时的最先进性能。最后,我们将我们的模型在 OTS 上学到的关键特征进行了可视化,并以独特的表示形式呈现凸起。
更新日期:2024-05-30
down
wechat
bug