Scientific Reports ( IF 3.8 ) Pub Date : 2024-04-08 , DOI: 10.1038/s41598-024-58573-y Binsheng Gong 1 , Samir Lababidi 2 , Rebecca Kusko 3 , Khaled Bouri 4 , Sarah Prezek 5 , Vishal Thovarai 5 , Anish Prasanna 5 , Ezekiel J Maier 5 , Mahdi Golkaram 6 , Xingqiang Sun 7 , Konstantinos Kyriakidis 8 , João Paulo Kitajima 9 , Sayed Mohammad Ebrahim Sahraeian 10 , Yunfei Guo 10 , Elaine Johanson 2 , Wendell Jones 11 , Weida Tong 1 , Joshua Xu 1
Accurately calling indels with next-generation sequencing (NGS) data is critical for clinical application. The precisionFDA team collaborated with the U.S. Food and Drug Administration’s (FDA’s) National Center for Toxicological Research (NCTR) and successfully completed the NCTR Indel Calling from Oncopanel Sequencing Data Challenge, to evaluate the performance of indel calling pipelines. Top performers were selected based on precision, recall, and F1-score. The performance of many other pipelines was close to the top performers, which produced a top cluster of performers. The performance was significantly higher in high confidence regions and coding regions, and significantly lower in low complexity regions. Oncopanel capture and other issues may have occurred that affected the recall rate. Indels with higher variant allele frequency (VAF) may generally be called with higher confidence. Many of the indel calling pipelines had good performance. Some of them performed generally well across all three oncopanels, while others were better for a specific oncopanel. The performance of indel calling can further be improved by restricting the calls within high confidence intervals (HCIs) and coding regions, and by excluding low complexity regions (LCR) regions. Certain VAF cut-offs could be applied according to the applications.
中文翻译:
为了实现准确的插入缺失,需要通过 precisionFDA 的国际管道竞赛进行 oncopanel 测序
利用下一代测序(NGS)数据准确识别插入缺失对于临床应用至关重要。 PrecisionFDA 团队与美国食品和药物管理局 (FDA) 的国家毒理学研究中心 (NCTR) 合作,成功完成了 Oncopanel 测序数据挑战赛中的 NCTR Indel Calling,以评估 indel 调用流程的性能。根据准确率、召回率和 F1 分数选择表现最佳的人员。许多其他管道的性能接近顶级表现者,从而产生了顶级表现者集群。高置信度区域和编码区域的性能显着较高,而低复杂度区域的性能显着较低。 Oncopanel 捕获和其他问题可能会影响召回率。具有较高变异等位基因频率 (VAF) 的插入缺失通常可以具有较高的置信度。许多 indel 调用管道都具有良好的性能。其中一些在所有三个 oncopanel 上总体表现良好,而另一些则在特定 oncopanel 上表现更好。通过将调用限制在高置信区间 (HCI) 和编码区域内以及排除低复杂度区域 (LCR) 区域,可以进一步提高 indel 调用的性能。可以根据应用应用某些 VAF 截止值。