Nature Biotechnology ( IF 33.1 ) Pub Date : 2023-08-03 , DOI: 10.1038/s41587-023-01881-x Florian V De Rop 1, 2 , Gert Hulselmans 1, 2 , Chris Flerin 1, 2 , Paula Soler-Vila 3 , Albert Rafels 4 , Valerie Christiaens 1, 2 , Carmen Bravo González-Blas 1, 2 , Domenica Marchese 4 , Ginevra Caratù 4 , Suresh Poovathingal 1 , Orit Rozenblatt-Rosen 5 , Michael Slyper 6 , Wendy Luo 6 , Christoph Muus 6 , Fabiana Duarte 7, 8 , Rojesh Shrestha 7, 8 , S Tansu Bagdatli 9 , M Ryan Corces 10 , Lira Mamanova 11 , Andrew Knights 11 , Kerstin B Meyer 11 , Ryan Mulqueen 12 , Akram Taherinasab 13, 14 , Patrick Maschmeyer 15, 16 , Jörn Pezoldt 17, 18 , Camille Lucie Germaine Lambert 17, 18 , Marta Iglesias 4, 19 , Sebastián R Najle 4 , Zain Y Dossani 20, 21 , Luciano G Martelotto 22, 23 , Zach Burkett 24 , Ronald Lebofsky 24 , José Ignacio Martin-Subero 3, 25, 26, 27 , Satish Pillai 20, 21 , Arnau Sebé-Pedrós 4, 19, 28 , Bart Deplancke 17, 18 , Sarah A Teichmann 11, 29 , Leif S Ludwig 6, 15, 16 , Theodore P Braun 13, 14 , Andrew C Adey 12 , William J Greenleaf 9, 30 , Jason D Buenrostro 7, 8 , Aviv Regev 5, 31, 32 , Stein Aerts 1, 2 , Holger Heyn 4, 19
Single-cell assay for transposase-accessible chromatin by sequencing (scATAC-seq) has emerged as a powerful tool for dissecting regulatory landscapes and cellular heterogeneity. However, an exploration of systemic biases among scATAC-seq technologies has remained absent. In this study, we benchmark the performance of eight scATAC-seq methods across 47 experiments using human peripheral blood mononuclear cells (PBMCs) as a reference sample and develop PUMATAC, a universal preprocessing pipeline, to handle the various sequencing data formats. Our analyses reveal significant differences in sequencing library complexity and tagmentation specificity, which impact cell-type annotation, genotype demultiplexing, peak calling, differential region accessibility and transcription factor motif enrichment. Our findings underscore the importance of sample extraction, method selection, data processing and total cost of experiments, offering valuable guidance for future research. Finally, our data and analysis pipeline encompasses 169,000 PBMC scATAC-seq profiles and a best practices code repository for scATAC-seq data analysis, which are freely available to extend this benchmarking effort to future protocols.
中文翻译:
单细胞 ATAC 测序方案的系统基准测试
通过测序对转座酶可及的染色质进行单细胞测定 (scATAC-seq) 已成为剖析调控环境和细胞异质性的强大工具。然而,对 scATAC-seq 技术之间系统偏差的探索仍然缺乏。在这项研究中,我们使用人外周血单核细胞 (PBMC) 作为参考样本,对 47 个实验中的 8 种 scATAC-seq 方法的性能进行了基准测试,并开发了 PUMATAC(一种通用预处理管道)来处理各种测序数据格式。我们的分析揭示了测序文库复杂性和标签特异性的显着差异,这影响了细胞类型注释、基因型解复用、峰识别、差异区域可访问性和转录因子基序富集。我们的研究结果强调了样本提取、方法选择、数据处理和实验总成本的重要性,为未来的研究提供了宝贵的指导。最后,我们的数据和分析管道包含 169,000 个 PBMC scATAC-seq 配置文件和用于 scATAC-seq 数据分析的最佳实践代码存储库,这些代码存储库可免费使用,以将基准测试工作扩展到未来的协议。