当前位置: X-MOL 学术ACM Trans. Graph. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
GPU Coroutines for Flexible Splitting and Scheduling of Rendering Tasks
ACM Transactions on Graphics  ( IF 7.8 ) Pub Date : 2024-11-19 , DOI: 10.1145/3687766
Shaokun Zheng, Xin Chen, Zhong Shi, Ling-Qi Yan, Kun Xu

We introduce coroutines into GPU kernel programming, providing an automated solution for flexible splitting and scheduling of rendering tasks. This approach addresses a prevalent challenge in harnessing the power of modern GPUs for complex, imbalanced graphics workloads like path tracing. Usually, to accommodate the SIMT execution model and latency-hiding architecture, developers have to decompose a monolithic mega-kernel into smaller sub-tasks for improved thread coherence and reduced register pressure. However, involving the handling of intricate nested control flows and numerous interdependent program states, this process can be exceedingly tedious and error-prone when performed manually. Coroutines, a building block for asynchronous programming in many high-level CPU languages, exhibit untapped potential for restructuring GPU kernels due to their versatility in control representation. By extending Luisa [Zheng et al. 2022], we implement an asymmetric, stackless coroutine model with programming language support and multiple built-in schedulers for modern GPUs. To showcase the effectiveness of our model and implementation, we examine them in different application scenarios, including path tracing, SDF rendering, and incorporation with custom passes.

中文翻译:


用于灵活拆分和调度渲染任务的 GPU 协程



我们将协程引入 GPU 内核编程中,为渲染任务的灵活拆分和调度提供自动化解决方案。这种方法解决了利用现代 GPU 的强大功能处理复杂、不平衡的图形工作负载(如路径跟踪)的普遍挑战。通常,为了适应 SIMT 执行模型和延迟隐藏架构,开发人员必须将单体大型内核分解为更小的子任务,以提高线程一致性并降低寄存器压力。但是,涉及处理复杂的嵌套控制流和许多相互依赖的程序状态,手动执行此过程可能会非常乏味且容易出错。协程是许多高级 CPU 语言中异步编程的构建块,由于其在控制表示方面的多功能性,因此在重构 GPU 内核方面表现出尚未开发的潜力。通过扩展 Luisa [Zheng et al. 2022],我们实现了一个非对称、无堆栈的协程模型,该模型具有编程语言支持和多个现代 GPU 的内置调度程序。为了展示我们的模型和实现的有效性,我们在不同的应用场景中对其进行了研究,包括路径追踪、SDF 渲染以及与自定义通道的合并。
更新日期:2024-11-19
down
wechat
bug