Advances in Computational Mathematics ( IF 1.7 ) Pub Date : 2024-08-09 , DOI: 10.1007/s10444-024-10176-x Anna Yesypenko , Per-Gunnar Martinsson
The paper describes a sparse direct solver for the linear systems that arise from the discretization of an elliptic PDE on a two-dimensional domain. The scheme decomposes the domain into thin subdomains, or “slabs” and uses a two-level approach that is designed with parallelization in mind. The scheme takes advantage of \(\varvec{\mathcal {H}}^\textbf{2}\)-matrix structure emerging during factorization and utilizes randomized algorithms to efficiently recover this structure. As opposed to multi-level nested dissection schemes that incorporate the use of \(\varvec{\mathcal {H}}\) or \(\varvec{\mathcal {H}}^\textbf{2}\) matrices for a hierarchy of front sizes, SlabLU is a two-level scheme which only uses \(\varvec{\mathcal {H}}^\textbf{2}\)-matrix algebra for fronts of roughly the same size. The simplicity allows the scheme to be easily tuned for performance on modern architectures and GPUs. The solver described is compatible with a range of different local discretizations, and numerical experiments demonstrate its performance for regular discretizations of rectangular and curved geometries. The technique becomes particularly efficient when combined with very high-order accurate multidomain spectral collocation schemes. With this discretization, a Helmholtz problem on a domain of size \(\textbf{1000} \varvec{\lambda } \times \textbf{1000} \varvec{\lambda }\) (for which \(\varvec{N}~\mathbf {=100} \textbf{M}\)) is solved in 15 min to 6 correct digits on a high-powered desktop with GPU acceleration.
中文翻译:
SlabLU:椭圆偏微分方程的两级稀疏直接求解器
该论文描述了一种稀疏直接求解器,用于解决二维域上椭圆偏微分方程离散化产生的线性系统。该方案将域分解为细小的子域或“板”,并使用在设计时考虑到并行化的两级方法。该方案利用分解过程中出现的\(\varvec{\mathcal {H}}^\textbf{2}\)矩阵结构,并利用随机算法有效地恢复该结构。与使用\(\varvec{\mathcal {H}}\)或\(\varvec{\mathcal {H}}^\textbf{2}\)矩阵的多级嵌套剖析方案相反正面尺寸的层次结构,SlabLU 是一个两级方案,仅使用\(\varvec{\mathcal {H}}^\textbf{2}\)矩阵代数来表示大致相同尺寸的正面。简单性使得该方案可以轻松调整以适应现代架构和 GPU 的性能。所描述的求解器与一系列不同的局部离散化兼容,并且数值实验证明了其对于矩形和弯曲几何形状的规则离散化的性能。当与非常高阶的精确多域光谱搭配方案相结合时,该技术变得特别有效。通过这种离散化,大小为 \(\textbf{1000} \varvec{\lambda } \times \textbf{1000} \varvec{\lambda }\)的域上的亥姆霍兹问题(其中\(\varvec{N} ~\mathbf {=100} \textbf{M}\) ) 在具有 GPU 加速的高性能桌面上于 15 分钟内求解出 6 个正确的数字。