TPP: Transparent Page Placement for CXL-Enabled Tiered Memory,arXiv - CS - Operating Systems

当前位置： X-MOL 学术 › arXiv.cs.OS › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

TPP: Transparent Page Placement for CXL-Enabled Tiered Memory
arXiv - CS - Operating Systems Pub Date : 2022-06-06 , DOI: arxiv-2206.02878
Hasan Al Maruf, Hao Wang, Abhishek Dhanotia, Johannes Weiner, Niket Agarwal, Pallab Bhattacharya, Chris Petersen, Mosharaf Chowdhury, Shobhit Kanaujia, Prakash Chauhan

With increasing memory demands for datacenter applications and the emergence of coherent interfaces like CXL that enable main memory expansion, we are about to observe a wide adoption of tiered-memory subsystems in hyperscalers. In such systems, main memory can constitute different memory technologies with varied performance characteristics. In this paper, we characterize the memory usage of a wide range of datacenter applications across the server fleet of a hyperscaler (Meta) to get insights into an application's memory access patterns and performance on a tiered memory system. Our characterizations show that datacenter applications can benefit from tiered memory systems as there exist opportunities for offloading colder pages to slower memory tiers. Without efficient memory management, however, such systems can significantly degrade performance. We propose a novel OS-level application-transparent page placement mechanism (TPP) for efficient memory management. TPP employs a lightweight mechanism to identify and place hot and cold pages to appropriate memory tiers. It enables page allocation to work independently from page reclamation logic that is, otherwise, tightly coupled in today's Linux kernel. As a result, the local memory tier has memory headroom for new allocations. At the same time, TPP can promptly promote performance-critical hot pages trapped in the slow memory tiers to the fast tier node. Both promotion and demotion mechanisms work transparently without any prior knowledge of an application's memory access behavior. We evaluate TPP with diverse workloads that consume significant portions of DRAM on Meta's server fleet and are sensitive to memory subsystem performance. TPP's efficient page placement improves Linux's performance by up to 18%. TPP outperforms NUMA balancing and AutoTiering, state-of-the-art solutions for tiered memory, by 10-17%.

中文翻译：

TPP：启用 CXL 的分层内存的透明页面放置

随着数据中心应用程序对内存的需求不断增加，以及支持主内存扩展的 CXL 等连贯接口的出现，我们即将观察到分层内存子系统在超大规模计算机中的广泛采用。在这样的系统中，主存储器可以构成具有不同性能特征的不同存储器技术。在本文中，我们描述了超大规模 (Meta) 的服务器队列中各种数据中心应用程序的内存使用情况，以深入了解应用程序的内存访问模式和分层内存系统上的性能。我们的特征表明，数据中心应用程序可以从分层内存系统中受益，因为存在将较冷页面卸载到较慢内存层的机会。然而，如果没有有效的内存管理，这样的系统会显着降低性能。我们提出了一种新颖的操作系统级应用程序透明页面放置机制（TPP），用于有效的内存管理。TPP 采用轻量级机制来识别热页和冷页并将其放置到适当的内存层。它使页面分配能够独立于页面回收逻辑工作，否则，在当今的 Linux 内核中是紧密耦合的。因此，本地内存层具有用于新分配的内存空间。同时，TPP 可以及时将困在慢内存层的性能关键热页提升到快层节点。升级和降级机制都透明地工作，无需事先了解应用程序的内存访问行为。我们评估具有不同工作负载的 TPP，这些工作负载消耗 Meta 上的重要部分 DRAM s 服务器队列，并且对内存子系统性能很敏感。TPP 的高效页面放置将 Linux 的性能提高了 18%。TPP 比 NUMA 平衡和 AutoTiering（分层内存的最先进解决方案）高 10-17%。

更新日期：2022-06-08

点击分享查看原文

点击收藏

阅读更多本刊新发论文