Identifying runtime libraries in statically linked linux binaries,Future Generation Computer Systems

当前位置： X-MOL 学术 › Future Gener. Comput. Syst. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Identifying runtime libraries in statically linked linux binaries
Future Generation Computer Systems ( IF 6.2 ) Pub Date : 2024-11-13 , DOI: 10.1016/j.future.2024.107602
Javier Carrillo-Mondéjar, Ricardo J. Rodríguez

Vulnerabilities in unpatched applications can originate from third-party dependencies in statically linked applications, as they must be relinked each time to take advantage of libraries that have been updated to fix any vulnerability. Despite this, malware binaries are often statically linked to ensure they run on target platforms and to complicate malware analysis. In this sense, identification of libraries in malware analysis becomes crucial to help filter out those library functions and focus on malware function analysis. In this paper, we introduce MANTILLA, a system for identifying runtime libraries in statically linked Linux-based binaries. Our system is based on radare2 to identify functions and extract their features (independent of the underlying architecture of the binary) through static binary analysis and on the K-nearest neighbors supervised machine learning model and a majority rule to predict final values. MANTILLA is evaluated on a dataset consisting of binaries built for different architectures (MIPSeb, ARMel, Intel x86, and Intel x86-64) and different runtime libraries (uClibc, glibc, and musl), achieving very high accuracy. We also evaluate it in two case studies. First, using a dataset of binary files belonging to the binutils collection and second, using an IoT malware dataset. In both cases, good accuracy results are obtained both in terms of runtime library detection (94.4% and 95.5%, respectively) and architecture identification (100% and 98.6%, respectively).

中文翻译：

识别静态链接的 Linux 二进制文件中的运行时库

未修补的应用程序中的漏洞可能源自静态链接应用程序中的第三方依赖项，因为每次都必须重新链接这些漏洞，以利用已更新的库来修复任何漏洞。尽管如此，恶意软件二进制文件通常是静态链接的，以确保它们在目标平台上运行，并使恶意软件分析复杂化。从这个意义上说，在恶意软件分析中识别库对于帮助过滤掉这些库函数并专注于恶意软件函数分析至关重要。在本文中，我们介绍了 MANTILLA，这是一个用于识别静态链接的基于 Linux 的二进制文件中的运行时库的系统。我们的系统基于 radare2 来识别函数并通过静态二进制分析提取其特征（独立于二进制的底层架构），并基于 K 最近邻监督机器学习模型和多数规则来预测最终值。MANTILLA 在数据集上进行评估，该数据集由为不同架构（MIPSeb、ARMel、Intel x86 和 Intel x86-64）构建的二进制文件和不同的运行时库（uClibc、glibc 和 musl）组成，实现了非常高的准确性。我们还在两个案例研究中对其进行了评估。首先，使用属于 binutils 集合的二进制文件数据集，其次，使用 IoT 恶意软件数据集。在这两种情况下，在运行时库检测（分别为 94.4% 和 95.5%）和架构识别（分别为 100% 和 98.6%）方面都获得了良好的准确率结果。

更新日期：2024-11-13

点击分享查看原文

点击收藏

公开下载

阅读更多本刊新发论文