A Systematic Literature Review on Automated Software Vulnerability Detection Using Machine Learning,ACM Computing Surveys

当前位置： X-MOL 学术 › ACM Comput. Surv. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

A Systematic Literature Review on Automated Software Vulnerability Detection Using Machine Learning
ACM Computing Surveys ( IF 23.8 ) Pub Date : 2024-10-11 , DOI: 10.1145/3699711
Nima Shiri Harzevili, Alvine Boaye Belle, Junjie Wang, Song Wang, Zhen Ming (Jack) Jiang, Nachiappan Nagappan

In recent years, numerous Machine Learning (ML) models, including Deep Learning (DL) and classic ML models, have been developed to detect software vulnerabilities. However, there is a notable lack of comprehensive and systematic surveys that summarize, classify, and analyze the applications of these ML models in software vulnerability detection. This absence may lead to critical research areas being overlooked or under-represented, resulting in a skewed understanding of the current state of the art in software vulnerability detection. To close this gap, we propose a comprehensive and systematic literature review that characterizes the different properties of ML-based software vulnerability detection systems using six major research questions (RQs). Using a custom web scraper, our systematic approach involves extracting a set of studies from four widely used online digital libraries—ACM Digital Library, IEEEXplore, ScienceDirect, and Google Scholar. We manually analyzed the extracted studies to filter out irrelevant work unrelated to software vulnerability detection, followed by creating taxonomies and addressing research questions. Our analysis indicates a significant upward trend in applying ML techniques for software vulnerability detection over the past few years, with many studies published in recent years. Prominent conference venues include the International Conference on Software Engineering (ICSE), the International Symposium on Software Reliability Engineering (ISSRE), The Mining Software Repositories (MSR) conference, and the ACM International Conference on the Foundations of Software Engineering (FSE), while the Information and Software Technology (IST), the Computers & Security (C&S), and the Journal of Systems and Software (JSS) are the leading journal venues. Our results reveal that 39.1% of the subject studies use hybrid sources while 37.6% of the subject studies utilize benchmark data for software vulnerability detection. Code-based data are the most commonly used data type among subject studies, with source code being the predominant subtype. Graph-based and token-based input representations are the most popular techniques, accounting for 57.2% and 24.6% of the subject studies, respectively. Among the input embedding techniques, graph embedding and token vector embedding are the most frequently used techniques accounting for 32.6% and 29.7% of the subject studies. Additionally, 88.4% of the subject studies use DL models, with Recurrent Neural Networks (RNNs) and Graph Neural Networks (GNNs) being the most popular subcategories, while only 7.2% use classic ML models. Among the vulnerability types covered by the subject studies, CWE-119, CWE-20, and CWE-190 are the most frequent ones. In terms of tools used for software vulnerability detection, Keras with TensorFlow backend and PyTorch libraries are the most frequently used model-building tools accounting for 42 studies for each. Also, Joern is the most popular tool used for code representation accounting for 24 studies. Finally, we summarize the challenges and future directions in the context of software vulnerability detection, providing valuable insights for researchers and practitioners in the field.

中文翻译：

关于使用机器学习进行自动软件漏洞检测的系统文献综述

近年来，已经开发了许多机器学习（ML）模型，包括深度学习（DL）和经典 ML 模型，以检测软件漏洞。然而，明显缺乏全面和系统的调查来总结、分类和分析这些 ML 模型在软件漏洞检测中的应用。这种缺失可能导致关键研究领域被忽视或代表性不足，从而导致对软件漏洞检测当前技术水平的理解出现偏差。为了缩小这一差距，我们提出了一个全面而系统的文献综述，使用六个主要研究问题（RQ）来描述基于 ML 的软件漏洞检测系统的不同属性。使用自定义网络爬虫，我们的系统方法包括从四个广泛使用的在线数字图书馆（ACM 数字图书馆、IEEEXplore、ScienceDirect 和 Google Scholar）中提取一组研究。我们手动分析了提取的研究，以过滤掉与软件漏洞检测无关的不相关工作，然后创建分类法并解决研究问题。我们的分析表明，在过去几年中，应用 ML 技术进行软件漏洞检测呈显著上升趋势，近年来发表了许多研究。主要的会议地点包括国际软件工程会议（ICSE），国际软件可靠性工程研讨会（ISSRE），挖矿软件仓库（MSR）会议，以及ACM软件工程基础国际会议（FSE），而信息和软件技术（IST），计算机与安全（C&S）和系统和软件期刊（JSS）是领先的期刊场所。我们的结果显示，39.1% 的学科研究使用混合来源，而 37.6% 的学科研究使用基准数据进行软件漏洞检测。基于代码的数据是主题研究中最常用的数据类型，源代码是主要的子类型。基于图形和基于标记的输入表示是最受欢迎的技术，分别占学科研究的 57.2% 和 24.6%。在输入嵌入技术中，图形嵌入和标记向量嵌入是最常用的技术，分别占学科研究的 32.6% 和 29.7%。此外，88.4% 的学科研究使用 DL 模型，其中递归神经网络（RNN）和图形神经网络（GNN）是最受欢迎的子类别，而只有 7.2% 使用经典的 ML 模型。在主题研究涵盖的漏洞类型中，CWE-119、CWE-20 和 CWE-190 是最常见的漏洞类型。在用于软件漏洞检测的工具方面，带有 TensorFlow 后端和 PyTorch 库的 Keras 是最常用的模型构建工具，每个工具都有 42 项研究。此外，Joern 是 24 项研究中最受欢迎的代码表示工具。最后，我们总结了软件漏洞检测背景下的挑战和未来方向，为该领域的研究人员和从业者提供了有价值的见解。

更新日期：2024-10-11

点击分享查看原文

点击收藏

公开下载

阅读更多本刊新发论文本刊介绍/投稿指南