Deep Learning for Code Intelligence: Survey, Benchmark and Toolkit,ACM Computing Surveys

当前位置： X-MOL 学术 › ACM Comput. Surv. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Deep Learning for Code Intelligence: Survey, Benchmark and Toolkit
ACM Computing Surveys ( IF 23.8 ) Pub Date : 2024-05-18 , DOI: 10.1145/3664597
Yao Wan ₁ , Zhangqian Bi ₁ , Yang He ₂ , Jianguo Zhang ₃ , Hongyu Zhang ₄ , Yulei Sui ₅ , Guandong Xu ₆ , Hai Jin ₁ , Philip Yu ₇

Affiliation

Code intelligence leverages machine learning techniques to extract knowledge from extensive code corpora, with the aim of developing intelligent tools to improve the quality and productivity of computer programming. Currently, there is already a thriving research community focusing on code intelligence, with efforts ranging from software engineering, machine learning, data mining, natural language processing, and programming languages. In this paper, we conduct a comprehensive literature review on deep learning for code intelligence, from the aspects of code representation learning, deep learning techniques, and application tasks. We also benchmark several state-of-the-art neural models for code intelligence, and provide an open-source toolkit tailored for the rapid prototyping of deep-learning-based code intelligence models. In particular, we inspect the existing code intelligence models under the basis of code representation learning, and provide a comprehensive overview to enhance comprehension of the present state of code intelligence. Furthermore, we publicly release the source code and data resources to provide the community with a ready-to-use benchmark, which can facilitate the evaluation and comparison of existing and future code intelligence models (https://xcodemind.github.io). At last, we also point out several challenging and promising directions for future research.

中文翻译：

代码智能深度学习：调查、基准和工具包

代码智能利用机器学习技术从广泛的代码语料库中提取知识，旨在开发智能工具来提高计算机编程的质量和生产力。目前，已经有一个蓬勃发展的研究社区专注于代码智能，其工作范围涵盖软件工程、机器学习、数据挖掘、自然语言处理和编程语言。在本文中，我们从代码表示学习、深度学习技术和应用任务等方面对代码智能的深度学习进行了全面的文献综述。我们还对几种最先进的代码智能神经模型进行了基准测试，并提供了专为基于深度学习的代码智能模型的快速原型设计而定制的开源工具包。特别是，我们在代码表示学习的基础上检查了现有的代码智能模型，并提供了全面的概述，以增强对代码智能现状的理解。此外，我们公开发布源代码和数据资源，为社区提供一个即用型基准，可以方便对现有和未来的代码智能模型进行评估和比较（https://xcodemind.github.io）。最后，我们还指出了未来研究的几个具有挑战性和前景的方向。

更新日期：2024-05-18

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>