当前位置:
X-MOL 学术
›
WIREs Data Mining Knowl. Discov.
›
论文详情
Our official English website, www.x-mol.net, welcomes your
feedback! (Note: you will need to create a separate account there.)
A systematic review on detection and adaptation of concept drift in streaming data using machine learning techniques
WIREs Data Mining and Knowledge Discovery ( IF 6.4 ) Pub Date : 2024-03-19 , DOI: 10.1002/widm.1536 Shruti Arora 1 , Rinkle Rani 2 , Nitin Saxena 2
WIREs Data Mining and Knowledge Discovery ( IF 6.4 ) Pub Date : 2024-03-19 , DOI: 10.1002/widm.1536 Shruti Arora 1 , Rinkle Rani 2 , Nitin Saxena 2
Affiliation
Last decade demonstrate the massive growth in organizational data which keeps on increasing multi‐fold as millions of records get updated every second. Handling such vast and continuous data is challenging which further opens up many research areas. The continuously flowing data from various sources and in real‐time is termed as streaming data. While deriving valuable statistics from data streams, the variation that occurs in data distribution is called concept drift. These drifts play a significant role in a variety of disciplines, including data mining, machine learning, ubiquitous knowledge discovery, quantitative decision theory, and so forth. As a result, a substantial amount of research is carried out for studying methodologies and approaches for dealing with drifts. However, the available material is scattered and lacks guidelines for selecting an effective technique for a particular application. The primary novel objective of this survey is to present an understanding of concept drift challenges and allied studies. Further, it assists researchers from diverse domains to accommodate detection and adaptation algorithms for concept drifts in their applications. Overall, this study aims to contribute to deeper insights into the classification of various types of drifts and methods for detection and adaptation along with their key features and limitations. Furthermore, this study also highlights performance metrics used to evaluate the concept drift detection methods for streaming data. This paper presents the future research scope by highlighting gaps in the existing literature for the development of techniques to handle concept drifts.This article is categorized under: Algorithmic Development > Ensemble Methods Application Areas > Data Mining Software Tools Fundamental Concepts of Data and Knowledge > Big Data Mining
中文翻译:
使用机器学习技术对流数据中概念漂移的检测和适应进行系统综述
过去十年展示了组织数据的巨大增长,随着每秒数百万条记录的更新,这些数据持续成倍增长。处理如此庞大且连续的数据具有挑战性,这进一步开拓了许多研究领域。来自各种来源的实时连续流动的数据被称为流数据。在从数据流中获取有价值的统计数据时,数据分布中发生的变化称为概念漂移。这些漂移在各种学科中发挥着重要作用,包括数据挖掘、机器学习、无处不在的知识发现、定量决策理论等。因此,进行了大量的研究来研究处理漂移的方法和途径。然而,可用的材料很分散,并且缺乏为特定应用选择有效技术的指南。这项调查的主要新颖目标是提出对概念漂移挑战和相关研究的理解。此外,它还帮助来自不同领域的研究人员在其应用程序中适应概念漂移的检测和适应算法。总体而言,本研究旨在帮助更深入地了解各种类型漂移的分类以及检测和适应方法及其主要特征和局限性。此外,本研究还重点介绍了用于评估流数据概念漂移检测方法的性能指标。本文通过强调现有文献中在开发处理概念漂移技术方面的差距来介绍未来的研究范围。本文分为以下几类: 算法开发 > 集成方法 应用领域 > 数据挖掘软件工具 数据和知识的基本概念 > 大数据挖掘
更新日期:2024-03-19
中文翻译:
使用机器学习技术对流数据中概念漂移的检测和适应进行系统综述
过去十年展示了组织数据的巨大增长,随着每秒数百万条记录的更新,这些数据持续成倍增长。处理如此庞大且连续的数据具有挑战性,这进一步开拓了许多研究领域。来自各种来源的实时连续流动的数据被称为流数据。在从数据流中获取有价值的统计数据时,数据分布中发生的变化称为概念漂移。这些漂移在各种学科中发挥着重要作用,包括数据挖掘、机器学习、无处不在的知识发现、定量决策理论等。因此,进行了大量的研究来研究处理漂移的方法和途径。然而,可用的材料很分散,并且缺乏为特定应用选择有效技术的指南。这项调查的主要新颖目标是提出对概念漂移挑战和相关研究的理解。此外,它还帮助来自不同领域的研究人员在其应用程序中适应概念漂移的检测和适应算法。总体而言,本研究旨在帮助更深入地了解各种类型漂移的分类以及检测和适应方法及其主要特征和局限性。此外,本研究还重点介绍了用于评估流数据概念漂移检测方法的性能指标。本文通过强调现有文献中在开发处理概念漂移技术方面的差距来介绍未来的研究范围。本文分为以下几类: