Computers & Security ( IF 4.8 ) Pub Date : 2022-04-10 , DOI: 10.1016/j.cose.2022.102719
Bin Li 1 , Yijie Wang 1 , Kele Xu 1 , Li Cheng 1 , Zhiquan Qin 1
We study the problem of active intrusion detection over network traffic streams. Existing works create clusters for known classes and manually label instances outside the clusters for detecting novel attack classes and concept drift, yet several challenges are present. First, these methods assume that different classes of network traffic distribute far from each other in feature space, while similar attack classes could violate this assumption. It makes the true novel classes and concept drift undetectable, therefore a degraded performance. Second, prior works depending on heavily calculating and retraining cannot achieve efficient incremental updates over the infinite and high-speed streams of network traffic. Last, related methods rarely leverage the domain knowledge in intrusion detection. To address these issues, we propose DFAID, a Density-aware and Feature-deviated Active Intrusion Detection framework over network traffic streams. We first design the mask density score and the feature deviation score to maximize the effectiveness of labeled instances, effectively detecting novel attack classes and the concept drift when similar classes exist. Then, DFAID leverages robust incremental clustering structures to group instances in local regions, relieving the burden on the speed and reducing effects of noisy instances. Last, we further present DFAID-DK by incorporating the Domain Knowledge of temporal correlations between network attacks to correct the wrong predictions. Extensive experiments on two well-known benchmarks, CIC-IDS2017 and ISCX-2012, demonstrate that DFAID and its variation DFAID-DK both achieve significant improvement compared with related methods in terms of f1-score (21.7%, 22.7%) on average, and its running speed is an order of magnitude faster.
中文翻译:

DFAID:基于网络流量的密度感知和特征偏差主动入侵检测
我们研究了对网络流量流进行主动入侵检测的问题。现有工作为已知类创建集群并手动标记集群外的实例以检测新的攻击类和概念漂移,但存在一些挑战。首先,这些方法假设不同类别的网络流量在特征空间中彼此分布很远,而类似的攻击类别可能违反这一假设。它使真正的新颖类和概念漂移无法检测到,因此性能下降。其次,依赖于大量计算和再训练的先前工作无法在无限和高速的网络流量流上实现有效的增量更新。最后,相关方法很少利用入侵检测中的领域知识。为了解决这些问题,我们建议DFAID,一种基于网络流量流的密度感知和特征偏差主动入侵检测框架。我们首先设计了掩码密度分数和特征偏差分数,以最大限度地提高标记实例的有效性,有效地检测新的攻击类和存在相似类时的概念漂移。然后,DFAID 利用稳健的增量聚类结构对局部区域的实例进行分组,减轻速度负担并减少噪声实例的影响。最后,我们通过结合域K进一步展示DFAID- DK了解网络攻击之间的时间相关性以纠正错误的预测。在两个著名基准 CIC-IDS2017 和 ISCX-2012 上进行的大量实验表明,与相关方法相比,DFAID 及其变体 DFAID-DK 在平均 f1 分数(21.7%、22.7%)方面均取得了显着的提升,而且它的运行速度要快一个数量级。