当前位置:
X-MOL 学术
›
Int. J. Intell. Syst.
›
论文详情
Our official English website, www.x-mol.net, welcomes your
feedback! (Note: you will need to create a separate account there.)
Towards blind detection of steganography in low-bit-rate speech streams
International Journal of Intelligent Systems ( IF 5.0 ) Pub Date : 2022-09-16 , DOI: 10.1002/int.23077 Congcong Sun, Hui Tian, Wojciech Mazurczyk, Chin-Chen Chang, Yiqiao Cai, Yonghong Chen
International Journal of Intelligent Systems ( IF 5.0 ) Pub Date : 2022-09-16 , DOI: 10.1002/int.23077 Congcong Sun, Hui Tian, Wojciech Mazurczyk, Chin-Chen Chang, Yiqiao Cai, Yonghong Chen
To prevent the abuse of low-rate speech-based steganography from threatening cyberspace security, the corresponding steganalysis approaches have been developed and received significant attention from research community. However, most existing steganalysis methods assume that steganography methods are known in advance, which in practice is impractical. That is why, in this paper, we present three blind detection schemes suitable for steganography in low-bit-rate speech streams. The first is based on mixed sample data augmentation. It randomly selects a certain proportion of steganographic samples from the sample set of each steganographic method to form a training set together with the original carrier samples for training to enhance the robustness of the model. The second relies on decision fusion where first step is to train a dedicated classification model for each steganography method and then use a majority voting mechanism in the detection stage to fuse the outputs of each model to give the final detection result. Compared to the other two steganalysis schemes, the third one design the detection model based on self-paced ensemble according to the distribution characteristics of speech samples. Its main idea is to fully train multiple base classifiers through multiple iterations as well as under-sampling processes, and organically fuse them to form a powerful ensemble classifier. In each iteration, differing from the traditional ensemble classifier solution, we put more attention to the steganographic samples at the decision boundary for the under-sampling process of the steganography set composed of multiple steganography methods, rather than randomly selecting steganographic samples. The steganographic samples at the decision boundary are searched using the classification hardness given by the ensemble classifier trained in the last iteration, which is more informative and more conducive to improve the performance of base classifiers. The experimental results show that the proposed three schemes can achieve efficient blind detection for low-bit-rate speech-based steganography, and the steganalysis scheme based on the self-paced ensemble has the best performance. Specifically, when the embedding rate is at 30%, the accuracy of the steganalysis scheme based on self-paced ensemble is more than 85%, while the accuracy of the other two steganalysis method is less than 80%. Additionally, the steganalysis scheme based on the self-paced ensemble learning even outperforms dedicated detectors for specific steganographic methods in terms of recall for steganographic sample detection.
中文翻译:
低比特率语音流中隐写术的盲检测
为了防止滥用基于低速率语音的隐写术威胁网络空间安全,已经开发了相应的隐写分析方法并受到研究界的极大关注。然而,大多数现有的隐写分析方法都假设隐写术方法是事先已知的,这在实践中是不切实际的。这就是为什么在本文中,我们提出了三种适用于低比特率语音流隐写术的盲检测方案。第一个是基于混合样本数据扩充。它从每种隐写方法的样本集中随机选取一定比例的隐写样本与原始载体样本一起组成训练集进行训练,以增强模型的鲁棒性。第二种依赖于决策融合,第一步是为每种隐写术方法训练一个专用的分类模型,然后在检测阶段使用多数表决机制来融合每个模型的输出,给出最终的检测结果。与其他两种隐写分析方案相比,第三种隐写分析方案根据语音样本的分布特征设计了基于自定进度集成的检测模型。其主要思想是通过多次迭代以及欠采样过程充分训练多个基分类器,并将它们有机融合形成一个强大的集成分类器。在每次迭代中,不同于传统的集成分类器解决方案,对于由多种隐写方法组成的隐写集的欠采样过程,我们更多地关注决策边界处的隐写样本,而不是随机选择隐写样本。使用上一次迭代训练的集成分类器给出的分类硬度搜索决策边界处的隐写样本,信息量更大,更有利于提高基分类器的性能。实验结果表明,所提出的三种方案可以实现低比特率语音隐写的高效盲检测,其中基于自定进度集成的隐写分析方案具有最佳性能。具体来说,当嵌入率为30%时,基于self-paced ensemble的隐写分析方案的准确率超过85%,而另外两种隐写分析方法的准确率不到80%。此外,在隐写样本检测的召回率方面,基于自定进度集成学习的隐写分析方案甚至优于用于特定隐写方法的专用检测器。
更新日期:2022-09-16
中文翻译:
低比特率语音流中隐写术的盲检测
为了防止滥用基于低速率语音的隐写术威胁网络空间安全,已经开发了相应的隐写分析方法并受到研究界的极大关注。然而,大多数现有的隐写分析方法都假设隐写术方法是事先已知的,这在实践中是不切实际的。这就是为什么在本文中,我们提出了三种适用于低比特率语音流隐写术的盲检测方案。第一个是基于混合样本数据扩充。它从每种隐写方法的样本集中随机选取一定比例的隐写样本与原始载体样本一起组成训练集进行训练,以增强模型的鲁棒性。第二种依赖于决策融合,第一步是为每种隐写术方法训练一个专用的分类模型,然后在检测阶段使用多数表决机制来融合每个模型的输出,给出最终的检测结果。与其他两种隐写分析方案相比,第三种隐写分析方案根据语音样本的分布特征设计了基于自定进度集成的检测模型。其主要思想是通过多次迭代以及欠采样过程充分训练多个基分类器,并将它们有机融合形成一个强大的集成分类器。在每次迭代中,不同于传统的集成分类器解决方案,对于由多种隐写方法组成的隐写集的欠采样过程,我们更多地关注决策边界处的隐写样本,而不是随机选择隐写样本。使用上一次迭代训练的集成分类器给出的分类硬度搜索决策边界处的隐写样本,信息量更大,更有利于提高基分类器的性能。实验结果表明,所提出的三种方案可以实现低比特率语音隐写的高效盲检测,其中基于自定进度集成的隐写分析方案具有最佳性能。具体来说,当嵌入率为30%时,基于self-paced ensemble的隐写分析方案的准确率超过85%,而另外两种隐写分析方法的准确率不到80%。此外,在隐写样本检测的召回率方面,基于自定进度集成学习的隐写分析方案甚至优于用于特定隐写方法的专用检测器。