Complex & Intelligent Systems ( IF 5.0 ) Pub Date : 2024-11-14 , DOI: 10.1007/s40747-024-01617-7 Jun Liu, Jianxun Zhang, Ting Tang, Shengyuan Wu
The rapid development of drone technology has made drones one of the essential tools for acquiring aerial information. The detection and localization of text information through drones greatly enhance their understanding of the environment, enabling tasks of significant importance such as community commercial planning and autonomous navigation in intelligent environments. However, the unique perspective and complex environment during drone photography lead to various challenges in text detection, including diverse text shapes, large-scale variations, and background interference, making traditional methods inadequate. To address this issue, we propose a drone-based text detection method based on boundary adaptation. We first conduct an in-depth analysis of text characteristics from a drone’s perspective. Using ResNet50 as the backbone network, we introduce the proposed Hybrid Text Attention Mechanism into the backbone network to enhance the perception of text regions in the feature extraction module. Additionally, we propose a Spatial Feature Fusion Module to adaptively fuse text features of different scales, thereby enhancing the model’s adaptability. Furthermore, we introduce a text detail transformer by incorporating a local feature extractor into the transformer of the text detail boundary iteration optimization module. This enables the precise optimization and localization of text boundaries by reducing the interference of complex backgrounds, eliminating the need for complex post-processing. Extensive experiments on challenging text detection datasets and drone-based text detection datasets validate the high robustness and state-of-the-art performance of our proposed method, laying a solid foundation for practical applications.
中文翻译:
DADNet:基于边界自适应的无人机视角任意形状文本检测
无人机技术的飞速发展使无人机成为获取航空信息的重要工具之一。通过无人机检测和定位文本信息极大地增强了他们对环境的理解,从而能够完成非常重要的任务,例如社区商业规划和智能环境中的自主导航。然而,无人机摄影时独特的视角和复杂的环境导致了文本检测的各种挑战,包括文本形状多样、变化大、背景干扰等,传统方法无法满足要求。为了解决这个问题,我们提出了一种基于边界适应的基于无人机的文本检测方法。我们首先从无人机的角度对文本特征进行深入分析。以 ResNet50 作为骨干网络,将所提出的 Hybrid Text Attention Mechanism 引入骨干网络,以增强特征提取模块中对文本区域的感知。此外,我们提出了一个空间特征融合模块,以自适应地融合不同尺度的文本特征,从而增强模型的适应性。此外,我们通过将本地特征提取器集成到文本细节边界迭代优化模块的转换器中,引入了文本细节转换器。这通过减少复杂背景的干扰,实现了文本边界的精确优化和定位,无需复杂的后处理。在具有挑战性的文本检测数据集和基于无人机的文本检测数据集上进行的广泛实验验证了我们提出的方法的高鲁棒性和最先进的性能,为实际应用奠定了坚实的基础。