当前位置:
X-MOL 学术
›
Comput. Aided Civ. Infrastruct. Eng.
›
论文详情
Our official English website, www.x-mol.net, welcomes your
feedback! (Note: you will need to create a separate account there.)
Damage‐level classification considering both correlation between image and text data and confidence of attention map
Computer-Aided Civil and Infrastructure Engineering ( IF 8.5 ) Pub Date : 2024-11-08 , DOI: 10.1111/mice.13366 Keisuke Maeda, Naoki Ogawa, Takahiro Ogawa, Miki Haseyama
Computer-Aided Civil and Infrastructure Engineering ( IF 8.5 ) Pub Date : 2024-11-08 , DOI: 10.1111/mice.13366 Keisuke Maeda, Naoki Ogawa, Takahiro Ogawa, Miki Haseyama
In damage‐level classification, deep learning. models are more likely to focus on regions unrelated to classification targets because of the complexities inherent in real data, such as the diversity of damages (e.g., crack, efflorescence, and corrosion). This causes performance degradation. To solve this problem, it is necessary to handle data complexity and uncertainty. This study proposes a multimodal deep learning model that can focus on damaged regions using text data related to damage in images, such as materials and components. Furthermore, by adjusting the effect of attention maps on damage‐level classification performance based on the confidence calculated when estimating these maps, the proposed method realizes an accurate damage‐level classification. Our contribution is the development of a model with an end‐to‐end multimodal attention mechanism that can simultaneously consider both text and image data and the confidence of the attention map. Finally, experiments using real images validate the effectiveness of the proposed method.
中文翻译:
考虑图像和文本数据之间的相关性以及注意力图置信度的损伤水平分类
在损伤水平分类中,深度学习。由于真实数据固有的复杂性,例如损坏的多样性(例如裂纹、风化和腐蚀),模型更有可能关注与分类目标无关的区域。这会导致性能下降。要解决这个问题,有必要处理数据复杂性和不确定性。本研究提出了一种多模态深度学习模型,该模型可以使用与图像损伤相关的文本数据(例如材料和组件)来关注受损区域。此外,通过在估计这些地图时计算出的置信度调整注意力图对损伤水平分类性能的影响,所提出的方法实现了准确的损伤水平分类。我们的贡献是开发了一个具有端到端多模态注意力机制的模型,该模型可以同时考虑文本和图像数据以及注意力图的置信度。最后,使用真实图像的实验验证了所提方法的有效性。
更新日期:2024-11-08
中文翻译:
考虑图像和文本数据之间的相关性以及注意力图置信度的损伤水平分类
在损伤水平分类中,深度学习。由于真实数据固有的复杂性,例如损坏的多样性(例如裂纹、风化和腐蚀),模型更有可能关注与分类目标无关的区域。这会导致性能下降。要解决这个问题,有必要处理数据复杂性和不确定性。本研究提出了一种多模态深度学习模型,该模型可以使用与图像损伤相关的文本数据(例如材料和组件)来关注受损区域。此外,通过在估计这些地图时计算出的置信度调整注意力图对损伤水平分类性能的影响,所提出的方法实现了准确的损伤水平分类。我们的贡献是开发了一个具有端到端多模态注意力机制的模型,该模型可以同时考虑文本和图像数据以及注意力图的置信度。最后,使用真实图像的实验验证了所提方法的有效性。