当前位置: X-MOL 学术IEEE Geosci. Remote Sens. Mag. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Language Integration in Remote Sensing: Tasks, datasets, and future directions
IEEE Geoscience and Remote Sensing Magazine ( IF 16.2 ) Pub Date : 2023-10-11 , DOI: 10.1109/mgrs.2023.3316438
Laila Bashmal 1 , Yakoub Bazi 2 , Farid Melgani 3 , Mohamad M. Al Rahhal 4 , Mansour Abdulaziz Al Zuair 5
Affiliation  

The emerging field of vision–language models, which combines computer vision and natural language processing (NLP), has gained significant interest and exploration. This integration has opened up new research opportunities, particularly in remote sensing (RS), where it has the potential to enhance RS systems’ capabilities. In this context, this article presents a comprehensive review of more than 100 articles focusing on the integration of NLP techniques into RS understanding research. The review covers various vision–language modeling tasks, including but not limited to RS image captioning, RS text-to-image retrieval, RS visual question answering (VQA), and RS image generation. For each task, the review provides a summary of the state-of-the-art developments, including methods, evaluation metrics, datasets, and experimental results on benchmark datasets. The review is concluded by discussing the key challenges and highlighting potential research directions for future development, with the aim of inspiring further research in this important field.

中文翻译:

遥感中的语言集成:任务、数据集和未来方向

新兴的视觉语言模型结合了计算机视觉和自然语言处理(NLP),引起了人们的极大兴趣和探索。这种集成开辟了新的研究机会,特别是在遥感 (RS) 领域,它有潜力增强遥感系统的能力。在此背景下,本文对 100 多篇专注于将 NLP 技术融入 RS 理解研究的文章进行了全面回顾。该评论涵盖了各种视觉语言建模任务,包括但不限于 RS 图像字幕、RS 文本到图像检索、RS 视觉问答 (VQA) 和 RS 图像生成。对于每项任务,该评论提供了最新进展的摘要,包括方法、评估指标、数据集和基准数据集的实验结果。综述最后讨论了关键挑战并强调了未来发展的潜在研究方向,旨在启发这一重要领域的进一步研究。
更新日期:2023-10-11
down
wechat
bug