Word2Scene: Efficient remote sensing image scene generation with only one word via hybrid intelligence and low-rank representation,ISPRS Journal of Photogrammetry and Remote Sensing

当前位置： X-MOL 学术 › ISPRS J. Photogramm. Remote Sens. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Word2Scene: Efficient remote sensing image scene generation with only one word via hybrid intelligence and low-rank representation
ISPRS Journal of Photogrammetry and Remote Sensing ( IF 10.6 ) Pub Date : 2024-11-06 , DOI: 10.1016/j.isprsjprs.2024.11.002
Jiaxin Ren, Wanzeng Liu, Jun Chen, Shunxi Yin, Yuan Tao

To address the numerous challenges existing in current remote sensing scene generation methods, such as the difficulty in capturing complex interrelations among geographical features and the integration of implicit expert knowledge into generative models, this paper proposes an efficient method for generating remote sensing scenes using hybrid intelligence and low-rank representation, named Word2Scene, which can generate complex scenes with just one word. This approach combines geographic expert knowledge to optimize the remote sensing scene description, enhancing the accuracy and interpretability of the input descriptions. By employing a diffusion model based on hybrid intelligence and low-rank representation techniques, this method endows the diffusion model with the capability to understand remote sensing scene concepts and significantly improves the training efficiency of the diffusion model. This study also introduces the geographic scene holistic perceptual similarity (GSHPS), a novel evaluation metric that holistically assesses the performance of generative models from a global perspective. Experimental results demonstrate that our proposed method outperforms existing state-of-the-art models in terms of remote sensing scene generation quality, efficiency, and realism. Compared to the original diffusion models, LPIPS decreased by 18.52% (from 0.81 to 0.66), and GSHPS increased by 28.57% (from 0.70 to 0.90), validating the effectiveness and advancement of our method. Moreover, Word2Scene is capable of generating remote sensing scenes not present in the training set, showcasing strong zero-shot capabilities. This provides a new perspective and solution for remote sensing image scene generation, with the potential to advance the development of remote sensing, geographic information systems, and related fields. Our code will be released at https://github.com/jaycecd/Word2Scene.

中文翻译：

Word2Scene：通过混合智能和低秩表示，仅使用一个单词即可高效生成遥感图像场景

为了解决当前遥感场景生成方法中存在的众多挑战，例如难以捕捉地理特征之间复杂的相互关系以及将隐含的专业知识集成到生成模型中，本文提出了一种使用混合智能和低秩表示生成遥感场景的有效方法，名为 Word2Scene，它可以只用一个词生成复杂的场景。这种方法结合了地理专业知识来优化遥感场景描述，从而提高输入描述的准确性和可解释性。该方法采用基于混合智能和低秩表示技术的扩散模型，赋予了扩散模型理解遥感场景概念的能力，显著提高了扩散模型的训练效率。本研究还介绍了地理场景整体感知相似性（GSHPS），这是一种新颖的评估指标，可从全局角度全面评估生成模型的性能。实验结果表明，我们提出的方法在遥感场景生成质量、效率和真实感方面优于现有的最先进的模型。与原始扩散模型相比，LPIPS 下降了 18.52% （从 0.81 到 0.66），GSHPS 增加了 28.57% （从 0.70 到 0.90），验证了我们方法的有效性和先进性。此外，Word2Scene 能够生成训练集中不存在的遥感场景，展示了强大的零镜头能力。这为遥感影像场景生成提供了新的视角和解决方案，有可能推动遥感、地理信息系统和相关领域的发展。我们的代码将在 https://github.com/jaycecd/Word2Scene 发布。

更新日期：2024-11-06

点击分享查看原文

点击收藏

公开下载

阅读更多本刊新发论文本刊介绍/投稿指南