当前位置:
X-MOL 学术
›
ISPRS J. Photogramm. Remote Sens.
›
论文详情
Our official English website, www.x-mol.net, welcomes your
feedback! (Note: you will need to create a separate account there.)
Multi-level urban street representation with street-view imagery and hybrid semantic graph
ISPRS Journal of Photogrammetry and Remote Sensing ( IF 10.6 ) Pub Date : 2024-10-18 , DOI: 10.1016/j.isprsjprs.2024.09.032 Yan Zhang, Yong Li, Fan Zhang
ISPRS Journal of Photogrammetry and Remote Sensing ( IF 10.6 ) Pub Date : 2024-10-18 , DOI: 10.1016/j.isprsjprs.2024.09.032 Yan Zhang, Yong Li, Fan Zhang
Street-view imagery has been densely covering cities. They provide a close-up perspective of the urban physical environment, allowing a comprehensive perception and understanding of cities. There has been a significant amount of effort to represent the urban physical environment based on street view imagery, and this representation has been utilized to study the relationships between the physical environment, human dynamics, and socioeconomic environments. However, there are two key challenges in representing the urban physical environment of streets based on street-view images for downstream tasks. First, current research mainly focuses on the proportions of visual elements within the scene, neglecting the spatial adjacency between them. Second, the spatial dependency and spatial interaction between streets have not been adequately accounted for. These limitations hinder the effective representation and understanding of urban streets. To address these challenges, we propose a dynamic graph representation framework based on dual spatial semantics. At the intra-street level, we consider the spatial adjacency relationships of visual elements. Our method dynamically parses visual elements within the scene, achieving context-specific representations. At the inter-street level, we construct two spatial weight matrices by integrating the spatial dependency and the spatial interaction relationships. It could account for the hybrid spatial relationships between streets comprehensively, enhancing the model’s ability to represent human dynamics and socioeconomic status. Furthermore, aside from these two modules, we also provide a spatial interpretability analysis tool for downstream tasks. A case study of our research framework shows that our method improves vehicle speed and flow estimation by 2.4% and 6.4%, respectively. This not only indicates that street-view imagery provides rich information about urban transportation but also offers a more accurate and reliable data-driven framework for urban studies. The code is available at: (https://github.com/yemanzhongting/HybridGraph ).
中文翻译:
使用街景图像和混合语义图的多级城市街道表示
街景图像一直密集地覆盖城市。它们提供了城市物理环境的特写视角,从而可以全面感知和理解城市。基于街景影像表示城市物理环境已经付出了大量努力,这种表示已被用于研究物理环境、人类动态和社会经济环境之间的关系。但是,在基于街景图像表示街道的城市物理环境以用于下游任务时,存在两个关键挑战。首先,目前的研究主要集中在场景内视觉元素的比例上,而忽略了它们之间的空间邻接。其次,街道之间的空间依赖性和空间交互没有得到充分考虑。这些限制阻碍了对城市街道的有效表示和理解。为了应对这些挑战,我们提出了一种基于双空间语义的动态图表示框架。在街道内层面,我们考虑了视觉元素的空间邻接关系。我们的方法动态解析场景中的视觉元素,实现特定于上下文的表示。在街道间层面,我们通过整合空间依赖性和空间交互关系来构建两个空间权重矩阵。它可以全面解释街道之间的混合空间关系,从而增强模型表示人类动态和社会经济地位的能力。此外,除了这两个模块之外,我们还为下游任务提供了空间可解释性分析工具。 我们研究框架的案例研究表明,我们的方法分别将车速和流量估计提高了 2.4% 和 6.4%。这不仅表明街景影像提供了有关城市交通的丰富信息,而且还为城市研究提供了更准确、更可靠的数据驱动框架。代码可在以下网址获得:(https://github.com/yemanzhongting/HybridGraph)。
更新日期:2024-10-18
中文翻译:
使用街景图像和混合语义图的多级城市街道表示
街景图像一直密集地覆盖城市。它们提供了城市物理环境的特写视角,从而可以全面感知和理解城市。基于街景影像表示城市物理环境已经付出了大量努力,这种表示已被用于研究物理环境、人类动态和社会经济环境之间的关系。但是,在基于街景图像表示街道的城市物理环境以用于下游任务时,存在两个关键挑战。首先,目前的研究主要集中在场景内视觉元素的比例上,而忽略了它们之间的空间邻接。其次,街道之间的空间依赖性和空间交互没有得到充分考虑。这些限制阻碍了对城市街道的有效表示和理解。为了应对这些挑战,我们提出了一种基于双空间语义的动态图表示框架。在街道内层面,我们考虑了视觉元素的空间邻接关系。我们的方法动态解析场景中的视觉元素,实现特定于上下文的表示。在街道间层面,我们通过整合空间依赖性和空间交互关系来构建两个空间权重矩阵。它可以全面解释街道之间的混合空间关系,从而增强模型表示人类动态和社会经济地位的能力。此外,除了这两个模块之外,我们还为下游任务提供了空间可解释性分析工具。 我们研究框架的案例研究表明,我们的方法分别将车速和流量估计提高了 2.4% 和 6.4%。这不仅表明街景影像提供了有关城市交通的丰富信息,而且还为城市研究提供了更准确、更可靠的数据驱动框架。代码可在以下网址获得:(https://github.com/yemanzhongting/HybridGraph)。