联合全局建模与局部补偿的高分辨率遥感影像语义分割

张建昌; 徐伟铭; 余大文; 梁昌钰; 陈枭

doi:10.65658/j.hndk.2025101501

联合全局建模与局部补偿的高分辨率遥感影像语义分割

Semantic Segmentation for Remote Sensing Images Based on Joint Global Feature Modeling and Local Feature Compensation

摘要

摘要: 为解决高分辨率遥感影像语义分割中普遍存在的局部细节刻画不足以及全局建模效率较低的问题，本文提出了一种结合卷积神经网络与状态空间模型的分割网络—RMUNet。该模型采用轻量级的ResNet18作为编码器，并引入视觉状态空间模块构建解码器，实现高效的全局上下文建模。同时，为增强对细粒度语义信息的感知能力，设计了局部特征补偿模块。针对深浅层特征融合过程中易出现的语义偏差问题，进一步提出跨层级融合注意力模块，以实现空间与语义信息的有效协同。结果表明，RMUNet在Vaihingen、Potsdam和LoveDA三大遥感数据集上的平均交并比分别达到83.78%、87.09%和52.85%，均优于现有的主流方法。该模型在维持较低计算复杂度的同时，显著提升了模型的特征表达能力与分割精度。研究结果为高分辨率遥感影像的高效智能解译提供了一种兼顾性能与效率的可行方案。

Abstract: To address the common challenges of insufficient local detail representation and low global modeling efficiency in high-resolution remote sensing image semantic segmentation, this paper proposes a segmentation network that integrates convolutional neural networks and state-space models, named RMUNet. The proposed model employs a lightweight ResNet18 as the encoder and introduces a Visual State Space Block (VSSBlock) in the decoder to achieve efficient global context modeling. Meanwhile, a Local Feature Compensation Module (LFCM) is designed to enhance the perception of fine-grained semantic information. To mitigate the semantic bias that may arise during the fusion of shallow and deep features, a Cross-level Fusion Attention Module (CFAM) is further proposed to enable effective collaboration between spatial and semantic representations. Experimental results demonstrate that RMUNet achieves mean Intersection-over-Union (mIoU) scores of 83.78%, 87.09%, and 52.85% on the Vaihingen, Potsdam, and LoveDA datasets, respectively, outperforming existing mainstream methods. While maintaining low computational complexity, RMUNet significantly enhances feature representation and segmentation accuracy, providing an efficient and effective solution for high-resolution remote sensing image interpretation.

HTML全文

参考文献(29)

施引文献

资源附件(0)

搜索

联合全局建模与局部补偿的高分辨率遥感影像语义分割

Semantic Segmentation for Remote Sensing Images Based on Joint Global Feature Modeling and Local Feature Compensation