Abstract
Pixel-level segmentation of corrosion-induced damage in underground reinforced concrete (RC) structures is crucial for structural health monitoring but remains challenging due to complex backgrounds and the need for real-time models suitable for on-site inspections. To address these limitations, this paper introduces PH-UNet-Mask, a lightweight multi-task deep learning architecture designed for robust damage segmentation in challenging underground environments. The proposed model integrates Haar wavelet downsampling (HWD) within its encoder to preserve crucial edge details and employs a Parallel Channel-Spatial Attention (PCSA) module to adaptively refine feature representation by explicitly modelling spatial-channel dependencies. A key innovation is a multi-task joint learning framework featuring a dedicated decoder branch that generates a structural mask; this mask is utilised by the primary decoder to effectively suppress background noise interference before performing fine-grained, multi-class damage segmentation. The model was trained and validated on a newly developed RDS dataset, comprising 3336 annotated images captured in real underground parking structures. Experimental results demonstrate that PH-UNet-Mask significantly outperforms baseline and state-of-the-art segmentation models, achieving mF1-Score of 89.75% and mIoU of 82.30%. Ablation studies systematically validated the individual contributions of the HWD, PCSA, and background masking components to the overall performance enhancement. Furthermore, the model exhibited high robustness against variations in training data size and achieved real-time inference speeds (10.5-13 FPS). This work provides an effective and computationally efficient solution for automated, pixel-level detection and classification of critical RC corrosion damage types, enabling accurate quantitative assessment and facilitating proactive structural maintenance strategies in demanding underground settings.
Keywords
RC structures, damage segmentation, deep learning, lightweight model, multi-task learning.
DOI
10.5703/1288284318102
Recommended Citation
Miao, Xu and Zhao, Yuxi, "An Architecture PH-UNet-Mask based on Multi-task Joint Learning: Exploration in Real-time Segmentation of Concrete Corrosion" (2025). International Conference on Durability of Concrete Structures. 2.
https://docs.lib.purdue.edu/icdcs/2025/slm/2
An Architecture PH-UNet-Mask based on Multi-task Joint Learning: Exploration in Real-time Segmentation of Concrete Corrosion
Pixel-level segmentation of corrosion-induced damage in underground reinforced concrete (RC) structures is crucial for structural health monitoring but remains challenging due to complex backgrounds and the need for real-time models suitable for on-site inspections. To address these limitations, this paper introduces PH-UNet-Mask, a lightweight multi-task deep learning architecture designed for robust damage segmentation in challenging underground environments. The proposed model integrates Haar wavelet downsampling (HWD) within its encoder to preserve crucial edge details and employs a Parallel Channel-Spatial Attention (PCSA) module to adaptively refine feature representation by explicitly modelling spatial-channel dependencies. A key innovation is a multi-task joint learning framework featuring a dedicated decoder branch that generates a structural mask; this mask is utilised by the primary decoder to effectively suppress background noise interference before performing fine-grained, multi-class damage segmentation. The model was trained and validated on a newly developed RDS dataset, comprising 3336 annotated images captured in real underground parking structures. Experimental results demonstrate that PH-UNet-Mask significantly outperforms baseline and state-of-the-art segmentation models, achieving mF1-Score of 89.75% and mIoU of 82.30%. Ablation studies systematically validated the individual contributions of the HWD, PCSA, and background masking components to the overall performance enhancement. Furthermore, the model exhibited high robustness against variations in training data size and achieved real-time inference speeds (10.5-13 FPS). This work provides an effective and computationally efficient solution for automated, pixel-level detection and classification of critical RC corrosion damage types, enabling accurate quantitative assessment and facilitating proactive structural maintenance strategies in demanding underground settings.