Uncertainty, Edge, and Reverse-Attention Guided Generative Adversarial Network for Automatic Building Detection in Remotely Sensed Images

Somrita Chattopadhyay, Purdue University

Abstract

Despite recent advances in deep-learning based semantic segmentation, automatic building detection from remotely sensed imagery is still a challenging problem owing to large variability in the appearance of buildings across the globe. The errors occur mostly around the boundaries of the building footprints, in shadow areas, and when detecting buildings whose exterior surfaces have reflectivity properties that are very similar to those of the surrounding regions. To overcome these problems, we propose a generative adversarial network based segmentation framework with uncertainty attention unit and refinement module embedded in the generator. The refinement module, composed of edge and reverse attention units, is designed to refine the predicted building map. The edge attention enhances the boundary features to estimate building boundaries with greater precision, and the reverse attention allows the network to explore the features missing in the previously estimated regions. The uncertainty attention unit assists the network in resolving uncertainties in classification. As a measure of the power of our approach, as of January 5, 2022, it ranks at the second place on DeepGlobe’s public leaderboard despite the fact that main focus of our approach — refinement of the building edges — does not align exactly with the metrics used for leaderboard rankings. Our overall F1-score on DeepGlobe’s challenging dataset is 0.745. We also report improvements on the previous-best results for the challenging INRIA Validation Dataset for which our network achieves an overall IoU of 81.28% and an overall accuracy of 97.03%. Along the same lines, for the official INRIA Test Dataset, our network scores 77.86% and 96.41% in overall IoU and accuracy. We have also improved upon the previous best results on two other datasets: For the WHU Building Dataset, our network achieves 92.27% IoU, 96.73% precision, 95.24% recall and 95.98% F1-score. And, finally, for the Massachusetts Buildings Dataset, our network achieves 96.19% relaxed IoU score and 98.03% relaxed F1-score over the previous best scores of 91.55% and 96.78% respectively, and in terms of non-relaxed F1 and IoU scores, our network outperforms the previous best scores by 2.77% and 3.89% respectively.

Degree

Ph.D.

Advisors

Prakash, Purdue University.

Subject Area

Artificial intelligence|Logic

Off-Campus Purdue Users:
To access this dissertation, please log in to our
proxy server
.

Share

COinS