A deep learning method for building height estimation using high-resolution multi-view imagery over urban areas: A case study of 42 Chinese cities

Knowledge of building height is critical for understanding the urban development process. High-resolution optical satellite images can provide fine spatial details within urban areas, while they have not been applied to building height estimation over multiple cities and the feasibility of mapping building height at a fine scale (< 5 m) remains understudied. Multi-view satellite images can describe vertical information of buildings, due to the inconsistent response of buildings (e.g., spectral and structural variations) to different viewing angles, but they have not been employed to deep learning-based building height estimation. In this context, we introduce high-resolution ZY-3 multi-view images to estimate building height at a spatial resolution of 2.5 m. We propose a multi-spectral, multi-view, and multi-task deep network (called M3Net) for building height estimation, where ZY-3 multi-spectral and multi-view images are fused in a multi-task learning framework. A random forest (RF) method using multi-source features is also carried out for comparison. We select 42 Chinese cities with diverse building types to test the proposed method. Results show that the M3Net obtains a lower root mean square error (RMSE) than the RF, and the inclusion of ZY-3 multi-view images can significantly lower the uncertainty of building height prediction. Comparison with two existing state-of-the-art studies further confirms the superiority of our method, especially the efficacy of the M3Net in alleviating the saturation effect of high-rise building height estimation. Compared to the vanilla single/multi-task models, the M3Net also achieves a lower RMSE. Moreover, the spatial-temporal transferability test indicates the robustness of the M3Net to imaging conditions and building styles. The test of our method on a relatively large area (covering about 14,120 km2) further validates the scalability of our method from the perspectives of both efficacy and quality. The source code will be made available at https://github.com/lauraset/BuildingHeightModel.

Graphical Abstract of proposed framework

How to cite

Cao, Y., Huang, X., 2021. A deep learning method for building height estimation using high-resolution multi-view imagery over urban areas: A case study of 42 Chinese cities. Remote Sens. Environ. 264, 112590.

en_USEnglish