Generalized Sparse Convolutional Neural Networks for Semantic Segmentation of Point Clouds Derived from Tri-Stereo Satellite Imagery

Bachhofner, Stefan ORCID: https://orcid.org/0000-0001-7785-2090 and Loghin, Ana-Maria ORCID: https://orcid.org/0000-0001-8995-289X and Otepka, Johannes ORCID: https://orcid.org/0000-0003-4203-8376 and Pfeifer, Norbert ORCID: https://orcid.org/0000-0002-2348-7929 and Hornacek, Michael and Siposova, Andrea ORCID: https://orcid.org/0000-0003-2908-5813 and Schmidinger, Niklas and Hornik, Kurt ORCID: https://orcid.org/0000-0003-4198-9911 and Schiller, Nikolaus and Kähler, Olaf and Hochreiter, Ronald ORCID: https://orcid.org/0000-0002-7120-8939 (2020) Generalized Sparse Convolutional Neural Networks for Semantic Segmentation of Point Clouds Derived from Tri-Stereo Satellite Imagery. Remote Sensing, 12 (8). p. 1289. ISSN 2072-4292

[img]
Preview
Text
remotesensing-12-01289-v2.pdf
Available under License Creative Commons: Attribution 4.0 International (CC BY 4.0).

Download (23MB) | Preview

Abstract

We studied the applicability of point clouds derived from tri-stereo satellite imagery for semantic segmentation for generalized sparse convolutional neural networks by the example of an Austrian study area. We examined, in particular, if the distorted geometric information, in addition to color, influences the performance of segmenting clutter, roads, buildings, trees, and vehicles. In this regard, we trained a fully convolutional neural network that uses generalized sparse convolution one time solely on 3D geometric information (i.e., 3D point cloud derived by dense image matching), and twice on 3D geometric as well as color information. In the first experiment, we did not use class weights, whereas in the second we did. We compared the results with a fully convolutional neural network that was trained on a 2D orthophoto, and a decision tree that was once trained on hand-crafted 3D geometric features, and once trained on hand-crafted 3D geometric as well as color features. The decision tree using hand-crafted features has been successfully applied to aerial laser scanning data in the literature. Hence, we compared our main interest of study, a representation learning technique, with another representation learning technique, and a non-representation learning technique. Our study area is located in Waldviertel, a region in Lower Austria. The territory is a hilly region covered mainly by forests, agriculture, and grasslands. Our classes of interest are heavily unbalanced. However, we did not use any data augmentation techniques to counter overfitting. For our study area, we reported that geometric and color information only improves the performance of the Generalized Sparse Convolutional Neural Network (GSCNN) on the dominant class, which leads to a higher overall performance in our case. We also found that training the network with median class weighting partially reverts the effects of adding color. The network also started to learn the classes with lower occurrences. The fully convolutional neural network that was trained on the 2D orthophoto generally outperforms the other two with a kappa score of over 90% and an average per class accuracy of 61%. However, the decision tree trained on colors and hand-crafted geometric features has a 2% higher accuracy for roads.

Item Type: Article
Keywords: 3D segmentation; deep learning; derived point clouds; tri-stereo; VeryHigh Resolution (VHR) Satellite Imagery; 2.5D segmentation; image segmentation; semantic segmentation; machine learning
Version of the Document: Published
Depositing User: ePub Administrator
Date Deposited: 11 Sep 2020 14:26
Last Modified: 21 Sep 2020 13:26
Related URLs:
FIDES Link: https://bach.wu.ac.at/d/research/results/96624/
URI: https://epub.wu.ac.at/id/eprint/7738

Actions

View Item View Item

Downloads

Downloads per month over past year

View more statistics