Single-stream CNN with Learnable Architecture for Multi-source Remote Sensing Data

Yang, Yi; Zhu, Daoye; Qu, Tengteng; Wang, Qiangyu; Ren, Fuhu; Cheng, Chengqi

doi:10.1109/TGRS.2022.3169163

Computer Science > Computer Vision and Pattern Recognition

arXiv:2109.06094 (cs)

[Submitted on 13 Sep 2021 (v1), last revised 7 Feb 2022 (this version, v2)]

Title:Single-stream CNN with Learnable Architecture for Multi-source Remote Sensing Data

Authors:Yi Yang, Daoye Zhu, Tengteng Qu, Qiangyu Wang, Fuhu Ren, Chengqi Cheng

View PDF

Abstract:In this paper, we propose an efficient and generalizable framework based on deep convolutional neural network (CNN) for multi-source remote sensing data joint classification. While recent methods are mostly based on multi-stream architectures, we use group convolution to construct equivalent network architectures efficiently within a single-stream network. We further adopt and improve dynamic grouping convolution (DGConv) to make group convolution hyperparameters, and thus the overall network architecture, learnable during network training. The proposed method therefore can theoretically adjust any modern CNN models to any multi-source remote sensing data set, and can potentially avoid sub-optimal solutions caused by manually decided architecture hyperparameters. In the experiments, the proposed method is applied to ResNet and UNet, and the adjusted networks are verified on three very diverse benchmark data sets (i.e., Houston2018 data, Berlin data, and MUUFL data). Experimental results demonstrate the effectiveness of the proposed single-stream CNNs, and in particular ResNet18-DGConv improves the state-of-the-art classification overall accuracy (OA) on HS-SAR Berlin data set from $62.23\%$ to $68.21\%$. In the experiments we have two interesting findings. First, using DGConv generally reduces test OA variance. Second, multi-stream is harmful to model performance if imposed to the first few layers, but becomes beneficial if applied to deeper layers. Altogether, the findings imply that multi-stream architecture, instead of being a strictly necessary component in deep learning models for multi-source remote sensing data, essentially plays the role of model regularizer. Our code is publicly available at this https URL. We hope our work can inspire novel research in the future.

Subjects:	Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Image and Video Processing (eess.IV)
Cite as:	arXiv:2109.06094 [cs.CV]
	(or arXiv:2109.06094v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2109.06094
Related DOI:	https://doi.org/10.1109/TGRS.2022.3169163

Submission history

From: Yi Yang [view email]
[v1] Mon, 13 Sep 2021 16:10:41 UTC (15,344 KB)
[v2] Mon, 7 Feb 2022 04:49:08 UTC (21,698 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Single-stream CNN with Learnable Architecture for Multi-source Remote Sensing Data

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Single-stream CNN with Learnable Architecture for Multi-source Remote Sensing Data

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators