GARNet: Global-Aware Multi-View 3D Reconstruction Network and the Cost-Performance Tradeoff

Zhu, Zhenwei; Yang, Liying; Lin, Xuxin; Jiang, Chaohao; Li, Ning; Yang, Lin; Liang, Yanyan

doi:10.1016/j.patcog.2023.109674

Computer Science > Computer Vision and Pattern Recognition

arXiv:2211.02299 (cs)

[Submitted on 4 Nov 2022]

Title:GARNet: Global-Aware Multi-View 3D Reconstruction Network and the Cost-Performance Tradeoff

Authors:Zhenwei Zhu, Liying Yang, Xuxin Lin, Chaohao Jiang, Ning Li, Lin Yang, Yanyan Liang

View PDF

Abstract:Deep learning technology has made great progress in multi-view 3D reconstruction tasks. At present, most mainstream solutions establish the mapping between views and shape of an object by assembling the networks of 2D encoder and 3D decoder as the basic structure while they adopt different approaches to obtain aggregation of features from several views. Among them, the methods using attention-based fusion perform better and more stable than the others, however, they still have an obvious shortcoming -- the strong independence of each view during predicting the weights for merging leads to a lack of adaption of the global state. In this paper, we propose a global-aware attention-based fusion approach that builds the correlation between each branch and the global to provide a comprehensive foundation for weights inference. In order to enhance the ability of the network, we introduce a novel loss function to supervise the shape overall and propose a dynamic two-stage training strategy that can effectively adapt to all reconstructors with attention-based fusion. Experiments on ShapeNet verify that our method outperforms existing SOTA methods while the amount of parameters is far less than the same type of algorithm, Pix2Vox++. Furthermore, we propose a view-reduction method based on maximizing diversity and discuss the cost-performance tradeoff of our model to achieve a better performance when facing heavy input amount and limited computational cost.

Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2211.02299 [cs.CV]
	(or arXiv:2211.02299v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2211.02299
Related DOI:	https://doi.org/10.1016/j.patcog.2023.109674

Submission history

From: Zhenwei Zhu [view email]
[v1] Fri, 4 Nov 2022 07:45:19 UTC (5,918 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:GARNet: Global-Aware Multi-View 3D Reconstruction Network and the Cost-Performance Tradeoff

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:GARNet: Global-Aware Multi-View 3D Reconstruction Network and the Cost-Performance Tradeoff

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators