Hyperbolic Chamfer Distance for Point Clouds
Hyperbolic Chamfer Distance for Point Clouds
Fangzhou Lin1,2 Yun Yue1 Songlin Hou1,3 Xuechu Yu1 Yajun Xu4
Kazunori D Yamada2 Ziming Zhang1 *
1
Worcester Polytechnic Institute, USA 2 Tohoku University, Japan
3
Dell Technologies, USA 4 Hokkaido University, Japan
{flin2, yyue, xyu4, shou, zzhang15}@wpi.edu, [email protected],
Abstract
𝑥#
Chamfer distance (CD) is a standard metric to measure 𝑥!" 𝑥!"
𝑥#
the shape dissimilarity between point clouds in point cloud 𝑥#" 𝑥$
𝑥!
completion, as well as a loss function for (deep) learning. 𝑥#" 𝑥!
𝑥$
However, it is well known that CD is vulnerable to out-
liers, leading to the drift towards suboptimal models. In
contrast to the literature where most works address such (a) Euclidean (b) Hyperbolic
issues in Euclidean space, we propose an extremely sim-
ple yet powerful metric for point cloud completion, namely Figure 1. Illustration of point matching in the (a) Euclidean space
and (b) hyperbolic space. With the position-aware embeddings in
Hyperbolic Chamfer Distance (HyperCD), that computes
hyperbolic space, the mismatched point pairs in Euclidean space
CD in hyperbolic space. In backpropagation, HyperCD
may be corrected, leading to better completion performance.
consistently assigns higher weights to the matched point
pairs with smaller Euclidean distances. In this way, good
point matches are likely to be preserved while bad matches
can be updated gradually, leading to better completion re- Point cloud completion is usually non-trivial due to the
sults. We demonstrate state-of-the-art performance on the unordered and unstructured characteristics of point clouds
benchmark datasets, i.e. PCN, ShapeNet-55, and ShapeNet- (especially obtained from real-world environments). Re-
34, and show from visualization that HyperCD can signif- cently, many (deep) learning-based approaches have been
icantly improve the surface smoothness. Code is available introduced to point cloud completion ranging from su-
at: https://github.com/Zhang-VISLab. pervised learning, self-supervised learning to unsupervised
learning [70, 55, 35, 6, 11, 43]. Amongst them, supervised
learning with a general encoder-decoder structure serves as
1. Introduction the dominant paradigm architectural choice for many re-
searchers and achieves state-of-the-art on nearly all main-
Point clouds, one of the most important data representa- stream benchmarks. Their works largely focus on the de-
tions that can be easily acquired, play a key role in modern sign of different structures in the encoder and decoder for
robotics and automation applications [54, 33, 44]. However, more informative feature extractions and better point cloud
raw data of point clouds captured by existing 3D sensors is generation [69, 62, 75, 54, 12], in Euclidean space.
usually incomplete and sparse due to occlusion, limited sen-
Unequal Point Importance in Point Clouds. Humans of-
sor resolution and light reflection [68, 23, 31, 24, 75], which
ten perceive the visual quality of point clouds in a non-
can negatively impact the performance of downstream tasks
homogeneous way by putting a higher emphasis on the
that require high-quality representation, such as point cloud
points with certain geometric structures such as planes,
segmentation and detection. In this paper, we address this
edges, corners, etc. For example, point clouds with smooth
issue by inferring the complete shape of an object or scene
surfaces and sharp edges tend to be more visually appealing
from incomplete raw point clouds. This task is referred to
than their counterparts [64, 26]. Surprisingly, this simple
as point cloud completion [2].
yet nontrivial fact in point clouds, however, is hardly ex-
* corresponding author plored in the literature of point cloud completion. For in-
14595
stance, Chamfer distance (CD) is a widely used metric in as we see, the curves of gradients of y = arcosh(1 + x)
point cloud completion, e.g. [15, 61], to measure the shape and y = arcosh(1 + x3 /3) are quite similar to those for
dissimilarity between any pair of point clouds by calculat- DCD, while the gradient of y = arcosh(1 + x2 /2) pro-
ing the average distance between each point in one set to duces a nice curve that exactly follows what we expect for a
its nearest neighbor found in another. While CD can faith- good weighting mechanism in point cloud completion. This
fully reflect the global dissimilarity between the prediction observation provides us new insights on defining HyperCD.
and ground truth, the distances of all nearest-neighbor pairs Specifically, by matching the points with nearest neigh-
between both sets are treated with equal importance (even bors in Euclidean space (represented by x-axis in Fig. 2),
higher weights to outliers). Thus, CD is sensitive to outliers. we first obtain the matches between the prediction and
Density-aware Chamfer Distance (DCD) in Euclidean ground truth, and vice versa. We then plug these Euclidean
Space. To address such an equal weighting problem in distances into arcosh to represent them in hyperbolic space.
CD, recently Wu et al. [61] proposed a DCD metric by ex- Empirically we demonstrate that models trained with such a
ploring the disparity of density distributions in point clouds. simple metric can significantly outperform the counterparts
As illustrated in Fig. 1 (a), due to the different point den- trained with CD as well as DCD.
sity in point clouds, denser points may easily have multiple Contributions. We list our main contributions as follows:
matches, while sparser points may not. This phenomenon • We propose an extremely simple yet powerful distance
is considered in DCD as a weighting mechanism (inverse to metric, HyperCD, for point cloud completion. To the best
the number of matches for balance) so that sparser points of our knowledge, we are the first to explore hyperbolic
have higher weights. Meanwhile, DCD also proposed using space for point cloud completion.
an exponential approximation (the first order approximation • We demonstrate state-of-the-art performance on several
of Taylor expansion) of CD to overcome the sensitivity to benchmark datasets based on popular networks that are
outliers, as illustrated in Fig. 2. trained with HyperCD.
Though empirically DCD seems to work better than CD
for point cloud completion, it may have some serious issues: 2. Related Work
• Density-aware mechanism in DCD may assign higher
weights to sparser points. This not only tends to obtain 3D Shape Completion. Elder methods in 3D shape com-
good matches at the edges and corners, but also favors the pletion generally focus on voxel grid, which have network
matches with outliers that leads to inferior completion. architecture similar to 2D image networks [34, 8, 22]. How-
• CD approximate functions hardly preserve good matches. ever, information loss will inevitably happen when the inter-
From Fig. 2, CD is sensitive to outliers because its gra- mediate representations have been involved, and voxeliza-
dient assigns higher (or equal) weights to the points with tion will cause high computational cost regard to voxel reso-
larger distances. DCD can mitigate this problem, but the lution [56]. Therefore, recent state-of-the-art models are de-
weights either decrease too fast (exponentially) with the signed to consume raw point cloud data directly. As the pio-
ℓ1 distance or are small for good matches (even zeros for neering work PointNet [42], it independently applies MLPs
perfect matches) with the ℓ2 distance. on each point and subsequently aggregates features through
Hyperbolic Chamfer Distance (HyperCD). To mitigate max-pooling operation to achieve permutation invariance.
these aforementioned problems in CD for point cloud com- Following this design, a clear-cut way is employing per-
pletion, in contrast to the literature, we believe that preserv- mutation invariance neural networks as a tool to design an
ing good matches while improving bad matches gradually encoder for input partial feature extraction and a decoder
during training is the key to the success in point cloud com- to complete point clouds. As the first learning-based point
pletion. We call this matching property position-aware, as cloud completion network, PCN [70] extract global feature
it only depends on the point positions. Recently hyper- in a similar way PointNet did and generate points through
bolic space has been demonstrated as a means to repre- folding operations [66]. In order to obtain local structures
sent the inherent compositional nature of point clouds us- among points, Zhang et al. [74] extract multi-scale features
ing position-aware embeddings within tree-like geometric from different layers in the feature extraction part to en-
structures, e.g. [36]. Such works highly motivate us to ex- hance the performance. CDN [57] uses a cascaded refine-
plore CD in hyperbolic space. ment network to bridge the local details of partial input and
As illustrated in Fig. 1 (b), hyperbolic space provides the global shape information together.
more flexibility than Euclidean distance as a measure be- Lyu et al. [32] treat point cloud completion as a con-
tween points, and thus it may be possible to correct match- ditional generation problem in the framework of denois-
ing errors in CD. Besides, as illustrated in Fig. 2 (left), with ing diffusion probabilistic models (DDPM) [46, 16, 76, 30].
different power functions, the hyperbolic spaces (defined by They also mentioned the problem where CD loss is not sen-
arcosh) can better approximate CD than DCD. Meanwhile, sitive to overall density distribution. Their solution is us-
14596
10 3
y=x y=x
y = x 2 /2 y = x 2 /2
2.5
8 y = 1-exp(-x) y = 1-exp(-x)
y = 1-exp(-x 2 /2) y = 1-exp(-x 2 /2)
y = arcosh(1+x) 2 y = arcosh(1+x)
6
y = arcosh(1+x 2 /2) y = arcosh(1+x 2 /2)
y/ x
3 1.5 y = arcosh(1+x 3 /3)
y
y = arcosh(1+x /3)
4
1
2
0.5
0 0
0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7
x x
Figure 2. Illustration of (left) some distance metrics and (right) their corresponding gradients, where the dotted curves are used in ℓ1 and
ℓ2 -CD, the dash ones are used in density-aware Chamfer distance (DCD) [61], and the solid curves are special cases of our HyperCD.
ing DDPM to define a one-to-one point-wise mapping be- of CD and computational cost in EMD to a certain level.
tween two consecutive point clouds in the diffusion process,
which uses a simple mean squared error loss function [4] for Hyperbolic Learning. Euclidean space has been widely
training. However, it is computationally intensive and only used in machine learning because it is a natural generaliza-
works at the coarse point cloud generation stage. tion of human intuition-friendly, visual three-dimensional
space, and easy for measuring distances and inner-products
Attention mechanisms like Transformer [53], demon- [13, 28, 20, 40]. However, the Euclidean embedding is not
strate their superiority in long-range interaction capturing the suitable choice for some complex tree-like data fields
as compared to CNNs’ constrained receptive fields. For such as Biology, Network Science, Computer Graphics, or
instance, to preserve more detailed geometry information Computer Vision that exhibit highly non-Euclidean latent
for point cloud generation in the decoder, SA-Net [58] uses anatomy [13, 5]. This encourages the research community
the skip-attention mechanism to merge local region infor- to develop deep neural networks in non-Euclidean space,
mation from the encoder and point features of the decoder. such as hyperbolic space, which is a Riemannian manifold
SnowflakeNet [62] and PointTr [69] pay extra attention to of constant negative curvature. Recently, the gap between
the decoder part with Transformer-like designs. PointAttN the hyperbolic embeddings and the Euclidean embeddings
[54] further proposes an architecture design solely based on has been narrowed by deriving the essential components of
Transformers. These works have demonstrated the ability deep neural networks in hyperbolic geometry [13, 45] (e.g.
of Transformers in point cloud completion tasks. multinomial logistic regression, fully-connected layers, re-
current neural networks etc.).
Point Cloud Distance. Distance in point clouds is a non- Unlike Euclidean space with polynomial volume growth
negative function that measures the dissimilarity between w.r.t. the radius, hyperbolic space Hn has exponential
them. Since point clouds are inherently unordered, the growth that is suitable for tree-like structure data. The
shape-level distance is typically derived from statistics of representation power of hyperbolic space has been demon-
pair-wise point-level distances based on a particular assign- strated in NLP [37, 38], image segmentation [60, 3], few-
ment strategy [61]. With relatively low computational cost shot [18] and zero-shot learning [29] as well as metric
fair design, CD and its variants are extensively used in learning equipped with vision transformers [10]. For point
learning-based methods for point cloud completion tasks clouds of 3D objects, the data naturally exhibit a hierar-
[9, 32, 73, 47]. Earth Mover’s Distance (EMD), which is chy property, where simple parts can be assembled into
another widely used metric, relies on finding the optimal progressively more complex shapes to form whole objects.
mapping function from one set to the other by solving an Recently, the work of [36] has shown that the features
optimization problem. In some cases, it is considered to from a point cloud classifier could be embedded into hyper-
be more reliable than CD, but it suffers from high compu- bolic space that leads to the state-of-art supervised models
tational overhead and is only suitable for sets with exact for point cloud classification. Intuitively, points near the
numbers of points [27, 1]. Recently, Wu et al. [61] propose boundary of hyperbolic space are sparser compared with
a Density-aware Chamfer Distance (DCD) as a new metric points at the center. While the hierarchy property between
for point cloud completion which can balance the behavior part and whole could provide useful clues in classifying ob-
14597
jects, it cannot be directly applied in generation tasks like
point cloud completion. However, the property of hyper- Learning Objective for Point Cloud Completion. Based
bolic embedding with exponential offers a clue in designing on the definition of CD in Eq. 1, a simple learning objective
a new loss for point cloud completion tasks that focus more can be written as follows:
on the surface. To the best of our knowledge, we are the first
to introduce hyperbolic space in point cloud completion. \label {eqn:obj} \min _{\omega \in \Omega }\sum _i F_i(\omega ) \stackrel {def}{=} \min _{\omega \in \Omega }\sum _i D(f(x_i;\omega ), y_i), (4)
3. Method
where Ω denotes the feasible solution space for ω defined
In this section, we will introduce HyperCD, its fast cal-
by some constraints such as regularization.
culation and weighting mechanism in backpropagation.
3.1. Chamfer Distance 3.2. Hyperbolic Chamfer Distance
Notations. We denote (xi , yi ) as the i-th point cloud pair in Challenges. There are several challenges that prevent us
the training data, xi = {xij } as the incomplete input point from directly substituting Eq. 3 into Eq. 1 as listed below:
cloud with 3D points xij , ∀j, and yi = {yik } as the ground- • Domain constraint: The norm of each 3D point should be
truth point cloud with points yik , ∀k. We denote d(·, ·) as a strictly smaller than 1. Some implementation tricks such
certain distance metric, f as a neural network for generating as clipping [72] can be applied here to mitigate the issue.
a new point cloud from an incomplete input point cloud that • Computational burden: The calculation in Eq. 3 is much
is parametrized by ω. more complex than the Euclidean distance, leading to sig-
Definition. Based on such notations above, a Chamfer dis- nificantly higher computational burden especially in large
tance for point clouds can be defined as follows, in general: scale settings as the matching complexity per point cloud
pair is O(|xi ||yi |), e.g. the numbers of points in point
clouds bigger than 10K. To mitigate this issue, often the
\label {eqn:CD} & \hspace {-3mm} D(x_i, y_i) \nonumber \\ & \hspace {-3mm} = \frac {1}{|x_i|}\sum _j\min _k d(x_{ij}, y_{ik}) + \frac {1}{|y_i|}\sum _k\min _j d(x_{ij}, y_{ik}), hyperbolic distance is computed in Gyrovector space in-
stead [50, 51, 52, 71], a generalization of Euclidean vec-
(1)
tor spaces, based on the Möbius transformations [21].
However, such operations still require too much compu-
where | · | denotes the cardinality of a set. Based on this def- tation to be efficient in large scale settings.
inition, we can instantiate the distance metric with different So far, we have discovered that (1) computing Euclidean
geometric spaces, such as: distances is much faster than computing hyperbolic dis-
• Euclidean distance: For point cloud completion, function tances, and (2) hyperbolic distances are defined based on
d is usually defined in Euclidean space, referring to arcosh. So, how shall we define hyperbolic Chamfer dis-
tance and accelerate its computation?
d(x_{ij}, y_{ik})=\left \{ \begin {array}{ll} \|x_{ij} - y_{ik}\| & \mbox {as {\em L1-distance}} \\ \|x_{ij} - y_{ik}\|^2 & \mbox {as {\em L2-distance}} \end {array} \right . (2)
(3)
3.2.2 Learning with HyperCD as Loss Function
Note that the hyperbolic distance can be always defined
based on arcosh, no matter what model is used to repre- Now let us discuss the learning process in backpropagation
sent the hyperbolic space. for updating network weights. Considering the learning ob-
14598
Algorithm 1 HyperCD 1
Normalized weights
0.8
Initialize a matrix M , D1 ← 0, D2 ← 0;
foreach j, k do 0.7
= 0.1
Mjk ← ∥xij − yik ∥2 ; 0.6 = 0.2
end = 0.5
=1
foreach j do 0.5
=2
D1 ← D1 + arcosh(1 + α mink Mjk ); =5
0.4 = 10
end
= 20
foreach k do 0.3
D2 ← D2 + arcosh(1 + α minj Mjk ); 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Euclidean distance
end
D1 D2
return D(xi , yi ) ← |x i|
+ |yi|
; Figure 3. Illustration of the gradient weights using our HyperCD.
All the numbers are normalized by √12α .
\label {eqn:derivative} \frac {\partial d(\Tilde {x}_{ij}, y_{im(j)})}{\partial \omega } = z_{ij} \cdot \frac {\partial \|\Tilde {x}_{ij} - y_{im(j)}\|}{\partial \omega }
(7)
(10)
ij im(j) 2α∥x̃ −y ∥
where zij = 2
∈ R denotes the q When x is approaching 0, function h will perform as a
(1+α∥x̃ij −yim(j) ∥2 ) −1
weight for the gradient feature in backpropagation. This im- power function, with a single special case where β = 2 re-
plicit weighting mechanism only depends on the Euclidean turns a constant. This analysis coincides with Fig. 2 (right),
distances, and thus it is position-aware. The same gradient and it seems that only β = 2 can provide a reason-
∂∥x̃ij −yim(j) ∥ able weighting mechanism to preserve good point matches.
feature, , is also used in CD and DCD. The
∂ω When x is approaching infinity, all the curves with the same
only difference among such distance metrics in learning is
β will converge to a power function as well.
the weighting mechanism.
Proposition 4. The weight zij in Eq. 7 is strictly decreasing
3.3. Analysis on HyperCD w.r.t. ∥x̃ij − yim(j) ∥ for an arbitrary α > 0.
Proposition 1. Consider d(xij , yik ) = g(∥xij − yik ∥) in Fig. 3 illustrates the change of weights w.r.t. the dis-
Eq. 1 where function g is strictly increasing. It holds that tances using different α’s. When α is small, e.g. α ≤ 2,
the curves decrease gradually, which can potentially better
preserve good matches through backpropagation.
\min _kd(x_{ij},y_{ik}) = g\left (\min _k\|x_{ij}-y_{ik}\|\right ). (8)
4. Experiments
This proposition states that g and min are switchable for
a strictly increasing function g. In this way, the complexity Datasets. We verify and analyze our HyperCD for point
of g (mink ∥xij − yik ∥) is just slightly higher than comput- cloud completion on the following benchmark datasets.
ing Euclidean distances (with extra calculation for g). • ShapeNet-Part: The benchmark ShapeNet-Part [67] is
a part segmentation subset of ShapeNetCore [7] 3D
Proposition 2. Consider a function h(x) = arcosh(1 + meshes. It contains 17,775 different 3D mesh which be-
αxβ ), ∀x ≥ 0. h is strictly increasing iff α > 0, β > 0. long to 16 categories. The ground truth point cloud data
14599
Table 1. Completion results on PCN in terms of per-point L1 Chamfer distance ×1000 (lower is better).
Methods Average Plane Cabinet Car Chair Lamp Couch Table Boat
FoldingNet [66] 14.31 9.49 15.80 12.61 15.55 16.41 15.97 13.65 14.99
TopNet [49] 12.15 7.61 13.31 10.90 13.82 14.44 14.78 11.22 11.12
AtlasNet [14] 10.85 6.37 11.94 10.10 12.06 12.37 12.99 10.33 10.61
GRNet [63] 8.83 6.45 10.37 9.45 9.41 7.96 10.51 8.44 8.04
CRN [57] 8.51 4.79 9.97 8.31 9.49 8.94 10.69 7.81 8.05
NSFA [74] 8.06 4.76 10.18 8.63 8.53 7.03 10.53 7.35 7.48
FBNet [65] 6.94 3.99 9.05 7.90 7.38 5.82 8.85 6.35 6.18
PCN [70] 11.27 5.50 22.70 10.63 8.70 11.00 11.34 11.68 8.59
HyperCD + PCN 10.59 5.95 11.62 9.33 12.45 12.58 13.10 9.82 9.85
FoldingNet [66] 14.31 9.49 15.80 12.61 15.55 16.41 15.97 13.65 14.99
HyperCD + FoldingNet 12.09 7.89 12.90 10.67 14.55 13.87 14.09 11.86 10.89
PMP-Net [59] 8.73 5.65 11.24 9.64 9.51 6.95 10.83 8.72 7.25
HyperCD + PMP-Net 8.40 5.06 10.67 9.30 9.11 6.83 11.01 8.18 7.03
PoinTr [69] 8.38 4.75 10.47 8.68 9.39 7.75 10.93 7.78 7.29
HyperCD + PoinTr 7.56 4.42 9.77 8.22 8.22 6.62 9.62 6.97 6.67
SnowflakeNet [62] 7.21 4.29 9.16 8.08 7.89 6.07 9.23 6.55 6.40
HyperCD + SnowflakeNet 6.91 3.95 9.01 7.88 7.37 5.75 8.94 6.19 6.17
PointAttN [54] 6.86 3.87 9.00 7.63 7.43 5.90 8.68 6.32 6.09
DCD + PointAttN 7.54 4.47 9.65 8.14 8.12 6.75 9.60 6.92 6.67
HyperCD + PointAttN 6.68 3.76 8.93 7.49 7.06 5.61 8.48 6.25 5.92
SeedFormer [75] 6.74 3.85 9.05 8.06 7.06 5.21 8.85 6.05 5.85
DCD + SeedFormer 24.52 16.42 26.23 21.08 20.06 18.30 26.51 18.23 18.22
HyperCD + SeedFormer 6.54 3.72 8.71 7.79 6.83 5.11 8.61 5.82 5.76
Table 2. Completion results on ShapeNet-55 based on L2 Chamfer distance ×1000 (lower is better) and F-Score@1% (higher is better).
Methods Table Chair Plane Car Sofa CD-S CD-M CD-H CD-Avg F1
PFNet [17] 3.95 4.24 1.81 2.53 3.34 3.83 3.87 7.97 5.22 0.339
FoldingNet [66] 2.53 2.81 1.43 1.98 2.48 2.67 2.66 4.05 3.12 0.082
TopNet [49] 2.21 2.53 1.14 2.18 2.36 2.26 2.16 4.3 2.91 0.126
PCN [70] 2.13 2.29 1.02 1.85 2.06 1.94 1.96 4.08 2.66 0.133
GRNet [63] 1.63 1.88 1.02 1.64 1.72 1.35 1.71 2.85 1.97 0.238
PoinTr [69] 0.81 0.95 0.44 0.91 0.79 0.58 0.88 1.79 1.09 0.464
SeedFormer [75] 0.72 0.81 0.40 0.89 0.71 0.50 0.77 1.49 0.92 0.472
HyperCD + SeedFormer 0.66 0.74 0.35 0.83 0.64 0.47 0.72 1.40 0.86 0.482
was created by sampling 2,048 points uniformly on each 16,384 points are uniformly sampled from the mesh sur-
mesh. The partial point cloud data were generated by ran- faces as complete ground truth, and 2,048 points are sam-
domly selecting a viewpoint as a center among multiple pled as partial input [70, 75].
viewpoints and removing points within a certain radius • ShapeNet-55/34: ShapeNet-55 and ShapeNet-34 datasets
from the complete data, which is similar to the generation datasets are also generated from the synthetic ShapeNet
of PoinTr’s ShapeNet-55/34 [69] benchmark. The num- [7] dataset while they contain more object categories and
ber of points we remove from each point cloud is 512. incomplete patterns. All 55 categories in ShapeNet are
• PCN: One of the most popular benchmark datasets for included in ShapeNet-55 with 41,952 shapes for training
point cloud completion is the PCN dataset [70]. It and 10,518 shapes for testing. ShapeNet-34 uses a subset
is a subset of ShapeNet [7] with shapes from 8 cate- of 34 categories for training and leaves 21 unseen cate-
gories. The incomplete point clouds are generated by gories for testing where 46,765 object shapes are used for
back-projecting 2.5D depth images from 8 viewpoints in training, 3,400 for testing on seen categories and 2,305
order to simulate real-world sensor data. For each shape, for testing on novel (unseen) categories. In both datasets,
14600
Figure 4. Visual comparison of point cloud completion results on PCN. Row-1: Inputs of incomplete point clouds. Row-2: Outputs of
PointAttN with CD. Row-3: Outputs of PointAttN with DCD. Row-4: Outputs of PointAttN with HyperCD. Row-5: Ground truth.
Table 3. Completion results on ShapeNet-34 based on L2 Chamfer distance ×1000 (lower is better) and F-Score@1% (higher is better).
34 seen categories 21 unseen categories
Methods
CD-S CD-M CD-H CD-Avg F1 CD-S CD-M CD-H CD-Avg F1
PFNet [17] 3.16 3.19 7.71 4.68 0.347 5.29 5.87 13.33 8.16 0.322
FoldingNet [66] 1.86 1.81 3.38 2.35 0.139 2.76 2.74 5.36 3.62 0.095
TopNet [49] 1.77 1.61 3.54 2.31 0.171 2.62 2.43 5.44 3.50 0.121
PCN [70] 1.87 1.81 2.97 2.22 0.154 3.17 3.08 5.29 3.85 0.101
GRNet [63] 1.26 1.39 2.57 1.74 0.251 1.85 2.25 4.87 2.99 0.216
PoinTr [69] 0.76 1.05 1.88 1.23 0.421 1.04 1.67 3.44 2.05 0.384
SeedFormer [75] 0.48 0.70 1.30 0.83 0.452 0.61 1.08 2.37 1.35 0.402
HyperCD + SeedFormer 0.46 0.67 1.24 0.79 0.459 0.58 1.03 2.24 1.31 0.428
2,048 points are sampled as input and 8,192 points as moderate and hard in the test stage.
ground truth. Following the same evaluation strategy with
[69], 8 fixed viewpoints are selected and the number of Implementation. We take three state-of-the-art networks,
points in the partial point cloud is set to 2,048, 4,096 or i.e. CP-Net [25], PointAttN [54] and SeedFormer [75], as
6,144 (25%, 50% or 75% of the complete point cloud) our backbone networks for comparison and analysis. We
which corresponds to three difficulty levels of simple, train all these networks with or without HyperCD from
scratch using PyTorch [39] with the Adam optimizer [19].
14601
All three networks are designed solely using the dissimi- Table 4. Completion results of CP-Net with different losses on
ShapeNet-Part in terms of per-point L2 Chamfer distance ×1000.
larity between generated and ground truth point clouds as
supervision signal. For the lightweight network CP-Net, its Loss function CD-Avg
original dissimilarity metric between prediction and ground L1-CD 4.16
truth is only calculated at the first stage. We replace its dis- L2-CD 4.82
similarity metric at this stage using HyperCD in our experi- DCD 5.74
ment. For multi-stage networks, i.e. PointAttN and Seed- y = arcosh(1 + x) 4.43
Former, the loss functions are introduced at every stage y = arcosh(1 + x3 ) 4.22
from coarse to fine point clouds. We replace the loss func- Hyperbolic Distance 4.09
tions from all the stages with HyperCD so it can participate HyperCD 4.03
in the whole training process. To ensure fairness in compar-
ison, all the other loss functions compared with HyperCD
in this paper are processed in the same way as HyperCD. from distortion.
All the hyperparameters such as learning rate, batch size ShapeNet-55/34. We also test on the ShapeNet-55 dataset
and training epochs are kept consistently with the setting to evaluate the adaptability of HyperCD on tasks with
of baselines for a fair comparison. Hyperparameter α in higher diversities. Table 2 summarizes the average L2
HyperCD is tuned with grid search, by default. We conduct Chamfer distances on three difficulty levels and the over-
our experiments on a server with 10 NVIDIA RTX 2080Ti all CDs. Following the convention, we show results in 5
11G GPUs for CP-Net with ShapeNet-Part, on a server with categories (Table, Chair, Plane, Car and Sofa) with more
4 NVIDIA A100 80G GPUs for PointAttN with PCN, on a than 2,500 samples in the training set. Complete results for
server with 4 NVIDIA V100 16G GPUs for SeedFormer all 55 categories are available in the supplemental material.
with PCN and ShapeNet-55/34. We also provide results under the F-Score@1% metric. As
we can see from Table 2, the introduction of HyperCD im-
Evaluation Metrics. To make a fair comparison, we eval- proves the baseline performance to some extent. To have an
uate the performance of all the methods using CD. F1- intuitive evaluation of reconstructed results, we also provide
Score@1% [48] is also used to evaluate ShapeNet-55/34 qualitative evaluations in supplemental materials compared
with the same experiment setting in the literature. For bet- with results generated from the baselines. We can clearly
ter comparison we also list the original results of some other see while reconstructed results have better numerical per-
methods on PCN and ShapeNet-55/34. formance, the model trained with HyperCD works better in
reconstructing the surface areas and preserving the details
4.1. State-of-the-art Comparison with less noise. The improvement in both numerical and
qualitative evaluations indicates that HyperCD is capable to
PCN. Following the literature, we report CD with L1- adapt to point completing tasks with high diversities.
distance in Table 1 with numbers per category. We also On ShapeNet-34, we evaluate performances within 34
include the results trained with DCD. As we can see, the seen categories (same as training) as well as 21 unseen cat-
replacement of HyperCD loss enables two baselines to out- egories (not used in training). In Table 3, we can observe
perform their previous state-of-the-art results by a certain that HyperCD is capable of improving baseline model per-
amount, while the performance gets slightly worse when formance in terms of achieving higher scores. The improve-
DCD is used. As we discussed earlier, numerical metric (i.e. ment of performance indicates our loss function is highly-
CD) may not faithfully reflect the visual quality, so we also generalizable for point clouds completion tasks with both
provide qualitative evaluation results shown in Fig. 4, com- seen and unseen categories.
pared with results generated from the baseline model with 4.2. Analysis
CD and DCD loss functions. As we can see, both models
can reconstruct point clouds in general outline to some ex- We choose ShapeNet-Part as the dataset to analyze and
tent, but the reconstructed results with CD are more likely compare with different loss functions. As introduced previ-
to suffer from distortion on several areas with high noise ously, ShapeNet-Part is a relatively small dataset compris-
level on the surface. With the introduction of HyperCD loss ing 16 categories objects, which is sufficient for analysis in
during training, the baseline network can further demonstra- our case. For the model part, we choose a light-weighted
bly well-reconstructed point cloud in general outline while network called CP-Net [25].
maintaining the realistic details of the original ground truth Hyperparameters. We provide the relationship between α
with significantly reduced noise level. Although DCD ex- with learning rate (lr) on the effect of training performance
hibits better capacity in controlling noise level in generated in Fig. 5. As a complimentary to our aforementioned dis-
point clouds, it fails to preserve fine details and also suffers cuss on arcosh, we train and test the performance of differ-
14602
0.00030 4.40 10
0.00025 4.35
4.30
0.00020
lr 4.25
0.00015
4.20
CD 70
0.00010 4.15
0.00005 4.10
130
0.00000 4.05
0.25 0.50 0.75 1.00 1.25 1.50 1.75 2.00
HyperCD CD
Figure 5. The L1-CD with different α and lr. Figure 6. Illustration of point correspondence change over epochs.
14603
Manolis Savva, Shuran Song, Hao Su, et al. Shapenet: [21] Max Kochurov, Rasul Karimov, and Serge Kozlukov.
An information-rich 3d model repository. arXiv preprint Geoopt: Riemannian optimization in pytorch, 2020. 4
arXiv:1512.03012, 2015. 5, 6 [22] Truc Le and Ye Duan. Pointgrid: A deep network for 3d
[8] Angela Dai, Charles Ruizhongtai Qi, and Matthias Nießner. shape understanding. In Proceedings of the IEEE conference
Shape completion using 3d-encoder-predictor cnns and on computer vision and pattern recognition, pages 9204–
shape synthesis. In Proceedings of the IEEE conference on 9214, 2018. 2
computer vision and pattern recognition, pages 5868–5877, [23] Ruihui Li, Xianzhi Li, Pheng-Ann Heng, and Chi-Wing Fu.
2017. 2 Point cloud upsampling via disentangled refinement. In Pro-
[9] Haowen Deng, Tolga Birdal, and Slobodan Ilic. 3d local ceedings of the IEEE/CVF conference on computer vision
features for direct pairwise registration. In Proceedings of and pattern recognition, pages 344–353, 2021. 1
the IEEE/CVF Conference on Computer Vision and Pattern [24] Ren-Wu Li, Bo Wang, Chun-Peng Li, Ling-Xiao Zhang,
Recognition, pages 3244–3253, 2019. 3 and Lin Gao. High-fidelity point cloud completion with
[10] Aleksandr Ermolov, Leyla Mirvakhabova, Valentin low-resolution recovery and noise-aware upsampling. arXiv
Khrulkov, Nicu Sebe, and Ivan Oseledets. Hyperbolic preprint arXiv:2112.11271, 2021. 1
vision transformers: Combining improvements in metric [25] Fangzhou Lin, Yajun Xu, Ziming Zhang, Chenyang Gao, and
learning. In Proceedings of the IEEE/CVF Conference Kazunori D Yamada. Cosmos propagation network: Deep
on Computer Vision and Pattern Recognition, pages learning model for point cloud completion. Neurocomputing,
7409–7419, 2022. 3 507:221–234, 2022. 7, 8
[11] Zhaoxin Fan, Yulin He, Zhicheng Wang, Kejian Wu,
[26] Kangcheng Liu. An integrated lidar-slam system for com-
Hongyan Liu, and Jun He. Reconstruction-aware prior dis-
plex environment with noisy point clouds. arXiv preprint
tillation for semi-supervised point cloud completion. arXiv
arXiv:2212.05705, 2022. 1
preprint arXiv:2204.09186, 2022. 1
[27] Minghua Liu, Lu Sheng, Sheng Yang, Jing Shao, and Shi-
[12] Ben Fei, Weidong Yang, Wen-Ming Chen, Zhijun Li, Yikang
Min Hu. Morphing and sampling network for dense point
Li, Tao Ma, Xing Hu, and Lipeng Ma. Comprehensive
cloud completion. In Proceedings of the AAAI conference on
review of deep learning-based 3d point cloud completion
artificial intelligence, volume 34, pages 11596–11603, 2020.
processing and analysis. IEEE Transactions on Intelligent
3
Transportation Systems, 2022. 1
[28] Qi Liu, Maximilian Nickel, and Douwe Kiela. Hyperbolic
[13] Octavian Ganea, Gary Bécigneul, and Thomas Hofmann.
graph neural networks. Advances in neural information pro-
Hyperbolic neural networks. Advances in neural informa-
cessing systems, 32, 2019. 3
tion processing systems, 31, 2018. 3
[14] Thibault Groueix, Matthew Fisher, Vladimir G Kim, [29] Shaoteng Liu, Jingjing Chen, Liangming Pan, Chong-Wah
Bryan C Russell, and Mathieu Aubry. A papier-mâché ap- Ngo, Tat-Seng Chua, and Yu-Gang Jiang. Hyperbolic visual
proach to learning 3d surface generation. In Proceedings of embedding learning for zero-shot recognition. In Proceed-
the IEEE conference on computer vision and pattern recog- ings of the IEEE/CVF conference on computer vision and
nition, pages 216–224, 2018. 6 pattern recognition, pages 9273–9281, 2020. 3
[15] Yulan Guo, Hanyun Wang, Qingyong Hu, Hao Liu, Li Liu, [30] Shitong Luo and Wei Hu. Diffusion probabilistic models for
and Mohammed Bennamoun. Deep learning for 3d point 3d point cloud generation. In Proceedings of the IEEE/CVF
clouds: A survey. IEEE transactions on pattern analysis and Conference on Computer Vision and Pattern Recognition,
machine intelligence, 43(12):4338–4364, 2020. 2 pages 2837–2845, 2021. 2
[16] Jonathan Ho, Ajay Jain, and Pieter Abbeel. Denoising diffu- [31] Shitong Luo and Wei Hu. Score-based point cloud denoising.
sion probabilistic models. Advances in Neural Information In Proceedings of the IEEE/CVF International Conference
Processing Systems, 33:6840–6851, 2020. 2 on Computer Vision, pages 4583–4592, 2021. 1
[17] Zitian Huang, Yikuan Yu, Jiawen Xu, Feng Ni, and Xinyi Le. [32] Zhaoyang Lyu, Zhifeng Kong, Xudong Xu, Liang Pan,
Pf-net: Point fractal network for 3d point cloud completion. and Dahua Lin. A conditional point diffusion-refinement
In CVPR, 2020. 6, 7 paradigm for 3d point cloud completion. arXiv preprint
[18] Valentin Khrulkov, Leyla Mirvakhabova, Evgeniya Usti- arXiv:2112.03530, 2021. 2, 3
nova, Ivan Oseledets, and Victor Lempitsky. Hyperbolic [33] Changfeng Ma, Yang Yang, Jie Guo, Chongjun Wang, and
image embeddings. In Proceedings of the IEEE/CVF Con- Yanwen Guo. Completing partial point clouds with out-
ference on Computer Vision and Pattern Recognition, pages liers by collaborative completion and segmentation. arXiv
6418–6428, 2020. 3 preprint arXiv:2203.09772, 2022. 1
[19] Diederik P Kingma and Jimmy Ba. Adam: A method for [34] Daniel Maturana and Sebastian Scherer. Voxnet: A 3d con-
stochastic optimization. arXiv preprint arXiv:1412.6980, volutional neural network for real-time object recognition.
2014. 7 In 2015 IEEE/RSJ international conference on intelligent
[20] Anna Klimovskaia, David Lopez-Paz, Léon Bottou, and robots and systems (IROS), pages 922–928. IEEE, 2015. 2
Maximilian Nickel. Poincaré maps for analyzing com- [35] Himangi Mittal, Brian Okorn, Arpit Jangid, and David Held.
plex hierarchies in single-cell data. Nature communications, Self-supervised point cloud completion via inpainting. arXiv
11(1):2966, 2020. 3 preprint arXiv:2111.10701, 2021. 1
14604
[36] Antonio Montanaro, Diego Valsesia, and Enrico Magli. Re- [49] Lyne P. Tchapmi, Vineet Kosaraju, Hamid Rezatofighi, Ian
thinking the compositionality of point clouds through regu- Reid, and Silvio Savarese. Topnet: Structural point cloud
larization in the hyperbolic space. In Alice H. Oh, Alekh decoder. In CVPR, 2019. 6, 7
Agarwal, Danielle Belgrave, and Kyunghyun Cho, editors, [50] Abraham A Ungar. Hyperbolic trigonometry and its applica-
Advances in Neural Information Processing Systems, 2022. tion in the poincaré ball model of hyperbolic geometry. Com-
2, 3 puters & Mathematics with Applications, 41(1-2):135–147,
[37] Maximillian Nickel and Douwe Kiela. Poincaré embeddings 2001. 4
for learning hierarchical representations. Advances in neural [51] Abraham Albert Ungar. Analytic hyperbolic geometry and
information processing systems, 30, 2017. 3 Albert Einstein’s special theory of relativity. World Scien-
[38] Maximillian Nickel and Douwe Kiela. Learning continu- tific, 2008. 4
ous hierarchies in the lorentz model of hyperbolic geome- [52] Abraham Albert Ungar. A gyrovector space approach to hy-
try. In International Conference on Machine Learning, pages perbolic geometry. Synthesis Lectures on Mathematics and
3779–3788. PMLR, 2018. 3 Statistics, 1(1):1–194, 2008. 4
[39] Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, [53] Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszko-
James Bradbury, Gregory Chanan, Trevor Killeen, Zeming reit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia
Lin, Natalia Gimelshein, Luca Antiga, et al. Pytorch: An im- Polosukhin. Attention is all you need. Advances in neural
perative style, high-performance deep learning library. Ad- information processing systems, 30, 2017. 3
vances in neural information processing systems, 32, 2019. [54] Jun Wang, Ying Cui, Dongyan Guo, Junxia Li, Qing-
7 shan Liu, and Chunhua Shen. Pointattn: You only
[40] Wei Peng, Tuomas Varanka, Abdelrahman Mostafa, Henglin need attention for point cloud completion. arXiv preprint
Shi, and Guoying Zhao. Hyperbolic deep neural networks: arXiv:2203.08485, 2022. 1, 3, 6, 7
A survey. arXiv preprint arXiv:2101.04562, 2021. 3 [55] Xiaogang Wang, Marcelo H. Ang Jr. , and Gim Hee Lee.
[41] Wei Peng, Tuomas Varanka, Abdelrahman Mostafa, Henglin Cascaded refinement network for point cloud completion. In
Shi, and Guoying Zhao. Hyperbolic deep neural networks: A CVPR, 2020. 1
survey. IEEE Transactions on Pattern Analysis and Machine [56] Xiaogang Wang, Marcelo H Ang, and Gim Hee Lee. Voxel-
Intelligence, 44(12):10023–10044, 2022. 4 based network for shape completion by leveraging edge gen-
[42] Charles R Qi, Hao Su, Kaichun Mo, and Leonidas J Guibas. eration. In Proceedings of the IEEE/CVF international con-
Pointnet: Deep learning on point sets for 3d classification ference on computer vision, pages 13189–13198, 2021. 2
and segmentation. In Proceedings of the IEEE conference [57] Xiaogang Wang, Marcelo H Ang Jr, and Gim Hee Lee. Cas-
on computer vision and pattern recognition, pages 652–660, caded refinement network for point cloud completion. In
2017. 2 Proceedings of the IEEE/CVF Conference on Computer Vi-
[43] Yiming Ren, Peishan Cong, Xinge Zhu, and Yuexin Ma. sion and Pattern Recognition, pages 790–799, 2020. 2, 6
Self-supervised point cloud completion on real traffic scenes [58] Xin Wen, Tianyang Li, Zhizhong Han, and Yu-Shen Liu.
via scene-concerned bottom-up mechanism. In 2022 IEEE Point cloud completion by skip-attention network with hi-
International Conference on Multimedia and Expo (ICME), erarchical folding. In CVPR, 2020. 3
pages 1–6. IEEE, 2022. 1 [59] Xin Wen, Peng Xiang, Zhizhong Han, Yan-Pei Cao, Pengfei
[44] Jieqi Shi, Lingyun Xu, Peiliang Li, Xiaozhi Chen, and Shao- Wan, Wen Zheng, and Yu-Shen Liu. Pmp-net: Point cloud
jie Shen. Temporal point cloud completion with pose distur- completion by learning multi-step point moving paths. In
bance. IEEE Robotics and Automation Letters, 7(2):4165– Proceedings of the IEEE/CVF Conference on Computer Vi-
4172, 2022. 1 sion and Pattern Recognition, pages 7443–7452, 2021. 6
[45] Ryohei Shimizu, Yusuke Mukuta, and Tatsuya Harada. [60] Zhenzhen Weng, Mehmet Giray Ogut, Shai Limonchik, and
Hyperbolic neural networks++. arXiv preprint Serena Yeung. Unsupervised discovery of the long-tail in in-
arXiv:2006.08210, 2020. 3 stance segmentation using hierarchical self-supervision. In
[46] Jascha Sohl-Dickstein, Eric Weiss, Niru Maheswaranathan, Proceedings of the IEEE/CVF Conference on Computer Vi-
and Surya Ganguli. Deep unsupervised learning using sion and Pattern Recognition, pages 2603–2612, 2021. 3
nonequilibrium thermodynamics. In International Confer- [61] Tong Wu, Liang Pan, Junzhe Zhang, Tai Wang, Ziwei Liu,
ence on Machine Learning, pages 2256–2265. PMLR, 2015. and Dahua Lin. Density-aware chamfer distance as a com-
2 prehensive metric for point cloud completion. In Advances
[47] Junshu Tang, Zhijun Gong, Ran Yi, Yuan Xie, and Lizhuang in Neural Information Processing Systems, volume 34, pages
Ma. Lake-net: topology-aware point cloud completion by lo- 29088–29100, 2021. 2, 3
calizing aligned keypoints. In Proceedings of the IEEE/CVF [62] Peng Xiang, Xin Wen, Yu-Shen Liu, Yan-Pei Cao, Pengfei
conference on computer vision and pattern recognition, Wan, Wen Zheng, and Zhizhong Han. Snowflakenet: Point
pages 1726–1735, 2022. 3 cloud completion by snowflake point deconvolution with
[48] Maxim Tatarchenko, Stephan R Richter, René Ranftl, skip-transformer. In ICCV, 2021. 1, 3, 6
Zhuwen Li, Vladlen Koltun, and Thomas Brox. What do [63] Haozhe Xie, Hongxun Yao, Shangchen Zhou, Jiageng Mao,
single-view 3d reconstruction networks learn? In Proceed- Shengping Zhang, and Wenxiu Sun. Grnet: Gridding resid-
ings of the IEEE/CVF Conference on Computer Vision and ual network for dense point cloud completion. In ECCV,
Pattern Recognition, pages 3405–3414, 2019. 8 2020. 6, 7
14605
[64] Yajun Xu, Shogo Arai, Diyi Liu, Fangzhou Lin, and
Kazuhiro Kosuge. Fpcc: Fast point cloud clustering-based
instance segmentation for industrial bin-picking. Neurocom-
puting, 494:255–268, 2022. 1
[65] Xuejun Yan, Hongyu Yan, Jingjing Wang, Hang Du, Zhi-
hong Wu, Di Xie, Shiliang Pu, and Li Lu. Fbnet: Feedback
network for point cloud completion. In European Confer-
ence on Computer Vision, pages 676–693. Springer, 2022.
6
[66] Yaoqing Yang, Chen Feng, Yiru Shen, and Dong Tian. Fold-
ingnet: Point cloud auto-encoder via deep grid deformation.
In Proceedings of the IEEE conference on computer vision
and pattern recognition, pages 206–215, 2018. 2, 6, 7
[67] Li Yi, Vladimir G Kim, Duygu Ceylan, I-Chao Shen,
Mengyan Yan, Hao Su, Cewu Lu, Qixing Huang, Alla Shef-
fer, and Leonidas Guibas. A scalable active framework for
region annotation in 3d shape collections. ACM Transactions
on Graphics (ToG), 35(6):1–12, 2016. 5
[68] Lequan Yu, Xianzhi Li, Chi-Wing Fu, Daniel Cohen-Or, and
Pheng-Ann Heng. Pu-net: Point cloud upsampling network.
In Proceedings of the IEEE conference on computer vision
and pattern recognition, pages 2790–2799, 2018. 1
[69] Xumin Yu, Yongming Rao, Ziyi Wang, Zuyan Liu, Jiwen Lu,
and Jie Zhou. Pointr: Diverse point cloud completion with
geometry-aware transformers. In ICCV, 2021. 1, 3, 6, 7
[70] Wentao Yuan, Tejas Khot, David Held, Christoph Mertz, and
Martial Hebert. Pcn: point completion network. In 3DV,
2018. 1, 2, 6, 7
[71] Yun Yue, Fangzhou Lin, Kazunori D Yamada, and Zim-
ing Zhang. Hyperbolic contrastive learning. arXiv preprint
arXiv:2302.01409, 2023. 4
[72] Jingzhao Zhang, Tianxing He, Suvrit Sra, and Ali Jadbabaie.
Why gradient clipping accelerates training: A theoretical
justification for adaptivity. arXiv preprint arXiv:1905.11881,
2019. 4
[73] Kaiyi Zhang, Ximing Yang, Yuan Wu, and Cheng Jin.
Attention-based transformation from latent features to point
clouds. In Proceedings of the AAAI Conference on Artificial
Intelligence, volume 36, pages 3291–3299, 2022. 3
[74] Wenxiao Zhang, Qingan Yan, and Chunxia Xiao. Detail pre-
served point cloud completion via separated feature aggre-
gation. In European Conference on Computer Vision, pages
512–528. Springer, 2020. 2, 6
[75] Haoran Zhou, Yun Cao, Wenqing Chu, Junwei Zhu, Tong
Lu, Ying Tai, and Chengjie Wang. Seedformer: Patch seeds
based point cloud completion with upsample transformer.
arXiv preprint arXiv:2207.10315, 2022. 1, 6, 7
[76] Linqi Zhou, Yilun Du, and Jiajun Wu. 3d shape generation
and completion through point-voxel diffusion. In Proceed-
ings of the IEEE/CVF International Conference on Com-
puter Vision, pages 5826–5835, 2021. 2
14606