HypLL: The Hyperbolic Learning Library
HypLL: The Hyperbolic Learning Library
that hyperbolic geometry provides a viable alternative foundation the number of nodes generally grows exponentially with depth.
for deep learning, especially when data is hierarchical in nature and On the other hand, the volume of a ball in Euclidean space grows
when working with few embedding dimensions. Currently however, polynomially with diameter, leading to distortion when embedding
no accessible open-source library exists to build hyperbolic network hierarchies [4]. For embedding hierarchies, foundational work in
modules akin to well-known deep learning libraries. We present hyperbolic learning has shown that hyperbolic space is superior,
HypLL, the Hyperbolic Learning Library to bring the progress on leading to minimal distortion while requiring only few embedding
hyperbolic deep learning together. HypLL is built on top of PyTorch, dimensions [10, 19].
with an emphasis in its design for easy-of-use, in order to attract a The advances in hyperbolic embeddings of hierarchies and sub-
broad audience towards this new and open-ended research direc- sequent theory on hyperbolic network layers [11, 23] have led to
tion. The code is available at: [Link] rapid developments towards hyperbolic deep learning across many
hyperbolic_learning_library. The compressed archive is available modalities. Hyperbolic learning algorithms have been proposed for
at: [Link] graphs [15], text [24], computer vision [8, 12, 13], recommender
systems [26] and more. The growing body of literature has already
CCS CONCEPTS been captured in multiple surveys [17, 22, 27], we kindly refer to
these works for detailed descriptions of ongoing literature. Across
• Computing methodologies → Machine learning; • Software
these works, hyperbolic learning has shown great potential, with
and its engineering → Software libraries and repositories.
improvements when data is hierarchical or scarce, with more ro-
KEYWORDS bustness to out-of-distribution and adversarial samples, and better
performance with few embedding dimensions.
hyperbolic geometry, deep learning, software library An open challenge in hyperbolic learning is the lack of shared
ACM Reference Format: open source implementation and development. With current deep
Max van Spengler, Philipp Wirth, and Pascal Mettes. 2023. HypLL: The learning libraries centered around Euclidean operators, building
Hyperbolic Learning Library. In Proceedings of XXX (ACM MM’23). ACM, upon existing frameworks is not directly feasible for hyperbolic
New York, NY, USA, 4 pages. [Link]
learning. Multiple leading repositories have already been made for
optimization on hyperbolic and other non-Euclidean spaces [3, 14,
1 INTRODUCTION 18]. Such approaches do however not keep track of underlying
Deep learning plays a central role in Artificial Intelligence on any manifolds or have implementations of common network layers. To
data type and modality. Advances in deep learning are distilled fill the gap in current literature, we present HypLL: The Hyperbolic
in well-known and broadly used software libraries, such as Py- Learning Library. Our repository is built upon PyTorch and contains
Torch [21], Tensorflow [2], and MxNet [7] amongst others. Contem- implementations of well-known network layers. We take two core
porary deep learning libraries are implicitly or explicitly designed principles to heart: (i) our functionality should closely resemble
for Euclidean operators, with optimized matrix and vector opera- PyTorch and (ii) it should be easy to use and debug, even in the
tors and corresponding automatic differentiation. The underlying presence of different manifolds. The library is targeted both for
assumption is that data is best represented on regular grids. In researchers in hyperbolic deep learning, as well as AI practitioners
practice however, this assumption might not be sufficient, desirable, broadly, without needing all the mathematical knowledge before
or even workable [6]. getting starting in this research direction.
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
for profit or commercial advantage and that copies bear this notice and the full citation 2 HYPERBOLIC LEARNING LIBRARY DESIGN
on the first page. Copyrights for components of this work owned by others than ACM Central in the design of HypLL is make the step towards hyperbolic
must be honored. Abstracting with credit is permitted. To copy otherwise, or republish,
to post on servers or to redistribute to lists, requires prior specific permission and/or a learning easy for PyTorch users and to keep the library easy to
fee. Request permissions from permissions@[Link]. use, even when dealing with many manifolds. In pure PyTorch
ACM MM’23, 2023, Ottawa, Canada (and any other library), it can become difficult and tedious during
© 2023 Association for Computing Machinery.
ACM ISBN 978-1-4503-XXXX-X/18/06. . . $15.00 computations to keep track of the manifold and other metadata
[Link] underlying the data within tensors. Especially in deep learning,
ACM MM’23, 2023, Ottawa, Canada van Spengler et al.
Not all data lives on a manifold. For example, a tensor with label
indices does not have an underlying manifold. In such cases we
revert to the tensor class from PyTorch. Hence in HypLL, this class
bears a slightly different interpretation; a tensor containing values
which do not form vectors or points on a manifold.
The manifold parameter is simply a manifold tensor which
subclasses the parameter class from PyTorch. This allows creating
layers with points on a manifold as its parameters, which will prove
important when discussing the nn module.
The tangent tensor is similar to the manifold tensor in that it
stores metadata for a collection of data stored in its tensor attribute.
However, here the data consists of vectors living in the tangent
space of the manifold M that is within the manifold attribute of the
tangent tensor. A tangent space, written as T𝑥 M, is defined by a
manifold M and a point 𝑥 ∈ M. When working in hyperbolic space,
is it convenient to have tangent vectors from various tangent spaces
Figure 1: The structure of HypLL, centered around the ten-
{T𝑥𝑖 M}𝑖 , stored in the same tangent tensor. Therefore, the tangent
sors, manifolds, nn, and optim modules.
tensor also contains a manifold tensor which has the same manifold
and for which the tensor attribute is broadcastable with the tensor of
tangent vectors. Allowing broadcastable tensors instead of tensors
of the same shape makes these tangent tensors more flexible while
when the number of tensors in the computational graph increases
reducing memory requirements. If this manifold tensor is set to
rapidly, this problem becomes challenging. As a result, mistakes
None, every tangent vector is located at the origin of the manifold.
happen frequently and tend to be difficult to spot and correct. We
Lastly, the tangent tensor contains a manifold dimension, which is
have built our design around keeping track of manifolds, to make
again an integer indicating what dimension contains the vectors.
the network design transparent and easy to debug. In contrast, while
To summarize, the tangent tensor contains a tensor attribute
Hyperlib also provides hyperbolic learning functionalities [1], it
containing tangent vectors, a manifold attribute indicating the man-
is only available for Tensorflow, does not keep track of manifolds,
ifold to which the vectors are tangent, a manifold tensor attribute
and does not contain important layers such as convolutions and
containing the points on the manifold where the tangent spaces
batch normalization.
are located and is broadcastable with the tangent vectors; and a
In this Section, we will highlight the design of the core modules
manifold dimension indicating the dimension of the vectors.
in HypLL. The overall structure of the library is shown in Figure 1.
The library is centered around four modules: (i) the tensors module,
(ii) the manifolds module, (iii) the nn module, and (iv) the optim 2.2 The manifolds module
module. The modules are discussed sequentially below.
The manifolds module contains the different manifolds that the
library currently supports. These classes contain all of the usual op-
2.1 The tensors module erations that are defined on these manifolds and which are required
The tensors module forms the foundation and contains three impor- for the operations defined in the nn module. Each different manifold
tant components: 1) the manifold tensor, 2) the manifold parameter, subclasses the base manifold class, which is a metaclass containing
and 3) the tangent tensor. The first and third of these take over the methods that each manifold should have. In the current im-
a part of the role of the original tensor class from the PyTorch plementation, we have focused on the Euclidean manifold and the
library. The manifold parameter takes over the role of the PyTorch Poincaré ball, the most commonly used model of hyperbolic space.
parameter class. As such, these classes form the basic objects for The library is designed to be able to incorporate any other manifold
storing data throughout any computations with the other modules. as well in future updates, such as the hyperboloid and Klein mod-
The manifold tensor is a class with three important properties: els. The inclusion of the Euclidean manifold within this module is
1) a PyTorch tensor, 2) a manifold M from the manifold module required for optimally providing flexible manifold-agnostic layers
which will be discussed in Subsection 2.2 and 3) a manifold dimen- in the nn module.
sion 𝑑 ∈ Z. The manifold indicates where the data — stored in Each manifold submodule contains the mathematical functions
the tensor object — lives. The manifold dimension indicates the that define its operations, such as exponential maps, logarithmic
dimension that stores the points on the manifold. For example, if we maps, the Fréchet mean, vector addition and more. These opera-
have a 2-dimensional manifold tensor with a Poincaré ball manifold tions apply checks to see if their inputs live on the correct manifold
and manifold dimension 1, then each row in our tensor contains a and, when there are multiple inputs, if the manifold dimensions
point on the Poincaré ball. By storing this additonial data on the of the inputs align properly. This is the largest contributor to our
manifold tensor, we can later ensure that any operation applied to second design principle, as these operations significantly reduce
the manifold tensor is indeed allowed. For example, if an operation the difficulty of debugging by explicitly disallowing mathematically
assumes data to be Euclidean, while the data is actually hyperbolic, nonsensical operations. Without such checks, operations can easily
we can easily point out the mistake. lead to silent failures and tricky bugs. For a complete overview of
HypLL: The Hyperbolic Learning Library ACM MM’23, 2023, Ottawa, Canada
Figure 2: Comparison of the implementation of a small convnet in HypLL (left) versus in PyTorch (right).
the different operations defined on these manifolds, with imple- ease-of-use. Following our first design principle, this manifold ar-
mentations based on [11, 14, 23, 25], we refer to the source code. gument that is supplied to each layer is the only difference between
Another important component of this module is the curvature the signature of the original PyTorch layers and our layers. As a
of the Poincaré ball manifold. The curvature is a module containing result, the only difference between building a neural network with
a single parameter, which is used to compute the absolute value HypLL compared to with PyTorch is having to define a manifold
of the curvature and can be made learnable. The reason to use the and supplying this manifold to the layers of the network.
absolute value of the curvature instead of the true negative value
is to avoid having to add a minus sign throughout, which increases
ease-of-use. This does not lead to down-stream issues as we only 2.4 The optim module
support non-positive curvature manifolds. Such a curvature object The optim module implements the Riemannian SGD and Riemann-
is supplied as input to the Poincaré ball class during initialization ian Adam optimizers as introduced in [5], based on the implemen-
to define the curvature of the manifold. tations from [14]. These implementations work both with manifold
parameters and PyTorch’s parameters. When optimizing manifold
parameters, the optimizers use the manifold of the manifold parame-
2.3 The nn module ter for each of the operations that is performed during optimization.
The nn module is where all of the neural network methodology As a result, the checks are again built-in automatically through the
is implemented. It is structured similarly to the nn module from manifold objects. When training with manifold parameters on the
PyTorch and contains the currently available learning tools for Euclidean manifold or with PyTorch parameters, the optimizers
Euclidean space and the Poincaré ball model. Similar to the classes are equivalent to their PyTorch counterparts. Following our first
in the PyTorch nn module, each of the classes in our nn module design principle, initialization of these optimizers is identical to the
subclasses the Module class from PyTorch, which ensures that optimization of the PyTorch optimizers. Moreover, these optimiz-
they will be properly registered for optimization. This module will ers inherit from the base PyTorch optimizer, which makes them
be expanded whenever new methodology becomes available in compatible with learning rate schedulers and other such tools.
literature. An overview of the available layers is shown in Figure 1.
The implementations are based on [11, 16, 19, 23].
Each of the layers in the nn module is designed to be manifold- 3 EXAMPLE USAGE
agnostic. In practice, each layer is supplied with a manifold object To showcase how easy it becomes to define and train a neural net-
and that it uses the operations defined on this manifold object to work with HypLL, we will describe the similarities and differences
define its forward pass. So, when the supplied manifold is Euclidean with the usage of PyTorch. The major differences that come with
space, the layers are equivalent to their PyTorch counterparts. Due using our library are 1) defining a manifold on which our data will
to the usage of these manifold operations, all of the compatibility live and on which our model will act; and 2) moving our input data
checks on the inputs are automatically built-in, which increases onto this manifold as a pre-processing step. For this example we
ACM MM’23, 2023, Ottawa, Canada van Spengler et al.
REFERENCES
[1] 2022. HyperLib: Deep learning in the Hyperbolic space. [Link]
nalexai/hyperlib.
[2] Martín Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen,
Craig Citro, Greg S. Corrado, Andy Davis, Jeffrey Dean, Matthieu Devin, San-
jay Ghemawat, Ian Goodfellow, Andrew Harp, Geoffrey Irving, Michael Isard,
Yangqing Jia, Rafal Jozefowicz, Lukasz Kaiser, Manjunath Kudlur, Josh Levenberg,
Dandelion Mané, Rajat Monga, Sherry Moore, Derek Murray, Chris Olah, Mike
Schuster, Jonathon Shlens, Benoit Steiner, Ilya Sutskever, Kunal Talwar, Paul
Tucker, Vincent Vanhoucke, Vijay Vasudevan, Fernanda Viégas, Oriol Vinyals,
Pete Warden, Martin Wattenberg, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng.
2015. TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems.
[3] Seth D. Axen, Mateusz Baran, Ronny Bergmann, and Krzysztof Rzecki. 2021.
[Link]: An Extensible Julia Framework for Data Analysis on Manifolds.
arXiv:2106.08777
Figure 3: Feeding data to a model written with HypLL. [4] Gregor Bachmann, Gary Bécigneul, and Octavian Ganea. 2020. Constant curva-
ture graph convolutional networks. In ICML.
[5] Gary Bécigneul and Octavian-Eugen Ganea. 2018. Riemannian adaptive opti-
mization methods. arXiv:1810.00760 (2018).
will use the CIFAR-10 tutorial from the PyTorch documentation1 , [6] Michael M Bronstein, Joan Bruna, Yann LeCun, Arthur Szlam, and Pierre Van-
which we have also adapted as a tutorial for our library. We will dergheynst. 2017. Geometric deep learning: going beyond euclidean data. IEEE
Signal Processing Magazine (2017).
show several steps involved in training this small convolutional [7] Tianqi Chen, Mu Li, Yutian Li, Min Lin, Naiyan Wang, Minjie Wang, Tianjun Xiao,
network and compare it to the PyTorch implementation. Bing Xu, Chiyuan Zhang, and Zheng Zhang. 2015. Mxnet: A flexible and efficient
Creating a network. We start by defining the manifold and machine learning library for heterogeneous distributed systems. arXiv:1512.01274
(2015).
then using this manifold to define a small convolutional network. [8] Aleksandr Ermolov, Leyla Mirvakhabova, Valentin Khrulkov, Nicu Sebe, and Ivan
We will use a Poincaré ball with curvature -1 for this example. The Oseledets. 2022. Hyperbolic Vision Transformers: Combining Improvements in
Metric Learning. In CVPR.
implementations of this network in HypLL and in PyTorch are [9] Matthias Fey and Jan E. Lenssen. 2019. Fast Graph Representation Learning with
shown side-by-side in Figure 2. The only true difference is that we PyTorch Geometric. In ICLRw.
define a manifold and supply it to the model layers in the HypLL [10] Octavian Ganea, Gary Bécigneul, and Thomas Hofmann. 2018. Hyperbolic
entailment cones for learning hierarchical embeddings. In ICML.
code. Adding hyperbolic geometry to a network is a simple as that [11] Octavian Ganea, Gary Bécigneul, and Thomas Hofmann. 2018. Hyperbolic neural
with this library. networks. In NeurIPS.
Feeding data to our model. Second, we show part of the train- [12] Mina Ghadimi Atigh, Julian Schoep, Erman Acar, Nanne van Noord, and Pascal
Mettes. 2022. Hyperbolic Image Segmentation. In CVPR.
ing loop, where we only show the part in which our implementation [13] Valentin Khrulkov, Leyla Mirvakhabova, Evgeniya Ustinova, Ivan Oseledets, and
differs from PyTorch for brevity. Mapping Euclidean vectors to hy- Victor Lempitsky. 2020. Hyperbolic image embeddings. In CVPR.
[14] Max Kochurov, Rasul Karimov, and Serge Kozlukov. 2020. Geoopt: Riemannian
perbolic space is usually performed by assuming the vectors to be Optimization in PyTorch. arXiv:2005.02819 [[Link]]
tangent vectors at the origin and then mapping these to the hyper- [15] Qi Liu, Maximilian Nickel, and Douwe Kiela. 2019. Hyperbolic graph neural
bolic space using the exponential map. We will use this approach networks. In NeurIPS.
[16] Aaron Lou, Isay Katsman, Qingxuan Jiang, Serge Belongie, Ser-Nam Lim, and
here as well. The example is shown in Figure 3. Christopher De Sa. 2020. Differentiating through the fréchet mean. In ICML.
So, when using HypLL, a little bit of logic has to be added to [17] Pascal Mettes, Mina Ghadimi Atigh, Martin Keller-Ressel, Jeffrey Gu, and Ser-
move the inputs to the manifold on which the network acts. Namely, ena Yeung. 2023. Hyperbolic Deep Learning in Computer Vision: A Survey.
arXiv:2305.06611 (2023).
we first wrap the inputs in a tangent tensor with no manifold tensor [18] Nina Miolane, Nicolas Guigui, Alice Le Brigant, Johan Mathe, Benjamin Hou,
argument and then map it using the exponential map of the Poincaré Yann Thanwerdas, Stefan Heyder, Olivier Peltre, Niklas Koep, Hadi Zaatiti, Hatem
Hajri, Yann Cabanes, Thomas Gerald, Paul Chauchat, Christian Shewmake, Daniel
ball. This operation is left to the user so they have full control over Brooks, Bernhard Kainz, Claire Donnat, Susan Holmes, and Xavier Pennec. 2020.
how to move their inputs to hyperbolic space. Aside from that, Geomstats: A Python Package for Riemannian Geometry in Machine Learning.
nothing else changes, making hyperbolic deep learning a tool that JMLR (2020).
[19] Maximillian Nickel and Douwe Kiela. 2017. Poincaré embeddings for learning
can be used by a broad audience. hierarchical representations. In NeurIPS.
[20] Natalya Fridman Noy and Carole D Hafner. 1997. The state of the art in ontology
4 CONCLUSIONS AND OUTLOOK design: A survey and comparative review. AI magazine (1997).
[21] Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory
This paper presents the Hyperbolic Learning Library, enabling re- Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, Alban Des-
maison, Andreas Kopf, Edward Yang, Zachary DeVito, Martin Raison, Alykhan
searchers and practitioners to perform deep learning without hassle Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, and Soumith
in hyperbolic space, a new and open-ended research direction. Hy- Chintala. 2019. PyTorch: An Imperative Style, High-Performance Deep Learning
pLL is designed to make the step from PyTorch minimal and to keep Library. In NeurIPS.
[22] Wei Peng, Tuomas Varanka, Abdelrahman Mostafa, Henglin Shi, and Guoying
debugging easy by tracking manifolds. The library is a continual Zhao. 2021. Hyperbolic deep neural networks: A survey. IEEE TPAMI (2021).
effort, where the latest advances in the field are continually inte- [23] Ryohei Shimizu, Yusuke Mukuta, and Tatsuya Harada. 2021. Hyperbolic neural
grated and forms a central point to work on challenges in the field, networks++. In ICLR.
[24] Alexandru Tifrea, Gary Bécigneul, and Octavian-Eugen Ganea. 2019. Poincaré
such as increasing stability in optimization and performing learning glove: Hyperbolic word embeddings. In ICLR.
at large scale. The main structure follows the ideology of PyTorch [25] Abraham Albert Ungar. 2008. A gyrovector space approach to hyperbolic geome-
try. Synthesis Lectures on Mathematics and Statistics (2008).
[21] with corresponding modules. In the future, we strive to build [26] Liping Wang, Fenyu Hu, Shu Wu, and Liang Wang. 2021. Fully hyperbolic graph
a geometric framework on top of the library for graph-based data, convolution network for recommendation. In ICIKM.
in the spirit of PyG, PyTorch geometric [9]. [27] Menglin Yang, Min Zhou, Zhihao Li, Jiahong Liu, Lujia Pan, Hui Xiong, and
Irwin King. 2022. Hyperbolic graph neural networks: A review of methods and
1 [Link] applications. arXiv:2202.13852 (2022).
(07-06-2023)