0% found this document useful (0 votes)

21 views22 pages

Geometric Foundations of Deep Learning by Michael Bronstein Towards Data Science

Geometric Deep Learning aims to unify various machine learning problems through the principles of symmetry and invariance, drawing parallels to Felix Klein's Erlangen Programme in geometry. This approach provides a mathematical framework for understanding and constructing neural network architectures, addressing the challenges posed by high-dimensional data. By leveraging geometric structures, such as symmetry priors and scale separation, it enhances the performance and interpretability of deep learning models across different domains.

Uploaded by

Maha Achour

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

21 views22 pages

Geometric Foundations of Deep Learning by Michael Bronstein Towards Data Science

Uploaded by

Maha Achour

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 22

Geometric foundations of Deep Learning | by Michael Bronstein | Towa... https://towardsdatascience.com/geometric-foundations-of-deep-learnin...

Open in app Sign up Sign In

Search Medium

THE ERLANGEN PROGRAMME OF ML

Geometric foundations of Deep Learning

Geometric Deep Learning is an attempt for geometric unification of a broad class of
ML problems from the perspectives of symmetry and invariance. These principles
not only underlie the breakthrough performance of convolutional neural networks
and the recent success of graph neural networks but also provide a principled way
to construct new types of problem-specific inductive biases.

Michael Bronstein · Follow

Published in Towards Data Science
13 min read · Apr 28, 2021

Listen Share

This blog post was co-authored with Joan Bruna, Taco Cohen, and Petar Veličković and is
based on the new “proto-book” M. M. Bronstein, J. Bruna, T. Cohen, and P. Veličković,
Geometric Deep Learning: Grids, Groups, Graphs, Geodesics, and Gauges (2021), Petar’s
talk at Cambridge and Michael’s keynote talk at ICLR 2021.

In October 1872, the philosophy faculty of a small university in the Bavarian

city of Erlangen appointed a new young professor. As customary, he was
requested to deliver an inaugural research programme, which he published under
the somewhat long and boring title Vergleichende Betrachtungen über neuere
geometrische Forschungen (“A comparative review of recent researches in geometry”).
The professor was Felix Klein, only 23 years of age at that time, and his inaugural
work has entered the annals of mathematics as the “Erlangen Programme” [1].

1 of 22 8/19/2023, 2:58 PM
Geometric foundations of Deep Learning | by Michael Bronstein | Towa... https://towardsdatascience.com/geometric-foundations-of-deep-learnin...

Felix Klein and his Erlangen Programme. Image: Wikipedia/University of Michigan Historical Math
Collections.

The nineteenth century had been remarkably fruitful for geometry. For the first
time in nearly two thousand years after Euclid, the construction of projective
geometry by Poncelet, hyperbolic geometry by Gauss, Bolyai, and Lobachevsky, and
elliptic geometry by Riemann showed that an entire zoo of diverse geometries was
possible. However, these constructions had quickly diverged into independent and
unrelated fields, with many mathematicians of that period questioning how the
different geometries are related to each other and what actually defines a geometry.

The breakthrough insight of Klein was to approach the definition of geometry as the
study of invariants, or in other words, structures that are preserved under a certain
type of transformations (symmetries). Klein used the formalism of group theory to
define such transformations and use the hierarchy of groups and their subgroups in
order to classify different geometries arising from them. Thus, the group of rigid
motions leads to the traditional Euclidean geometry, while affine or projective
transformations produce, respectively, the affine and projective geometries.
Importantly, the Erlangen Programme was limited to homogeneous spaces [2] and

2 of 22 8/19/2023, 2:58 PM
Geometric foundations of Deep Learning | by Michael Bronstein | Towa... https://towardsdatascience.com/geometric-foundations-of-deep-learnin...

initially excluded Riemannian geometry.

Klein’s Erlangen Programme approached geometry as the study of properties remaining invariant under
certain types of transformations. 2D Euclidean geometry is defined by rigid transformations (modeled as the
isometry group) that preserve areas, distances, and angles, and thus also parallelism. Affine transformations
preserve parallelism, but neither distances nor areas. Finally, projective transformations have the weakest
invariance, with only intersections and cross-ratios preserved, and correspond to the largest group among
the three. Klein thus argued that projective geometry is the most general one.

The impact of the Erlangen Program on geometry and mathematics broadly was
very profound. It also spilled to other fields, especially physics, where symmetry
considerations allowed to derive conservation laws from the first principles — an
astonishing result known as Noether’s Theorem [3]. It took several decades until this
fundamental principle — through the notion of gauge invariance (in its generalised
form developed by Yang and Mills in 1954) — proved successful in unifying all the
fundamental forces of nature with the exception of gravity. This is what is called the
Standard Model and it describes all the physics we currently know. We can only
repeat the words of a Nobel-winning physicist, Philip Anderson [4], that

“it is only slightly overstating the case to say that

physics is the study of symmetry.’’
believe that the current state of affairs in the field of deep (representation) learning

3 of 22 8/19/2023, 2:58 PM
Geometric foundations of Deep Learning | by Michael Bronstein | Towa... https://towardsdatascience.com/geometric-foundations-of-deep-learnin...

We is reminiscent of the situation of geometry in the nineteenth century:

on the one hand, in the past decade, deep learning has brought a
revolution in data science and made possible many tasks previously thought to be
beyond reach — whether computer vision, speech recognition, natural language
translation, or playing Go. On the other hand, we now have a zoo of different neural
network architectures for different kinds of data, but few unifying principles. As a
consequence, it is difficult to understand the relations between different methods,
which inevitably leads to the reinvention and re-branding of the same concepts.

Deep learning today: a zoo of architectures, few unifying principles. Animal images: ShutterStock.

Geometric Deep Learning is an umbrella term we introduced in [5] referring to

recent attempts to come up with a geometric unification of ML similar to Klein’s
Erlangen Programme. It serves two purposes: first, to provide a common
mathematical framework to derive the most successful neural network
architectures, and second, give a constructive procedure to build future
architectures in a principled way.

4 of 22 8/19/2023, 2:58 PM
Geometric foundations of Deep Learning | by Michael Bronstein | Towa... https://towardsdatascience.com/geometric-foundations-of-deep-learnin...

S upervised machine learning in its simplest setting is essentially a function

estimation problem: given the outputs of some unknown function on a
training set (e.g. labelled dog and cat images), one tries to find a function f from
some hypothesis class that fits well the training data and allows to predict the
outputs on previously unseen inputs. In the past decade, the availability of large,
high-quality datasets such as ImageNet coincided with growing computational
resources (GPUs), allowing to design rich function classes that have the capacity to
interpolate such large datasets.

Neural networks appear to be a suitable choice to represent functions, because even

the simplest architecture like the Perceptron can produce a dense class of functions
when using just two layers, allowing to approximate any continuous function to any
desired accuracy — a property known as Universal Approximation [6].

Multilayer perceptrons are universal approximators: with just one hidden layer, they can represent
combinations of step functions, allowing to approximate any continuous function with arbitrary precision.

The setting of this problem in low-dimensions is a classical problem in

approximation theory that has been studied extensively, with precise mathematical
control of estimation errors. But the situation is entirely different in high
dimensions: one can quickly see that in order to approximate even a simple class of

5 of 22 8/19/2023, 2:58 PM
Geometric foundations of Deep Learning | by Michael Bronstein | Towa... https://towardsdatascience.com/geometric-foundations-of-deep-learnin...

e.g. Lipschitz continuous functions the number of samples grows exponentially with
the dimension — a phenomenon known colloquially as the “curse of
dimensionality”. Since modern machine learning methods need to operate with data
in thousands or even millions of dimensions, the curse of dimensionality is always
there behind the scenes making such a naive approach to learning impossible.

Illustration of the curse of dimensionality: in order to approximate a Lipschitz-continuous function composed

of Gaussian kernels placed in the quadrants of a d-dimensional unit hypercube (blue) with error ε, one
requires �(1/εᵈ) samples (red points).

This is perhaps best seen in computer vision problems like image classification.
Even tiny images tend to be very high-dimensional, but intuitively they have a lot of
structure that is broken and thrown away when one parses the image into a vector to
feed it into the Perceptron. If the image is now shifted by just one pixel, the
vectorised input will be very different, and the neural network will need to be
shown a lot of examples in order to learn that shifted inputs must be classified in the
same way [7].

F ortunately, in many cases of high-dimensional ML problems we have an

additional structure that comes from the geometry of the input signal. We call
this structure a “symmetry prior” and it is a general powerful principle that gives us
optimism in dimensionality-cursed problems. In our example of image
classification, the input image x is not just a d-dimensional vector, but a signal
defined on some domain Ω, which in this case is a two-dimensional grid. The
structure of the domain is captured by a symmetry group � — the group of 2D
translations in our example — which acts on the points on the domain. In the space

6 of 22 8/19/2023, 2:58 PM
Geometric foundations of Deep Learning | by Michael Bronstein | Towa... https://towardsdatascience.com/geometric-foundations-of-deep-learnin...

of signals �(Ω), the group actions (elements of the group, �∈�) on the underlying
domain are manifested through what is called the group representation ρ(�) — in our
case, it is simply the shift operator, a d×d matrix that acts on a d-dimensional vector
[8].

Illustration of geometric priors: the input signal (image x∈�(Ω)) is defined on the domain (grid Ω), whose
symmetry (translation group �) acts in the signal space through the group representation ρ(�) (shift
operator). Making an assumption on how the functions f (e.g. image classifier) interacts with the group
restricts the hypothesis class.

The geometric structure of the domain underlying the input signal imposes
structure on the class of functions f that we are trying to learn. One can have
invariant functions that are unaffected by the action of the group, i.e., f(ρ(�)x)=f(x)
for any �∈� and x. On the other hand, one may have a case where the function has
the same input and output structure and is transformed in the same way as the
input—such functions are called equivariant and satisfy f(ρ(�)x)=ρ(�)f(x) [9]. In the
realm of computer vision, image classification is a good illustration of a setting
where one would desire an invariant function (e.g. no matter where a cat is located
in the image, we still want to classify it as a cat), while image segmentation, where
the output is a pixel-wise label mask, is an example of an equivariant function (the
segmentation mask should follow the transformation of the input image).

A nother powerful geometric prior is “scale separation”. In some cases, we can

construct a multiscale hierarchy of domains (Ω and Ω’ in the figure below) by
“assimilating” nearby points and producing also a hierarchy of signal spaces that are
related by a coarse-graining operator P. On these coarse scales, we can apply coarse-

7 of 22 8/19/2023, 2:58 PM
Geometric foundations of Deep Learning | by Michael Bronstein | Towa... https://towardsdatascience.com/geometric-foundations-of-deep-learnin...

scale functions. We say that a function f is locally stable if it can be approximated as

the composition of the coarse-graining operator P and the coarse-scale function,
f≈f’∘P. While f might depend on long-range dependencies, if it is locally stable, these
can be separated into local interactions that are then propagated towards the coarse
scales [10].

Illustration of scale separation, where we can approximate a fine-level function f as the composition f≈f’∘P of
a coarse-level function f’ and a coarse-graining operator P.

T hese two principles give us a very general blueprint of Geometric Deep

Learning that can be recognised in the majority of popular deep neural
architectures used for representation learning: a typical design consists of a
sequence of equivariant layers (e.g. convolutional layers in CNNs), possibly followed
by an invariant global pooling layer aggregating everything into a single output. In
some cases, it is also possible to create a hierarchy of domains by some coarsening
procedure that takes the form of local pooling.

8 of 22 8/19/2023, 2:58 PM
Geometric foundations of Deep Learning | by Michael Bronstein | Towa... https://towardsdatascience.com/geometric-foundations-of-deep-learnin...

Geometric Deep Learning blueprint.

This is a very general design that can be applied to different types of geometric
structures, such as grids, homogeneous spaces with global transformation groups,
graphs (and sets, as a particular case), and manifolds, where we have global isometry
invariance and local gauge symmetries. The implementation of these principles
leads to some of the most popular architectures that exist today in deep learning:
Convolutional Networks (CNNs), emerging from translational symmetry, Graph
Neural Networks, DeepSets [11], and Transformers [12], implementing permutation
invariance, gated RNNs (such as LSTM networks) that are invariant to time warping
[13], and Intrinsic Mesh CNNs [14] used in computer graphics and vision, that can be
derived from gauge symmetry.

The “5G” of Geometric Deep Learning: Grids, Group (homogeneous spaces with global symmetries), Graphs

9 of 22 8/19/2023, 2:58 PM
Geometric foundations of Deep Learning | by Michael Bronstein | Towa... https://towardsdatascience.com/geometric-foundations-of-deep-learnin...

(and sets as a particular case), and Manifolds, where geometric priors are manifested through global
isometry invariance (which can be expressed using Geodesics) and local Gauge symmetries.

In future posts, we will be exploring in further detail the instances of the

Geometric Deep Learning blueprint on the “5G” [15]. As a final note, we
should emphasize that symmetry has historically been a central concept in many
[1] According to a popular belief, repeated in many sources including Wikipedia, the
fields of science, of which physics, as already mentioned in the beginning, is key. In
Erlangen Programme was delivered in Klein’s inaugural address in October 1872.
the machine learning community, the importance of symmetry has long been
Klein indeed gave such a talk (though on December 7, 1872), but it was for a non-
recognised in particular in the applications to pattern recognition and computer
mathematical audience and concerned primarily his ideas of mathematical
vision, with early works on equivariant feature detection dating back to Shun’ichi
education. What is now called the “Erlangen Programme” was actually the
Amari [16] and Reiner Lenz [17]. In the neural networks literature, the Group
aforementioned brochure Vergleichende Betrachtungen, subtitled Programm zum
Invariance Theorem for Perceptrons by Marvin Minsky and Seymour Papert [18] put
Eintritt in die philosophische Fakultät und den Senat der k. Friedrich-Alexanders-
fundamental limitations on the capabilities of (single-layer) perceptrons to learn
Universität zu Erlangen (“Programme for entry into the Philosophical Faculty and the
invariants. This was one of the primary motivations for studying multi-layer
Senate of the Emperor Friedrich-Alexander University of Erlangen”, see an English
architectures [19–20], which had ultimately led to deep learning.
translation). While Erlangen claims the credit, Klein stayed there for only three
years, moving in 1875 to the Technical University of Munich (then called Technische
Hochschule), followed by Leipzig (1880), and finally settling down in Göttingen from
1886 until his retirement. See R. Tobies Felix Klein — Mathematician, Academic
Organizer, Educational Reformer (2019) In: H. G. Weigand et al. (eds) The Legacy of
Felix Klein, Springer.

[2] A homogeneous space is a space where “all points are the same” and any point
can be transformed into another by means of group action. This is the case for all
geometries proposed before Riemann, including Euclidean, affine, and projective,
as well as the first non-Euclidean geometries on spaces of constant curvature such
as the sphere or hyperbolic space. It took substantial effort and nearly 50 years,
notably by Élie Cartan and the French geometry school, to extend Klein’s ideas to
manifolds.

[3] Klein himself has probably anticipated the potential of his ideas in physics,
complaining of “how persistently the mathematical physicist disregards the
advantages afforded him in many cases by only a moderate cultivation of the
projective view”. By that time, it was already common to think of physical systems
through the perspective of the calculus of variation, deriving the differential

10 of 22 8/19/2023, 2:58 PM
Geometric foundations of Deep Learning | by Michael Bronstein | Towa... https://towardsdatascience.com/geometric-foundations-of-deep-learnin...

equations governing such systems from the “least action principle”, i.e. as the
minimiser of some functional (action). In a paper published in 1918, Emmy Noether
showed that every (differentiable) symmetry of the action of a physical system has a
corresponding conservation law. This, by all means, was a stunning result:
beforehand, meticulous experimental observation was required to discover
fundamental laws such as the conservation of energy, and even then, it was an
empirical result not coming from anywhere. For historical notes, see C. Quigg,
Colloquium: A Century of Noether’s Theorem (2019), arXiv:1902.01989.

[4] P. W. Anderson, More is different (1972), Science 177(4047):393–396.

[5] M. M. Bronstein et al. Geometric deep learning: going beyond Euclidean data
(2017), IEEE Signal Processing Magazine 34(4):18–42 attempted to unify learning on
grids, graphs, and manifolds from the spectral analysis perspective. The term
“geometric deep learning” was actually coined earlier, in Michael’s ERC grant
proposal.

[6] There are multiple versions of the Universal Approximation theorem. It is usually
credited to G. Cybenko, Approximation by superpositions of a sigmoidal function
(1989) Mathematics of Control, Signals, and Systems 2(4):303–314 and K. Hornik,
Approximation capabilities of multilayer feedforward networks (1991), Neural
Networks 4(2):251–257.

[7] The remedy for this problem in computer vision came from classical works in
neuroscience by Hubel and Wiesel, the winners of the Nobel prize in medicine for
the study of the visual cortex. They showed that brain neurons are organised into
local receptive fields, which served as an inspiration for a new class of neural
architectures with local shared weights, first the Neocognitron in K. Fukushima, A
self-organizing neural network model for a mechanism of pattern recognition
unaffected by shift in position (1980), Biological Cybernetics 36(4):193–202, and then
the Convolutional Neural Networks, the seminar work of Y. LeCun et al., Gradient-
based learning applied to document recognition (1998), Proc. IEEE 86(11):2278–2324,
where weight sharing across the image effectively solved the curse of
dimensionality.

11 of 22 8/19/2023, 2:58 PM
Geometric foundations of Deep Learning | by Michael Bronstein | Towa... https://towardsdatascience.com/geometric-foundations-of-deep-learnin...

[8] Note that a group is defined as an abstract object, without saying what the group
elements are (e.g. transformations of some domain), only how they compose. Hence,
very different kinds of objects may have the same symmetry group.

[9] These results can be generalised for the case of approximately invariant and
equivariant functions, see e.g. J. Bruna and S. Mallat, Invariant scattering
convolution networks (2013), Trans. PAMI 35(8):1872–1886.

[10] Scale separation is a powerful principle exploited in physics, e.g. in the Fast
Multipole Method (FMM), a numerical technique originally developed to speed up
the calculation of long-range forces in n-body problems. FMM groups sources that
lie close together and treats them as a single source.

[11] M. Zaheer et al., Deep Sets (2017), NIPS. In the computer graphics community, a
similar architecture was proposed in C. R. Qi et al., PointNet: Deep Learning on
Point Sets for 3D Classification and Segmentation (2017), CVPR.

[12] A. Vaswani et al., Attention is all you need (2017), NIPS, introduced the now
popular Transformer architecture. It can be considered as a graph neural network
with a complete graph.

[13] C. Tallec and Y. Ollivier, Can recurrent neural networks warp time? (2018),
arXiv:1804.11188.

[14] J. Masci et al., Geodesic convolutional neural networks on Riemannian

manifolds (2015), arXiv:1501.06297 was the first convolutional-like neural network
architecture with filters applied in local coordinate charts on meshes. It is a
particular case of T. Cohen et al., Gauge Equivariant Convolutional Networks and
the Icosahedral CNN (2019), arXiv:1902.04615.

[15] M. M. Bronstein, J. Bruna, T. Cohen, and P. Veličković, Geometric Deep

Learning: Grids, Groups, Graphs, Geodesics, and Gauges (2021)

[16] S.-l. Amari, Feature spaces which admit and detect invariant signal
transformations (1978), Joint Conf. Pattern Recognition. Amari is also famous as the

12 of 22 8/19/2023, 2:58 PM
Geometric foundations of Deep Learning | by Michael Bronstein | Towa... https://towardsdatascience.com/geometric-foundations-of-deep-learnin...

pioneer of the field of information geometry, which studies statistical manifolds of

probability distributions using tools of differential geometry.

[17] R. Lenz, Group theoretical methods in image processing (1990), Springer.

[18] M. Minsky and S. A Papert. Perceptrons: An introduction to computational

geometry (1987), MIT Press. This is the second edition of the (in)famous book
blamed for the first “AI winter”, which includes additional results and responds to
some of the criticisms of the earlier 1969 version.

[19] T. J. Sejnowski, P. K. Kienker, and G. E. Hinton, Learning symmetry groups with

hidden units: Beyond the perceptron (1986), Physica D:Nonlinear Phenomena
22(1–3):260–275

[20] J. Shawe-Taylor, Building symmetries into feedforward networks (1989), ICANN.

The first work that can be credited with taking a representation-theoretical view on
invariant and equivariant neural networks is J. Wood and J. Shawe-Taylor,
Representation theory and invariant neural networks (1996), Discrete Applied
Mathematics 69(1–2):33–60. In the “modern era” of deep learning, building
symmetries into neural networks was done by R. Gens and P. M. Domingos, Deep
symmetry networks (2014), NIPS (see also Pedro Domingos’ invited talk at ICLR
2014)

We are grateful to Ben Chamberlain for proofreading this post and to Yoshua Bengio,
Charles Blundell, Andreea Deac, Fabian Fuchs, Francesco di Giovanni, Marco Gori, Raia
Hadsell, Will Hamilton, Maksym Korablyov, Christian Merkwirth, Razvan Pascanu,
Bruno Ribeiro, Anna Scaife, Jürgen Schmidhuber, Marwin Segler, Corentin Tallec, Ngân
Vu, Peter Wirnsberger, and David Wong for their feedback on different parts of the text on
which this post is based. We also that Xiaowen Dong and Pietro Liò for helping us break
the “stage fright” and present early versions of our work.

See additional information on the project webpage, Towards Data Science Medium posts,
subscribe to Michael’s posts and YouTube channel, get Medium membership, or follow

13 of 22 8/19/2023, 2:58 PM
Geometric foundations of Deep Learning | by Michael Bronstein | Towa... https://towardsdatascience.com/geometric-foundations-of-deep-learnin...

Michael, Joan, Taco, and Petar on Twitter.

Geometric Deep Learning Deep Learning Machine Learning Artificial Intelligence

Written by Michael Bronstein

10.5K Followers · Writer for Towards Data Science

DeepMind Professor of AI @Oxford. Serial startupper. ML for graphs, biochemistry, drug design, and animal
communication.

More from Michael Bronstein and Towards Data Science

14 of 22 8/19/2023, 2:58 PM
Geometric foundations of Deep Learning | by Michael Bronstein | Towa... https://towardsdatascience.com/geometric-foundations-of-deep-learnin...

Michael Bronstein in Towards Data Science

Hyperbolic Deep Reinforcement Learning

Many RL problems have hierarchical tree-like nature. Hyperbolic geometry offers a powerful
prior for such problems.

17 min read · Apr 30

382 5

15 of 22 8/19/2023, 2:58 PM
Geometric foundations of Deep Learning | by Michael Bronstein | Towa... https://towardsdatascience.com/geometric-foundations-of-deep-learnin...

Bex T. in Towards Data Science

130 ML Tricks And Resources Curated Carefully From 3 Years (Plus Free
eBook)
Each one is worth your time

· 48 min read · Aug 1

2.5K 10

Kenneth Leung in Towards Data Science

Running Llama 2 on CPU Inference Locally for Document Q&A

Clearly explained guide for running quantized open-source LLM applications on CPUs using
LLama 2, C Transformers, GGML, and LangChain

· 11 min read · Jul 18

2K 27

16 of 22 8/19/2023, 2:58 PM
Geometric foundations of Deep Learning | by Michael Bronstein | Towa... https://towardsdatascience.com/geometric-foundations-of-deep-learnin...

Michael Bronstein in Towards Data Science

Graph Neural Networks through the lens of Differential Geometry and

Algebraic Topology
New perspectives on old problems in Graph ML

11 min read · Nov 18, 2021

Recommended from Medium

622

See all from Michael Bronstein

See all from Towards Data Science

17 of 22 8/19/2023, 2:58 PM
Geometric foundations of Deep Learning | by Michael Bronstein | Towa... https://towardsdatascience.com/geometric-foundations-of-deep-learnin...

Vikas Kumar Ojha in Geek Culture

Binary Neural Networks: A Game Changer in Machine Learning

This blog explains about binary neural networks which have potential of revolutionizing deep
learning if proper efforts are made.

· 9 min read · Feb 19

158 1

Michael Bronstein in Towards Data Science

Hyperbolic Deep Reinforcement Learning

Many RL problems have hierarchical tree-like nature. Hyperbolic geometry offers a powerful
prior for such problems.

17 min read · Apr 30

382 5

Lists

18 of 22 8/19/2023, 2:58 PM
Geometric foundations of Deep Learning | by Michael Bronstein | Towa... https://towardsdatascience.com/geometric-foundations-of-deep-learnin...

Predictive Modeling w/ Python

20 stories · 281 saves

Natural Language Processing

522 stories · 141 saves

AI Regulation
6 stories · 80 saves

Practical Guides to Machine Learning

10 stories · 292 saves

Clément Delteil in Towards AI

Creating Stunning Neural Network Visualizations with ChatGPT and

PlotNeuralNet
Presenting PlotNeuralNet, a LaTeX / Python package to visualize Neural Networks

· 8 min read · Mar 8

159

19 of 22 8/19/2023, 2:58 PM
Geometric foundations of Deep Learning | by Michael Bronstein | Towa... https://towardsdatascience.com/geometric-foundations-of-deep-learnin...

Fahim Rustamy, PhD

Vision Transformers vs. Convolutional Neural Networks

This blog post is inspired by the paper titled AN IMAGE IS WORTH 16�16 WORDS:
TRANSFORMERS FOR IMAGE RECOGNITION AT SCALE from google’s…

7 min read · Jun 4

298 4

20 of 22 8/19/2023, 2:58 PM
Geometric foundations of Deep Learning | by Michael Bronstein | Towa... https://towardsdatascience.com/geometric-foundations-of-deep-learnin...

Strong Analytics

Learning from LiDAR Data with Deep Learning

Lidar has been making headlines, from discovering lost Mayan ruins to aiding self-driving car
navigation. Lidar is even in your new iPhone…

8 min read · Mar 20

Mahii_si_raj

Bayesian Statistics—A Probabilistic Approach

Introduction —

8 min read · Aug 11

See more recommendations

21 of 22 8/19/2023, 2:58 PM
Geometric foundations of Deep Learning | by Michael Bronstein | Towa... https://towardsdatascience.com/geometric-foundations-of-deep-learnin...

22 of 22 8/19/2023, 2:58 PM

Geometric Deep Learning With Grids Groups Graphs Geodesics and Gauges
No ratings yet
Geometric Deep Learning With Grids Groups Graphs Geodesics and Gauges
160 pages
Geometric Deep Learning Guide
No ratings yet
Geometric Deep Learning Guide
160 pages
Highlights
No ratings yet
Highlights
24 pages
An Elementary Introduction To Information Geometry
No ratings yet
An Elementary Introduction To Information Geometry
63 pages
Mathematical Introduction To Deep Learning: Methods, Implementations, and Theory
No ratings yet
Mathematical Introduction To Deep Learning: Methods, Implementations, and Theory
601 pages
NIPS 2017 Principles of Riemannian Geometry in Neural Networks Paper
No ratings yet
NIPS 2017 Principles of Riemannian Geometry in Neural Networks Paper
10 pages
Geometric Deep Learning Overview
No ratings yet
Geometric Deep Learning Overview
20 pages
ANNMath
No ratings yet
ANNMath
104 pages
Entropy EIG
No ratings yet
Entropy EIG
62 pages
Mathematical Introduction To Deep Learning
No ratings yet
Mathematical Introduction To Deep Learning
300 pages
Math
No ratings yet
Math
737 pages
A Brief Introduction To Geometric Deep Learning - by Jason McEwen - Towards Data Science
No ratings yet
A Brief Introduction To Geometric Deep Learning - by Jason McEwen - Towards Data Science
16 pages
Geomstats Tutorial
No ratings yet
Geomstats Tutorial
132 pages
Alfakih Abdo Y. Distance Geometry - Theory, Methods, and Applications
No ratings yet
Alfakih Abdo Y. Distance Geometry - Theory, Methods, and Applications
427 pages
An Elementary Introduction To Information Geometry
No ratings yet
An Elementary Introduction To Information Geometry
61 pages
HypLL: The Hyperbolic Learning Library
No ratings yet
HypLL: The Hyperbolic Learning Library
4 pages
Hyperbolic Neural Networks
No ratings yet
Hyperbolic Neural Networks
21 pages
Geometry-Informed Neural Operator for 3D PDEs
No ratings yet
Geometry-Informed Neural Operator for 3D PDEs
19 pages
Dirac-Bianconi Graph Neural Networks - Enabling Non-Diffusive Long-Range Graph Predictions
No ratings yet
Dirac-Bianconi Graph Neural Networks - Enabling Non-Diffusive Long-Range Graph Predictions
14 pages
Mathematical Introduction To Deep Learning: Methods, Implementations, and Theory
No ratings yet
Mathematical Introduction To Deep Learning: Methods, Implementations, and Theory
714 pages
Memoria
No ratings yet
Memoria
98 pages
Geometry-Informed Neural Operator For Large-Scale 3D Pdes
No ratings yet
Geometry-Informed Neural Operator For Large-Scale 3D Pdes
23 pages
FrankNielsen Soph IA 21nov2019
No ratings yet
FrankNielsen Soph IA 21nov2019
52 pages
KERNEL Geometric Deep Learning: LeCun
No ratings yet
KERNEL Geometric Deep Learning: LeCun
22 pages
An Introduction To Mathematics of Deep Learning
No ratings yet
An Introduction To Mathematics of Deep Learning
14 pages
Towards Causal Representation Learning
No ratings yet
Towards Causal Representation Learning
24 pages
A Geometric Modeling of Occam's Razor in Deep Learning: How To Measure The Simplicity or The Complexity of A Model
No ratings yet
A Geometric Modeling of Occam's Razor in Deep Learning: How To Measure The Simplicity or The Complexity of A Model
31 pages
Ee5551 Newproj Report
No ratings yet
Ee5551 Newproj Report
18 pages
DeepXDE A Deep Learning Library For Solving Differ
No ratings yet
DeepXDE A Deep Learning Library For Solving Differ
17 pages
Carlsson - Topology and Data
No ratings yet
Carlsson - Topology and Data
54 pages
Topology and Data
No ratings yet
Topology and Data
54 pages
GCAN
No ratings yet
GCAN
32 pages
Information Geometry in ML & Optimization
No ratings yet
Information Geometry in ML & Optimization
3 pages
Generative Adversarial Networks For Data
No ratings yet
Generative Adversarial Networks For Data
86 pages
Introduction To 31 Numerical Relativity
No ratings yet
Introduction To 31 Numerical Relativity
84 pages
Introduction To 31 Numerical Relativity
No ratings yet
Introduction To 31 Numerical Relativity
84 pages
The Origins of Representation Manifolds in Large Language Models
No ratings yet
The Origins of Representation Manifolds in Large Language Models
16 pages
An Explicit Link Between Gaussian Fields and Gaussian Markov Random Fields - The SPDE Approach
No ratings yet
An Explicit Link Between Gaussian Fields and Gaussian Markov Random Fields - The SPDE Approach
39 pages
Geometric Deep Learning On Graphs and Manifolds Using Mixture Model Cnns
No ratings yet
Geometric Deep Learning On Graphs and Manifolds Using Mixture Model Cnns
13 pages
An Elementary Introduction To Information Geometry
No ratings yet
An Elementary Introduction To Information Geometry
56 pages
L10-DL Intro
No ratings yet
L10-DL Intro
25 pages
NeurIPS 2018 Hyperbolic Neural Networks Paper
No ratings yet
NeurIPS 2018 Hyperbolic Neural Networks Paper
11 pages
Practical 2 Complete-1
No ratings yet
Practical 2 Complete-1
4 pages
Towards Geometric Deep Learning I - On The Shoulders of Giants
No ratings yet
Towards Geometric Deep Learning I - On The Shoulders of Giants
13 pages
Mod5 Slides
No ratings yet
Mod5 Slides
37 pages
Nielsen A Geometric Modeling of Occams Razor in Deep Learning
No ratings yet
Nielsen A Geometric Modeling of Occams Razor in Deep Learning
41 pages
14 Recursive Self Similarity I
No ratings yet
14 Recursive Self Similarity I
30 pages
Estimating Riemannian Metric with Noise
No ratings yet
Estimating Riemannian Metric with Noise
24 pages
Topological Deep Learning
No ratings yet
Topological Deep Learning
9 pages
A State-Of-The-Art Computer Vision Adopting Non - E
No ratings yet
A State-Of-The-Art Computer Vision Adopting Non - E
33 pages
Differential Geometry Ideas in Financial Mathematics
No ratings yet
Differential Geometry Ideas in Financial Mathematics
4 pages
Hyperbolic Deep Learning in Computer Vision: A Survey
No ratings yet
Hyperbolic Deep Learning in Computer Vision: A Survey
25 pages
Topology and Data: Gunnar Carlsson
No ratings yet
Topology and Data: Gunnar Carlsson
54 pages
Dag Wgan
No ratings yet
Dag Wgan
12 pages
Deep Neural Network Approximation of Invariant Functions Through Dynamical Systems
No ratings yet
Deep Neural Network Approximation of Invariant Functions Through Dynamical Systems
57 pages
Genetic Algorithms Versus Traditional Methods
No ratings yet
Genetic Algorithms Versus Traditional Methods
7 pages
QGCN
No ratings yet
QGCN
21 pages
Categorical Deep Learning - An Algebraic Theory of Architectures
No ratings yet
Categorical Deep Learning - An Algebraic Theory of Architectures
29 pages
Unit VI Applications of ANN
No ratings yet
Unit VI Applications of ANN
6 pages
Convolutional Neural Networks Guide
No ratings yet
Convolutional Neural Networks Guide
28 pages
SPIRAL Binding Report (2) - 1
No ratings yet
SPIRAL Binding Report (2) - 1
46 pages
Alice's Adventures in A Differentiable Wonderland
No ratings yet
Alice's Adventures in A Differentiable Wonderland
279 pages
3D Human Pose Estimation in Video With Temporal Convolutions and Semi-Supervised Training
No ratings yet
3D Human Pose Estimation in Video With Temporal Convolutions and Semi-Supervised Training
13 pages
Research Paper-Final Template
No ratings yet
Research Paper-Final Template
9 pages
CODE UNNATI Marathon by BHIBHUSHITAM
No ratings yet
CODE UNNATI Marathon by BHIBHUSHITAM
91 pages
YOLO Based Object Detection Models: A Review and Its Applications
No ratings yet
YOLO Based Object Detection Models: A Review and Its Applications
40 pages
Basics of Computer Vision Course
No ratings yet
Basics of Computer Vision Course
83 pages
3D Bounding Box Estimation via Deep Learning
No ratings yet
3D Bounding Box Estimation via Deep Learning
10 pages
Updated Research Paper - 295
No ratings yet
Updated Research Paper - 295
19 pages
AI Handbook For Class 10
100% (3)
AI Handbook For Class 10
194 pages
LSTM-CNN Architecture For Human Activity Recognition
No ratings yet
LSTM-CNN Architecture For Human Activity Recognition
12 pages
1 s2.0 S0893608020302665 Main PDF
No ratings yet
1 s2.0 S0893608020302665 Main PDF
25 pages
Final Parkinsons Project Report Body
No ratings yet
Final Parkinsons Project Report Body
41 pages
2 - Noman Naseer - Intro To AI, Machine - Learning - KNN - NBC - K Means Clustering
No ratings yet
2 - Noman Naseer - Intro To AI, Machine - Learning - KNN - NBC - K Means Clustering
37 pages
Data Science Career Boost
No ratings yet
Data Science Career Boost
30 pages
Module 2 - S8 CSE NOTES - KTU DEEP LEARNING NOTES - CST414
No ratings yet
Module 2 - S8 CSE NOTES - KTU DEEP LEARNING NOTES - CST414
20 pages
Machine Learning Algorithms in Civil Structural Health Monitoring: A Systematic Review
No ratings yet
Machine Learning Algorithms in Civil Structural Health Monitoring: A Systematic Review
24 pages
2022 - Ramirez-Agudelo - Intake Time As Potential Predictor of Methane Emissions From Cattle
No ratings yet
2022 - Ramirez-Agudelo - Intake Time As Potential Predictor of Methane Emissions From Cattle
23 pages
Artificial Intelligence For Drug Development, Precision Medicine, and Healthcare
100% (3)
Artificial Intelligence For Drug Development, Precision Medicine, and Healthcare
372 pages
A Systematic Review of Hyperparameter Optimizations Techniques in Convolutional Newral Networks
No ratings yet
A Systematic Review of Hyperparameter Optimizations Techniques in Convolutional Newral Networks
32 pages
B4 A Novel Biometric Identification System Based On Fingertip Electrocardiogram and Speech Signals
No ratings yet
B4 A Novel Biometric Identification System Based On Fingertip Electrocardiogram and Speech Signals
13 pages
【更有效的掩码模型】Architecture-Agnostic Masked Image Modeling - From ViT Back to CNN
No ratings yet
【更有效的掩码模型】Architecture-Agnostic Masked Image Modeling - From ViT Back to CNN
19 pages
Helmet Detection Using Machine Learning and Automatic License Final
75% (4)
Helmet Detection Using Machine Learning and Automatic License Final
47 pages
Real-Time Traffic Sign and Light Recognition System For ADAS
No ratings yet
Real-Time Traffic Sign and Light Recognition System For ADAS
18 pages
ML-5TH Unit
No ratings yet
ML-5TH Unit
28 pages
EEG Machine Learning For Analysis of Mild Traumatic Brain Injury: A Survey
No ratings yet
EEG Machine Learning For Analysis of Mild Traumatic Brain Injury: A Survey
27 pages
IITMP-APAIML Brochure
No ratings yet
IITMP-APAIML Brochure
30 pages
Improving Automatic Polyp Detection Using CNN by Exploiting Temporal Dependency in Colonoscopy Video
No ratings yet
Improving Automatic Polyp Detection Using CNN by Exploiting Temporal Dependency in Colonoscopy Video
14 pages

Geometric Foundations of Deep Learning by Michael Bronstein Towards Data Science

Uploaded by

Geometric Foundations of Deep Learning by Michael Bronstein Towards Data Science

Uploaded by

Geometric foundations of Deep Learning | by Michael Bronstein | Towa... https://towardsdatascience.com/geometric-foundations-of-deep-learnin...

Open in app Sign up Sign In

THE ERLANGEN PROGRAMME OF ML

Geometric foundations of Deep Learning

Michael Bronstein · Follow

In October 1872, the philosophy faculty of a small university in the Bavarian

initially excluded Riemannian geometry.

“it is only slightly overstating the case to say that

We is reminiscent of the situation of geometry in the nineteenth century:

Geometric Deep Learning is an umbrella term we introduced in [5] referring to

S upervised machine learning in its simplest setting is essentially a function

Neural networks appear to be a suitable choice to represent functions, because even

The setting of this problem in low-dimensions is a classical problem in

Illustration of the curse of dimensionality: in order to approximate a Lipschitz-continuous function composed

F ortunately, in many cases of high-dimensional ML problems we have an

A nother powerful geometric prior is “scale separation”. In some cases, we can

scale functions. We say that a function f is locally stable if it can be approximated as

T hese two principles give us a very general blueprint of Geometric Deep

Geometric Deep Learning blueprint.

In future posts, we will be exploring in further detail the instances of the

[4] P. W. Anderson, More is different (1972), Science 177(4047):393–396.

[14] J. Masci et al., Geodesic convolutional neural networks on Riemannian

[15] M. M. Bronstein, J. Bruna, T. Cohen, and P. Veličković, Geometric Deep

pioneer of the field of information geometry, which studies statistical manifolds of

[17] R. Lenz, Group theoretical methods in image processing (1990), Springer.

[18] M. Minsky and S. A Papert. Perceptrons: An introduction to computational

[19] T. J. Sejnowski, P. K. Kienker, and G. E. Hinton, Learning symmetry groups with

[20] J. Shawe-Taylor, Building symmetries into feedforward networks (1989), ICANN.

Michael, Joan, Taco, and Petar on Twitter.

Written by Michael Bronstein

More from Michael Bronstein and Towards Data Science

Michael Bronstein in Towards Data Science

Hyperbolic Deep Reinforcement Learning

17 min read · Apr 30

Bex T. in Towards Data Science

· 48 min read · Aug 1

Kenneth Leung in Towards Data Science

Running Llama 2 on CPU Inference Locally for Document Q&A

· 11 min read · Jul 18

Michael Bronstein in Towards Data Science

Graph Neural Networks through the lens of Differential Geometry and

11 min read · Nov 18, 2021

Recommended from Medium

See all from Michael Bronstein

See all from Towards Data Science

Vikas Kumar Ojha in Geek Culture

Binary Neural Networks: A Game Changer in Machine Learning

· 9 min read · Feb 19

Michael Bronstein in Towards Data Science

Hyperbolic Deep Reinforcement Learning

17 min read · Apr 30

Predictive Modeling w/ Python

Natural Language Processing

Practical Guides to Machine Learning

Clément Delteil in Towards AI

Creating Stunning Neural Network Visualizations with ChatGPT and

· 8 min read · Mar 8

Fahim Rustamy, PhD

Vision Transformers vs. Convolutional Neural Networks

7 min read · Jun 4

Learning from LiDAR Data with Deep Learning

8 min read · Mar 20

Bayesian Statistics—A Probabilistic Approach

8 min read · Aug 11

See more recommendations

You might also like