TP5-1.ipynb - Colab
TP5-1.ipynb - Colab
ipynb - Colab
The aim of this notebook is to describe the dynamics of a non-linear dynamical system by
means of the Koopman theory.
keyboard_arrow_down Introduction
We consider a quantity x ∈ R
n
(a vector) which evolves with time, following a dynamical
system. Think for example of the joint location of the planets in our solar system, which follows
the law of gravitation.
dx(t)
where ẋ(t) is the temporal derivative, and f is a given map describing
n n
:= : R → R
dt
the dynamics.
For a given f , it is not always possible to solve the differential equation (1) analytically. For this
reason, instead, numerical schemes are usually employed, to integrate in time t the equation (1),
so as to propagate the initial condition x(0) up to a desired time T ; think of
T
x(T ) = x(0) + ∫
t=0
f (x(t))dt . The discretization in time of eq (1) or of the integral
introduces numerical approximations, and yields estimates of x(T ) of various quality
depending on the discretization scheme.
In the field of numerical simulations, discretization schemes have been studied for a long time,
and numerical solvers already exist to provide good estimates of integrals (far better than with
the naive discretization xk+δ = xk + δ f (xk ) for a discrete time increment δ, which induces a
O(δ )
2
error at each time step).
The goal of this practical session is to make use of such numerical solvers to improve the
learning of dynamical systems with neural networks.
import numpy as np
from scipy.integrate import solve_ivp
import matplotlib.pyplot as plt
import sys
from tqdm import tqdm
https://colab.research.google.com/drive/1gp5p4hXFuHAxodjDtOLC_v7iI02kncy6#scrollTo=2e3dedc5 1/24
30/03/2025 05:13 TP5-1.ipynb - Colab
import torch.nn.functional as F
import torch.autograd as autograd
3
ẋ2 = x1 − x
1
To integrate in time the ODEs, a 4th order Runge-Kutta scheme can be used.
dim_system = 2
for i in tqdm(range(matrix_x0.shape[0])):
# Lambda function is used as solve_ivp requires a function of the form f(t, x)
ode_result = solve_ivp(lambda _t, array_x: duffing(array_x),
[0, t_max],
matrix_x0[i],
method='RK45',
t_eval=array_t)
array3d_xt[i, :] = ode_result.y
print(cm)
for i in range(10):
ax.plot(array3d_xt[i, 0, :], array3d_xt[i, 1, :], lw=0.5, color=cm(i))
ax.plot(array3d_xt[i, 0, 0], array3d_xt[i, 1, 0], 'o', lw=1.5, color=cm(i)) #initial
ax.set_xlabel('$x_1$', fontsize=20)
ax.set_ylabel('$x_2$', fontsize=20)
plt.show()
xk+1 = F (xk )
where F might be the δ-discretised flow map of the continuous dynamical system in eq (1)
given by
k+δ
and X = (xk )
N
k=0
the discrete time series of the system state.
The Koopman theory states that there exists an infinite-dimensional linear operator K that
advances in time all observable functions (gi )m
i=1
given by gi : R
n
→ R
https://colab.research.google.com/drive/1gp5p4hXFuHAxodjDtOLC_v7iI02kncy6#scrollTo=2e3dedc5 3/24
30/03/2025 05:13 TP5-1.ipynb - Colab
This way, the non-linear dynamics of x, described by F , can be turned into a linear dynamical
system, described by K , acting on another representation space, formed by the observable
quantities gi (x).
Indeed, let gi be an observable function and denoting gi k := gi (xk ) , using the previous
equation, the time evolution of the observables is given by
then, the linearised dynamics of the observables is given by the following equation
gi = Kgi
k+1 k
To project back the dynamics from the Koopman space (Rm , where g(x) lives) to the phase
space (Rn , where x lives), a supplementary function φ : R
m
→ R
n
is needed. Going from x to
the Koopman space and back yields φ o g = Id.
Under this condition, the functions g, φ and K can be parametrized gθ , φρ and Kϕ , and the
parameters θ, ρ and ϕ can be learned minimizing suitable loss functions.
For this purpose, given a time series X = {xk |k = 1 … N } , the following conditions hold:
1. Reconstruction error
∥φρ (gθ (xk )) − xk ∥ = 0
The last three errors can be used as loss functions to train three different neural networks.
These different neural networks compose our architecture that can be summarized as in the
following sketch:
architecture
(matrix_x_data_train,
matrix_x_data_test,
matrix_x_next_data_train,
matrix_x_next_data_test) = train_test_split(matrix_x_data,
https://colab.research.google.com/drive/1gp5p4hXFuHAxodjDtOLC_v7iI02kncy6#scrollTo=2e3dedc5 4/24
30/03/2025 05:13 TP5-1.ipynb - Colab
matrix_x_next_data,
test_size=0.2)
print(matrix_x_data_train.shape,
matrix_x_data_test.shape,
matrix_x_next_data_train.shape,
matrix_x_next_data_test.shape)
cuda
torch.set_default_dtype(torch.float32)
# torch.set_default_tensor_type('torch.DoubleTensor')
tensor2d_x_data_train = torch.from_numpy(matrix_x_data_train).to(device)
tensor2d_x_next_data_train = torch.from_numpy(matrix_x_next_data_train).to(device)
tensor2d_x_data_test = torch.from_numpy(matrix_x_data_test).to(device)
tensor2d_x_next_data_test = torch.from_numpy(matrix_x_next_data_test).to(device)
torch_dataset_train = TensorDataset(tensor2d_x_data_train,
tensor2d_x_next_data_train)
torch_dataset_test = TensorDataset(tensor2d_x_data_test,
tensor2d_x_next_data_test)
train_dataloader = DataLoader(torch_dataset_train,
batch_size=batch_size,
shuffle=True)
test_dataloader = DataLoader(torch_dataset_test,
batch_size=batch_size,
shuffle=True) ######
class Encoder(nn.Module):
def __init__(self, list_layer_dim: list):
super().__init__()
self.list_layer_dim = list_layer_dim
https://colab.research.google.com/drive/1gp5p4hXFuHAxodjDtOLC_v7iI02kncy6#scrollTo=2e3dedc5 5/24
30/03/2025 05:13 TP5-1.ipynb - Colab
self.list_FC = nn.ModuleList()
for i in range(len(self.list_layer_dim) - 1):
input_dim = self.list_layer_dim[i]
output_dim = self.list_layer_dim[i + 1]
self.list_FC.append(nn.Linear(input_dim, output_dim))
class Decoder(nn.Module):
def __init__(self, list_layer_dim: list):
super().__init__()
self.list_layer_dim = list_layer_dim
self.list_FC = nn.ModuleList()
for i in range(len(self.list_layer_dim) - 1, 0, -1):
input_dim = self.list_layer_dim[i]
output_dim = self.list_layer_dim[i - 1]
self.list_FC.append(nn.Linear(input_dim, output_dim))
class Autoencoder(nn.Module):
def __init__(self, feature_dim: int, hidden_layer: int, output_dim: int):
super().__init__()
list_layer_dim = \
[output_dim if i == hidden_layer
else feature_dim + i * (output_dim - feature_dim) // hidden_layer
for i in range(hidden_layer + 1)]
self.encoder = Encoder(list_layer_dim)
self.decoder = Decoder(list_layer_dim)
The Koopman operator K (which is linear, and thus a matrix) must have a spectral radius
ρ(K) ≤ 1 . Such condition will provide a stable -or at least a marginally stable- Koopman
operator. To fulfill this requirement, we might leverage on the Perron-Frobenius theorem.
The Perron-Frobenius th. states: if K is a m × m positive matrix i.e. kij > 0 for 1 ≤ i, j ≤ m ,
then the following inequality holds:
https://colab.research.google.com/drive/1gp5p4hXFuHAxodjDtOLC_v7iI02kncy6#scrollTo=2e3dedc5 6/24
30/03/2025 05:13 TP5-1.ipynb - Colab
Question 1. : Complete the KoopmanModule class to enforce ρ(K) , using the Perron-
≤ 1
class KoopmanOperator(nn.Module):
def __init__(self, koopman_operator_dim: int):
super().__init__()
self.koopman_operator_dim = koopman_operator_dim
def check_spectral_radius(self):
# Compute eigenvalues of K
eigenvalues = torch.linalg.eigvals(self.K.data)
# Compute the spectral radius (maximum absolute value of eigenvalues)
spectral_radius = torch.max(torch.abs(eigenvalues)).item()
print(f"Spectral Radius: {spectral_radius}")
return spectral_radius
dim_observable = 10
koopman_operator = KoopmanOperator(dim_observable)
autoencoder = Autoencoder(feature_dim,
feature_dim hidden_layer, output_dim).to(device)
koopman_operator = KoopmanOperator(output_dim).to(device)
print(autoencoder)
Autoencoder(
(encoder): Encoder(
(list_FC): ModuleList(
(0): Linear(in_features=2, out_features=7, bias=True)
(1): Linear(in_features=7, out_features=13, bias=True)
(2): Linear(in_features=13, out_features=18, bias=True)
(3): Linear(in_features=18, out_features=24, bias=True)
(4): Linear(in_features=24, out_features=30, bias=True)
)
)
https://colab.research.google.com/drive/1gp5p4hXFuHAxodjDtOLC_v7iI02kncy6#scrollTo=2e3dedc5 7/24
30/03/2025 05:13 TP5-1.ipynb - Colab
(decoder): Decoder(
(list_FC): ModuleList(
(0): Linear(in_features=30, out_features=24, bias=True)
(1): Linear(in_features=24, out_features=18, bias=True)
(2): Linear(in_features=18, out_features=13, bias=True)
(3): Linear(in_features=13, out_features=7, bias=True)
(4): Linear(in_features=7, out_features=2, bias=True)
)
)
)
learning_rate_autoencoder = 0.0001
learning_rate_koopman = 0.00001
optimiser_autoencoder = torch.optim.Adam(autoencoder.parameters(),
lr=learning_rate_autoencoder)
optimiser_koopman = torch.optim.Adam(koopman_operator.parameters(),
lr=learning_rate_koopman)
Question 2. : Define a function to compute the loss to be minimized. It should at least include
the 3 terms listed above:
Reconstruction error
Prediction error in the Koopman space
Prediction error in the phase space
Because the different objectives outlined by these losses may compete, the training can be
difficult. You may try different variations on these losses and comment your findings. In order to
improve the training process, one can for instance:
Add a multiplicative factor in front of each loss component, to balance their importance;
how the scales of different losses are related?
We can refine the loss acting upon the latent space, by using a variational autoencoder
approach. This is similar to the Gaussian likelihood used in the first practical (TD1). We
want the prediction in the latent space (i.e. the Koopman space) to be a normal
distribution N (0, 1) . Add a corresponding loss for the latent space. Difference to 0 mean
and 1 standard deviation must be thus included in the loss;
Freeze the gradients of one part of the network, for instance the encoder, for one specific
objective, using the requires_grad property. For instance:
criterion = nn.MSELoss()
...
# Compute one part loss_l of the total loss
# First deactivate gradient computation for irrelevant parts of the architecture
for p in autoencoder.encoder.parameters():
p.requires_grad = False
https://colab.research.google.com/drive/1gp5p4hXFuHAxodjDtOLC_v7iI02kncy6#scrollTo=2e3dedc5 8/24
30/03/2025 05:13 TP5-1.ipynb - Colab
# To Be Implemented
return total_loss
return kl_divergence
https://colab.research.google.com/drive/1gp5p4hXFuHAxodjDtOLC_v7iI02kncy6#scrollTo=2e3dedc5 9/24
30/03/2025 05:13 TP5-1.ipynb - Colab
1. Reconstruction error
∥φρ (gθ (xk )) − xk ∥ = 0
Question 3.: The following cell executes the training loop. You can modify it in order to display
the different intermediate losses computed in the function LOSS above. How do they evolve in
time? Justify your final choice.
n_batch = len(train_dataloader)
n_batch_test = len(test_dataloader)
n_epoch = 100 # To be tuned
n_epochs_for_eval = 10
plt.figure(figsize=(8, 4))
plt.plot(epochs_train, train_losses[loss_name], label='Train', marker='.', linestyle=
plt.plot(epochs_val, val_losses[loss_name], label='Validation', marker='o', linestyle
plt.title(f'{loss_name.capitalize()} Loss Evolution (Weight: {weight})')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.legend()
plt.grid(True)
plt.show()
plt.figure(figsize=(8, 4))
plt.plot(epochs_train, total_train_loss, label='Train Total Loss', marker='.', linest
https://colab.research.google.com/drive/1gp5p4hXFuHAxodjDtOLC_v7iI02kncy6#scrollTo=2e3dedc5 10/24
30/03/2025 05:13 TP5-1.ipynb - Colab
optimiser_autoencoder.zero_grad()
optimiser_koopman.zero_grad()
tensor2d_observable = autoencoder.encoder(tensor2d_batch_x)
tensor2d_observable_next = autoencoder.encoder(tensor2d_batch_x_next)
tensor2d_decoded_x = autoencoder.decoder(tensor2d_observable)
tensor2d_koopman_observable_next = koopman_operator(tensor2d_observable)
tensor2d_predict_x_next = autoencoder.decoder(tensor2d_koopman_observable_next)
# tensor_loss_val = loss_koopman(tensor2d_batch_x,
# tensor2d_batch_x_next,
# tensor2d_decoded_x,
# tensor2d_observable_next,
# tensor2d_koopman_observable_next,
# tensor2d_predict_x_next)
total_loss.backward()
optimiser_autoencoder.step()
optimiser_koopman.step()
https://colab.research.google.com/drive/1gp5p4hXFuHAxodjDtOLC_v7iI02kncy6#scrollTo=2e3dedc5 11/24
30/03/2025 05:13 TP5-1.ipynb - Colab
epoch_losses['reconstruction'] += loss_recon.item()
epoch_losses['koopman_pred'] += loss_koop_pred.item()
epoch_losses['phase_pred'] += loss_phase.item()
epoch_losses['kl_divergence'] += loss_kl.item()
tensor2d_observable = autoencoder.encoder(tensor2d_batch_x)
tensor2d_observable_next = autoencoder.encoder(tensor2d_batch_x_next)
tensor2d_decoded_x = autoencoder.decoder(tensor2d_observable)
tensor2d_koopman_observable_next = koopman_operator(tensor2d_observable
tensor2d_predict_x_next = autoencoder.decoder(tensor2d_koopman_observab
val_epoch_losses['reconstruction'] += loss_reconstruction(tensor2d_deco
val_epoch_losses['koopman_pred'] += loss_koopman_pred(tensor2d_koopman_
val_epoch_losses['phase_pred'] += loss_phase_pred(tensor2d_predict_x_ne
val_epoch_losses['kl_divergence'] += loss_vae(tensor2d_observable_next)
https://colab.research.google.com/drive/1gp5p4hXFuHAxodjDtOLC_v7iI02kncy6#scrollTo=2e3dedc5 12/24
30/03/2025 05:13 TP5-1.ipynb - Colab
https://colab.research.google.com/drive/1gp5p4hXFuHAxodjDtOLC_v7iI02kncy6#scrollTo=2e3dedc5 13/24
30/03/2025 05:13 TP5-1.ipynb - Colab
https://colab.research.google.com/drive/1gp5p4hXFuHAxodjDtOLC_v7iI02kncy6#scrollTo=2e3dedc5 14/24
30/03/2025 05:13 TP5-1.ipynb - Colab
By investigating the above curves we can see that the close match between training and
validation losses across all four terms suggests that the model generalizes well, with no signs
of overfitting. However we will test out the performance as well for the case where we set the
KL divergence to 0.
lamda = 0.0
n_epoch = 100
https://colab.research.google.com/drive/1gp5p4hXFuHAxodjDtOLC_v7iI02kncy6#scrollTo=2e3dedc5 15/24
30/03/2025 05:13 TP5-1.ipynb - Colab
https://colab.research.google.com/drive/1gp5p4hXFuHAxodjDtOLC_v7iI02kncy6#scrollTo=2e3dedc5 16/24
30/03/2025 05:13 TP5-1.ipynb - Colab
https://colab.research.google.com/drive/1gp5p4hXFuHAxodjDtOLC_v7iI02kncy6#scrollTo=2e3dedc5 17/24
30/03/2025 05:13 TP5-1.ipynb - Colab
As we can noitice in both cases, we don't see overfitting on the training set since the validation
loss and the training loss are close. However when we remove the KL divergence term from the
loss we get a slightly better performance since the losses on the validation set move from :
Reconstruction: 0.0002, Koopman Pred: 0.0101, Phase Pred: 0.0002, to -> Reconstruction:
0.0001, Koopman Pred: 0.0002, Phase Pred: 0.0001. Here we see that the losses without the KL
divergence are slightly better. This is due to the fact that optimizing 3 losses is easier than
optimizing 4. Further more by removing the Regularization factor we no longer enforce the a
latent structure to be close to that of a normal distribution. Thus we can say that using the
regularization is a better guarantee for a good generalization than not using it.
keyboard_arrow_down Verification
Question 4. : We want to ensure the Koopman operator is stable. This can be verified by
checking whether its spectral radius ρ(K) ≤ 1 . Plot the eigenvalues of the Koopman operator
in order to verify the bound on its spectral radius. You can use the numpy.linalg.eig function to
retrieve the eigenvalues of a matrix.
# TODO: Check Koopman stability and plot the eigen values of the Koopman operator a
n_grid = 30
x1_min, x1_max = -2, 2
x2_min, x2_max = -2, 2
for i in range(n_grid):
for j in range(n_grid):
x1 = matrix_grid_x1[i, j]
x2 = matrix_grid_x2[i, j]
array3d_dynamics[i, j, :] = duffing(np.array([x1, x2]))
for i in range(n_grid):
for j in range(n_grid):
x1 = matrix_grid_x1[i, j]
x2 = matrix grid x2[i, j]
https://colab.research.google.com/drive/1gp5p4hXFuHAxodjDtOLC_v7iI02kncy6#scrollTo=2e3dedc5 18/24
30/03/2025 05:13 TP5-1.ipynb - Colab
x2 matrix_grid_x2[i, j]
tensor2d_x = torch.tensor([[x1, x2]], dtype=torch.float32).to(device)
tensor2d_observable = autoencoder.encoder(tensor2d_x)
tensor2d_koopman_observable_next = koopman_operator(tensor2d_observable)
tensor2d_predict_x_next = autoencoder.decoder(tensor2d_koopman_observable_n
array_x_next = tensor2d_predict_x_next.cpu().detach().numpy().ravel()
ax = fig.add_subplot(132)
ax.quiver(matrix_grid_x1,
matrix_grid_x2,
array3d_dynamics_pred[:, :, 0],
array3d_dynamics_pred[:, :, 1], scale=10)
ax.set_title('Prediction')
ax.set_xlabel('$x_1$')
ax.set_ylabel('$x_2$')
ax = fig.add_subplot(133)
cp = ax.contourf(matrix_grid_x1,
matrix_grid_x2,
matrix_error_log)
fig.colorbar(cp)
ax.set_title('Error in log scale')
ax.set_xlabel('$x_1$')
ax.set_ylabel('$x_2$')
plt.show()
Considering xk as the observation of a state at time t = kδ , and xk+1 the state at time t + δ,
for δ → 0 it is also possible to define the continuous-time infinitesimal generator of the
Koopman operator family as
Kg(xk ) − g(xk ) g ∘ F (xk ) − xk
Lg(xk ) = lim =
δ→0 δ δ
The pevious expression defines the Lie derivative, and for this reason L is known as the Lie
operator. L describes the continuous dynamics of the observables in the Koopman space:
ġ (x) = Lg(x).
1. Reconstruction error
∥φρ (gθ (x)) − x∥ = 0
Important Remark: As long as the system f is known, the three errors can be computed
without data belonging to trajectories.
(matrix_x_data_train,
matrix_x_data_test,
matrix_x_next_data_train,
matrix_x_next_data_test) = train_test_split(matrix_x0,
matrix_system_derivative_data,
test_size=0.2)
print(matrix_x_data_train.shape,
matrix_x_data_test.shape,
matrix_x_next_data_train.shape,
matrix_x_next_data_test.shape)
torch_dataset_train = TensorDataset(torch.from_numpy(matrix_x_data_train),
torch.from_numpy(matrix_x_next_data_train))
torch_dataset_test = TensorDataset(torch.from_numpy(matrix_x_data_test),
torch.from_numpy(matrix_x_next_data_test))
train_dataloader = DataLoader(torch_dataset_train,
batch_size=batch_size,
shuffle=True)
test_dataloader = DataLoader(torch_dataset_test,
batch_size=batch_size,
shuffle=True)
class Encoder(nn.Module):
def __init__(self, list_layer_dim):
super().__init__()
self.list_layer_dim = list_layer_dim
self.list_FC = nn.ModuleList()
for i in range(len(self.list_layer_dim) - 1):
dim_input = self.list_layer_dim[i]
dim_output = self.list_layer_dim[i + 1]
self.list_FC.append(nn.Linear(dim_input, dim_output))
class Decoder(nn.Module):
def __init__(self, list_layer_dim):
super().__init__()
self.list_layer_dim = list_layer_dim
self.list_FC = nn.ModuleList()
for i in range(len(self.list_layer_dim) - 1, 0, -1):
https://colab.research.google.com/drive/1gp5p4hXFuHAxodjDtOLC_v7iI02kncy6#scrollTo=2e3dedc5 21/24
30/03/2025 05:13 TP5-1.ipynb - Colab
dim_input = self.list_layer_dim[i]
dim_output = self.list_layer_dim[i - 1]
self.list_FC.append(nn.Linear(dim_input, dim_output))
class Autoencoder(nn.Module):
def __init__(self, feature_dim
feature_dim, hidden_layer, output_dim):
super().__init__()
list_layer_dim = \
[output_dim if i == hidden_layer
else feature_dim + i * (output_dim - feature_dim)
feature_dim // hidden_layer
for i in range(hidden_layer + 1)]
self.encoder = Encoder(list_layer_dim)
self.decoder = Decoder(list_layer_dim)
The Lie operator must be defined such that it will be always stable by construction. To do that,
we consider a matrix of parameters Ψ ∈ R
m×m
and a vector of parameters Γ ∈ R
m
. The
resulting Lie operator will be of the form:
T
L = (Ψ − Ψ ) − diag(|Γ|)
Question 4. : As you did for the discrete case, you now have to implement the LieModule
module. It should have the form indicated above to guarantee R(λ) ≤ 0 . Check that the
initialization fulfills this property.
class LieModule(nn.Module):
def __init__(self, lie_operator_dim: int):
super().__init__()
self.lie_operator_dim = lie_operator_dim
# TODO: Complete function
https://colab.research.google.com/drive/1gp5p4hXFuHAxodjDtOLC_v7iI02kncy6#scrollTo=2e3dedc5 22/24
30/03/2025 05:13 TP5-1.ipynb - Colab
autoencoder = Autoencoder(feature_dim,
feature_dim hidden_layer, output_dim).to(device)
lie_operator = LieModule(output_dim).to(device)
print(autoencoder)
Some tricks are needed to train. If the autoencoder and the Lie model are learned at the same
speed, the training turns out to be highly unstable since the three loss functions have moving
targets. For this reason, the Lie learning rate has been chosen smaller than the autoencoder
one.
learning_rate_autoencoder = 0.0001
learning_rate_lie = 0.00001
optimiser_autoencoder = torch.optim.Adam(autoencoder.parameters(),
lr=learning_rate_autoencoder,
weight_decay=1e-3)
optimiser_lie = torch.optim.Adam(lie_operator.parameters(),
lr=learning_rate_lie,
weight_decay=1e-3)
A further loss is considered to stabilize the learning stage. The state x belongs to a compact
set, since it is the solution of a dissipative dynamical system. This is not true for g(x) (we need
to choose appropriate activation functions to have appropriate Liptchitz guarantees). To avoid
discrepancies in magnitudes of gi (x), a regularization loss is added:
1/2
1 1
2
μ = ∑ gi (x) = 0 and σ = ( ∑(gi (x) − μ) ) = 1
m m
m m
inspired by VAE.
For the training to be smooth, the encoder parameters are not affected by the prediction loss in
phase space. This is based on an empirical observation and is motivated by the fact that the
encoder appears in the three losses and plays a competitive role against the decoder and the
Lie model. This should not affect the results since the encoder remains coupled with the
decoder in the reconstruction loss and with the Lie operator in the prediction loss in Koopman
space.
Question 5. : Implement the loss function similarly to what you did for the Question 2. Note that
here you should use the dynamics f and its values for a set of points belonging to the domain
2
[−2, 2] while no data from proper trajectories are needed.
https://colab.research.google.com/drive/1gp5p4hXFuHAxodjDtOLC_v7iI02kncy6#scrollTo=2e3dedc5 23/24
30/03/2025 05:13 TP5-1.ipynb - Colab
Since trajectories are not needed, random states can be sampled from the system manifold
,
x1 ∈ [−2, 2] x2 ∈ [−2, 2] .
n_batch = len(train_dataloader)
n_epoch = 5 # To be tuned
optimiser_autoencoder.zero_grad()
optimiser_lie.zero_grad()
# dgX = lie_operator * gX
# jvp = \nabla_x g (x) * f(x) (jvp: jacobian vector product)
(tensor2d_observable, tensor2d_jvp) = \
autograd.functional.jvp(autoencoder.encoder,
tensor2d_batch_x,
tensor2d_batch_x_next,
create_graph=True)
tensor2d_decoded_x = autoencoder.decoder(tensor2d_observable)
tensor2d_lie_observable_next = lie_operator(tensor2d_observable)
tensor2d_predict_x_next = autoencoder.decoder(tensor2d_lie_observable_next)
tensor_loss_val = \
loss(tensor2d_x=tensor2d_batch_x,
tensor2d_x_next=tensor2d_batch_x_next,
tensor2d_decoded_x=tensor2d_decoded_x,
tensor2d_observable=tensor2d_observable,
tensor2d_lie_observable_next=tensor2d_lie_observable_next,
https://colab.research.google.com/drive/1gp5p4hXFuHAxodjDtOLC_v7iI02kncy6#scrollTo=2e3dedc5 24/24