0% found this document useful (0 votes)
68 views18 pages

Fermentation Modeling

Uploaded by

Arthur He
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
68 views18 pages

Fermentation Modeling

Uploaded by

Arthur He
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

processes

Review
Hybrid Modeling for On-Line Fermentation Optimization and
Scale-Up: A Review
Mariana Albino 1 , Carina L. Gargalo 1 , Gisela Nadal-Rey 2 , Mads O. Albæk 2 , Ulrich Krühne 1
and Krist V. Gernaey 1, *

1 Process and Systems Engineering Center (PROSYS), Department of Chemical and Biochemical Engineering,
Technical University of Denmark, Building 228A, 2800 Kongens Lyngby, Denmark; marial@[Link] (M.A.);
carlour@[Link] (C.L.G.); ulkr@[Link] (U.K.)
2 Novonesis A/S, Fermentation Pilot Plant, Krogshoejvej 36, 2880 Bagsvaerd, Denmark;
gsnr@[Link] (G.N.-R.); maoa@[Link] (M.O.A.)
* Correspondence: kvg@[Link]

Abstract: Modeling is a crucial tool in the biomanufacturing industry, namely in fermentation pro-
cesses. This work discusses both mechanistic and data-driven models, each with unique benefits and
application potential. It discusses semi-parametric hybrid modeling, a growing field that combines
these two types of models for more accurate and easy result extrapolation. The characteristics and
structure of such hybrid models will be examined. Moreover, its versatility will be highlighted,
showing its usefulness in various stages of process development, including real-time monitoring
and optimization. Scale-up remains one of the most relevant topics in fermentation processes, as
it is important to have reproducible critical quality attributes, such as titer and yield, on larger
scales. Furthermore, the process still relies on empirical correlations and iterative optimization. For
these reasons, it is important to improve scale-up predictions, through e.g., the use of digital tools.
Perspectives will be presented on the potential that hybrid modeling has by predicting performance
across different process scales. This could provide more efficient and reliable biomanufacturing
processes that require less resource consumption through experimentation.

Keywords: scale-up; hybrid modeling; industrial fermentation; mechanistic modeling; data-driven


Citation: Albino, M.; Gargalo, C.L.;
Nadal-Rey, G.; Albæk, M.O.; Krühne,
modeling
U.; Gernaey, K.V. Hybrid Modeling
for On-Line Fermentation
Optimization and Scale-Up: A Review.
Processes 2024, 12, 1635. https:// 1. Introduction
[Link]/10.3390/pr12081635 Industrial fermentation processes for producing different compounds, such as pharma-
Academic Editors: Ali Demirci, Irfan ceuticals and food ingredients, have long been used in the biotechnology industry. Various
Turhan and Ehsan Mahdinia microorganisms can be used for these processes, from different species of bacteria to yeast
and fungi [1]. Furthermore, cell culture processes with mammalian cells are relevant in the
Received: 14 June 2024 biopharmaceutical context, particularly for the production of monoclonal antibodies. Work-
Revised: 10 July 2024
ing with each organism has advantages and challenges, but some challenges are common to
Accepted: 19 July 2024
all upstream bioprocesses. Essentially, these organisms are extremely complex biologically,
Published: 3 August 2024
as different metabolic networks are present in each cell, which are differentially activated
under different circumstances. Furthermore, the interaction with the external environment
affects the biological phenomena at the cellular level [2]. For example, one of the major
Copyright: © 2024 by the authors.
challenges when implementing/deploying a new fermentation process is its scale-up, since
Licensee MDPI, Basel, Switzerland. changes in geometry and physical conditions on a large scale can affect process performance
This article is an open access article through the formation of gradients (e.g., of substrate concentration) [3,4].
distributed under the terms and Models are essential tools in the field of bioprocesses [5]. They can be used to transform
conditions of the Creative Commons process knowledge and data into relevant predicted variables. Ultimately, they can be
Attribution (CC BY) license (https:// used in real-time for monitoring and process optimization [6,7]. Regarding the topic of this
[Link]/licenses/by/ review article, fermentation processes, models can also be useful tools to predict differences
4.0/). in performance once the scale changes [8]. In this way, the scaling process can be made more

Processes 2024, 12, 1635. [Link] [Link]


Processes 2024, 12, 1635 2 of 18

seamlessly, thus reducing the time spent deploying a new process to full-scale production.
These models can essentially be divided into two categories: mechanistic and data-driven.
Mechanistic models, also commonly referred to as “white-box” models, are mathemat-
ical representations of the knowledge of the process. They are described by first-principles
equations, whose parameters have physical meaning [9,10]. On the other hand, data-driven
models (“black box”) do not require any previous knowledge of the process but rather
predict process outputs solely based on available process data [11,12].
Both of these modeling approaches have their strengths and weaknesses, and because
of that, their application also differs. This will be further discussed in specific sections
in this article, Sections 2 and 3, for mechanistic and data-driven modeling, respectively.
Combining both approaches is a strategy to overcome their limitations [11]. This is broadly
defined as hybrid or gray-box modeling [12]; this is the terminology adopted in this work.
Hybrid modeling creates the possibility of taking advantage of the process knowledge
available to build a model that can be easily adapted to different cases (within a specific
range) [13,14] and simultaneously use data-driven approaches to explain parts of the
process for which there is no mechanistic knowledge available [15,16].
This review aims to describe each of the above-mentioned types of models, their
advantages and disadvantages, how they have been applied in the context of fermentation
processes in the past, as well as up-to-date examples. Furthermore, we will discuss how
their combination—hybrid modeling—has been used and the opportunities it opens in
topics that can be aided by modeling. Finally, we will present perspectives on how such
models can enhance scale-up, a topic of broad interest in this research area.
The article is divided into six sections, the first being the Introduction section.
Sections 2 and 3 refer to mechanistic and data-driven models, respectively. The following
Section 4 focuses on hybrid modeling, its structure, and potential applications in fermenta-
tion processes. Section 5 highlights some of the issues concerning scale-up and the role that
models can play in solving them. Finally, Section 6 presents the conclusion of the review,
summarizing the main take-home messages and future perspectives.

2. Mechanistic Modeling
Mechanistic models are based on the fundamental laws of natural science and aim
to describe systems and their mechanisms using mathematical equations derived from
process knowledge. Equation parameters can have a biological, chemical, or physical
meaning. They can be based on mass, heat, and momentum balances as well as kinetic
rate equations. Figure 1 illustrates the process of developing such models. In fermentation
processes, the most critical aspects to predict are biomass growth, substrate consumption,
and product formation [10]. These models can have different levels of complexity regarding
the assumptions made about cell heterogeneity (segregated vs. unsegregated models)
and the detail considered when calculating cell growth (structured vs. unstructured mod-
els) [10]. Segregated models focus on studying heterogeneity in cell populations. For the
purpose of bioprocess optimization, their high complexity makes them challenging and
time-consuming, and therefore, an average description of the cell is more commonly ap-
plied [17]. Thus, the focus is on unsegregated models, and so there is no further discussion
on the differences between segregated and unsegregated models. A review of the methods
utilized for cell population modeling and its applications can be found in the publication
of Waldherr [18]. Table 1 highlights the differences between structured and unstructured
mechanistic models and their advantages and disadvantages.
Unstructured kinetic models are widespread because they usually do not contain
many parameters, and as a consequence, they are not computationally expensive. They
consider biomass as a black box that converts a substrate into the product of interest and
do not detail the chemical reactions occurring inside the cells. Therefore, this type of model
focuses on studying the impact that external parameters, such as temperature and pH, have
on the biological component of the process or others, such as agitation power and aeration,
have on the physical component of the system. The lower resolution of the biological
Processes 2024, 12, 1635 3 of 18

component allows for more detail on the physical characterization of the process [10]. Some
successful applications of these types of models are (a) modeling of overflow metabolism in
Escherichia coli [19]; (b) modeling of enzyme production in Aspergillus oryzae under different
aeration and agitation conditions [20]; and, (c) kinetic modeling of glucose and xylose
co-fermentation for the production of lignocellulosic ethanol [21].

Figure 1. Schematic of mechanistic model development.

Table 1. Summary of mechanistic model characteristics.

Type of Model Characteristics Advantages Disadvantages


Biomass as black-box
Balanced growth approximation Description of the physical aspects Potential over-simplification of
Unstructured
Mass balances and kinetic of the process biomass-product dynamics
equations
Biomass as a multi-component
organism
Cell growth calculated based on Suitable to model complex systems Extensive parameter identification
Structured
interaction of intracellular (e.g., metabolic networks) (e.g metabolomics analysis)
components
Metabolic flux equations

On the other hand, structured models consider biomass as multi-component organisms


with internal structure [22]. They can be useful, for example, for improving the metabolic
efficiency (yield) of a target product. For example, Tang et al. [23] used a pooled metabolic
model to predict the metabolic impact in P. chrysogenum of a feast–famine cycle, both
on the hour and minute time scales, making it relevant for studying the influence of
large-scale mixing. Jahan et al. [24] developed a model to estimate specific growth rates
based on reaction kinetics for wild-type and genetic mutants of Escherichia coli. In another
study, Çelik et al. [25] developed a structured kinetic model for Pichia pastoris growth and
recombinant protein production for optimizing feeding strategy.
Unstructured models are simpler but can still provide a useful description of the
process and, therefore, can be easily used for design purposes [10]. Special attention
must be paid since they cannot be easily extrapolated. On the other hand, dynamic
metabolic modeling provides a more accurate description of growth metabolism, increasing
extrapolating capabilities [26]. However, it requires many equations, and a significant
number of parameters must be experimentally identified. Ultimately, what defines a good
model depends on the desired application context, and thus, model complexity should be
decided according to the question to be answered.
In Table 2, more examples are summarized, both for structured and unstructured
approaches. Most applications concern the prediction of growth or product formation
under different process conditions. Nevertheless, these models have also been used to
determine optimal process parameters or to understand the cell’s metabolic responses.
In summary, mechanistic models are based on process understanding, which means that
they can be extrapolated, to some extent, outside the specific context in which they are
developed by tuning some of the model parameters [10]. On the downside, they require
Processes 2024, 12, 1635 4 of 18

intensive study/knowledge of the process and thus are time- and resource-consuming to
develop and maintain [9].

Table 2. Examples of applications of mechanistic models of fermentation processes.

Microorganism Type of Model Studied Parameter Main Findings Reference


Prediction of several
fermentation parameters,
Different agitation and
Aspergillus oryzae Unstructured e.g., rheological behavior [20]
aeration conditions
at different process
conditions
Optimize aeration rate for
Bacillus subtilis Unstructured Oxygen supply higher protein production, [27]
using low-cost substrates
Optimization of
temperature profiles for
CHO cell Unstructured Temperature shift [28]
optimal cell growth and
productivity
Prediction of growth and
Unstructured Overflow metabolism [19]
acetate-induced dynamics
Prediction of growth rate
Escherichia coli
based on reaction kinetics
Structured kinetic Specific growth rate [24]
for wild-type and genetic
mutants
Prediction of metabolic
Dynamic feeding
Pooled metabolic model response induced by [23]
conditions
feast-famine feeding cycles
Penicillium chrysogenum
Prediction of process
On- and off-line process
Structured kinetic measurements including [29]
measurements
e.g., off-gas analysis
Strategy to improve
Unstructured Protein production product formation based [30]
Pichia pastoris on growth kinetics
Growth and recombinant Prediction of optimal feed
Structured kinetic [25]
protein production strategy
Prediction of ethanol
production under
Saccharomyces cerevisiae Unstructured Ethanol production [31]
non-sterile conditions in
biofilm reactor
Prediction of ethanol yield
Glucose and xylose
Zymomonas mobilis Unstructured across a wide range of [21]
co-fermentation
initial process conditions
On-line prediction of
Not disclosed Unstructured On-line monitoring [32]
product concentration

3. Data-Driven Modeling
Unlike mechanistic models, data-driven approaches ignore relationships that originate
from process knowledge, and the parameters of the mathematical equations will not
have a physical meaning. Figure 2 illustrates how historical process data are used to
develop these models. In a fermentation context, these models aim to predict critical
quality attributes (CQA), such as titer, productivity, and carbon efficiency, based on critical
process parameters, without accounting for the mechanistic causalities that describe the
relationships [11].
Processes 2024, 12, 1635 5 of 18

Figure 2. Schematic of data-driven model development.

Machine learning is a widely used data-driven modeling method and can be divided
into two categories: supervised and unsupervised learning. Some authors also introduce
the classification of reinforcement learning, though others defend that it should not be
considered a specific class of learning methods but rather as a paradigm where an agent
learns how to behave in an environment based on a reward signal [33,34]. Nonetheless,
the three types (supervised, unsupervised, and reinforcement) of learning will be discussed
and examples will be given on their application in the modeling of fermentation processes
(Table 3).
In supervised learning, the data are labeled, which means that in addition to inputs,
there are also predetermined output attributes that are considered in the modeling pro-
cess [33]. In the case of fermentation processes, the output attributes would be, e.g., biomass
and product concentration, and the input attributes, online process data. Therefore, the al-
gorithm will identify the relationship between input and output variables and use it to
predict the target values based on new input values. Supervised learning includes the
use of artificial neural networks (ANN). These have been successfully implemented for
fermentation processes. For example, Tavasoli et al. [35] used neural networks to develop
a µ-stat approach to control methanol feeding in an E. coli fermentation for recombinant
protein production. The results showed significant improvements compared to previ-
ously used approaches for methanol feeding. Furthermore, Nagy [36] has used dynamic
neural networks to develop a model predictive controller of temperature for continuous
yeast fermentation.
On the other hand, unsupervised learning focuses on identifying hidden patterns in
the data without considering a target attribute. All variables in the dataset are used as
inputs. Thus, these methods are suitable for clustering and association techniques [33].
Some examples are PCA (principal component analysis) and PLS (partial least squares)
regressions. Andersen et al. [37] applied a partitioned PLS model to predict the yield of a
batch fermentation based on selected process variables. Another example is the use of a
data-driven Gaussian process regression model by Barton et al. [38] on a batch fermentation
model. The model was used to increase productivity from batch to batch by manipulating
process variables, e.g., batch cycle time. Nucci et al. [39] used a PCA algorithm to detect when
the process is not progressing as planned, providing decision-making support. A sub-category
for unsupervised learning methods recurs to maximum-likelihood properties. These methods
take into account the measurement error variance information, making them suitable for
processes with limited and noisy data, such as fermentation or cell culture. In the works of
Dewasme et al. [40] and Pimentel et al. [41], they are applied to PCA and nonnegative matrix
decomposition (NMD), respectively, to reduce data dimensionality and identify relevant
process models in hybridoma cell culture for the production of monoclonal antibodies.
Classified between supervised and unsupervised learning, reinforcement learning
is a type of algorithm in which the agent (in this case, the model) learns how to behave
in a dynamic environment through trial and error interactions, and the only feedback
is a scalar reward signal [34,42]. Two main strategies are used for solving this type of
problem: (a) search the space of behaviors for one that performs well in the environment;
this is achieved, for example, using genetic algorithms; (b) use statistical and dynamic
programming methods for estimating the effect that taking different actions has on the
different states of the system [42]. These types of models find applications in fermentation
processes, particularly in the development of feed control strategies. Treloar et al. [43]
Processes 2024, 12, 1635 6 of 18

applied a deep reinforcement learning method to control substrate feeding rates to maintain
the desired population levels (in a co-culture) to optimize product formation. In another
example, Kim et al. [44] used model-based reinforcement learning to develop a feed
rate control that led to an increase in yield and productivity in an in silico penicillin
production plant.
Table 3 summarizes the examples given in the text above, in addition to more relevant
examples of the application of data-driven models. It highlights the capabilities of these
approaches. To conclude, the main advantage of data-driven models is the automatic
assembly of the models and the low computational burden, which makes them suitable
for real-time monitoring and control [11]. However, unlike mechanistic modeling, its
predictive capabilities are limited to the space where they were validated, restricting its use
for bioprocess control and optimization to very specific cases [12].

Table 3. Examples of applications of data-driven modeling of fermentation processes.

Microorganism Type of Model Studied Parameter Main Findings Reference


On-line fault detection providing
Bacillus megaterium PCA Fault detection [39]
decision-making support
Increased cell growth and target
Neural networks µ-stat feeding control [35]
protein production
Escherichia coli Product formation optimization in a
Reinforcement learning Feed rate control simulated chemostat with [43]
co-cultures
Determine minimum number of
Maximum-likelihood
Macroscopic reactions reactions and parameters for [40]
PCA
Hybridoma cells process model
Maximum-likelihood Prediction of relevant Identified model with good
[41]
NMD parameters prediction results
Optimized yield and productivity of
Reinforcement learning Feed rate control [44]
in silico penicillin production plant
Penicillium chrysogenum Overperformed other feed control
Reinforcement learning Feed rate control strategies for a digital industrial [45]
penicillin plant
More robust temperature control,
Temperature model
Neural networks compared to linear model predictive [36]
predictive controller
controller
Prediction results with on-line
Saccharomyces cerevisiae fluorescence spectroscopy and a
Monitoring of relevant
Neural networks process model were equivalent to [46]
parameters
those of the model where offline
calibration data were used
Gaussian process Manipulation of cycle Increased productivity from each
[38]
regression time batch to the following one
Identification of process variables
Streptomycetes sp. PLS API production responsible for variation in API [47]
production
Similar performance to more
Not disclosed PLS Yield prediction [37]
complex genetic algorithm

4. Hybrid Modeling
As stated in the preceding sections, despite their advantages, both mechanistic and
data-driven models have their shortcomings, as summarized in Table 4. Semi-parametric
hybrid modeling, here named hybrid modeling, has the possibility of combining these
approaches. It results in a more accurate mechanistic model by incorporating historical
process data or a data-driven model that can be extrapolated outside the specific context in
Processes 2024, 12, 1635 7 of 18

which it has been trained [48]. This is particularly relevant for modeling complex systems
where only partial process understanding exists. An example would be a fermentation
process in which the mass and energy balances are well defined, but the parameters of
the kinetic rate equations are complicated to determine [49]. Process data are used by
data-driven models to fill in knowledge gaps. An advantage of this type of model is
that it integrates existing knowledge into a structured data-driven framework, allowing
predictions to be improved as more experimental data on the process are collected and
added to the model [50]. For monitoring purposes, the models should predict the relevant
parameters, based on online measurements. When it comes to fermentation processes, this
can be achieved, e.g., for biomass concentration (from oxygen and carbon evolution rates,
for example) [51], straightforwardly through data-driven techniques; however, this is more
challenging for product concentration, due to, for example, relatively low product titers [52].
By integrating the two types of knowledge, mechanistic and data-driven, the limitations
presented in Table 4 can be reduced [52]. Hybrid modeling is a relevant approach for
model predictive control since the model needs to remain robust in untested regions of the
process [50]. Hybrid models are characterized by their extrapolation capabilities, making
them suitable for process control outside the tested process conditions [53]. For this appli-
cation, the extrapolation capabilities of mechanistic models are an important complement
to data-driven approaches. Furthermore, the underlying mechanistic structure of these
models makes them more transparent and easier to scrutinize than their purely data-driven
counterparts [53].

Table 4. Advantages and limitations of mechanistic and data-driven models.

Type of Model Advantages Limitations


Increased process understanding Time-consuming development
Extensive experimental work
Mechanistic Process control and optimization
for validation
Model-based DOE Intensive process knowledge required
Automatic model assembly Poor extrapolation capabilities
Real-time monitoring and fault Requires representative and reliable
Data-driven
detection data
Low computational burden Limited for control and optimization

The work of Narayanan et al. [13] highlights the benefits of hybrid modeling by com-
paring the performance metrics of a process model with varying degrees of hybridization.
This study evaluates seven process models in which 0% (equal to a fully data-driven
model) to 100% (a fully mechanistic model) of process knowledge is included. The fully
data-driven model utilized was PLS since it was the best performing among other tested
structures (e.g., NN); when incorporating process knowledge, NN were chosen as the
data-driven components. As for the mechanistic component, the added knowledge to each
of the five hybrid models was (1) the rate of accumulation, (2) mass balances, (3) specific
rate, (4) specific growth and death rate, and (5) kinetic terms. The fully mechanistic model
further included Monod equations for the metabolites.
Table 5 summarizes the results obtained. Essentially, the models were tested in two
contexts: interpolation and extrapolation, i.e., within process conditions present in the
training dataset and conditions not observed in the training data, namely the feed pro-
files. For both cases, hybrid approaches showed superior performance. Most interestingly,
the data requirements for each level of knowledge incorporation differed. In the interpo-
lation scenario, it is possible to observe that, by adding an equation for the accumulation
rates, the same performance as the data-driven model was achieved with 20 fewer training
runs. Hybrid Model 3, which contained a variable for the specific formation/consumption
rate of each variable, was the best performer, having the lowest MSE (mean squared error)
and the least training data. As more knowledge was added to the model, more training
data were necessary to achieve equal performance. In these cases, the additional knowl-
Processes 2024, 12, 1635 8 of 18

edge of the process means that a larger number of outputs need to be predicted using
the NN, thus having more parameters and requiring a larger quantity of data. As for the
mechanistic model, the MSE obtained is the highest observed; however, only 10 runs are
necessary. With the same number of runs, the data-driven model exhibits significantly
worse performance (MSE of 0.15).

Table 5. Model performance across hybridization levels [13].

Interpolation Extrapolation
Degree of Optimal Run Optimal Run
Best MSE 2 Best RMSE
Hybridization Number 1 Number
Data-driven 50 0.039 50 0.32
Hybrid 1-rAcc 3 30 0.039 50 0.20
Hybrid 2-MB 4 30 0.030 50 0.10
Hybrid 3-rSp 5 30 0.025 30 0.05
Hybrid 4-rXv 6 50 0.025 50 0.05
Hybrid 5-kin 7 50 0.025 50 0.05
Mechanistic 10 0.060 30 0.10
1 Number of training runs necessary to achieve the lowest MSE. 2 Lowest mean squared error. 3 Rate of
accumulation. 4 Mass balance. 5 Specific rates. 6 Specific growth and death rates. 7 Kinetic terms.

The differences became more striking in the extrapolation test scenario. The data-driven
model performed poorly even with 50 training runs. By adding some knowledge, the per-
formance of hybrid model 1 improved significantly, although its performance was still not
satisfactory. Similarly to the interpolation case, hybrid model 3 was the best performer. It
exhibited the lowest MSE while also requiring the least training data. The models with a larger
mechanistic component once again required more training data and had an equally low MSE.
Finally, the fully mechanistic model presented an MSE higher than that of the best hybrid
models, although it needed the least amount of data. Considering only 30 training runs, it was
only outperformed by hybrid model 3. Furthermore, when comparing the two extremes, data-
driven and mechanistic, it is clear that the latter is superior when extrapolation is necessary.
Overall, with an adequate selection of the mechanistic component, hybrid models present
several advantages, resulting in more accurate models with good extrapolation properties
and with lower data requirements than data-driven counterparts.
Depending on how the different types of models are combined, two hybrid model
structures can be defined, parallel and serial (Figure 3). The parallel structure (Figure 3a) is
suitable when the parametric (mechanistic) model exists independently, but its prediction
capabilities are limited due to, e.g., unmodeled effects and nonlinearities [12]. As such,
the parametric model can be used by itself, and the non-parametric component only improves
the quality of the predictions [48]. The downside of this approach is that the model’s prediction
will remain poor for the input space in which the data-driven model has not been trained. As
for the serial structure (Figure 3b,c), the “white-box” model will be composed of first-principles
equations, such as mass and energy balances, for example, and the “black-box” model
component will be used to represent, for example, kinetic terms, since these are harder to
validate [12]. The serial structure is particularly suitable when there is insufficient knowledge
of the underlying process mechanisms to build a fully mechanistic model, but sufficient
process data are available to calibrate the data-driven component. On the other hand, a serial
structure can also be applied in the case where the predictions of the mechanistic model
are used as input to the data-driven model, establishing relationships between the process
parameters or the inputs [12].
The main determinant of the best structure to adopt is the structure of the mecha-
nistic model, as the assumptions made in that model constrain the solution space [54].
As such, when the mechanistic model cannot correctly represent some aspects of the pro-
cess, e.g., complex nonlinear kinetics, a parallel structure is preferred. It can perform better
than the serial arrangement since the data-driven model can partially compensate for the
structural weakness in the mechanistic model. When the structure of the mechanistic
Processes 2024, 12, 1635 9 of 18

model is accurate, the serial model gives better predictions compared to the parallel model.
In addition, the extrapolation properties will be significantly better.

Figure 3. Schematic of the three ways to combine the two types of models. (a) Parallel configuration.
(b,c) Serial configurations.

Hybrid modeling is a relatively recent field. Despite significant efforts in this area,
as evidenced by applications in fermentation processes (refer to Table 6), there are still
challenges to overcome. These challenges should be addressed to allow their widespread
application in the field of biochemical engineering. A detailed discussion of current chal-
lenges can be found in the review of Schweidtmann et al. [48]. Some examples that are
found particularly relevant for fermentation processes are: (a) the complexity in parameter
estimation in dynamic hybrid models since this could lead to an increase in computa-
tional demand [55,56]; (b) the lack of well-documented methods for incremental learning,
i.e., being able to train the model on new data without requiring access to the original data,
which could be essential to improve the model’s predictions as more experimental data
are collected. This is relevant since the most common approach at the moment is batch
incremental learning, which requires the model to be re-trained using the whole dataset
(original data and new data) and that can become computationally expensive for increasing
quantities of data [57]; and (c) the use in adaptive and evolving systems, since this can
be the case in fermentation processes, such as processes with distinct phases for growth
and product synthesis [58]. This results in a metabolic shift, which can be represented
by keeping the same structure (e.g., the equation used to describe the growth rate) and
changing the value of certain parameters—adaptive—or by constructing a different model
for each phase (e.g., choosing a new equation that better describes the growth rate in the
new conditions)—evolving.

Table 6. Applications of hybrid models of fermentation processes.

Microorganism Type of Model Application Studied Parameter Main Findings Reference


Soft-sensor for glucose
Unstructured
Glucose, glycerol and concentration coupled with a
Aspergillus niger mechanistic model + Prediction [59]
biomass concentration kinetic model for glycerol and
LBGM
biomass prediction
Real-time monitoring of the
Unstructured Biomass, glutamate,
fermentation with improved
Bordetella pertussis mechanistic model + Monitoring and lactate [60]
prediction compared to the PLS
PLS concentration
model on its own
Unstructured Decreased prediction error of
Candida rugosa mechanistic model + Monitoring Lypolitic activity lipase activity with on-line [61]
NN implementation
Unstructured The data-driven model is
Cunninghamella Kinetic parameter
mechanistic model + Prediction directly built as the kinetic [58]
echinulata estimation
NN parameters are estimated
Unstructured Significant reduction in required
mechanistic model + Optimization Induction conditions DOE to determine optimal [49]
NN parameters
Escherichia coli
Unstructured Improved batch-to-batch
mechanistic model + Control Feed rate reproducibility by introducing a [62]
NN model-based feed rate control
Processes 2024, 12, 1635 10 of 18

Table 6. Cont.

Microorganism Type of Model Application Studied Parameter Main Findings Reference


Real-time monitoring of the
Unstructured
Control Harvesting point fermentation and model [63]
mechanistic model + NN
predictive feed control
Mammalian cell culture Determination of the ideal level
Unstructured of mechanistic knowledge to be
Prediction Degree of hybridization [13]
mechanistic model + NN included for optimal
performance
Increased depth of the neural
Unstructured Prediction of dynamic
Prediction network led to a decrease in [64]
mechanistic model + NN variables
prediction errors
Pichia pastoris Prediction of biomass
Carbon balance +
concentration, based on online
Multiple linear Monitoring Biomass concentration [51]
data of three different
regression
fermentation phases
Unstructured Improved model accuracy from
Unknown kinetic
mechanistic model + Prediction the incorporation of neural [16]
dynamics
Neural ODEs ODEs
Saccharomyces cerevisiae
Unstructured Real-time monitoring of the
Substrate uptake and
mechanistic model + Monitoring fermentation using advanced [65]
ethanol production
PLS spectroscopy data
Unstructured Kinetic parameter Embedding of the Gaussian
Xanthophyllomyces
mechanistic model + Prediction estimation in process model reduces model [56]
dendrorhous
Gaussian process model mixed-sugar conditions uncertainty and prediction error
Uncertain parameters,
Unstructured Superior performance versus the
Non disclosed Prediction e.g., biomass, product [15]
mechanistic model + NN kinetic model
and substrate

Applications in Fermentation Processes


This section will focus on applications of hybrid modeling in the field of fermentation.
Hybrid modeling can be used as a prediction tool in the process development stage [16,49],
and its applications extend to monitoring, control, and optimization. Table 6 summarizes
some examples of the application of this type of model.
As an early development stage example, von Stosch et al. [49] applied a hybrid model
for a reduced DOE for an E. coli fermentation process. The selected structure for the
model was a serial structure (Figure 3c), where the data-driven component (ANN) is used
to predict the rates and correlation parameters included in the mechanistic component
(material balances). Essentially, the mechanistic model uses ODEs to describe the variation
over time of volume, biomass, and product concentrations, by establishing the relationships
between these variables and rates (biomass and product formation) and the added base
and feed solution. The data-driven model is an artificial neural network with three layers.
The inputs for this model are the process parameters X (biomass concentration), P/X
(specific productivity), T (temperature), pH, and the carbon feed rate. The structure of the
neural network (number of nodes and hidden layers) as well as the most relevant process
parameters to include were decided on the basis of the performance of the model on the
validation set. The hybrid model could predict the impact of different induction conditions
(e.g., temperature and pH) on biomass growth and recombinant protein formation, allowing
for better process understanding without added experiments.
For monitoring purposes, the models can be used as a soft sensor where online mea-
surements are taken by the model and turned into relevant predictions. For example,
Brunner et al. [51] use the online data of CO2 , measured in the off-gas for real-time predic-
tion of biomass concentration throughout the different stages of a fermentation process
(batch, transition phase, and fed-batch), recurring to a simple carbon balance model coupled
with multiple linear regression. In this case, a serial structure is adopted as well; however,
unlike the previous example, the mechanistic model’s predictions are fed into the data-
driven component (Figure 3b). The rate of carbon production is calculated mechanistically
based on mass balances. This value, along with the volume of the base solution that has
Processes 2024, 12, 1635 11 of 18

been fed into the reactor, is input into the multiple linear regression model. This model
will calculate the biomass concentration. Furthermore, a phase detection algorithm is used
to determine the current process stage, and automatically adapt the values of the model’s
parameters, based on the concentration of CO2 measured online in the off-gas.
In another approach, Boareto et al. [61] used NNs to improve a previously developed
mechanistic model of the produced lipolytic enzyme titer. The model utilized CO2 and sub-
strate feed rate measurements to predict, in real-time, the enzyme titer, as well as substrate
and biomass concentrations. In this approach, a parallel structure is adopted to combat
the structural mismatch in the original mechanistic model. The mechanistic component
of the model was adapted from the literature and reduced so that it would include ODEs
only for the biomass and substrate concentration, using the well-known Monod equation
to predict the growth rate. The equations describing the evolution in enzyme activity were
removed due to their inaccurate predictions. The data-driven component of the model
(ANN) is used to calculate the enzyme’s titer based on the carbon evolution rate, substrate
feed rate (online measurements), and biomass concentration (calculated by the mechanistic
component). The selected neural network model had three layers and its structure was
determined by cross-validation. The final model significantly improved the accuracy of
the enzyme titer predictions while maintaining the same performance for the prediction of
biomass concentration.
Cabaneros Lopez et al. [65] uses mid-infrared spectroscopy data to feed a PLS model.
Combined with a kinetic model, they can predict glucose, biomass, and ethanol concen-
tration in a lignocellulosic fermentation. The model presents a parallel structure, and the
predictions of both components are fused by a continuous-discrete extended Kalman filter
(CD-EKF). The mechanistic component is a kinetic model composed of eight ODEs de-
scribing all the variables of interest, and the parameter estimation was performed by the
non-linear least-squares method. For the data-driven component, PLS models were used to
predict glucose, xylose, and ethanol concentrations from the spectral data. The predictions
of the hybrid model were compared to the predictions of the mechanistic and data-driven
models on their own. In all but one case, the hybrid model presented lower RMSE values
than the other models. In one of the test fermentations, the RMSE for the prediction of
ethanol concentration was lower for the mechanistic model than for the hybrid model.
For control purposes, robust models with high extrapolation possibilities are re-
quired [52]—characteristic of hybrid models—however, not many applications are reported.
However, there are some examples such as the work of Dors et al. [63] and Jenzsch et al. [62],
in which the predictions of the hybrid model are used as input to control the fermentation
feed profile, leading to a more stable process and improved batch-to-batch reproducibility,
respectively. In the first case, a parallel structure is adopted. The mechanistic component
consists of ODEs that describe the mass balances for all relevant process variables and uses
Monod relationships to describe the kinetics of the process. The data-driven component
(ANN) is used to partially calculate the consumption and production rates. The predictions
of both components are weighted according to the process data available to train the neural
network in the region corresponding to the current process state; i.e., if sufficient historical
data for the current state exist, the prediction of the neural network will have a superior
weight to the one of the mechanistic components. Finally, the hybrid model is used to
calculate an optimal feed rate. The use of hybrid modeling for online applications, namely
monitoring, is already significant. Real-time predictions of key process variables open
the possibility of detecting if the process is running as expected and can aid the decision-
making of operators. This would push the industry towards a more digital operation, being
less dependent on variations influenced by human interaction.

5. Model Aided Scale-Up


The topic of scale-up has been central in the research activities concerning fermentation
for years. This entails taking a newly developed process from lab to full production scale.
The aim is to produce large product quantities while maintaining the CQAs observed at the
Processes 2024, 12, 1635 12 of 18

small scale. The challenge is that as the reactor size increases, the conditions for favorable
growth may be harder to attain, e.g., due to less efficient mixing. This, in turn, can lead
to lower process reproducibility, yields, and product quality [66]. Process scale-up is still,
to this day, a major challenge in the fermentation industry, as it is usually not based on
mathematical process models but on empirical correlations.
The ideal way to tackle the challenges found on large scales would be to perform
experiments on the actual production site. However, this is not economically feasible, not
only due to the large amount of resources consumed but also due to the loss of production
capacity [67]. The alternative is the use of scale-down approaches, in which a laboratory or
pilot scale reactor is used to replicate the conditions experienced at an industrial scale, so the
results are relevant for production process optimization. Achieving a successful scale-down
is also a challenge since some conditions might be difficult to replicate at smaller scales,
such as oxygen transfer, shear rate, and flow patterns. Several platforms can be used for
it, including pilot scale reactors, microtiter plates [68], shake flasks, microbioreactors or
milliliter scale stirred reactors [69]. Another interesting approach is the use of two connected
STRs or a STR and a PFR [70]. This approach allows a potential study of gradients by having
each small-scale reactor represent a specific zone of the large-scale reactor, for example,
of substrate or oxygen depletion.

5.1. Use of CFD-Coupled Kinetic Models


A relevant problem in industrial-scale reactors that should be taken into account
when planning a scale-up is the formation of gradients. There are significant gradients
that may impact fermentation process performance of, e.g., substrate or dissolved oxygen
concentration. These gradients will occur when the local rate of consumption is higher
than the rate of transport [3]. The consequence will be the occurrence of different zones
inside the bioreactor with a surplus or a deficiency of nutrients or oxygen. For example,
if the substrate is dosed at the top, then the cells located there will experience high nutrient
concentrations which can lead to, for example, overflow metabolism in E. coli with the
production of inhibitory by-products like acetate as a consequence.
A useful tool to study gradients is the combination of computational fluid dynamics
(CFD) simulations and biological models. By combining the fluid flow information gained
from the CFD model, with the cell metabolism information from the biological model, it is
possible to predict the cell response to environmental factors and ultimately the impact this
has on CQAs [71].
Several studies have been conducted, with the biological model’s complexity rang-
ing from simpler unstructured approaches to complex structured metabolic models. Al-
though unstructured approaches are useful for a better understanding of gradients, they
cannot capture the response of the organisms to the different conditions, as metabolic
models can. Table 7 summarizes examples of models developed. To highlight a few, Pigou
and Morchain [4] were able to predict the areas of the bioreactor where acetate would
be consumed or produced by E. coli due to the formation of a glucose gradient. They
used a population balance model in combination with a compartment model, reducing the
computational burden compared to a CFD simulation. In the work of Sibler et al. [72], they
could predict the impact of the CO gradient on the cell population. Through their work,
the scale-up of this syngas fermentation can be performed by taking into account the need
to improve CO mass transfer or engineering strains that better cope with this limitation.

Table 7. Applications of CFD-coupled kinetic models of fermentation processes.

Microorganism Approach Main Findings Reference


Need to improve CO mass transfer and/or to engineer
Clostridium ljungdahlii Unstructured kinetic model [72]
strains that cope with the conditions
Glucose gradients induce production/consumption of
Escherichia coli Metabolic model [4]
acetate in different parts of the reactor
Processes 2024, 12, 1635 13 of 18

Table 7. Cont.

Microorganism Approach Main Findings Reference


Statistical assessment of the substrate fluctuations
Dynamic gene regulation model experienced by organisms in industrial-scale [73]
Penicillium chrysogenum fermentation
Identified targets for metabolic and reactor
Pooled metabolic model [26]
optimization of large-scale fermentation
Insights into the intracellular mechanisms that
Pseudomonas putida Cell cycle model [74]
determine growth phenotypes
The approach provides a simulation strategy for the
Saccharomyces cerevisiae Unstructured kinetic model design and operation of bioreactors, particularly when [75]
single cell behavior is relevant

5.2. Use of Hybrid Modelling in Scale-Up


So far, this section has illustrated how modeling can be used to predict a challenge
that can be encountered when scaling up the formation of gradients. However, models that
can predict the fermentation performance to some extent on a large scale, namely what
the CQAs would look like under different conditions, would be extremely beneficial when
planning the scale transition. Although there are numerous instances of hybrid models
being applied to fermentation processes, their utilization for scale-up is not commonly
mentioned. We believe that this modeling approach offers a significant opportunity to
accelerate the scale-up stage of fermentation process development.
As mentioned in Section 4, hybrid models can have good extrapolation properties
(which are determined by the mechanistic component of the model) while being less time-
consuming to develop than purely mechanistic models. This highlights their suitability as
an approach for scale-up since the model could be developed and calibrated at a smaller
scale and then extrapolated to a larger scale. One strategy is to develop a mechanistic
model on a small scale and complement it with data-driven approaches to represent
scale-specific parts of the model. The data-driven component will account for scale-up
factors and other assumptions made in the mechanistic model that are not valid on larger
scales [12,76]. Another option is to use mainly small-scale experimental data to develop
the model but include a few validation experiments at a larger scale. This will be sufficient
to significantly improve the model’s prediction capabilities on a large scale, while at the
same time not requiring a big increase in resource consumption [50].
Here, two examples are described where a hybrid model is used to predict larger-scale
performance. In the work of Bayer et al. [50], a hybrid model was developed from a DOE
with 300 mL shake flasks and re-calibrated with only three batches at the 15 L lab-scale
reactor. These few experimental runs on a larger scale were sufficient for the model to
accurately describe cell behavior and product formation on the 15 L scale under different
process conditions. These included shifts in temperature and substrate concentration in the
feed, throughout the fermentation time, while the model had only been trained in static
conditions. Regarding the work of Rogers et al. [14], three hybrid models are developed on
the 1 L scale, where each model has a different quantity of kinetic knowledge incorporated,
i.e., number of parameters. The models are then used to predict the performance of
fermentations in 5 L reactors under a temperature change that was not present in the
training data. They found that the model with intermediate knowledge incorporation
performed the best in this case and is suitable for model-based bioreactor optimization
and scale-up. Although the model with the largest kinetic information had more confident
predictions, it showed lower accuracy. This highlights that incorporating information
that is not fully understood can lead to incorrect bias and poorer model performance,
i.e., overfitting the model.
Processes 2024, 12, 1635 14 of 18

No examples of applications of similar strategies to industrial scales have been found


in the literature. However, an adaptation of the described strategies to larger scales could
be of interest.

6. Conclusions
The development and application of process models continue to be crucial research
areas in biotechnology. These models can be utilized at various stages of process develop-
ment, from initial design to optimization, for scale-up, and ultimately as a more detailed
way of monitoring and controlling processes. The main takeaways of this review are:
• Both mechanistic and data-driven models continue to be relevant strategies in the
development of fermentation processes, both with different specific use cases. Data-
driven models are particularly relevant for online process models and are frequently
used in the development of soft-sensors. On the other hand, the interpretability
and extrapolation capabilities of mechanistic models make them suitable for process
optimization and understanding the impact of different parameters on the cell’s
metabolic responses.
• Hybrid modeling is a rapidly evolving field and offers substantial benefits in the
context of fermentation processes. It enables the exploitation of the strengths of both
types of aforementioned models while combatting their weaknesses, ideally leading
to a more agile development process.
• The level of mechanistic knowledge included in hybrid models must be carefully
selected to avoid overparametrizing or biasing the model. If performed adequately,
the result will be a more accurate and extrapolative model, with lower data require-
ments than a data-driven counterpart.
• Most use cases still focus on the prediction and monitoring of relevant process vari-
ables, but they present great potential for model predictive control applications. Fur-
thermore, it appears to be an interesting tool for aiding in process upscaling due to
good extrapolation capabilities across scales.
• The technology readiness level of hybrid modeling is still considered low. Some chal-
lenges, like the expansion of models as more data becomes available or the complexity
in parameter estimation, need to be overcome for their successful implementation as
relevant tools for industrial bioprocesses.

Author Contributions: Conceptualization, M.A. and C.L.G. writing—original draft preparation, M.A.
and C.L.G.; writing—review and editing, G.N.-R., K.V.G., M.O.A. and U.K. All authors have read
and agreed to the published version of the manuscript.
Funding: This research was funded by Novo Nordisk Foundation: Sustain4.0: Real-time sustainability
analysis for Industry 4.0 (NNF0080136).
Acknowledgments: This project received support from the Technical University of Denmark and
Novonesis A/S.
Conflicts of Interest: The authors Gisela Nadal-Rey and Mads O. Albæk were employed by the
company Novonesis A/S. The remaining authors declare that the research was conducted in the
absence of any commercial or financial relationships that could be construed as a potential conflict of
interest. The authors declare that this study received funding from the Novo Nordisk Foundation.
The funder was not involved in the study design, collection, analysis, interpretation of data, the
writing of this article or the decision to submit it for publication”.

Abbreviations
The following abbreviations are used in this manuscript:

ANN artificial neural networks


API active pharmaceutical ingredient
CFD computational fluid dynamics
CHO Chinese hamster ovary
Processes 2024, 12, 1635 15 of 18

CQA critical quality attribute


DOE design of experiments
NN neural networks
ODE ordinary differential equation
PCA principal component analysis
PFR plug flow reactor
PLS partial least squares
STR stirred tank reactor

References
1. Behera, S.S.; Ray, R.C.; Das, U.; Panda, S.K.; Saranraj, P. Microorganisms in Fermentation. In Essentials in Fermentation Technology;
Berenjian, A., Ed.; Springer International Publishing: Cham, Switzerland, 2019; pp. 1–39. [CrossRef]
2. González-Figueredo, C.; Flores-Estrella, R.A.; Rojas-Rejón, O.A. Fermentation: Metabolism, kinetic models, and bioprocessing. In
Current Topics in Biochemical Engineering; IntechOpen: Rijeka, Croatia, 2018; Volume 1.
3. Nadal-Rey, G.; McClure, D.D.; Kavanagh, J.M.; Cornelissen, S.; Fletcher, D.F.; Gernaey, K.V. Understanding gradients in industrial
bioreactors. Biotechnol. Adv. 2021, 46, 107660. [CrossRef] [PubMed]
4. Pigou, M.; Morchain, J. Investigating the interactions between physical and biological heterogeneities in bioreactors using
compartment, population balance and metabolic models. Chem. Eng. Sci. 2015, 126, 267–282. [CrossRef]
5. Gargalo, C.L.; de las Heras, S.C.; Jones, M.N.; Udugama, I.; Mansouri, S.S.; Krühne, U.; Gernaey, K.V. Towards the Development
of Digital Twins for the Bio-manufacturing Industry. In Digital Twins: Tools and Concepts for Smart Biomanufacturing; Herwig, C.,
Pörtner, R., Möller, J., Eds.; Springer International Publishing: Cham, Switzerland, 2021; pp. 1–34. [CrossRef]
6. Becker, T.; Enders, T.; Delgado, A. Dynamic neural networks as a tool for the online optimization of industrial fermentation.
Bioprocess Biosyst. Eng. 2002, 24, 347–354. [CrossRef]
7. Lourenço, N.D.; Lopes, J.A.; Almeida, C.F.; Sarraguça, M.C.; Pinheiro, H.M. Bioreactor monitoring with spectroscopy and
Chemometrics: A Review. Anal. Bioanal. Chem. 2012, 404, 1211–1237. [CrossRef] [PubMed]
8. Janoska, A.; Buijs, J.; van Gulik, W.M. Predicting the influence of combined oxygen and glucose gradients based on scale-down
and modelling approaches for the scale-up of penicillin fermentations. Process Biochem. 2023, 124, 100–112. [CrossRef]
9. Mears, L.; Stocks, S.M.; Albaek, M.O.; Sin, G.; Gernaey, K.V. Mechanistic fermentation models for process design, monitoring,
and Control. Trends Biotechnol. 2017, 35, 914–924. [CrossRef] [PubMed]
10. Gernaey, K.V.; Lantz, A.E.; Tufvesson, P.; Woodley, J.M.; Sin, G. Application of mechanistic models to fermentation and biocatalysis
for next-generation processes. Trends Biotechnol. 2010, 28, 346–354. [CrossRef] [PubMed]
11. Tsopanoglou, A.; Jiménez del Val, I. Moving towards an era of hybrid modelling: Advantages and challenges of coupling
mechanistic and data-driven models for upstream pharmaceutical bioprocesses. Curr. Opin. Chem. Eng. 2021, 32, 100691.
[CrossRef]
12. von Stosch, M.; Oliveira, R.; Peres, J.; Feyo de Azevedo, S. Hybrid semi-parametric modeling in process systems engineering:
Past, present and future. Comput. Chem. Eng. 2014, 60, 86–101. [CrossRef]
13. Narayanan, H.; Luna, M.; Sokolov, M.; Butté, A.; Morbidelli, M. Hybrid Models Based on Machine Learning and an Increasing
Degree of Process Knowledge: Application to Cell Culture Processes. Ind. Eng. Chem. Res. 2022, 61, 8658–8672. [CrossRef]
14. Rogers, A.W.; Song, Z.; Ramon, F.V.; Jing, K.; Zhang, D. Investigating ‘greyness’ of hybrid model for bioprocess predictive
modelling. Biochem. Eng. J. 2023, 190, 108761. [CrossRef]
15. Shah, P.; Sheriff, M.Z.; Bangi, M.S.F.; Kravaris, C.; Kwon, J.S.I.; Botre, C.; Hirota, J. Deep neural network-based hybrid modeling
and experimental validation for an industry-scale fermentation process: Identification of time-varying dependencies among
parameters. Chem. Eng. J. 2022, 441, 135643. [CrossRef]
16. Bangi, M.S.F.; Kao, K.; Kwon, J.S.I. Physics-informed neural networks for hybrid modeling of lab-scale batch fermentation for
β-carotene production using Saccharomyces cerevisiae. Chem. Eng. Res. Des. 2022, 179, 415–423. [CrossRef]
17. Moser, A.; Appl, C.; Brüning, S.; Hass, V.C. Mechanistic Mathematical Models as a Basis for Digital Twins. In Digital Twins: Tools
and Concepts for Smart Biomanufacturing; Springer: Berlin/Heidelberg, Germany, 2021. [CrossRef]
18. Waldherr, S. Estimation methods for heterogeneous cell population models in systems biology. J. R. Soc. Interface 2018,
15, 20180530 [CrossRef]
19. Anane, E.; López C, D.C.; Neubauer, P.; Cruz Bournazou, M.N. Modelling overflow metabolism in Escherichia coli by acetate
cycling. Biochem. Eng. J. 2017, 125, 23–30. [CrossRef]
20. Albaek, M.O.; Gernaey, K.V.; Hansen, M.S.; Stocks, S.M. Modeling enzyme production with Aspergillus oryzae in pilot scale
vessels with different agitation, aeration, and agitator types. Biotechnol. Bioeng. 2011, 108, 1828–1840. [CrossRef] [PubMed]
21. Grisales Díaz, V.H.; Willis, M.J. Ethanol production using Zymomonas mobilis: Development of a kinetic model describing
glucose and xylose co-fermentation. Biomass Bioenergy 2019, 123, 41–50. [CrossRef]
22. Du, Y.H.; Wang, M.Y.; Yang, L.H.; Tong, L.L.; Guo, D.S.; Ji, X.J. Optimization and Scale-Up of Fermentation Processes Driven by
Models. Bioengineering 2022, 9, 473. [CrossRef] [PubMed]
Processes 2024, 12, 1635 16 of 18

23. Tang, W.; Deshmukh, A.T.; Haringa, C.; Wang, G.; van Gulik, W.; van Winden, W.; Reuss, M.; Heijnen, J.J.; Xia, J.; Chu, J.; et al. A
9-pool metabolic structured kinetic model describing days to seconds dynamics of growth and product formation by Penicillium
chrysogenum. Biotechnol. Bioeng. 2017, 114, 1733–1743. [CrossRef] [PubMed]
24. Jahan, N.; Maeda, K.; Matsuoka, Y.; Sugimoto, Y.; Kurata, H. Development of an accurate kinetic model for the central carbon
metabolism of Escherichia coli. Microb. Cell Factories 2016, 15, 112. [CrossRef] [PubMed]
25. Çelik, E.; Çalık, P.; Oliver, S.G. A structured kinetic model for recombinant protein production by Mut+ strain of Pichia pastoris.
Chem. Eng. Sci. 2009, 64, 5028–5035. [CrossRef]
26. Haringa, C.; Tang, W.; Wang, G.; Deshmukh, A.T.; van Winden, W.A.; Chu, J.; van Gulik, W.M.; Heijnen, J.J.; Mudde, R.F.;
Noorman, H.J. Computational fluid dynamics simulation of an industrial P. chrysogenum fermentation with a coupled 9-pool
metabolic model: Towards rational scale-down and design optimization. Chem. Eng. Sci. 2018, 175, 12–24. [CrossRef]
27. Pan, S.; Chen, G.; Zeng, J.; Cao, X.; Zheng, X.; Zeng, W.; Liang, Z. Fibrinolytic enzyme production from low-cost substrates by
marine Bacillus subtilis: Process optimization and kinetic modeling. Biochem. Eng. J. 2019, 141, 268–277. [CrossRef]
28. Xu, J.; Tang, P.; Yongky, A.; Drew, B.; Borys, M.C.; Liu, S.; Li, Z.J. Systematic development of temperature shift strategies for
Chinese hamster ovary cells based on short duration cultures and kinetic modeling. mAbs 2019, 11, 191–204. [CrossRef] [PubMed]
29. Goldrick, S.; Ştefan, A.; Lovett, D.; Montague, G.; Lennox, B. The development of an industrial-scale fed-batch fermentation
simulation. J. Biotechnol. 2015, 193, 70–82. [CrossRef] [PubMed]
30. Barrigon, J.M.; Valero, F.; Montesinos, J.L. A macrokinetic model-based comparative meta-analysis of recombinant protein
production by Pichia pastoris under AOX1 promoter. Biotechnol. Bioeng. 2015, 112, 1132–1145. [CrossRef] [PubMed]
31. Germec, M.; Karhan, M.; Demirci, A.; Turhan, I. Kinetic modeling, sensitivity analysis, and techno-economic feasibility of ethanol
fermentation from non-sterile carob extract-based media in Saccharomyces cerevisiae biofilm reactor under a repeated-batch
fermentation process. Fuel 2022, 324, 124729. [CrossRef]
32. Mears, L.; Stocks, S.M.; Albaek, M.O.; Sin, G.; Gernaey, K.V. Application of a mechanistic model as a tool for on-line monitoring
of pilot scale filamentous fungal fermentation processes—The importance of evaporation effects. Biotechnol. Bioeng. 2017, 114,
589–599. [CrossRef] [PubMed]
33. Alloghani, M.; Al-Jumeily, D.; Mustafina, J.; Hussain, A.; Aljaaf, A.J. A Systematic Review on Supervised and Unsupervised Ma-
chine Learning Algorithms for Data Science. In Supervised and Unsupervised Learning for Data Science; Springer: Berlin/Heidelberg,
Germany, 2020. [CrossRef]
34. Wiering, M.A.; Van Otterlo, M. Reinforcement learning. Adapt. Learn. Optim. 2012, 12, 739.
35. Tavasoli, T.; Arjmand, S.; Ranaei Siadat, S.O.; Shojaosadati, S.A.; Sahebghadam Lotfi, A. A robust feeding control strategy
adjusted and optimized by a neural network for enhancing of alpha 1-antitrypsin production in Pichia pastoris. Biochem. Eng. J.
2019, 144, 18–27. [CrossRef]
36. Nagy, Z.K. Model based control of a yeast fermentation bioreactor using optimally designed artificial neural networks. Chem.
Eng. J. 2007, 127, 95–109. [CrossRef]
37. Andersen, S.W.; Runger, G.C. Partitioned partial least squares regression with application to a batch fermentation process. J.
Chemom. 2011, 25, 159–168. [CrossRef]
38. Barton, M.; Duran-Villalobos, C.A.; Lennox, B. Multivariate batch to batch optimisation of fermentation processes to improve
productivity. J. Process. Control 2021, 108, 148–156. [CrossRef]
39. Nucci, E.R.; Cruz, A.J.; Giordano, R.C. Monitoring bioreactors using Principal Component Analysis: Production of penicillin G
acylase as a case study. Bioprocess Biosyst. Eng. 2009, 33, 557–564. [CrossRef] [PubMed]
40. Dewasme, L.; Cote, F.; Filee, P.; Hantson, A.L.; Vande Wouwer, A. Dynamic modeling of hybridoma cell cultures using maximum
likelihood principal component analysis. IFAC-PapersOnLine 2017, 50, 12143–12148. [CrossRef]
41. Pimentel, G.A.; Dewasme, L.; Wouwer, A.V. Data-driven Linear Predictor based on Maximum Likelihood Nonnegative Matrix
Decomposition for Batch Cultures of Hybridoma Cells. IFAC-PapersOnLine 2022, 55, 903–908. [CrossRef]
42. Kaelbling, L.P.; Littman, M.L.; Moore, A.W. Reinforcement learning: A survey. J. Artif. Intell. Res. 1996, 4, 237–285. [CrossRef]
43. Treloar, N.J.; Fedorec, A.J.H.; Ingalls, B.; Barnes, C.P. Deep reinforcement learning for the control of microbial co-cultures in
bioreactors. PLoS Comput. Biol. 2020, 16, e1007783. [CrossRef] [PubMed]
44. Kim, J.W.; Park, B.J.; Oh, T.H.; Lee, J.M. Model-based reinforcement learning and predictive control for two-stage optimal control
of fed-batch bioreactor. Comput. Chem. Eng. 2021, 154, 107465. [CrossRef]
45. Oh, T.H.; Park, H.M.; Kim, J.W.; Lee, J.M. Integration of reinforcement learning and model predictive control to optimize
semi-batch bioreactor. AIChE J. 2022, 68, e17658. [CrossRef]
46. Paquet-Durand, O.; Assawarajuwan, S.; Hitzmann, B. Artificial neural network for bioprocess monitoring based on fluorescence
measurements: Training without offline measurements. Eng. Life Sci. 2017, 17, 874–880. [CrossRef] [PubMed]
47. Lopes, J.; Menezes, J.; Westerhuis, J.; Smilde, A. Multiblock PLS analysis of an industrial pharmaceutical process. Biotechnol.
Bioeng. 2002, 80, 419–427. [CrossRef] [PubMed]
48. Schweidtmann, A.M.; Zhang, D.; von Stosch, M. A review and perspective on hybrid modeling methodologies. Digit. Chem. Eng.
2024, 10, 100136. [CrossRef]
49. von Stosch, M.; Hamelink, J.M.; Oliveira, R. Hybrid modeling as a QBD/PAT tool in process development: An industrial E. coli
case study. Bioprocess Biosyst. Eng. 2016, 39, 773–784. [CrossRef] [PubMed]
Processes 2024, 12, 1635 17 of 18

50. Bayer, B.; Duerkop, M.; Striedner, G.; Sissolak, B. Model Transferability and Reduced Experimental Burden in Cell Culture
Process Development Facilitated by Hybrid Modeling and Intensified Design of Experiments. Front. Bioeng. Biotechnol. 2021,
9, 740215. [CrossRef] [PubMed]
51. Brunner, V.; Siegl, M.; Geier, D.; Becker, T. Biomass soft sensor for a Pichia pastoris fed-batch process based on phase detection
and hybrid modeling. Biotechnol. Bioeng. 2020, 117, 2749–2759. [CrossRef] [PubMed]
52. von Stosch, M.; Davy, S.; Francois, K.; Galvanauskas, V.; Hamelink, J.M.; Luebbert, A.; Mayer, M.; Oliveira, R.; O’Kennedy, R.;
Rice, P.; et al. Hybrid modeling for quality by design and PAT-benefits and challenges of applications in biopharmaceutical
industry. Biotechnol. J. 2014, 9, 719–726. [CrossRef] [PubMed]
53. von Stosch, M.; Oliveira, R.; Peres, J.; Feyo de Azevedo, S. A general hybrid semi-parametric process control framework. J.
Process. Control. 2012, 22, 1171–1181. [CrossRef]
54. Psichogios, D.C.; Ungar, L.H. A hybrid neural network-first principles approach to process modeling. AIChE J. 1992, 38,
1499–1511. [CrossRef]
55. Cruz-Bournazou, M.N.; Narayanan, H.; Fagnani, A.; Butté, A. Hybrid Gaussian Process Models for continuous time series in
bolus fed-batch cultures. IFAC-PapersOnLine 2022, 55, 204–209. [CrossRef]
56. Vega-Ramon, F.; Zhu, X.; Savage, T.R.; Petsagkourakis, P.; Jing, K.; Zhang, D. Kinetic and hybrid modeling for yeast astaxanthin
production under uncertainty. Biotechnol. Bioeng. 2021, 118, 4854–4866. [CrossRef] [PubMed]
57. Read, J.; Bifet, A.; Pfahringer, B.; Holmes, G. Batch-Incremental versus Instance-Incremental Learning in Dynamic and Evolving
Data. In Proceedings of the Advances in Intelligent Data Analysis XI, Helsinki, Finland, 25–27 October 2012; Springer:
Berlin/Heidelberg, Germany, 2012; pp. 313–323.
58. Rogers, A.W.; Cardenas, I.O.S.; Del Rio-Chanona, E.A.; Zhang, D. Investigating physics-informed neural networks for bioprocess
hybrid model construction. Comput. Aided Chem. Eng. 2023, 52, 83–88. [CrossRef]
59. Rydal, T.; Frandsen, J.; Nadal-Rey, G.; Albæk, M.O.; Ramin, P. Bringing a scalable adaptive hybrid modeling framework closer to
industrial use: Application on a multiscale fungal fermentation. Biotechnol. Bioeng. 2024, 121, 1609–1625. [CrossRef] [PubMed]
60. von Stosch, M.; Oliveria, R.; Peres, J.; de Azevedo, S.F. Hybrid modeling framework for process analytical technology: Application
to Bordetella pertussis cultures. Biotechnol. Prog. 2012, 28, 284-291. [CrossRef] [PubMed]
61. Boareto, Á.J.M.; De Souza, M.B., Jr.; Valero, F.; Valdman, B. A hybrid neural model (HNM) for the on-line monitoring of lipase
production by Candida rugosa. J. Chem. Technol. Biotechnol. 2007, 82, 319–327. [CrossRef]
62. Jenzsch, M.; Gnoth, S.; Kleinschmidt, M.; Simutis, R.; Lübbert, A. Improving the batch-to-batch reproducibility of microbial
cultures during recombinant protein production by regulation of the total carbon dioxide production. J. Biotechnol. 2007, 128,
858–867. [CrossRef] [PubMed]
63. Dors, M.; Simutis, R.; Lübbert, A. Hybrid Process Modeling for Advanced Process State Estimation, Prediction, and Control
Exemplified in a Production-Scale Mammalian Cell Culture. In Biosensor and Chemical Sensor Technology; American Chemical
Society: Washington, DC, USA, 1996. [CrossRef]
64. Pinto, J.; Mestre, M.; Ramos, J.; Costa, R.S.; Striedner, G.; Oliveira, R. A general deep hybrid model for bioreactor systems:
Combining first principles with deep neural networks. Comput. Chem. Eng. 2022, 165, 107952. [CrossRef]
65. Cabaneros Lopez, P.; Udugama, I.A.; Thomsen, S.T.; Roslander, C.; Junicke, H.; Iglesias, M.M.; Gernaey, K.V. Transforming data to
information: A parallel hybrid model for real-time state estimation in lignocellulosic ethanol fermentation. Biotechnol. Bioeng.
2021, 118, 579–591. [CrossRef] [PubMed]
66. Schmidt, F.R. Optimization and scale up of industrial fermentation processes. Appl. Microbiol. Biotechnol. 2005, 68, 425–435.
[CrossRef] [PubMed]
67. Formenti, L.R.; Nørregaard, A.; Bolic, A.; Hernandez, D.Q.; Hagemann, T.; Heins, A.L.; Larsson, H.; Mears, L.; Mauricio-Iglesias,
M.; Krühne, U.; et al. Challenges in industrial fermentation technology research. Biotechnol. J. 2014, 9, 727–738. [CrossRef]
[PubMed]
68. Funke, M.; Buchenauer, A.; Schnakenberg, U.; Mokwa, W.; Diederichs, S.; Mertens, A.; Müller, C.; Kensy, F.; Büchs, J. Microfluidic
biolector—microfluidic bioprocess control in microtiter plates. Biotechnol. Bioeng. 2010, 107, 497–505. [CrossRef]
69. Tajsoleiman, T.; Mears, L.; Krühne, U.; Gernaey, K.V.; Cornelissen, S. An industrial perspective on scale-down challenges using
miniaturized bioreactors. Trends Biotechnol. 2019, 37, 697–706. [CrossRef] [PubMed]
70. Junne, S.; Klingner, A.; Kabisch, J.; Schweder, T.; Neubauer, P. A two-compartment bioreactor system made of commercial
parts for bioprocess scale-down studies: Impact of oscillations on Bacillus subtilis fed-batch cultivations. Biotechnol. J. 2011, 6,
1009–1017. [CrossRef] [PubMed]
71. Wang, G.; Haringa, C.; Noorman, H.; Chu, J.; Zhuang, Y. Developing a Computational Framework To Advance Bioprocess
Scale-Up. Trends Biotechnol. 2020, 38, 846–856. [CrossRef] [PubMed]
72. Siebler, F.; Lapin, A.; Hermann, M.; Takors, R. The impact of CO gradients on C. ljungdahlii in a 125 m3 bubble column: Mass
transfer, circulation time and lifeline analysis. Chem. Eng. Sci. 2019, 207, 410–423. [CrossRef]
73. Haringa, C.; Tang, W.; Deshmukh, A.T.; Xia, J.; Reuss, M.; Heijnen, J.J.; Mudde, R.F.; Noorman, H.J. Euler-Lagrange computational
fluid dynamics for (bio)reactor scale down: An analysis of organism lifelines. Eng. Life Sci. 2016, 16, 652–663. [CrossRef]
[PubMed]
74. Kuschel, M.; Siebler, F.; Takors, R. Lagrangian Trajectories to Predict the Formation of Population Heterogeneity in Large-Scale
Bioreactors. Bioengineering 2017, 4, 27. [CrossRef] [PubMed]
Processes 2024, 12, 1635 18 of 18

75. Lapin, A.; Müller, D.; Reuss, M. Dynamic behavior of microbial populations in stirred bioreactors simulated with Euler-Lagrange
methods: Traveling along the lifelines of single cells. Ind. Eng. Chem. Res. 2004, 43, 4647–4656. [CrossRef]
76. Simon, L.L.; Fischer, U.; Hungerbühler, K. Modeling of a Three-Phase Industrial Batch Reactor Using a Hybrid First-Principles
Neural-Network Model. Ind. Eng. Chem. Res. 2006, 45, 7336–7343. [CrossRef]

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual
author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.

You might also like