Conversion Gate01
Conversion Gate01
[c]asp-L + atp + gln-L + h2o --> amp + asn-L + glu-L + h + ppi Matrix Creator
Palsson
40
30
Output (mmol/g DW-hr)
20
10
0
EX_co2(e) EX_h(e) EX_h2o(e) EX_pi(e) Biomass
-10
EX_nh4(e)
1.20E+00
1.00E+00
8.00E-01
Relative Growth Rate
Acetate
Akg
6.00E-01
Glucose-D
L-lactate
D-Lactate 4.00E-01
Malate
Pyruvate
2.00E-01
Succinate
Glycerol
0.00E+00
-10 -8 -6 -4 -2 0 2 4 6
Proton secretion flux
Naringenin
Reactions added
At 2 seconds/ calculation…
Primary Knockouts < 3 hours
Secondary Knockouts ~ 1 day
Tertiary Knockouts ~ 12 days
Quaternary Knockouts ~ 230 days
Combinatorial Explosion
Limited search space
Problem of large search space
Time taken
Not all search covered
Other methods possible?
Genetic Algorithm
Genetic Algorithm
Genetic Algorithm
Crossover - Recombination
Crossover combines genetic material from two parents,
in order to produce superior offspring.
Mutation
•Mutation introduces randomness into the population.
•The idea of mutation is to reintroduce divergence into a
converging population.
Fitness Function
The Fitness function determines what
solutions are better than others.
Fitness is computed for each individual.
Fitness = flavanoid production
Example population
A B
C
3/6 = 50% 2/6 = 33%
fitness(A) = 3
fitness(B) = 1
fitness(C) = 2
Stopping Criteria
Final problem is to decide
when to stop execution of algorithm.
There are two possible solutions
to this problem:
First approach:
Stop after production
of definite number of generations
Second approach:
Stop when the improvement in average fitness
Early phase:
quasi-random population distribution
Mid-phase:
population arranged around/on hills
Late phase:
population concentrated on high hills
Advantages of GA’s
Search space not limited to first top 10
knockouts
Supports multi-objective optimization
Can return a family of solutions with
similar fluxes
Easy to exploit previous or alternate
solutions
May find synergistic knockouts overlooked
by standard search
Genetic Algorithm
Parameters of the GA
Representation scheme: Integer
[00100111]
[3 6 7 8]
Mutation rate: 1/ string length / locus
restricted
Crossover type: scattered (random mix)
Elite children : 2
Stall generations: 50
Population size: 1000
Mutation probability: Simulated Annealing
Simulated Annealing
Simulated Annealing
Change in Mutation Rate
0.6
0.5
0.4
Mutation rate
0.3
0.2
0.1
0
0 10 20 30 40 50 60 70 80 90 100
Generation %
Results:
Results: Summary
Over 10,000 KO results were stored by the
algorithms, out of about 900,000 MOMA
calculations performed
Results: Hill Climber VS GA
Which is better?
Results for both methods in Agreement
Exhaustive combination of top 10 most
frequently suggested KO’s yielded no better
results
Implications: the search space is not as
chaotic as originally assumed
Results: Effect of Gene Mapping
More accurate prediction on reactions
affected by disruption of genes
For example, the top yielding candidate for a
primary level knockout predicted the loss of
two reactions
Results: Primary Level
The top result predicted a flux increase of
naringenin from zero with no knockouts
performed to 0.6078 mmol/g-DW/hr
Gene: sdhC
Reaction:
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
y(
'sd
h C'
Pr
im )
ar
y(
'tp
iA
Pr ')
Se im
c on ar
y(
da 'g
ry nd
Se ('g ')
co nd
nd '
's
ar d hC
y(
'g ')
Se ly
c on
A'
's
da d
Te r y(
hC
rti 'f o ')
ar lD
y(
'g '
dh 's
dh
Te A' C
rti 'g ')
ar nd
y( '
'g 's
cd dh
Q
Te ' C
r tia 'g ')
ua ly
te r y A
rn ('m '
ar 's
dh dh
Q
y(
'd ' C
ua cu 'g ')
C ly
te
rn ' A'
ar 'b 'sd
rn
KO Genes
y( Q hC
'd ' ')
Q cu 'g
ua C nd
te ' '
G rn 'b 'sd
A rn
a
Top 3 Simulated KO in each level
Q r Q hC
y( ' ')
ua 'g 'fo
te dh lD
rn A '
G ar ' 'sd
A y( 'p
Q ' g hC
g i' ')
ua nd 'b
te ' rn
rn 'd Q
ar cu '
G y( C 'g
A 's ' nd
Q dh 'b ')
ua C r nQ
te ' '
rn 'g 'sd
ar dh
y( A hD
'g ' ')
nd 'a
' ce
'm A'
dh 'g
Results: Top 3 in each level
' nd
'g ')
dh
A'
's
dh
B
')
% increase of naringenin flux
Pr
im
ar
100000
200000
300000
400000
500000
600000
700000
800000
0
y(
'sd
hC
Pr ')
im
ar
y(
'tp
iA
Pr ')
Se im
co ar
nd y(
ar 'g
Se y ('g
nd
')
co nd
nd '
ar 's
dh
y(
' C
Se gl ')
yA
co '
nd 's
ar dh
Te y( C
rti 'f o ')
ar lD
y(
'g '
dh 's
dh
Te A' C
'g ')
rti
ar n d
y( '
'g 's
cd dh
Q
Te ' C
rti 'g ')
ua
te a ry
ly
A
rn ('m '
's
ar
y d h
dh
Q ( 'd ' C
ua 'g ')
te
cu
C l y
rn ' A '
ar 'b 'sd
y( rn
'd Q hC
c ' ')
Q uC 'gn
ua ' d '
G te 'b
A rn rn 'sd
Q ar Q hC
ua y ( 'g ' ')
'fo
te dh lD
rn A '
G ar ' 'sd
A y( 'p
Q ' g
gi
'
hC
ua nd 'b ')
te ' rn
rn 'd Q
ar cu '
G y( C 'g
A 'sd ' nd
Q h 'b ')
ua C rn
te ' Q
rn 'g '
ar dh 'sd
% increase over predicted naringenin wildtype flux (0.0002 mmol/g-DW/hr)
y( A hD
'g ' ')
'a
nd
' c eA
'm '
dh 'g
' nd
'g ')
Results: Increase over Wild type
dh
A'
's
dh
B
')
Results: Targets
TCA cycle, the pentose phosphate pathway,
and other biosynthetic pathways
Results: Rationalization
Precursor Availability
5
Acetyl CoA
1.5
carboxylate:
1
produces malonyl
CoA
0.5
0
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6
Naringenin Flux (mmol/g-DW/hr)
Naringenin/ Biomass Relationship
Competition for precursors
0.9
0.8
0.7
0.6
Biomass flux (mmol/g-DW/hr)
0.5
0.4
0.3
0.2
0.1
0
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6
Naringenin output (mmol/g-DW/hr)
% increase of naringenin flux
Pr
im
0
50
100
150
200
250
300
ar
y(
'sd
hC
')
Pr
im
ar
y(
'tp
iA
')
Pr
im
Se ar
y(
co 'g
nd nd
ar ')
y(
'g
Se nd
'
co 's
nd dh
ar C
y ')
('
gl
yA
Se '
co 's
dh
nd C
a ry ')
Te ('f
ol
r ti D
ar '
y( 's
'g dh
dh C
A' ')
Te 'gn
r ti d'
ar 's
y( dh
'g C
cd ')
KO Genes
'
'g
Te ly
A
Q
r ti
ar '
ua y 's
('m dh
te d C
rn h ')
ar '
y( 'gl
'd yA
Q cu '
C
% increases of top 3 KO's over previous levels
ua ' 'sd
te
rn 'br hC
ar nQ ')
y( '
'd 'gn
cu d'
C
Q ' 'sd
ua 'b hC
te rn
Q ')
rn
ar '
'fo
y( lD
'g '
dh 'sd
A
' hC
'p ')
gi
'
'b
rn
Q
'
'g
Results: Diminishing Returns
nd
')
Biomass Flux (mmol/g-DW/hr)
W
0
1
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
i ld
ty
Pr p e
im
ar
y(
'sd
h C'
Pr
im )
ar
y(
'tp
iA
Pr ')
Se im
co ar
nd y(
ar 'g
y nd
Se ('g ')
co nd
nd '
ar 's
Biomass Threshold
dh
y(
' C
Se gl ')
co y A'
nd 's
ar dh
Te y C
r ti
ar
('f
ol ')
y( D
'g '
dh 's
dh
Te A' C
r ti 'g ')
ar nd
y(
'g '
c 's
Q Te d '
dh
C
ua r ti 'g ')
te ar l yA
rn y(
'm '
ar 's
y( dh dh
Q 'd ' C
ua cu 'g ')
te C ly
KO Genes
rn ' A '
ar 'b 'sd
y( rn
'd Q hC
Q ' ')
ua
cu
C 'g
' nd
G te 'b '
A rn rn 'sd
Q ar
y Q hC
'
ua ('g 'f ')
Biomass Flux of top 3 KO's in each level
te dh o lD
G
rn A '
A
ar ' 'sd
y( 'p
Q ' g g i hC
ua ' ')
te
nd
' 'b rn
rn ' dc Q
G ar uC '
A y(
' 'g
Q sd ' nd
ua hC ' b ')
rn
te ' Q
rn 'g '
ar dh 'sd
y( A
'g ' hD
nd 'a ')
' ce
'm A'
dh 'g
Results: Diminishing Returns
' nd
'g ')
dh
A'
's
dh
B
')
In Conclusion
Will all knockouts identified show increased
productivity?
In-vivo results could provide an opportunity to
improve the model.
The approaches used justify some optimism
regarding gene targeting for strain
improvement
Provide a clearer understanding of the nature
of the optimization goal
Questions?