0% found this document useful (0 votes)
38 views11 pages

Petri Nets With Fuzzy Logic (PNFL) : Reverse Engineering and Parametrization

012807

Uploaded by

alkalkia
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
38 views11 pages

Petri Nets With Fuzzy Logic (PNFL) : Reverse Engineering and Parametrization

012807

Uploaded by

alkalkia
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/46404249

Petri Nets with Fuzzy Logic (PNFL): Reverse Engineering and Parametrization

Article  in  PLoS ONE · September 2010


DOI: 10.1371/journal.pone.0012807 · Source: PubMed

CITATIONS READS

25 1,657

4 authors, including:

Robert Küffner Tobias Petri


Ludwig-Maximilians-University of Munich 14 PUBLICATIONS   1,478 CITATIONS   
46 PUBLICATIONS   2,437 CITATIONS   
SEE PROFILE
SEE PROFILE

Ralf Zimmer
Ludwig-Maximilians-University of Munich
373 PUBLICATIONS   7,580 CITATIONS   

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

World Map of Nuclear Protein Complexes View project

Graph Kernels View project

All content following this page was uploaded by Ralf Zimmer on 10 August 2014.

The user has requested enhancement of the downloaded file.


Petri Nets with Fuzzy Logic (PNFL): Reverse Engineering
and Parametrization
Robert Küffner*, Tobias Petri, Lukas Windhager, Ralf Zimmer
Institut für Informatik, Ludwig-Maximilians-Universität, München, Germany

Abstract
Background: The recent DREAM4 blind assessment provided a particularly realistic and challenging setting for network
reverse engineering methods. The in silico part of DREAM4 solicited the inference of cycle-rich gene regulatory networks
from heterogeneous, noisy expression data including time courses as well as knockout, knockdown and multifactorial
perturbations.

Methodology and Principal Findings: We inferred and parametrized simulation models based on Petri Nets with Fuzzy
Logic (PNFL). This completely automated approach correctly reconstructed networks with cycles as well as oscillating
network motifs. PNFL was evaluated as the best performer on DREAM4 in silico networks of size 10 with an area under the
precision-recall curve (AUPR) of 81%. Besides topology, we inferred a range of additional mechanistic details with good
reliability, e.g. distinguishing activation from inhibition as well as dependent from independent regulation. Our models also
performed well on new experimental conditions such as double knockout mutations that were not included in the provided
datasets.

Conclusions: The inference of biological networks substantially benefits from methods that are expressive enough to deal
with diverse datasets in a unified way. At the same time, overly complex approaches could generate multiple different
models that explain the data equally well. PNFL appears to strike the balance between expressive power and complexity.
This also applies to the intuitive representation of PNFL models combining a straightforward graphical notation with
colloquial fuzzy parameters.

Citation: Küffner R, Petri T, Windhager L, Zimmer R (2010) Petri Nets with Fuzzy Logic (PNFL): Reverse Engineering and Parametrization. PLoS ONE 5(9): e12807.
doi:10.1371/journal.pone.0012807
Editor: Mark Isalan, Center for Genomic Regulation, Spain
Received March 31, 2010; Accepted June 18, 2010; Published September 20, 2010
Copyright: ß 2010 Küffner et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits
unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Funding: This work was supported by Helmholtz Alliance on Systems Biology, Project CoReNe (http://www.helmholtz.de/en/joint_initiative_for_innovation_
and_research/initiating_and_networking/helmholtz_alliances/systembiologie/helmholtz_alliance_on_systems_biology/networks_in_detail/corene_control_
of_regulatory_networks_with_focus_on_non_coding_rna/). The funders had no role in study design, data collection and analysis, decision to publish, or
preparation of the manuscript.
Competing Interests: The authors have declared that no competing interests exist.
* E-mail: [email protected]

Introduction complex knockout, knockdown and multifactorial perturbation


datasets.
The inference of biological networks based on gene expression We present a network inference approach based on Petri Nets
measurements is a complex task. A range of approaches have been with Fuzzy Logic (PNFL) [18]. Similar to ODEs but in contrast to
developed for that purpose, which is in turn reflected by a range of Bayesian or mutual information networks, PNFL enables a
corresponding reviews [1–7]. Basic principles to derive relation- simulation of the models. In contrast to the more detailed ODEs,
ships between genes or proteins include ordinary differential PNFL employs a simpler rule based discrete modeling system.
equations (ODE) [8–10], mutual information [11] and Bayesian The simulation is important for the investigation and refinement
networks [12]. of mechanistic network models in order to capture the dynamic
Predictions from the available methods are currently quite behavior of systems in addition to their topology. In case of
unreliable as shown in several comparative studies on in silico DREAM4, we simulate to re-generate the provided datasets. The
networks [13–16]. For instance, precisions of less than 30% have objective of our inference approach is the reconstruction of
been observed in [14] for all approaches investigated. This might models by optimizing the agreement between all of the datasets
be due to the fact that most methods were developed to exploit provided in the challenge and those generated by PNFL.
either (static) interventional datasets such as knockout experiments Heterogeneous datasets can thus be exploited and scored in a
or dynamic datasets such as time courses, but not both [4]. unified way.
Whether the incorporation of a broad range of datasets can In the following, we briefly summarize the DREAM4 setting,
increase the reliability of network reconstruction is explored by the introduce Petri Nets and PNFL and outline our approach to
DREAM competitions that conduct blind assessments of network simulate and reconstruct PNFL models. Subsequently, we
reverse-engineering approaches [17]. The in silico part of describe the results we obtained in the DREAM4 in silico size
DREAM4 (2009) provided time course datasets together with ten challenge.

PLoS ONE | www.plosone.org 1 September 2010 | Volume 5 | Issue 9 | e12807


PNFL: Reverse Engineering

Methods specifying the amount of substrate consumed and the amount of


product produced during a reaction (the firing of a transition). For a
Setting of the DREAM4 in silico size ten challenge detailed description of classical Petri nets see [25]. In addition,
Problem statement. The in silico part of DREAM4 aims at there exists a wide variety of extensions of Petri nets [26]. A Petri
the reconstruction of gene regulatory networks where effects are Net with Fuzzy Logic (PNFL) can be defined as an instance of a
propagated via directed transcription factor (TF, i.e. the effector hybrid functional Petri net (HFPN) [27].
protein) R target gene relationships. TFs are synthesized from In PNFL models of gene regulatory networks, the activity of a
their corresponding genes and can thus be themselves the targets target gene t is controlled by a single transition that discharges into
of other TFs. Other kinds of relationships (e.g. alternative splicing, a single output place (see Fig. 1). The marking of this place
protein modification, transport, metabolic reactions) were not represents t’s numerical gene level lt. The relationship between an
considered. effector place and a target place is called an effect. It is mediated by
The task is the automated reverse engineering of the directed an effect arc connecting an effector-gene place to the transition that
topology of five different networks with ten nodes per network. controls the target place.
The topology to be predicted merges genes and their products (i.e. Transitions are always enabled as each place always contains a
the TFs) into single nodes. All networks contain cycles, but no self valid (real-valued) token. Firing a transition removes the old
loops. No direct information on the edges is given. Instead, marking on the target place via a target place-transition arc (Fig. 1).
networks are to be inferred from the provided gene expression After quantifying the effect strength based on the effector gene
datasets (see below) alone. In a bonus round, participants used level (via function c, see eq. 1–3), a new marking is assigned to the
their reconstructed networks to simulate dual knockout perturba- target place by the output function o (eq. 4–6). The marking on
tions. The problem statement, evaluation and datasets are effector places remains unchanged (test arcs).
described in more detail on the DREAM website (http://wiki. A transition, its output place and their connecting arcs can be
c2b2.columbia.edu/dream/index.php/D4c2). replaced by hexagonal nodes (Fig. 1) to simplify the representation.
Evaluation. After the challenge, submissions were evaluated
In this reduced form, only hexagons and their connecting effect
against the true topology based on the area under the precision-
arcs remain as all transitions and places are replaced. Effect arcs
recall curve (AUPR) and the area under the receiver-operator
will be attached to or detached from transitions during the net-
characteristics curve (AUROC). We will focus our discussion on
work reconstruction process (section Reconstruction) thus con-
the AUPR. Roughly speaking, an AUPR of 50% means that for
necting different hexagonal nodes in the reduced Petri net
each correctly predicted edge an erroneous edge is predicted as
representation.
well. The sign of the edges (activation vs. inhibition) is not
considered in the DREAM4 evaluation. Dual knockout predic-
tions were compared against the true equilibrium values via the Modeling of gene regulatory relationships with PNFL
mean squared error (MSE). The evaluation is described in more The evaluation of effects using fuzzy logic involves a three-step
detail in [19]. procedure that consists of fuzzification, the application of effector
Gene expression datasets. The approach for dataset rules and defuzzification.
generation was developed by Marbach et al. [20,17]. Five time
course (TC) datasets were provided. At the beginning of the TC,
strong perturbations were applied to the basal transcription levels
of about a third of all genes. Halfway through the TC the
perturbation was removed so that the network relaxed to the wild
type (WT) equilibrium state (5 TC * 20 measurements * 10
genes = 1000 values). All other datasets contained equilibrium
gene levels only. Ten single gene knockout (KO), knockdown (KD)
and multifactorial (MF) perturbations (3 * 10 perturbations * 10
genes = 300 values) were provided. Compared to the wild type
(WT), basal transcription levels of KO and KD target genes were
reduced to 0% and 50%, respectively. MF datasets were generated
by applying moderate perturbations to the basal activation levels
of all genes in the network. Thus, MF datasets could be regarded
as transcriptional variations between different individuals. Given
gene levels were scaled to be in the range [0, 1].

Petri Nets
The application of Petri net theory for modeling and analysis of Figure 1. Petri Nets with Fuzzy Logic (PNFL). In Petri nets, states
biological networks is well established in the field of systems such as effector (e) or target (t) gene levels are represented by places
and are depicted as circles. State changes are represented by transitions
biology [21–24]. Petri nets are graph representations of networks
and are depicted as boxes. Effect arcs (i.e. effector place-transition arcs)
consisting of two types of nodes: places, representing entities like define the effectors influencing a target gene via the transition. Firing
proteins, genes, metabolites etc, and transitions, representing transitions leaves the marking of the effector places unchanged (test
reactions or, in general, state changes of entities. The state of an arcs, dashed). After the application of rule tables re,t to effector gene
entity is defined by the tokens that represent the marking of the levels le (function c, eq. 1–3), the target gene levels lt are updated by the
according place and the overall system state by the marking of the output function o (eq. 4–6). In Fig. 6, Fig. 8 and Fig. 10, we represent a
transition and its output place as a simplified hexagonal node. The
Petri net. Directed edges (arcs) connect places to transitions (input reconstruction determines the topology ( = effect arcs) and the
arcs) or transitions to places (output arcs). These arcs not only parametrization ( = rule tables and combination operators) of PNFL
depict which entities influence reactions or are influenced by them, models.
but they also exactly define the effects of a reaction, e.g. by doi:10.1371/journal.pone.0012807.g001

PLoS ONE | www.plosone.org 2 September 2010 | Volume 5 | Issue 9 | e12807


PNFL: Reverse Engineering

sum-product logic [29] results in a fuzzy rule consequent C:

X  L(E,le ), if(re,t (E)~T)


C(T,le ,re,t )~ ð2Þ
E[S 0, otherwise

Applying eq. 2 to all sets TMS results in a fuzzy value ,C(low,le,re,t),


C(med,le,re,t), C(high, le,re,t). describing e’s effect on the target gene t,
i.e. it is a fuzzy discretization of the proposed effect.
Defuzzification. By center of gravity defuzzification we
obtain a continuous rule consequent c:

0:C(low,le ,re,t )z0:5:C(med,le ,re,t )z1:C(high,le ,re,t )


c(le ,re,t )~ ð3Þ
C(low,le ,re,t )zC(med,le ,re,t )zC(high,le ,re,t )
Figure 2. Fuzzification and defuzzification. We use triangular
membership functions to fuzzify the continuous gene levels of an with the centers of gravity at 0, 0.5 and 1. Note that due to our
effector e into fuzzy sets. As shown by the magenta arrow, a continuous choice of fuzzy sets and rule tables the value of the denominator
gene level of le = 0.25 is fuzzified into the fuzzy value ,L(low,le) = 0.5, always equals to one. An example calculation involving eq. 1–3 is
L(med,le) = 0.5, L(high,le) = 0.0.. This can be reversed by defuzzification shown in Fig. 4.
without loss of information.
doi:10.1371/journal.pone.0012807.g002 Combination of effects. If several effectors regulate a target
gene, their combined effect on the target can be modeled by
logical operations [30]. We model two kinds of dependent
Fuzzification. In a first step, the continuous gene level le M regulation by the minimum of the effects (AND operator) or the
[0,1] of an effector e is transformed (fuzzified) into the fuzzy value maximum of the effects (OR operator). The average (MEAN)
,L(low,le), L(med,le), L(high,le). by triangular membership functions models the independent regulation of a target by its effectors.
(eq. 1, Fig. 2): Dependent and independent regulation are described in Fig. 5.
( The combination logic currently used in PNFL allows only a single
0, if(le w0:5) operator (either AND, OR or MEAN) to be selected per target gene
L(low,le )~ regardless of the number of effectors, see eq. 4 and eq. 6.
1{2:le , otherwise
(
2:le {1, if(le w0:5) ð1Þ PNFL simulation
L(high,le )~ Before simulation, gene levels are initialized to their wild type
0, otherwise
levels as provided by DREAM4. Let t be a gene targeted by n
L(med,le )~1{L(low,le ){L(high,le ) effectors e1,…,en. In each simulation step, updates u of the levels of
all genes are computed from the continuous rule consequents
Such membership functions are called fuzzy sets [28]. Contrary to a cj ~c(lej ,rej ,t ):
classical set, where an object is either contained in the set or not
(two-valued logic, {0,1}), a fuzzy set assigns a degree of member- 8
>
< 1, if (n~0)
ship from the interval [0,1] to each object. Thus, the fuzzy value
resulting from the fuzzification of a gene level le with respect to u(c1 ,:::,cn )~ c1 , if (n~1) ð4Þ
>
:
three fuzzy sets SM{low, med, high} can be interpreted as a fuzzy op(c1 ,:::,cn ), otherwise
discretization.
Application of effector rules. Based on the discretization with op M {AND, OR, MEAN}. Subsequently, u is applied to the
defined by fuzzy sets, the properties of regulatory relationships are gene levels lt of all target genes via the output function o (eq. 5) at
modeled by rule tables re,t (Fig. 3) in analogy to Boolean network once.
models. Rule tables (as used in DREAM4) define three levels of
effect strength for both activation (+++, ++, +) and inhibition 
0, if(bt ~0)
(222, 22, 2). A rule table re,t:SRS maps each effector set EMS lt ~o(lt ,c1 ,:::,cn )~ : : ð5Þ
a bt u(c1 ,:::,cn )z(1{a):lt , otherwise
to a corresponding target set TMS. The application of a rule by the

Figure 3. Rule tables. Given fuzzy effector gene levels, we describe the behavior of the targets by rule tables. Rule tables define sign and strength
of effects. Fully active strong (222, A) or medium (22, B) inhibitors result in low target activity, which is in contrast to weak inhibitors (2, C). The
corresponding strong (+++), medium (++) and weak (+) activator rule tables are constructed by exchanging high by low and low by high in the target
column.
doi:10.1371/journal.pone.0012807.g003

PLoS ONE | www.plosone.org 3 September 2010 | Volume 5 | Issue 9 | e12807


PNFL: Reverse Engineering

Figure 4. Fuzzy effect calculation example. In this example, the gene level of effector e is le = 0.125. It is transformed (fuzzified, panel A) into the
fuzzy gene level L by application of eq. 1. In panel B, the rule table re,t (Fig. 3C) is applied to describe the influence of e onto its target gene t by the
rule consequent C. C is derived by eq. 2, yielding the fuzzy value ,0, 0.25, 0.75. (panel B). The real valued influence of e onto t, c(le, re,t) = 0.875, is
calculated by defuzzification (panel C). Such a calculation is performed for all effectors of the target gene t individually. The influences are combined
by eq. 4 or eq. 6 (not shown here, see text).
doi:10.1371/journal.pone.0012807.g004

The scaling parameter a (Table 1) aligns the PNFL generated time A perturbation is represented as an additional (hidden, i.e.
courses to the provided time courses. The transcription rate unobserved) node in the network. During reconstruction, we infer
parameter bt tunes the transcription rate of gene t, with bt = 1 for perturbation targets together with effector targets, as both were not
the wild type transcription rate. disclosed in the challenge. Initially, we use eq. 6 instead of eq. 4 for
Knockout, knockdown and double knockout data. Gene all genes t directly affected by the perturbation. For a time course i,
perturbations are simulated by reducing the transcription rate bt. eq. 6 includes the perturbation term c(lpi ,rpi ,t ) with lpi :1, where pi
In case of a knockout or knockdown simulation, bt of the is an additional perturbation effector with corresponding rules rpi ,t .
perturbed gene t is set to 0 or 0.5, respectively. Similarly, double The perturbation is disabled halfway through the time course via
knockout simulations can be performed. switching back to eq. 4 thereby allowing the network to return to its
Time course data. Time course datasets were provided by wild type state. The reconstructed networks thus consist of 15
DREAM4 to show the impact of strong gene perturbations variables: 10 genes and 5 perturbation variables for the 5 different
(about a third of all genes) on a network as well as the relaxa- time courses.
tion to the wild type equilibrium state after removing the
perturbations. (
c(lpi ,rpi ,t ), if (n~0)
ui (c1 ,:::,cn ,c(lpi ,rpi ,t ))~ ð6Þ
op(c1 ,:::,cn ,c(lpi ,rpi ,t )), otherwise

Multifactorial data. DREAM4 also provided equilibrium


values for multifactorial (MF) perturbations. Here, the basal
transcription levels of all genes in the network were perturbed, but
Figure 5. Combinatorial gene regulation. The regulatory logic of to a lesser degree compared to the time course perturbations. In
different transcription factors (TFs) regulating a target gene used in contrast to the time courses, we do not compute additional rule
DREAM4 was disclosed after the challenge. TFs are assumed to bind to consequents for the MF data. Instead, we test how well MF target
cis-regulatory modules (CRMs) to regulate the expression of target gene levels can be generated if the PNFL wild type rules are
genes. Individual CRMs act as enhancers (red) or repressors (blue) of applied to the provided MF effector gene levels. The MF target
gene regulation. The bound states of different CRMs (e.g. by TFs 4 and gene levels are thus approximated, as the basal activation changes
10) are mutually independent. A complex of TFs regulating a given CRM
can be represented as AND operator. TFs 1 and 7 are mutually are not reflected in the PNFL rules.
dependent to form the complex and regulate the gene. In turn, a Differences between the PNFL and DREAM MF gene levels
complex of TFs controlling a repressing CRM can be implemented by can be due to three reasons: (1) the inferred effects or their
the OR operator (not shown). The effects of several CRMs on the activity parametrization are inadequate: this should be corrected by the
of the target gene are averaged (MEAN operator). In contrast to the reconstruction method, (2) noise and (3) the MF changes to the
arbitrary combination of operators in the DREAM4 approach, PNFL
selects only a single operator (AND, OR or MEAN) per target gene (see
basal transcription levels. Reasons (1) and (2) apply equally to all of
Methods and Results). The depicted regulation of gene 3 was taken the datasets. For reason (3) we did not account for, so deviations
from network 5 (see Fig. 10A). will be somewhat larger than for the other datasets. Therefore, we
doi:10.1371/journal.pone.0012807.g005 use lower weights for MF data in the objective function (Table 1).

PLoS ONE | www.plosone.org 4 September 2010 | Volume 5 | Issue 9 | e12807


PNFL: Reverse Engineering

Table 1. Parameters used for PNFL based reconstruction. agreement between the DREAM4 provided and PNFL simu-
lated datasets.
Note that the networks discussed here always include 10 genes
Parameter descriptions Equation Values/Lists of valuesa and the PNFL models always contain one place and one transition
for each gene. Topological changes of the PNFL models only
Rule tables r: effect strength 2 (M), (W, S), (W, M, S)b
involve attachment or detachment of input arcs.
Combination operators 4, 6 (MEAN), (OR, AND), (OR, AND, MEAN) Move set and move probabilities. Starting from a
Update ratio a 5 0.4, 0.5, 0.6 population of randomly initialized networks, the reconstruction
Regularization parameter reg 7 0.005, 0.002, 0.001, 0.0005, 0.0002 proceeds one network modifying move at a time. Each move
Weight time course wTC 7 1
modifies a single target gene. After a move, data is generated by
PNFL and compared against the DREAM data (Fig. 6). We
Weight knockout wKO 7 8
implemented moves on individual networks that add or remove
Weight knockdown wKD 7 6 effects (i.e. effect arcs), switch the effect combination logic op M
Weight multifactorial wMF 7 4 {AND, OR, MEAN} and increase or decrease the effector
Simulated annealing 8 0.02 strength via selecting the corresponding rule tables r M {+++, ++,
parameter k +, 2, 22, 222}. Each network in the population evolves both
a
independently by the moves mentioned before but also by a set of
Parameters from lists are randomly selected for ensemble predictions (see
Submission).
crossover moves. The crossover moves copy effect strength,
b
Degrees of effect strength, W = weak = (+,2), M = medium = (++,22) and combination logic or effects between two individuals.
S = strong = (+++,222). During reconstruction, particular moves are selected from the
doi:10.1371/journal.pone.0012807.t001 move set with a move probability that is proportional to the past
move acceptance probability for that move.
Reconstruction Objective function. The quality of the reconstructed
Overview. We construct PNFL models by inferring and networks is evaluated by an objective function dist. It is based on
parametrizing relationships between genes via appropriate rule the Pearson correlation coefficients rt of the target genes t and a
tables. Starting from a randomly initialized PNFL model our regularization term. Lower values of dist indicate a better
reconstruction approach (Fig. 6) proceeds via four steps: (1) agreement between the DREAM dataset vectors xt and the
The topology and parametrization of the initial network are PNFL dataset vectors yt and thus better PNFL models. The
modified by the application of moves. (2) After each move, data is vectors xt and yt are formed by the concatenation of all four
simulated by PNFL and (3) compared to the original data via an kinds of datasets (10 knockout, 10 knockdown, 100 time course,
objective function. Finally, (4) we use a simulated annealing 10 multifactorial values per gene). An additional vector w =
protocol to decide if a given move should be accepted or rejected. (wKO,…,wKO, wKD,…,wKD, wTC,…,wTC, wMF,…,wMF) weights the
The network optimization thus targets at the best possible data points with dataset specific weights (Table 1). All three vectors

Figure 6. Overview network reconstruction. To reconstruct the original network (A) we mimic the DREAM4 data generation process (ARB). The
knockout (KO) of gene 1 is depicted as an example data set in the lower panels. Our reconstruction starts from a randomly initialized population (C)
and proceeds through network changing moves. After each move, data is generated by PNFL (D) and compared against the DREAM data (B). We
implemented moves on single networks in the population and crossover moves that copy features between pairs of networks. Thereby, favourable
features are propagated throughout the population, which eventually leads to improved networks (E) and corresponding datasets (F). Note that - in
contrast to the PNFL simulation (D,F) - only equilibrium values were given for knockout experiments in DREAM4 (B). Edges denote effect strength
(thickness) and sign (activation = red, inhibition = blue).
doi:10.1371/journal.pone.0012807.g006

PLoS ONE | www.plosone.org 5 September 2010 | Volume 5 | Issue 9 | e12807


PNFL: Reverse Engineering

are of length 130. The weighted rt is calculated based on the the objective function dist. We accept inferior networks with a
weighted covariance and the weighted mean (not shown). A model probability p calculated from the Boltzmann distribution
coefficient r is calculated as the average of the gene coefficients rt. parametrized by k (Table 1). Essentially, moves that only slightly
In addition, we introduced a regularization parameter reg. It allows increase dist are accepted more frequently, especially if the
us to control |Network|, i.e. the number of edges ( = effect arcs) in temperature T is high. T decreases linearly during the
the models.

dist~½1{r(x,y,w)2 zreg:DNetworkD ð7Þ

Note that DREAM4 provided only equilibrium values for the


knockout, knockdown and multifactorial datasets. Only for the
time course datasets gene levels for different measurements were
available and are used for the calculation of dist.
Simulated annealing. We employ simulated annealing to
decide if a network changing move is accepted or rejected. That is,
we always accept moves that improve the network with respect to

Figure 8. PNFL reconstruction of network 5 (AUPR = 76%).


DREAM4 evaluated our predictions (panel A) in terms of correct
(colored solid), missed (black) and surplus (dotted) edges. For
simulation, we also infer three levels of effect strength (edge thickness)
Figure 7. Evaluation of the in silico challenge comprising five for both activation (red) and inhibition (blue). Targets regulated by
networks of ten genes. Panel A shows the prediction performance of multiple effectors are parametrized by the kind of regulation, i.e.
the directed unsigned topology as the area under the precision recall dependent (AND, OR) vs. independent (MEAN). Incorrect predictions are
curve (AUPR). In a bonus challenge, steady-state level predictions of more frequent when effector gene levels are low in the wild type (e.g.
dual knockout experiments were evaluated by the mean squared error genes 4, 5, 6 and 9). In panels A and B we compare the provided DREAM
(MSE, panel B). Our performance is shown in green. data to the PNFL simulation for the knockout of gene 8.
doi:10.1371/journal.pone.0012807.g007 doi:10.1371/journal.pone.0012807.g008

PLoS ONE | www.plosone.org 6 September 2010 | Volume 5 | Issue 9 | e12807


PNFL: Reverse Engineering

reconstruction run from one to zero. a single processor core. During the run, the individuals in the
population usually converge to a single network with only minor
{k:Df =T variations (data not shown).
p~e ,T[½1::0 ð8Þ
Relative contribution of the different datasets
The contribution of the different datasets to the prediction of
Submission. The DREAM4 submission format required to networks is given by dataset specific weights. The weights were
rank effects by their prediction confidence. We therefore chose a
derived manually based on randomly generated PNFL models.
consensus approach to predict an ensemble of networks.
The relative contributions of the individual datasets amount to
Consensus prediction approaches have been successfully applied
KO = 29% (wKO*10 data points = 80; compare eq. 7 and Table 1),
to network reconstruction before [31]. We carry out 100
KD = 21% (wKD*10 = 60), TC = 36% (wTC*100 = 100) and
reconstruction runs with different parameter settings (Table 1)
MF = 14% (wMF*10 = 40). While the combination of KO+KD
and random seeds. Ranking is based on the effect prediction
accounts for half of the total dataset weights, the largest individual
confidence calculated as the fraction of networks that included the
portion stems from the TC data.
given effect. This automated approach was the same for each of
the submitted nets. No manual processing was performed. For the
visualization and description of effects or networks, we assume an DREAM4 evaluation results
effect to be predicted if the prediction confidence is 50% or above. The overall results of the in silico size ten challenge as reported
by the DREAM organizers are depicted in Fig. 7. The network
topology was predicted by 29 different teams. In terms of the
Results
AUPR, our PNFL based reconstruction approach (81% AUPR
Size of the model space averaged over 5 networks) outperformed the second best team by
The number of possible models m depends on the number of 20 percentage points. Our approach performed best on four of the
genes g = 10, the number of time courses tc = 5, the number of rule five networks and second best on the remaining network. In an
tables r = 6 (Fig. 3) and the number of combination operators additional challenge, steady state gene levels in response to double
|op| = 3 (eq. 4). The number of models m depends on the number knockout mutations were predicted. This evaluated the ability to
of time courses tc because each time course introduces an predict the behavior of networks under previously unseen
additional perturbation variable. As we do not restrict the number experimental conditions. Only 7 teams participated in the double
of effectors, a gene can be affected by zero, one or up to knockout predictions (Fig. 7B) where PNFL also was the top
n = g21+tc = 14 variables, i.e. 9 other genes (as self interactions performer.
were not allowed) and 5 perturbation variables (eq. 6).
Reconstruction of network 5
" #g
n  
Our reconstruction of network 5 (Fig. 8A) achieved an AUPR of
X n :k
m~ 1zn:rzDopD: r ð9Þ 76%. The panels B and C in Fig. 8 compare the provided data
k~2
k (DREAM) to the PNFL simulation for the knockout of gene 8.
Genes up (e.g. gene 7) or down regulated (e.g. gene 1) are captured
In the given setting, the size of the model space is 1.2*10123. Thus, correctly in the PNFL simulation. To simplify the representation
a heuristic search strategy is necessary to detect high scoring of networks, transitions and the corresponding output places
networks. (compare Fig. 1) are merged into single nodes depicted as
hexagons.
Reconstruction run time Network 5 demonstrates the utility of the multifactorial data
The most time expensive steps are the simulations needed to (Fig. 9) for network reconstruction. According to personal
calculate the objective function after each move. Each move communication at the joint RECOMB/DREAM conference
requires 35 simulations, i.e. 5 time courses, 10 knockouts, 10 2009, several participants neglected to utilize this kind of data.
knockdowns and 10 multifactorials. A typical reconstruction run The four-gene cycle (genes 5R6R8R7R5, Fig. 8A) in network 5
consists of 2500 moves (1ms per move and network) on a is an example for a difficult network motif that our approach
population of 25 individual networks and takes about a minute on predicts correctly only if the multifactorial data is included.

Figure 9. Generation of multifactorial (MF) data for an effect in network 5. In network 5, gene 6 is the only effector for gene 8 (see Fig. 8).
Effectors are initialized by the provided MF gene levels (A). Subsequently, individual PNFL transitions are applied to compute the MF gene levels for
the targets (C). The objective function compares the target gene levels of the provided MF data (B) to the PNFL outputs (C).
doi:10.1371/journal.pone.0012807.g009

PLoS ONE | www.plosone.org 7 September 2010 | Volume 5 | Issue 9 | e12807


PNFL: Reverse Engineering

Incorrect predictions were more likely when the effector gene surplus effect (here: 5R9). Such a missing observation can for
levels were low in the wild type (e.g. genes 4, 5, 6 and 9 in network instance be due to knockouts or knockdowns exhibiting no
5). Here, predictions frequently contain shortcuts with respect to substantial effect because of low wild type gene levels.
the true topology (Fig. 8, correct: 5R6R9, predicted 6r5R9;
correct: 9R1R2, predicted: 1r9R2; see also Fig. 10, correct: Reconstruction of network 1
9R10R3, predicted: 10r9R3). This leads to two errors: As the The reconstruction of network 1 (Fig. 10A) achieved a very high
effect 6R9 can not be directly observed in the given data it is AUPR of 92%. Here, we predicted 14 out of 15 effects correctly.
missed as it is already ‘explained’ by an incorrectly predicted For a correct reproduction of time course data (e.g. time course 2
in Fig. 10, compare panels B and C) we also infer perturbation
target genes. According to our reconstruction, the perturbation p2
in time course 2 affects genes 3 and 7.
Network 1 was selected to demonstrate the capability of PNFL
to represent oscillating network motifs. Oscillations require cycles
that seem to pose no particular difficulty for the PNFL based
reconstruction. Each of the three nested cycles contained in
network 1 (genes 3«4, 3«7, 3R7R4R3) was resolved correctly.
In addition, genes 3, 4 and 7 were recognized as an oscillation
generating network motif. The removal of the perturbation
triggers oscillations for instance in gene 7, which was picked up
clearly in the PNFL simulation (Fig. 10, panels B and C).

Validation of effect signs


The validation described in this and the following subsections is
based on supplementary material posted after the completion
of DREAM4 (http://gnw.sourceforge.net/resources/DREAM4%
20in%20silico%20challenge.zip). It for instance enables the
validation of the signs of the effects in the models, i.e. if a target
is activated or inhibited by a given effector. Effect signs are
determined by the effector rule tables (see Application of effector
rules) selected during PNFL reconstruction. Sign predictions can
only be evaluated for correct effector-target predictions. Here, the
signs were predicted correctly in 100% of the cases.

Validation of the regulatory logic


Logical operations are used by both PNFL and DREAM4 to
combine the effects of multiple effectors on a given target gene.
Thereby, dependent (AND, OR) and independent (MEAN) kinds of
regulation are distinguished (see Fig. 5). In the DREAM4 setting,
arbitrary combinations of these operators are possible. For
instance, the activation state of gene 3 in network 1 (Fig. 5) is
described by a combination of AND and MEAN operators. This is
not possible by PNFL as it currently allows only one operator per
target gene. Note that this does not apply to the effect signs, which
are assigned to each effect separately by PNFL as well as
DREAM4.
A rigorous comparison between PNFL and DREAM4 logic is not
possible if more than one operator is involved as in Fig. 5. We then
consider PNFL and DREAM4 regulatory logics as approximately
equal if the operator that combines the majority of terms is
predicted by PNFL. In the example of Fig. 5, MEAN (combining
three terms, i.e. the three CRMs) would be correct whereas AND
(combining two terms, i.e. TF1 and TF7) and OR (not used here)
would be incorrect. According to this, our predictions are correct in
13 out of the 18 targets (72%) regulated by multiple effectors. Three
of the five mismatches are explained by topological errors. Here, the
corresponding target genes are connected to single effectors in the
Figure 10. PNFL reconstruction of network 1 (AUPR = 92%). PNFL models and to multiple effectors in the DREAM4 network.
Shown is our reconstruction of network 1 (A) and the data of time Thus, if the predicted topology permits the inference of the
course 2 as provided by DREAM (B) or simulated by PNFL (C). Time regulatory logic it is correct in 87% ( = 13/15) of the cases.
course data shows how the network responds to the application and
removal of perturbations. In addition to effector targets (eq. 4), we also
predict perturbation targets (eq. 6). According to our reconstruction, Validation of time course perturbation targets
perturbation p2 in time course 2 affects genes 3 and 7. The time courses emerge from perturbations that affect a
doi:10.1371/journal.pone.0012807.g010 specific subset of target genes. The PNFL based simulation of the

PLoS ONE | www.plosone.org 8 September 2010 | Volume 5 | Issue 9 | e12807


PNFL: Reverse Engineering

time courses thus required the prediction of the targets of a low wild type expression, over-expression instead of knockout
perturbation (see Time course data). The evaluation of our experiments should be performed.
prediction performance on inferring the time course perturbation Several of the participating teams focused on the knockout (KO)
targets resulted in an AUPR of 73%. The performance difference datasets and neglected to exploit the time course (TC),
to the prediction of effector targets (AUPR of 81%) is due to the multifactorial (MF) and knockdown (KD) data in their recon-
fact that each perturbation corresponds to a single time course. struction (personal communication at the joint RECOMB/
The remaining four time courses (and all of the KO, KD and MF DREAM conference 2009). We found that only the combination
datasets) do not provide any information with regard to the targets of all provided datasets enabled us to predict particularly difficult
of a selected perturbation. network motifs. An example is the unusual four-node cycle in
Fig. 8A that is predicted correctly only when using the MF data
Discussion (Fig. 9). In general, cycles and nested cycles pose no particular
difficulty to our PNFL based approach (e.g. Fig. 10A). The time
We presented a method for network reconstruction that uses
course shown in Fig. 10 demonstrates that our reconstruction also
Petri Nets with Fuzzy Logic (PNFL) for modeling and simulation.
resolves and recognizes oscillating network motifs.
This approach was the best performer (Fig. 7) in the in silico size ten
challenge of the 2009 DREAM4 assessment of reverse engineering The reconstruction of PNFL models reliably determines a range
methods. Why did it work so well? of mechanistic details that go beyond the graph topology evaluated
Our approach optimizes models to achieve the best possible in DREAM. Our models distinguish activation from inhibition,
agreement between PNFL generated datasets and the datasets dependent from independent regulation as well as strong, medium
provided in the DREAM4 challenges. To get the most out of the and weak degrees of effect strength. Such intuitive assertions are
data, we employ specific simulation approaches for each of the sufficient to specify, visualize and thus comprehend executable
available datasets. This allows us to exploit and score heteroge- models and their parameters. This is a characteristic feature of
neous datasets in a unified way. We further reduce the model fuzzy logic modeling [18]. Similarly intuitive notions are more
complexity severely to avoid the risk of overfitting. Ideally, only a difficult if not impossible to obtain from ODE, mutual information
single network should be able to reproduce the data. The model or Bayesian models. Nevertheless, both PNFL and ODE enable
space is still huge, requiring a heuristic, population based search the detailed simulation of models. Simulation models can facilitate
strategy. It avoids local minima traps and thus improves the an iterative cycle of model improvements based on the comparison
convergence of networks and also the agreement between PNFL between in silico and laboratory experiments.
and DREAM datasets. The resulting PNFL models accurately
predict the network behavior even under new experimental Acknowledgments
conditions not seen during model building. This was demonstrated
We acknowledge Florian Erhard for the stimulating discussions on the
in the double knockout challenge (Fig. 7B). PNFL system. We thank all reviewers for their suggestions, which
Incorrect predictions might result when the effector gene levels substantially improved the manuscript.
are low in the wild type. Knockout experiments, for instance,
provide only little topological information in such cases. This is
Author Contributions
particularly frequent in network 2 (data not shown) where different
network topologies generate similar data and our reconstruction Conceived and designed the experiments: RK TP LW RZ. Performed the
does not converge to a single network. Indeed, no team achieved a experiments: RK. Analyzed the data: RK TP LW RZ. Wrote the paper:
good prediction performance for network 2. In case of genes with RK. Implemented the software: RK.

References
1. de Jong H (2002) Modeling and simulation of genetic regulatory systems: a 14. Hache H, Lehrach H, Herwig R (2009) Reverse Engineering of Gene
literature review. J Comput Biol 9(1): 67–103. Regulatory Networks: A Comparative Study. EURASIP J Bioinform Syst Biol
2. Brazhnik P, de la Fuente A, Mendes P (2002) Gene networks: how to put the 2009: 617281.
function in genomics. Trends Biotechnol 20(11): 467–72. 15. Michoel T, De Smet R, Joshi A, Van de Peer Y, Marchal K (2009) Comparative
3. Bansal M, Belcastro V, Ambesi-Impiombato A, di Bernardo D (2007) How to analysis of module-based versus direct methods for reverse-engineering
infer gene networks from expression profiles. Mol Syst Biol 3: 78. transcriptional regulatory networks. BMC Syst Biol 3: 49.
4. Markowetz F, Spang R (2007) Inferring cellular networks–a review. BMC 16. Zou C, Denby KJ, Feng J (2009) Granger causality vs. dynamic Bayesian
Bioinformatics 8 Suppl 6: S5. network inference: a comparative study. BMC Bioinformatics 10: 122.
5. Li H, Xuan J, Wang Y, Zhan M (2008) Inferring regulatory networks. Front 17. Marbach D, Prill RJ, Schaffter T, Mattiussi C, Floreano D, et al. (2010)
Biosci 13: 263–75. Revealing strengths and weaknesses of methods for gene network inference. Proc
6. Hecker M, Lambeck S, Toepfer S, van Someren E, Guthke R (2008) Gene Natl Acad Sci U S A 107(14): 6286–91.
regulatory network inference: data integration in dynamic models-a review. 18. Windhager L, Zimmer R (2008) Intuitive Modeling of Dynamic Systems with
Biosystems 96(1): 86–103. Petri Nets and Fuzzy Logic. German Conference on Bioinformatics, Lecture
7. Sima C, Hua J, Jung S (2009) Inference of gene regulatory networks using time- Notes in Informatics P-136: 106–115.
series data: a survey. Curr Genomics 10(6): 416–29. 19. Prill RJ, Marbach D, Saez-Rodriguez J, Sorger PK, Alexopoulos LG, et al.
8. de la Fuente A, Brazhnik P, Mendes P (2002) Linking the genes: inferring (2010) Towards a rigorous assessment of systems biology models: the DREAM3
quantitative gene networks from microarray data. Trends Genet 18(8): 395–8. challenges. PLoS One 5(2): e9202.
9. Gardner TS, di Bernardo D, Lorenz D, Collins JJ (2003) Inferring genetic 20. Marbach D, Schaffter T, Mattiussi C, Floreano D (2009) Generating realistic in
networks and identifying compound mode of action via expression profiling. silico gene networks for performance assessment of reverse engineering methods.
Science 301(5629): 102–5. J Comput Biol 16(2): 229–39.
10. Nachman I, Regev A, Friedman N (2004) Inferring quantitative models of 21. Sackmann A, Heiner M, Koch I (2006) Application of Petri net based analysis
regulatory networks from expression data. Bioinformatics 20 Suppl 1: i248–56. techniques to signal transduction pathways. BMC Bioinformatics 7: 482.
11. Margolin A, Nemenman I, Basso K, Wiggins C, Stolovitzky G, et al. (2006) 22. Heiner M, Koch I, Will J (2004) Model validation of biological pathways using
ARACNE: An Algorithm for the Reconstruction of Gene Regulatory Networks Petri nets–demonstrated for apoptosis. Biosystems 75(1–3): 15–28.
in a Mammalian Cellular Context. BMC Bioinformatics 7(Suppl 1): S7. 23. Chen L, Qi-Wei G, Nakata M, Matsuno H, Miyano S (2007) Modelling and
12. Friedman N, Linial M, Nachman I, Pe’er D (2000) Using Bayesian networks to simulation of signal transductions in an apoptosis pathway by using timed Petri
analyze expression data. Journal of Computational Biology 7(3): 601–620. nets. Journal of Biosciences 32(1): 113–127.
13. Soranzo N, Bianconi G, Altafini C (2007) Comparing association network 24. Marwan W, Wagler A, Weismantel R (2008) A mathematical approach to solve
algorithms for reverse engineering of large-scale gene regulatory networks: the network reconstruction problem. Mathematical Methods of Operations
synthetic versus real data. Bioinformatics 23(13): 1640–7. Research 67(1): 117–132.

PLoS ONE | www.plosone.org 9 September 2010 | Volume 5 | Issue 9 | e12807


PNFL: Reverse Engineering

25. Murata T (1989) Petri nets: Properties, analysis and applications. Proceedings of 29. Mendel LM (1995) Fuzzy logic systems for engineering: a tutorial. Proceedings
the IEEE 77(4): 541–580. of the IEEE 83(3): 345–377.
26. Chaouiya C (2007) Petri net modelling of biological networks. Brief Bioinfor- 30. Istrail S, Davidson EH (2005) Logic functions of the genomic cis-regulatory
matics 8(4): 210–219. code. Proc Natl Acad Sci U S A 102(14): 4954–9.
27. Matsuno H, Tanaka Y, Aoshima H, Doi A, Matsui M, et al. (2003) Biopathways 31. Marbach D, Mattiussi C, Floreano D (2009) Combining multiple results of a
representation and simulation on hybrid functional Petri net. In Silico Biol 3(3): reverse-engineering algorithm: application to the DREAM five-gene network
389–404. challenge. Ann N Y Acad Sci 1158: 102–13.
28. Zadeh LA (1965) Fuzzy sets. Information and Control 8: 338–353.

PLoS ONE | www.plosone.org 10 September 2010 | Volume 5 | Issue 9 | e12807

View publication stats

You might also like