Modelling provenance using
                                         Structured Occurrence Networks




                          Paolo Missier, Brian Randell, Maciej Koutny
                                     Newcastle University, UK


                                           IPAW’12
                                 Santa Barbara, CA, June 2012




Thursday, June 21, 2012
Petri Nets and Occurrence Nets

                               Historically, Occurrence Nets have been used to define
                               process semantics of Petri Nets [2]
   IPAW 2012 - P.Missier




                           [2] Best, Eike, and Raymond Devillers. “Sequential and concurrent behaviour in Petri net theory.” Theoretical Computer
                                Science 55, no. 1 (1987): 87-136. http://www.sciencedirect.com/science/article/pii/0304397587900909.
    2

Thursday, June 21, 2012
Occurrence Nets




                                ON is about capturing causal relationships
                                between events and conditions


                           ON = (C, E, F ) with nodes C E and a flow relation F ⊆ (C ×E)∪(E × C) satisfying the following:
                           (i) for every condition c there is at most one event e such that (e,c) ∈ F, and at most one event f such that (c,f) ∈ F;
                           (ii) for every event e there is at least one condition c such that (c, e) ∈ F , and at least one condition d such that (e,
                             d) ∈ F ;
                           (iii) ON forms an acyclic graph and F + is a partial order relation (the relation PrecON = (F ◦ F)|C×C is acyclic)
   IPAW 2012 - P.Missier




    3

Thursday, June 21, 2012
Provenance and Occurrence Nets
                           • "Provenance is defined as a record that • An Occurrence Net is an abstract record of
                             describes the people, institutions,       a single execution of some computing
                             entities, and activities involved in      system
                             producing, influencing, or delivering a   • Only information about causality and
                             piece of data or a thing." [PROV]            concurrency between events and visited
                                                                          local states is represented
   IPAW 2012 - P.Missier




    4

Thursday, June 21, 2012
Provenance and Occurrence Nets
                           • "Provenance is defined as a record that • An Occurrence Net is an abstract record of
                             describes the people, institutions,       a single execution of some computing
                             entities, and activities involved in      system
                             producing, influencing, or delivering a   • Only information about causality and
                             piece of data or a thing." [PROV]            concurrency between events and visited
                                                                          local states is represented


                           Data: The evolution of variable A as a ON:
   IPAW 2012 - P.Missier




    4

Thursday, June 21, 2012
Provenance and Occurrence Nets
                           • "Provenance is defined as a record that • An Occurrence Net is an abstract record of
                             describes the people, institutions,       a single execution of some computing
                             entities, and activities involved in      system
                             producing, influencing, or delivering a   • Only information about causality and
                             piece of data or a thing." [PROV]            concurrency between events and visited
                                                                          local states is represented


                           Data: The evolution of variable A as a ON:




                           Agents: The evolution of Bob, the document editor:
                                         read                                         drafted    ready
                                                        performed       verified
                                  ptd    paper                                        internal     to
                                                           exp.         results                   draft
                                          p1                                           memo
   IPAW 2012 - P.Missier




                                                                        read
                                                                        paper
                                                                         p2




    4

Thursday, June 21, 2012
From ON to Structured ON
                             ON is an adequate starting point, but extensions are needed to
                             represent the activity of complex systems

                           • Structured Occurrence Nets provide these extensions as new
                             relationships amongst multiple ONs [2]
   IPAW 2012 - P.Missier




                           [2] Koutny, M., and B Randell. “Structured Occurrence Nets: A Formalism for Aiding System Failure Prevention and
                               Analysis Techniques.” Fundamenta Informaticae 97 (2009).
    5

Thursday, June 21, 2012
From ON to Structured ON
                             ON is an adequate starting point, but extensions are needed to
                             represent the activity of complex systems

                           • Structured Occurrence Nets provide these extensions as new
                             relationships amongst multiple ONs [2]
   IPAW 2012 - P.Missier




                           [2] Koutny, M., and B Randell. “Structured Occurrence Nets: A Formalism for Aiding System Failure Prevention and
                               Analysis Techniques.” Fundamenta Informaticae 97 (2009).
    5

Thursday, June 21, 2012
Provenance modelling patterns using SON

                               Main goal of this work:
                                       to explore the use of Structured Occurrence Nets
                                                as a formal model of provenance


                           • Data is viewed as an evolving system

                           • Agents are also evolving systems, thus their evolution is also captured

                           • Uniform representation of Agents / Activity / Data interplay

                           • Representation of multi-layered Provenance
                              – eg:the provenance of a workflow run is underpinned by system-level provenance
                              – SONs are suitable for modelling at multiple levels of abstraction
   IPAW 2012 - P.Missier




                             Expected by-product of the investigation:
                              suggest enhancements to the current SON model
    6

Thursday, June 21, 2012
Talk outline
                           • C-SON: ON + communication relation
                             – a provenance pattern to represent workflow patterns

                           • T-SON: C-SON + temporal abstraction
                           • (more abstraction relations omitted in the talk)
                           • Modelling multi-layered provenance
                           • Agents as evolving systems
                             – provenance of agents
   IPAW 2012 - P.Missier




    7

Thursday, June 21, 2012
Communication SONs
                           • Goal: to capture communication between ONs that otherwise proceed
                             concurrently
                           • by introducing a Communication relation amongst activities in two ONs
                              – induces a partial order on the states and conditions of the two nets
                              – must result in an acyclic net



                                                                   e1 cannot have happened after f1

                                                                   e2, f2 must have happened simultaneously
   IPAW 2012 - P.Missier




    8

Thursday, June 21, 2012
Communication SONs
                           • Goal: to capture communication between ONs that otherwise proceed
                             concurrently
                           • by introducing a Communication relation amongst activities in two ONs
                              – induces a partial order on the states and conditions of the two nets
                              – must result in an acyclic net



                                                                   e1 cannot have happened after f1

                                                                   e2, f2 must have happened simultaneously
   IPAW 2012 - P.Missier




    8

Thursday, June 21, 2012
C-SON at work - a first provenance pattern




                           • Each ON models one variable as a system.
                           • The composed activity “A:=A+1; A:=A+B; B:=A+B” is represented as a SON
                           composed of a pair of communicating ONs
   IPAW 2012 - P.Missier




                           • The resulting SON captures a (partial) order of possible reads and writes
                           leading to the final result

    9

Thursday, June 21, 2012
Workflow execution traces using C-SON

                                                                                                                                           X= 10

                                                                               Y:= f(X)
                                                             (a)                                                    (b)
                                                          Program                                                Execution
                                                        specification                                            instance                   f()


                                                                            <X,Z> := g(X,Y)                                                   Y = "200"

                                                                                                                                            g()


                                                                                                                                  X = 20          Z = "20"



                                                       10               r                10         r       10                               w               20

                                                   X


                                   (c)
                           SON representation of
                             Execution trace                                     f                                            g

                                                   Program
                                                   execution
   IPAW 2012 - P.Missier




                                                                                                                                                             w    20
                                                                                     w        200       r               200
                                                                                                                                                                       Z
                                                                                                                              Y



  10

Thursday, June 21, 2012
Temporal SON (T-SON)
                           • Goal: to replace part of an ON with new “atomic” actions
                             – a form of temporal abstraction
                             – these new actions only appear atomic at one level of abstraction
   IPAW 2012 - P.Missier




   11

Thursday, June 21, 2012
T-SON in action: multi-layered provenance
                           • Provenance of computed data naturally comes in layers:
                             – program execution (incl. workflow controller)
                                • system-level primitives, network protocols


                                           10    r                   10


                                                     t                         t                                   10               r       10
                                     t
                                                                                                               X

                             10          fetch                   send               10


                             X
                                                                                                                                        f

                                                                                                                        Program
                                                                                                                        execution
                                                                              get                compute




                                                                          t                  t             t


                                                                                         f
   IPAW 2012 - P.Missier




                                                         Program
                                                         execution




                                  Temporal abstraction used to hide/reveal lower-level details of both
  12                              systems and their interaction (i.e., reading from storage)
Thursday, June 21, 2012
T-SON: multi-layered provenance
                           Additional unfolding:
                           - assume f is a service call
                           - reveal details of communication with an underlying service implementation



                                                         10               r       10

                                                     X




                                                                              f

                                                              Program
                                                              execution




                                                          Service
                                                         execution
   IPAW 2012 - P.Missier




  13

Thursday, June 21, 2012
T-SON: multi-layered provenance



                                      10      r             10


                                                  t                t
                                t

                                                                                                     msg                 msg
                                                                                                   receive               send
                           10       fetch               send           10


                                                                                    Service
                           X
                                                                                   execution




                                                                            msg                              msg
                                                             get                                                    receive
                                                                            prep                             send




                                                        t                                 t
                                                                                                                         t

                                                                                               f
   IPAW 2012 - P.Missier




                                            Program
                                            execution




  14

Thursday, June 21, 2012
Agents as evolving systems
                           Alice and Bob collaborate on document editing




                            An agent performs activities that account for changes in the state of the data
                           • F is data (a file)
                           • Data and agents are modelled as systems
                             • Each system is a SON
                           • Communication abstraction used to synchronize events across different
   IPAW 2012 - P.Missier




                           systems
                             • Bob.draft → F.write, F.read → Alice.”read draft” etc.
                           • For example, Bob in the state b2 is unaware of Alice’s feedback
                           • Bob in state b3 has read Alice’s feedback
  15

Thursday, June 21, 2012
Agents as evolving systems
                           • Tracking the state of agents is important
                              – we know that it is the “version” of Bob that is aware of Alice’s comment that is
                                responsible for the latest edit




                             Below: Bob’s edits to the document do not take account of Alice’s comments

                                            ptd   draft        b2                                          edit   b4



                                     Bob




                                                   w      f1          r      f1      w       f2   r   f2    w     f3
   IPAW 2012 - P.Missier




                                     F




                                                                    read            leave
                                                          a1                a2
                                                                    draft         comments

  16                                Alice


Thursday, June 21, 2012
Summary
                           • SONs extend well-known Occurrence Nets

                           • Simple graphical notation with formal grounding
                             – amenable to various types of analysis

                           • Appealing for capturing system evolution and inter-system
                             communication

                           • Accommodates multiple levels of abstraction

                           • Evolution traces of Data, Agents and Activities provenance queried
                             seamlessly

                           • Some software tool support (in progress), more work needed here

                           • Feedback for customizations of the SON model itself!
   IPAW 2012 - P.Missier




  17

Thursday, June 21, 2012
Selected References
                                •    V. Khomenko, M. Koutny, A. Yakovlev: Logic Synthesis for Asynchronous
                                     Circuits Based on Petri Net Unfoldings and Incremental SAT. Fundamenta
                                     Informaticae 70, 2006.
                                       •    http://www.cs.ncl.ac.uk/publications/inproceedings/papers/749.pdf
                                •    M. Koutny, B. Randell: Structured Occurrence Nets: A Formalism for
                                     Aiding System Failure Prevention and Analysis Techniques. Fundamenta
                                     Informaticae 97, 2009.
                                       •    http://www.cs.ncl.ac.uk/publications/trs/papers/1162.pdf
                                •    B. Li, M. Koutny: Verification and Simulation Tool for Communication Structured
                                     Occurrence Nets. CS-TR, Newcastle University, (to appear).
                                •    P.M. Merlin and B. Randell: State Restoration in Distributed Systems. Proc.
                                     FTCS-8, 1978.
                                       •    http://www.cs.ncl.ac.uk/publications/inproceedings/papers/347.pdf
                                •    I. Poliakov, V. Khomenko, A. Yakovlev: Workcraft - A Framework for
                                     Interpreted Graph Models. LNCS 5606, 2009.
                                •    B. Randell, M. Koutny: Failures: Their Definition, Modelling and Analysis. LNCS
                                     4711, 2007.
                                       •    http://www.cs.ncl.ac.uk/publications/trs/papers/994.pdf
   IPAW 2012 - P.Missier




                                •    B. Randell, M. Koutny: Structured Occurrence Nets: Incomplete,
                                     Contradictory and Uncertain Failure Evidence. CS-TR 1170, Newcastle
                                     University, 2009.
                                       •    http://www.cs.ncl.ac.uk/publications/trs/papers/1170.pdf

  18                       Martinique, Jan. 2012
                                                                                                                       24

Thursday, June 21, 2012
IPAW 2012 - P.Missier   EXTRA material




  19

Thursday, June 21, 2012
SON provenance and PROV

                                                    read
                                 b2                                  b3             edit           b4
                                                  feedback

                           Bob                                                                               (a)



                                 f2               r          f2      r         f2          w            f3

                           F




                                                         used                    wasGeneratedBy
                                             f2                          edit      wa                 f3     (b)
                                                                                      sG
                                                                                        en
                                                                                           era
                                      used                         wasAssociatedWith           ted
                                                                                                   By
                                        read           wasGeneratedBy
                                      feedback                         Bob_b3 wasDerivedFrom Bob_b4
   IPAW 2012 - P.Missier




                                 wasAssociatedWith
                                                        wasDerivedFrom

                                        Bob_b2
  20

Thursday, June 21, 2012
T-SON and finite-duration activities

                           In Occurrence Nets, events (activities) are instantaneous.
                           Using temporal abstraction one can add a time duration to activities


                                                                                        [s,e]
                                               Activity f                           f
                                          with duration [s,e]


                                                        t         t                t            t   t       t



                                                            s                                           e



                                                        c              duration interval of f                   c



                                               t1                 t2                                                tn


                                        Time
   IPAW 2012 - P.Missier




  21

Thursday, June 21, 2012
SON -- behavioural abstraction
                           • Insight: ‘system’ and ‘state’ are not separate concepts
                           • The same graph node may be used to represent either a state or a
                           system, at different levels of abstraction




                           Key idea of Structured Occurrence Nets:
                           • multiple ONs, each portraying a system/state at some level of abstraction
   IPAW 2012 - P.Missier




                           • associations amongst the ONs are then established by means of new
                           relation types

  22

Thursday, June 21, 2012
B-abstraction in use
                           – An application of behavioural abstraction -- state/system duality
                           – Bob’s state b1 expands into a system’s activities (b1)
                            – the expansion provides additional insight into Bob’s “knowledge level” /
                              “preparedness” prior to drafting the document
   IPAW 2012 - P.Missier




  23                                                   Bob’s state as a system
Thursday, June 21, 2012
Finding good abstractions - cuts in ON and SON

                           (a)


                                                       (d)




                           (b)




                                                       (e)
   IPAW 2012 - P.Missier




                           (c)

  24

Thursday, June 21, 2012

More Related Content

PDF
Key Issues in eDiscovery
PPT
Paper presentations: UK e-science AHM meeting, 2005
PPT
Paper talk (presented by Prof. Ludaescher), WORKS workshop, 2010
PDF
Ipaw12 datalog paper talk
KEY
SWPM12 report on the dagstuhl seminar on Semantic Data Management
PDF
Invited talk at the GeoClouds Workshop, Indianapolis, 2009
PDF
Invited talk @ Cardiff University, 2008: Approximate entity reconciliation fo...
PDF
Paper talk @ EDBT'10: Fine-grained and efficient lineage querying of collecti...
Key Issues in eDiscovery
Paper presentations: UK e-science AHM meeting, 2005
Paper talk (presented by Prof. Ludaescher), WORKS workshop, 2010
Ipaw12 datalog paper talk
SWPM12 report on the dagstuhl seminar on Semantic Data Management
Invited talk at the GeoClouds Workshop, Indianapolis, 2009
Invited talk @ Cardiff University, 2008: Approximate entity reconciliation fo...
Paper talk @ EDBT'10: Fine-grained and efficient lineage querying of collecti...

Viewers also liked (10)

PPTX
C4Bio paper talk
PDF
Invited talk @ DCC09 workshop
PDF
Session talk @ AGU09
PPTX
PDT: Personal Data from Things, and its provenance
PPTX
ProvAbs: model, policy, and tooling for abstracting PROV graphs
PPTX
Big Data Quality Panel : Diachron Workshop @EDBT
PPTX
Your data won’t stay smart forever: exploring the temporal dimension of (big ...
PPTX
The lifecycle of reproducible science data and what provenance has got to do ...
PPTX
Cloud e-Genome: NGS Workflows on the Cloud Using e-Science Central
PPT
Paper presentation @DILS'07
C4Bio paper talk
Invited talk @ DCC09 workshop
Session talk @ AGU09
PDT: Personal Data from Things, and its provenance
ProvAbs: model, policy, and tooling for abstracting PROV graphs
Big Data Quality Panel : Diachron Workshop @EDBT
Your data won’t stay smart forever: exploring the temporal dimension of (big ...
The lifecycle of reproducible science data and what provenance has got to do ...
Cloud e-Genome: NGS Workflows on the Cloud Using e-Science Central
Paper presentation @DILS'07
Ad

Similar to Structured Occurrence Network for provenance: talk for ipaw12 paper (20)

PDF
Supporting Inter-Organizational Situation Assessment in Crisis Management
PDF
Fois2010 final
PDF
Rule-based Information Extraction from Disease Outbreak Reports
PPT
Mythbusters: Event Stream Processing v. Complex Event Processing
PDF
Object-Centric Processes - from cases to objects and relations… and beyond
PDF
Innovation ecosystems value co-creation
PPTX
Financial Networks and Cartography
PDF
From Representation to Mediation: A New Agenda for Conceptual Modeling Resear...
PDF
eventdemo2016
PPTX
Inter-Organizational Crisis Management Infrastructures for Electrical Power B...
PPTX
Situation based analysis and control for supporting Event-web applications
PDF
The Symbiotic Nature of Provenance and Workflow
PDF
A data driven approach for monitoring network events
PPTX
Scio12 sem web_final
PDF
A Resource Oriented Framework for Context-Aware Enterprise Applications
PDF
Mining Development Repositories to Study the Impact of Collaboration on Softw...
PPTX
Managing Complexity in a World of Surprise David L. Alderson, PhD
PDF
Understanding Collaboration in Fluid Organizations, a Proximity Approach
PDF
Supporting Inter-Organizational Situation Assessment in Crisis Management
Fois2010 final
Rule-based Information Extraction from Disease Outbreak Reports
Mythbusters: Event Stream Processing v. Complex Event Processing
Object-Centric Processes - from cases to objects and relations… and beyond
Innovation ecosystems value co-creation
Financial Networks and Cartography
From Representation to Mediation: A New Agenda for Conceptual Modeling Resear...
eventdemo2016
Inter-Organizational Crisis Management Infrastructures for Electrical Power B...
Situation based analysis and control for supporting Event-web applications
The Symbiotic Nature of Provenance and Workflow
A data driven approach for monitoring network events
Scio12 sem web_final
A Resource Oriented Framework for Context-Aware Enterprise Applications
Mining Development Repositories to Study the Impact of Collaboration on Softw...
Managing Complexity in a World of Surprise David L. Alderson, PhD
Understanding Collaboration in Fluid Organizations, a Proximity Approach
Ad

More from Paolo Missier (20)

PPTX
From XAI to XEE through Influence and Provenance.Controlling model fairness o...
PPTX
Data and end-to-end Explainability (XAI,XEE)
PPTX
A simple Introduction to Explainability in Machine Learning and AI (XAI)
PPTX
A simple Introduction to Algorithmic Fairness
PDF
(Explainable) Data-Centric AI: what are you explaininhg, and to whom?
PDF
Design and Development of a Provenance Capture Platform for Data Science
PDF
Towards explanations for Data-Centric AI using provenance records
PPTX
Interpretable and robust hospital readmission predictions from Electronic Hea...
PPTX
Data-centric AI and the convergence of data and model engineering: opportunit...
PPTX
Realising the potential of Health Data Science: opportunities and challenges ...
PPTX
Provenance Week 2023 talk on DP4DS (Data Provenance for Data Science)
PDF
A Data-centric perspective on Data-driven healthcare: a short overview
PPTX
Capturing and querying fine-grained provenance of preprocessing pipelines in ...
PPTX
Tracking trajectories of multiple long-term conditions using dynamic patient...
PPTX
Delivering on the promise of data-driven healthcare: trade-offs, challenges, ...
PPTX
Digital biomarkers for preventive personalised healthcare
PPTX
Digital biomarkers for preventive personalised healthcare
PPTX
Data Provenance for Data Science
PPTX
Capturing and querying fine-grained provenance of preprocessing pipelines in ...
PPTX
Quo vadis, provenancer?  Cui prodest?  our own trajectory: provenance of data...
From XAI to XEE through Influence and Provenance.Controlling model fairness o...
Data and end-to-end Explainability (XAI,XEE)
A simple Introduction to Explainability in Machine Learning and AI (XAI)
A simple Introduction to Algorithmic Fairness
(Explainable) Data-Centric AI: what are you explaininhg, and to whom?
Design and Development of a Provenance Capture Platform for Data Science
Towards explanations for Data-Centric AI using provenance records
Interpretable and robust hospital readmission predictions from Electronic Hea...
Data-centric AI and the convergence of data and model engineering: opportunit...
Realising the potential of Health Data Science: opportunities and challenges ...
Provenance Week 2023 talk on DP4DS (Data Provenance for Data Science)
A Data-centric perspective on Data-driven healthcare: a short overview
Capturing and querying fine-grained provenance of preprocessing pipelines in ...
Tracking trajectories of multiple long-term conditions using dynamic patient...
Delivering on the promise of data-driven healthcare: trade-offs, challenges, ...
Digital biomarkers for preventive personalised healthcare
Digital biomarkers for preventive personalised healthcare
Data Provenance for Data Science
Capturing and querying fine-grained provenance of preprocessing pipelines in ...
Quo vadis, provenancer?  Cui prodest?  our own trajectory: provenance of data...

Recently uploaded (20)

PDF
Ensemble model-based arrhythmia classification with local interpretable model...
PPTX
SGT Report The Beast Plan and Cyberphysical Systems of Control
PDF
Human Computer Interaction Miterm Lesson
PDF
INTERSPEECH 2025 「Recent Advances and Future Directions in Voice Conversion」
PPTX
AI-driven Assurance Across Your End-to-end Network With ThousandEyes
PDF
Introduction to MCP and A2A Protocols: Enabling Agent Communication
PDF
zbrain.ai-Scope Key Metrics Configuration and Best Practices.pdf
PDF
Transform-Quality-Engineering-with-AI-A-60-Day-Blueprint-for-Digital-Success.pdf
PDF
The-Future-of-Automotive-Quality-is-Here-AI-Driven-Engineering.pdf
PDF
4 layer Arch & Reference Arch of IoT.pdf
PPTX
Microsoft User Copilot Training Slide Deck
PDF
EIS-Webinar-Regulated-Industries-2025-08.pdf
PDF
A symptom-driven medical diagnosis support model based on machine learning te...
PPTX
Build automations faster and more reliably with UiPath ScreenPlay
PDF
A hybrid framework for wild animal classification using fine-tuned DenseNet12...
PDF
Accessing-Finance-in-Jordan-MENA 2024 2025.pdf
PDF
Transform-Your-Factory-with-AI-Driven-Quality-Engineering.pdf
PDF
MENA-ECEONOMIC-CONTEXT-VC MENA-ECEONOMIC
PDF
Early detection and classification of bone marrow changes in lumbar vertebrae...
PPTX
MuleSoft-Compete-Deck for midddleware integrations
Ensemble model-based arrhythmia classification with local interpretable model...
SGT Report The Beast Plan and Cyberphysical Systems of Control
Human Computer Interaction Miterm Lesson
INTERSPEECH 2025 「Recent Advances and Future Directions in Voice Conversion」
AI-driven Assurance Across Your End-to-end Network With ThousandEyes
Introduction to MCP and A2A Protocols: Enabling Agent Communication
zbrain.ai-Scope Key Metrics Configuration and Best Practices.pdf
Transform-Quality-Engineering-with-AI-A-60-Day-Blueprint-for-Digital-Success.pdf
The-Future-of-Automotive-Quality-is-Here-AI-Driven-Engineering.pdf
4 layer Arch & Reference Arch of IoT.pdf
Microsoft User Copilot Training Slide Deck
EIS-Webinar-Regulated-Industries-2025-08.pdf
A symptom-driven medical diagnosis support model based on machine learning te...
Build automations faster and more reliably with UiPath ScreenPlay
A hybrid framework for wild animal classification using fine-tuned DenseNet12...
Accessing-Finance-in-Jordan-MENA 2024 2025.pdf
Transform-Your-Factory-with-AI-Driven-Quality-Engineering.pdf
MENA-ECEONOMIC-CONTEXT-VC MENA-ECEONOMIC
Early detection and classification of bone marrow changes in lumbar vertebrae...
MuleSoft-Compete-Deck for midddleware integrations

Structured Occurrence Network for provenance: talk for ipaw12 paper

  • 1. Modelling provenance using Structured Occurrence Networks Paolo Missier, Brian Randell, Maciej Koutny Newcastle University, UK IPAW’12 Santa Barbara, CA, June 2012 Thursday, June 21, 2012
  • 2. Petri Nets and Occurrence Nets Historically, Occurrence Nets have been used to define process semantics of Petri Nets [2] IPAW 2012 - P.Missier [2] Best, Eike, and Raymond Devillers. “Sequential and concurrent behaviour in Petri net theory.” Theoretical Computer Science 55, no. 1 (1987): 87-136. http://www.sciencedirect.com/science/article/pii/0304397587900909. 2 Thursday, June 21, 2012
  • 3. Occurrence Nets ON is about capturing causal relationships between events and conditions ON = (C, E, F ) with nodes C E and a flow relation F ⊆ (C ×E)∪(E × C) satisfying the following: (i) for every condition c there is at most one event e such that (e,c) ∈ F, and at most one event f such that (c,f) ∈ F; (ii) for every event e there is at least one condition c such that (c, e) ∈ F , and at least one condition d such that (e, d) ∈ F ; (iii) ON forms an acyclic graph and F + is a partial order relation (the relation PrecON = (F ◦ F)|C×C is acyclic) IPAW 2012 - P.Missier 3 Thursday, June 21, 2012
  • 4. Provenance and Occurrence Nets • "Provenance is defined as a record that • An Occurrence Net is an abstract record of describes the people, institutions, a single execution of some computing entities, and activities involved in system producing, influencing, or delivering a • Only information about causality and piece of data or a thing." [PROV] concurrency between events and visited local states is represented IPAW 2012 - P.Missier 4 Thursday, June 21, 2012
  • 5. Provenance and Occurrence Nets • "Provenance is defined as a record that • An Occurrence Net is an abstract record of describes the people, institutions, a single execution of some computing entities, and activities involved in system producing, influencing, or delivering a • Only information about causality and piece of data or a thing." [PROV] concurrency between events and visited local states is represented Data: The evolution of variable A as a ON: IPAW 2012 - P.Missier 4 Thursday, June 21, 2012
  • 6. Provenance and Occurrence Nets • "Provenance is defined as a record that • An Occurrence Net is an abstract record of describes the people, institutions, a single execution of some computing entities, and activities involved in system producing, influencing, or delivering a • Only information about causality and piece of data or a thing." [PROV] concurrency between events and visited local states is represented Data: The evolution of variable A as a ON: Agents: The evolution of Bob, the document editor: read drafted ready performed verified ptd paper internal to exp. results draft p1 memo IPAW 2012 - P.Missier read paper p2 4 Thursday, June 21, 2012
  • 7. From ON to Structured ON ON is an adequate starting point, but extensions are needed to represent the activity of complex systems • Structured Occurrence Nets provide these extensions as new relationships amongst multiple ONs [2] IPAW 2012 - P.Missier [2] Koutny, M., and B Randell. “Structured Occurrence Nets: A Formalism for Aiding System Failure Prevention and Analysis Techniques.” Fundamenta Informaticae 97 (2009). 5 Thursday, June 21, 2012
  • 8. From ON to Structured ON ON is an adequate starting point, but extensions are needed to represent the activity of complex systems • Structured Occurrence Nets provide these extensions as new relationships amongst multiple ONs [2] IPAW 2012 - P.Missier [2] Koutny, M., and B Randell. “Structured Occurrence Nets: A Formalism for Aiding System Failure Prevention and Analysis Techniques.” Fundamenta Informaticae 97 (2009). 5 Thursday, June 21, 2012
  • 9. Provenance modelling patterns using SON Main goal of this work: to explore the use of Structured Occurrence Nets as a formal model of provenance • Data is viewed as an evolving system • Agents are also evolving systems, thus their evolution is also captured • Uniform representation of Agents / Activity / Data interplay • Representation of multi-layered Provenance – eg:the provenance of a workflow run is underpinned by system-level provenance – SONs are suitable for modelling at multiple levels of abstraction IPAW 2012 - P.Missier Expected by-product of the investigation: suggest enhancements to the current SON model 6 Thursday, June 21, 2012
  • 10. Talk outline • C-SON: ON + communication relation – a provenance pattern to represent workflow patterns • T-SON: C-SON + temporal abstraction • (more abstraction relations omitted in the talk) • Modelling multi-layered provenance • Agents as evolving systems – provenance of agents IPAW 2012 - P.Missier 7 Thursday, June 21, 2012
  • 11. Communication SONs • Goal: to capture communication between ONs that otherwise proceed concurrently • by introducing a Communication relation amongst activities in two ONs – induces a partial order on the states and conditions of the two nets – must result in an acyclic net e1 cannot have happened after f1 e2, f2 must have happened simultaneously IPAW 2012 - P.Missier 8 Thursday, June 21, 2012
  • 12. Communication SONs • Goal: to capture communication between ONs that otherwise proceed concurrently • by introducing a Communication relation amongst activities in two ONs – induces a partial order on the states and conditions of the two nets – must result in an acyclic net e1 cannot have happened after f1 e2, f2 must have happened simultaneously IPAW 2012 - P.Missier 8 Thursday, June 21, 2012
  • 13. C-SON at work - a first provenance pattern • Each ON models one variable as a system. • The composed activity “A:=A+1; A:=A+B; B:=A+B” is represented as a SON composed of a pair of communicating ONs IPAW 2012 - P.Missier • The resulting SON captures a (partial) order of possible reads and writes leading to the final result 9 Thursday, June 21, 2012
  • 14. Workflow execution traces using C-SON X= 10 Y:= f(X) (a) (b) Program Execution specification instance f() <X,Z> := g(X,Y) Y = "200" g() X = 20 Z = "20" 10 r 10 r 10 w 20 X (c) SON representation of Execution trace f g Program execution IPAW 2012 - P.Missier w 20 w 200 r 200 Z Y 10 Thursday, June 21, 2012
  • 15. Temporal SON (T-SON) • Goal: to replace part of an ON with new “atomic” actions – a form of temporal abstraction – these new actions only appear atomic at one level of abstraction IPAW 2012 - P.Missier 11 Thursday, June 21, 2012
  • 16. T-SON in action: multi-layered provenance • Provenance of computed data naturally comes in layers: – program execution (incl. workflow controller) • system-level primitives, network protocols 10 r 10 t t 10 r 10 t X 10 fetch send 10 X f Program execution get compute t t t f IPAW 2012 - P.Missier Program execution Temporal abstraction used to hide/reveal lower-level details of both 12 systems and their interaction (i.e., reading from storage) Thursday, June 21, 2012
  • 17. T-SON: multi-layered provenance Additional unfolding: - assume f is a service call - reveal details of communication with an underlying service implementation 10 r 10 X f Program execution Service execution IPAW 2012 - P.Missier 13 Thursday, June 21, 2012
  • 18. T-SON: multi-layered provenance 10 r 10 t t t msg msg receive send 10 fetch send 10 Service X execution msg msg get receive prep send t t t f IPAW 2012 - P.Missier Program execution 14 Thursday, June 21, 2012
  • 19. Agents as evolving systems Alice and Bob collaborate on document editing An agent performs activities that account for changes in the state of the data • F is data (a file) • Data and agents are modelled as systems • Each system is a SON • Communication abstraction used to synchronize events across different IPAW 2012 - P.Missier systems • Bob.draft → F.write, F.read → Alice.”read draft” etc. • For example, Bob in the state b2 is unaware of Alice’s feedback • Bob in state b3 has read Alice’s feedback 15 Thursday, June 21, 2012
  • 20. Agents as evolving systems • Tracking the state of agents is important – we know that it is the “version” of Bob that is aware of Alice’s comment that is responsible for the latest edit Below: Bob’s edits to the document do not take account of Alice’s comments ptd draft b2 edit b4 Bob w f1 r f1 w f2 r f2 w f3 IPAW 2012 - P.Missier F read leave a1 a2 draft comments 16 Alice Thursday, June 21, 2012
  • 21. Summary • SONs extend well-known Occurrence Nets • Simple graphical notation with formal grounding – amenable to various types of analysis • Appealing for capturing system evolution and inter-system communication • Accommodates multiple levels of abstraction • Evolution traces of Data, Agents and Activities provenance queried seamlessly • Some software tool support (in progress), more work needed here • Feedback for customizations of the SON model itself! IPAW 2012 - P.Missier 17 Thursday, June 21, 2012
  • 22. Selected References •  V. Khomenko, M. Koutny, A. Yakovlev: Logic Synthesis for Asynchronous Circuits Based on Petri Net Unfoldings and Incremental SAT. Fundamenta Informaticae 70, 2006. •  http://www.cs.ncl.ac.uk/publications/inproceedings/papers/749.pdf •  M. Koutny, B. Randell: Structured Occurrence Nets: A Formalism for Aiding System Failure Prevention and Analysis Techniques. Fundamenta Informaticae 97, 2009. •  http://www.cs.ncl.ac.uk/publications/trs/papers/1162.pdf •  B. Li, M. Koutny: Verification and Simulation Tool for Communication Structured Occurrence Nets. CS-TR, Newcastle University, (to appear). •  P.M. Merlin and B. Randell: State Restoration in Distributed Systems. Proc. FTCS-8, 1978. •  http://www.cs.ncl.ac.uk/publications/inproceedings/papers/347.pdf •  I. Poliakov, V. Khomenko, A. Yakovlev: Workcraft - A Framework for Interpreted Graph Models. LNCS 5606, 2009. •  B. Randell, M. Koutny: Failures: Their Definition, Modelling and Analysis. LNCS 4711, 2007. •  http://www.cs.ncl.ac.uk/publications/trs/papers/994.pdf IPAW 2012 - P.Missier •  B. Randell, M. Koutny: Structured Occurrence Nets: Incomplete, Contradictory and Uncertain Failure Evidence. CS-TR 1170, Newcastle University, 2009. •  http://www.cs.ncl.ac.uk/publications/trs/papers/1170.pdf 18 Martinique, Jan. 2012 24 Thursday, June 21, 2012
  • 23. IPAW 2012 - P.Missier EXTRA material 19 Thursday, June 21, 2012
  • 24. SON provenance and PROV read b2 b3 edit b4 feedback Bob (a) f2 r f2 r f2 w f3 F used wasGeneratedBy f2 edit wa f3 (b) sG en era used wasAssociatedWith ted By read wasGeneratedBy feedback Bob_b3 wasDerivedFrom Bob_b4 IPAW 2012 - P.Missier wasAssociatedWith wasDerivedFrom Bob_b2 20 Thursday, June 21, 2012
  • 25. T-SON and finite-duration activities In Occurrence Nets, events (activities) are instantaneous. Using temporal abstraction one can add a time duration to activities [s,e] Activity f f with duration [s,e] t t t t t t s e c duration interval of f c t1 t2 tn Time IPAW 2012 - P.Missier 21 Thursday, June 21, 2012
  • 26. SON -- behavioural abstraction • Insight: ‘system’ and ‘state’ are not separate concepts • The same graph node may be used to represent either a state or a system, at different levels of abstraction Key idea of Structured Occurrence Nets: • multiple ONs, each portraying a system/state at some level of abstraction IPAW 2012 - P.Missier • associations amongst the ONs are then established by means of new relation types 22 Thursday, June 21, 2012
  • 27. B-abstraction in use – An application of behavioural abstraction -- state/system duality – Bob’s state b1 expands into a system’s activities (b1) – the expansion provides additional insight into Bob’s “knowledge level” / “preparedness” prior to drafting the document IPAW 2012 - P.Missier 23 Bob’s state as a system Thursday, June 21, 2012
  • 28. Finding good abstractions - cuts in ON and SON (a) (d) (b) (e) IPAW 2012 - P.Missier (c) 24 Thursday, June 21, 2012