BCP D'exercices Deadlock Livelock Communicationnchll - Compressed
BCP D'exercices Deadlock Livelock Communicationnchll - Compressed
Designing
Reliable
Distributed
Systems
A Formal Methods Approach Based on
Executable Modeling in Maude
Undergraduate Topics in Computer
Science
Series editor
Ian Mackie
Advisory Board
Samson Abramsky, University of Oxford, Oxford, UK
Karin Breitman, Pontifical Catholic University of Rio de Janeiro, Rio de Janeiro,
Brazil
Chris Hankin, Imperial College London, London, UK
Dexter C. Kozen, Cornell University, Ithaca, USA
Andrew Pitts, University of Cambridge, Cambridge, UK
Hanne Riis Nielson, Technical University of Denmark, Kongens Lyngby, Denmark
Steven S. Skiena, Stony Brook University, Stony Brook, USA
Iain Stewart, University of Durham, Durham, UK
Undergraduate Topics in Computer Science (UTiCS) delivers high-quality
instructional content for undergraduates studying in all areas of computing and
information science. From core foundational and theoretical material to final-year
topics and applications, UTiCS books take a fresh, concise, and modern approach
and are ideal for self-study or for a one- or two-semester course. The texts are all
authored by established experts in their fields, reviewed by an international advisory
board, and contain numerous examples and problems. Many include fully worked
solutions.
Designing Reliable
Distributed Systems
A Formal Methods Approach Based
on Executable Modeling in Maude
123
Peter Csaba Ölveczky
University of Oslo
Oslo
Norway
De facto, both individually and socially, all of us rely more and more on
software-mediated systems and devices. However, as software disasters and suc-
cessful cyber-attacks keep piling up, the crucial importance of software quality and
reliability, and the sobering realization of how vulnerable our systems are, loom
larger and larger. In areas such as avionics, railway systems, microprocessor design,
and security protocols, the obvious consequence, namely, the need for mathemat-
ical methods providing high assurance beyond the insufficient assurance made
possible by testing alone is well understood, so that formal methods are applied in
practice in such areas. But this is far from being the case in general. In particular,
since most systems nowadays are distributed systems, which are very hard to test
and can have very subtle bugs, the necessary but insufficient role of testing is
painfully felt; but the obvious need for stronger verification methods beyond testing
is still not fully understood or appreciated in practice.
An important question is why this highly problematic state of affairs remains
largely unresolved. It is certainly true that, although big advances in both scalability
and automation of formal methods have been made and very important successful
formal verification efforts have been carried out, scalability is still an important
challenge. However, in my view two closely related problems, quite orthogonal to
scalability, present a serious obstacle, namely: (1) verifying designs, as opposed to
verifying code, is hindered in practice by the lack of suitable mathematical models
for system designs; and (2) there is considerable ignorance about the mathematical
modeling nature of programming made possible by declarative languages. The
importance of solving problem (1) is one of effectiveness: design errors can be
orders of magnitude more expensive than coding errors and in fact account for most
of the critical errors in system development. This does not mean that verifying code
is unimportant; however, correct-by-construction code generation from verified
designs is a promising alternative to standard code verification and can be a con-
siderably more cost-effective way of achieving code correctness. Problem (2) is
quite serious and is self-inflicted. In many prestigious universities worldwide most
vii
viii Foreword
Since equational logic is a sublogic of rewriting logic, which is a natural and simple
logic in which to specify distributed systems, the book then moves in a natural and
seamless way from its first part focused on deterministic systems into its second and
main part, focused on the executable specification of distributed systems as rewrite
theories in Maude. Properties of distributed systems and their specification and
verification are then explained. The same gentle and gradual approach is followed
in this second part. This is achieved so well and with such a wealth of examples,
that the book can also be used as a first introduction to distributed systems, their
modeling, and their verification at the undergraduate level. The same gradual
method of approach is also followed for the specification and verification of
properties. First, the simplest of such properties, namely, invariants, are introduced,
and explicit-state reachability analysis supported by Maude’s search command is
used to automatically verify such invariants, or to do so up to a given depth bound if
the system is infinite-state. After this, a gentle, yet quite thorough, introduction to
linear-time temporal logic (LTL) and its semantics is given, and many examples are
given showing how Maude’s LTL model checker can be used to automatically
verify LTL properties of a distributed system formally specified as a rewrite
theory in Maude. Finally, broader perspectives are opened up by explaining how
additional topics such as the specification and verification of real-time and of
probabilistic systems can be treated by corresponding extensions of rewriting logic
by means of real-time rewrite theories and probabilistic rewrite theories; and at the
property level by suitable real-time and probabilistic extensions of temporal logic.
Each notion is again illustrated by means of well-chosen examples and exercises.
In summary, this book addresses an important and serious need in undergraduate
CS education and, at the same time, the broader need of training a next generation
of computer scientists who are well acquainted with both distributed systems and
with the mathematical modeling and verification of such systems. Given the present
state of affairs, both in the vulnerability of our systems and the serious gaps in
mathematical modeling abilities in undergraduate CS education, the appearance of
this book could not be more timely. I have been using earlier drafts of this book in a
program verification course at the University of Illinois at Urbana-Champaign and
plan to recommend the present book to my students as reading material for such a
course in the years to come. I am sure that it will be of great help to many other
persons teaching programming languages, formal methods, and distributed systems
at the undergraduate level and, above all, to the students themselves.
As mentioned above, one main goal of this book is to gently introduce students to a
wide range of concepts in formal methods, including:
xi
xii Preface
• verifying properties about programs and (models of) systems; e.g., proving that a
specification/program terminates for all possible inputs, and using equational
logic to prove semantic properties;
• logics and inference systems; and
• automated model checking techniques to analyze properties for some—but not
all—possible inputs/system configurations.
This book is divided into two parts. The first part deals with specifying the data
types needed to model complex distributed systems. This part introduces classical
algebraic specification and term rewriting theory, including reasoning about ter-
mination, confluence, and inductive equational properties.
The second part deals with formally modeling and analyzing distributed systems
in rewriting logic using Maude. This part introduces rewriting logic and
object-oriented modeling of distributed systems. It also introduces temporal logic to
specify requirements that a system should satisfy. Such models are analyzed using
Maude simulations, reachability analysis, and temporal logic model checking,
thereby also giving the students a hands-on experience of the state-space explosion
problem for distributed systems. As mentioned above, the second main goal of this
book is to introduce the students to the problems of designing and analyzing
distributed systems. Instead of giving theoretical explanations of these issues, the
book tries to convey intuition about distributed systems and their design challenges
through a range of examples/case studies in different domains, including: the dining
philosophers problem, transport protocols like the alternating bit protocol and the
sliding window protocol, classic distributed algorithms such as the distributed
two-phase protocol for distributed database systems, distributed mutual exclusion
and leader election algorithms, and the NSPK cryptographic protocol. Finally, the
book briefly introduces two extensions of standard distributed systems: real-time
systems and probabilistic systems.
The book is based on a course that has been given at the University of Oslo for
more than 10 years, which implies that the book contains a wealth of exercises, both
smaller ones and larger ones suitable for course projects, etc. Most of the executable
code presented in this book, as well as other supplementary material, can be found
at http://peterol.at.ifi.uio.no/BOOK.
I would like to thank José Meseguer, Dorel Lucanu, Narciso Martí-Oliet, and
Ralf Sasse for many insightful and very helpful comments on earlier versions of this
book, Indranil Gupta for discussions on distributed systems, Jon Grov for providing
the figures used in this book, Si Liu for performing the statistical model checking
experiments, Lars Kristiansen for discussions on logic, and Shiji Bijo, Antonio
Gonzalez Burgueño, Benjamin Oliver, and Olaf Owe for pointing out mistakes in
those earlier drafts. I also thank Hanne Riis Nielson and Ian Mackie for encouraging
me to publish this book with Springer, and Simon Rees and Wayne Wheeler for
their patience in waiting for it to be finished.
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.1.1 Models of Distributed Systems . . . . . . . . . . . . . . . . . . . 2
1.1.2 From Model to System . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 The Maude Modeling Language and Analysis Tool . . . . . . . . . . 4
1.3 Why Maude? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.4 Contents of the Book . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.4.1 Part I: Algebraic Specification and Term Rewriting . . . 6
1.4.2 Part II: Dynamic Systems . . . . . . . . . . . . . . . . . . . . . . . 7
1.4.3 Appendix: Mathematical Background . . . . . . . . . . . . . . 8
xiii
xiv Contents
2.5.1
Examples of Order-Sorted Equational
Specifications. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.6 Membership Equational Logic Specifications . . . . . . . . . . . . . . . 33
2.7 Built-in Data Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.7.1 Booleans . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.7.2 Natural Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
2.7.3 Integers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
2.7.4 Rational Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
2.7.5 Floating-Point Numbers . . . . . . . . . . . . . . . . . . . . . . . . . 39
2.7.6 Strings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
2.7.7 Random Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
2.8 Associativity and Commutativity: Lists and Multisets . . . . . . . . 41
2.8.1 Commutativity, Associativity, and Identity . . . . . . . . . . 41
2.8.2 Associativity and Identity: Lists . . . . . . . . . . . . . . . . . . 43
2.8.3 Associativity, Commutativity, and Identity: Multisets
and Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
2.9 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
2.9.1 Two Sorting Algorithms . . . . . . . . . . . . . . . . . . . . . . . . 47
2.9.2 Some NP-Complete Problems . . . . . . . . . . . . . . . . . . . . 49
2.10 * Some Other Maude Features . . . . . . . . . . . . . . . . . . . . . . . . . . 54
2.10.1 Parameterized Modules . . . . . . . . . . . . . . . . . . . . . . . . . 54
2.10.2 Telling Maude how to Evaluate an Expression . . . . . . . 56
2.10.3 Other Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
3 Operational Semantics of Equational Specifications . . . . . . . . . . . . . 59
3.1 The Reduction Relation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
3.1.1 Basic Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
3.1.2 The Reduction Relation . . . . . . . . . . . . . . . . . . . . . . . . . 62
3.1.3 Some Derived Relations . . . . . . . . . . . . . . . . . . . . . . . . 62
3.2 Operational Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
3.3 Conditional Equations and Matching with assoc/comm . . . . . . 64
3.3.1 Conditional Equations . . . . . . . . . . . . . . . . . . . . . . . . . . 64
3.3.2 * A-, C-, and AC-matching is NP-hard . . . . . . . . . . . . . 65
4 Termination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
4.1 Undecidability of Termination . . . . . . . . . . . . . . . . . . . . . . . . . . 68
4.2 Nontermination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
4.3 Proving Termination Using “Weight Functions” . . . . . . . . . . . . . 73
4.4 Simplification Orders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
4.4.1 The Lexicographic Path Order . . . . . . . . . . . . . . . . . . . . 79
4.4.2 The Multiset Path Order and Other Variations
of lpo. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .... 80
4.4.3 Comparing Weight Functions and Simplification
Orders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .... 81
Contents xv
5 Confluence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
5.1 Unification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
5.2 Checking Local Confluence . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
6 Equational Logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
6.1 Equational Logic. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
6.1.1 * Knuth-Bendix Completion . . . . . . . . . . . . . . . . . . . . . 99
6.2 Inductive Theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
6.2.1 Proving Inductive Theorems for Nat . . . . . . . . . . . . . . 103
6.2.2 Inductive Theorems for Other Data Types . . . . . . . . . . . 105
7 Models of Equational Specifications . . . . . . . . . . . . . . . . . . . . . . . . . . 109
7.1 Many-Sorted R-Algebras . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
7.1.1 Homomorphisms and Isomorphisms . . . . . . . . . . . . . . . 112
7.1.2 Term Algebras . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
7.2 (R;E)-Models: (R;E)-Algebras . . . . . . . . . . . . . . . . . . . . . . . . . . 116
7.2.1 Quotient Algebras . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
7.2.2 The Algebra T R;E . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
7.2.3 The Normal Form Algebra . . . . . . . . . . . . . . . . . . . . . . 118
7.3 Soundness and Completeness of Equational Logic . . . . . . . . . . . 118
7.4 Intended Models: Initial Algebras . . . . . . . . . . . . . . . . . . . . . . . . 120
7.5 Empty Sorts and Many-Sorted Equational Logic . . . . . . . . . . . . 124
Our society increasingly depends on large and complex computer systems. Our cars,
airplanes, banks, power plants, social interactions, shopping activities, etc., are all
controlled and/or mediated to a large extent by computer systems. Most computer
systems these days are distributed systems, consisting of multiple computers, or
processors, of various kinds that collaborate to achieve some goal.
Unfortunately, distributed systems are quite complex and significantly harder to
get right than single-threaded sequential programs, because:
• any component in the system may perform an action at any time,
• it may be hard to know whether, or when, a message will be delivered, and
• it may be hard to predict the behavior of other components in the system.
Example 1.1. A prerequisite for banking is mutual authentication: (i) you know that
you are communicating with your bank and not with some impostor, and (ii) the bank
knows that the person pretending to be you actually is you. In a physical bank, you
know that you are in a bank by the imposing building, and the bank clerk asks you to
show some photo identification to be sure that you are who you claim to be. In online
banking and commerce, authentication protocols (“programs for distributed sys-
tems”) are used to ensure mutual authentication. One of the most well-known authen-
tication protocols is the Needham-Schroeder public key protocol (NSPK) [88] that
was published in 1978 by leading experts in the field. It is typically written as follows:
Message 1. A → B : A.B.{Na .A}PK(B)
Message 2. B → A : B.A.{Na .Nb }PK(A)
Message 3. A → B : A.B.{Nb }PK(B)
Chapter 14 explains what all this means; essentially, A and B are the agents that
want to establish mutual authentication (e.g., you and the bank), and the protocol
consists of sending three encrypted messages: first one message (A.B.{Na .A}PK(B) )
is sent from A to B; then B responds by sending a message (B.A.{Na .Nb }PK(A) ) back
This example shows that even a three-line distributed “program” can be really hard
to get right. However, our lives and economy depend crucially on the correctness
of considerably more complex distributed systems. How can we develop correct
distributed systems and ensure that they indeed are correct?
1.1 Modeling
Let us consider an analogy. Thousands of years ago, building a hut for yourself
was pretty easy and could be done right away without much elaboration. If the hut
collapsed, you could rebuild it in a few hours. Just like you could start coding the
programs in your introductory programming course without further ado. However,
buildings have become much more complex in the last 1000 years. How are buildings
constructed these days? You typically do not start building a large building with only
a faint idea of what you want. You first build (or draw) a model of your building. A
first model may be quite rough, but can be developed quickly and allows the architect
and the person commissioning the building to get an idea of whether this is what they
want. Once the main design is agreed upon, a more detailed model should be used to
infer properties of the model: will the bridge collapse? can the proposed skyscraper
withstand strong winds, floods, and earthquakes? The point is that:
1. such models are developed reasonably cheaply and quickly before starting to
build the building; and
2. one should be able to use the model and the laws of physics to predict quite
accurately whether the building to be built will satisfy certain desired properties.
It may be hard to compute by hand whether your skyscraper will withstand the
winds/earthquakes/floods in the region. Computers should do that!
When advanced models have been developed and analyzed, impressive modern-
day engineering technology can “easily” construct the building from the models. It
may not be a coincidence that we know Gustave Eiffel, Oscar Niemeyer, and Frank
Gehry, but have absolutely no clue about who actually built the Eiffel Tower, the
Museum of Contemporary Art in Niteroi, and the Guggenheim Museum in Bilbao.
In the same way, we need models of distributed systems before implementing them:
you do not want to implement your new avionics system directly on an Airbus A380
and have one plane crash for each mistake in your code, or to deploy some new
1.1 Modeling 3
e-commerce algorithm before you are really confident that your design is correct.
The model should be reasonably quick to develop and should focus on the “essence”
of the design and should abstract away inessential details. For example, a model of
a distributed algorithm could focus on what happens when a message is successfully
received or is lost in transmission, but can often abstract away details about how a
packet is sent from one computer to another.
A model you can only look at is not very useful. We would like to both simulate
the model and infer properties from it: can the flight control system deadlock? can
your authentication protocol be broken by malicious agents? does the e-commerce
protocol also work well if a crucial server goes down? Just like the architect should
be able to use the laws of physics to predict properties about the building to be built,
so should a system designer be able to analyze her model of a distributed system.
To reason about consequences of a design, its model must have a clear and precise
meaning, and there must be some laws/rules that allow us to infer consequences of
the model. Therefore, the model must be a mathematical object with precise, mathe-
matical, rules of how one can infer properties from the design. Such a mathematical
model of a computer system is called a formal model.
Specifications can be divided into system specifications (or models) and require-
ment specifications. System models specify the system, which means the compu-
tations performed by the system, whereas a requirement specification specifies the
requirements or properties that a system should satisfy. For example, in the NSPK
protocol, the three lines in Example 1.1 define the system model, which specifies the
computation steps that the participants should perform (namely, sending encrypted
messages, either to start a session or as a response to receiving a message). The
corresponding requirement specification states the requirement(s) that the system
should satisfy: “when an agent A thinks that it has established a connection with an
agent B, then it indeed has a connection with B and not with some other agent.”
The main goal is to prove that all possible behaviors of the system (model) satisfy
the system’s requirement specification. Furthermore, it would be great if comput-
ers could do this analysis, just like the architect wants to use computers to analyze
consequences of her design. This is only possible if both the system model and the
requirement specification are mathematical objects, and there are explicit mathemat-
ical rules that allow us to analyze whether or not a system satisfies its requirements.
The formal system model should preferably be executable; that is, the model can
directly be executed. This would allow for a range of automated computer analyses,
for example by simulating single behaviors of the system being modeled, or by model
checking analyses that analyze many, or all, possible behaviors of the system.
This book focuses on developing and analyzing—by computer and by hand—
executable formal models of distributed computer systems. It also deals with for-
malizing requirements of distributed systems using temporal logic.
The ultimate goal is not to have a nice model for its own sake, but to build a cor-
rect system. However, just like modern engineering technology and companies are
4 1 Introduction
very good at constructing even very large buildings from correct models, modern
programmers and programming environments and methodologies are quite good at
implementing systems from correct specifications. There are also commercial code
generation tools that can automatically generate code from high-level models.
Developing correct models is therefore a crucial task in the system development
process. When the task at hand is well understood, the actual implementation is “just”
programming and hardware engineering. In an early example illustrating the impor-
tance of developing correct system models, it turned out that only three of the 197
critical defects identified during integration and testing of the Voyager and Galileo
spacecrafts were due to coding errors [74,99]. Most faults arose in requirements
and difficult design problems related to distribution [99]. Furthermore, not only are
defects more likely to be introduced in the early stage of system development; it is
also much cheaper to catch errors early in the development process, since design
errors can be orders of magnitude more expensive to fix than coding errors.
This book uses the Maude [21] modeling language to define executable formal models
of distributed systems, and uses the Maude analysis tool to analyze the models. In
Maude, a distributed system is formalized as a theory in rewriting logic [16,80].
Maude and rewriting logic were both developed by José Meseguer and his research
group at the Computer Science Laboratory at SRI International. (Meseguer now
works at the University of Illinois at Urbana-Champaign.)
In rewriting logic, the data types of the system are defined algebraically by equa-
tions. In essence, defining data types amounts to defining functions in a functional
programming style. The dynamic behavior of a distributed system is defined by
rewrite rules, which describe how a part of the state can change in one step. Maude
supports object-oriented programming, including multiple inheritance, and asyn-
chronous communication through message passing, in a natural way.
The Maude interpreter evaluates an expression in an equational Maude program
by applying the equations “from left to right” until no equation can be applied,
thereby computing the normal form (or “value”) of the expression.
Since rewriting logic theories model distributed systems, they are typically non-
deterministic, meaning that there may be many different behaviors from the same
initial state of the system. A first form of analysis provided by Maude is to simulate
one of those behaviors by rewriting, which applies rewrite rules to the state, either
until no rule can be applied or until a user-given upper bound on the number of
rewrites has been reached. (The equations are applied to reduce each intermediate
state to its normal form before a rewrite rule is applied.) To analyze all possible
behaviors from a given initial state one can use Maude’s search capabilities to check
whether certain (un)desired states can be reached from the initial state.
Not only can we specify the system in Maude; we can also define the requirements
the system should satisfy in Maude as linear temporal logic formulas. Maude’s
1.2 The Maude Modeling Language and Analysis Tool 5
high-performance model checker can then be used to decide whether all possible
behaviors from a given initial state satisfy the requirements, provided that the set of
states reachable from the initial state is a finite set.
The Maude system, including a user manual, the source code, etc., is available free
of charge at http://maude.cs.illinois.edu for various Unix/Linux platforms. Maude
can also be compiled and run on Windows under Cygwin.
There are a number of reasons why I think that Maude is a good choice for an
introduction to formal modeling and analysis of distributed systems:
Simple and intuitive formalism. Maude models basically consist of equations that
define functions recursively, and rewrite rules that specify how the states evolve
dynamically. That’s all! There are no tricky constructs for concurrency or commu-
nication. This functional programming style tends to appeal to students.
Expressive formalism. The modeling formalism is fairly expressive, which makes
it easy to define models of complex systems. This is in contrast to simpler, e.g.,
automaton-based, approaches which either require a significant amount of work to
specify larger systems, or cannot model such a system at all due to the system’s
infinite-state nature. Maude also provides a natural model of concurrent objects,
which is ideal for modeling distributed systems. Together, this means that we can
easily model a wide range of distributed systems, as illustrated in this book.
Active area of research. A number of leading research groups perform research
on rewriting logic and apply Maude to state-of-the-art systems. A recent bibliogra-
phy [76] lists about 1000 published scientific papers involving rewriting logic and
Maude. Some applications of Maude include:
• Researchers at Microsoft and the University of Illinois at Urbana-Champaign
(UIUC) modeled aspects of web browsers and their interface in Maude, and used
Maude search to discover many previously unknown address bar and status bar
spoofing attacks in web browsers [19]. Maude has also been used to formally
specify and analyze a new secure web browser developed at UIUC [100].
• Modeling and analysis of a number of complex security and network communi-
cation protocols, including 50-page multicast protocols, protocols developed by
the IETF, etc. (see, e.g., [51,69,94,95]).
• Most modeling and programming languages do not have a well-defined precise
meaning (or semantics); the meaning of a model may be unclear or ambiguous,
and the meaning of a program may depend on the compiler being used. This is of
course unacceptable for safety-critical systems. Furthermore, the lack of a formal
meaning makes it impossible to deduce properties about such models, and hence
to build tools for their analysis. Due to its expressiveness and simplicity, Maude is
well suited to define the mathematical meaning of a model or a program, and has
been used to define the semantics of a wide range of modeling and programming
languages [86,87], including subsets of the avionics (aircraft software) industrial
6 1 Introduction
modeling standard AADL [91], the PLEXIL language developed at NASA for
spacecraft operations [31], the most complete formal semantics of the C and
Java languages [14,39], and so on. Having a Maude semantics also means that
models/programs in such a language can be analyzed using Maude. There is also
an efficient tool for analyzing multi-threaded Java programs [43].
• Finding several bugs in embedded software used by major car makers.
• Programs developed at NASA to determine the position of objects in space.
• Formalization, analysis, and development of cloud computing systems [53,71].
• Modeling of cell biology to simulate and analyze biological reactions [37,38].
The survey paper [84] gives an overview of some applications of Maude.
Mature and efficient. Maude is a fairly mature, robust, and high-performance tool,
publicly released in 1998, and is still under active development. It is also open-source
and easy to install.
A model of a distributed system consists of (at least) two parts: (i) the definition of
the data types (integers, Booleans, lists, sets, and so on) needed to define the states;
and (ii) the definition of the dynamic behavior of the system. This is reflected in the
structure of this book, which is divided into two parts.
Part I deals with defining data types by equational specifications, and analyzing
both the meaning and the operational properties of such equational specifications.
Part II deals with defining the dynamics of a distributed system using rewriting logic,
and of manually and automatically analyzing such models. Since a closely related
objective of this book is to introduce distributed systems, Part II also introduces ex-
amples of such systems from different domains, including communication protocols,
distributed algorithms, and cryptographic (or “security”) protocols.
This part covers classic topics in algebraic specification and term rewriting.
Chapter 2 introduces equational specification in Maude; we define in Maude the
usual data types: natural numbers, integers, lists, binary trees, and multisets. We
define the usual functions on these data types, including the quicksort and mergesort
algorithms on lists, as well as some classical NP-complete problems.
Chapter 3 introduces some operational properties that equational specifications
should satisfy. To exemplify how to formally reason about specifications, I focus on
reasoning about termination. Chapter 4 provides some intuition and more concrete
techniques to prove that your specification does not contain an infinite loop for any
input. We study the theoretical basis for the concept of simplification orders, and
use the standard path orders to prove termination. Chapter 5 shows how to verify
1.4 Contents of the Book 7
that specifications are confluent; that is, that the result of evaluating an expression is
independent of the order in which Maude chooses to apply the equations.
Chapter 6 shows how to use equational logic to reason about the “meaning” of
a specification. In particular, we focus on how induction techniques can be used to
prove that certain desired properties “follow logically” from a specification.
In formal modeling, the precise meaning of a specification/program is given by the
mathematical object defined by the program. Chapter 7 explains how an equational
Maude program defines a mathematical object, namely, an algebra. Chapter 7 also
proves Birkhoff’s Completeness Theorem: an equality holds in all models satisfying
a set of equations E if and only that equality can be proved in equational logic.
Chapter 8 introduces rewriting logic and explains how rewrite rules can be used to
specify the possible concurrent behaviors of a system.
Chapter 9 explains how rewriting logic models can be analyzed in Maude by
simulating one possible behavior of the system and by searching for (un)desired
states. Chapter 10 then introduces Maude’s model of concurrent objects; all the
larger examples in this book are modeled in an object-oriented style. Chapters 8
to 10 illustrate the concepts on simple examples, such as various small “games” and
modeling the “lives” of persons, and end with the well-known dining philosophers
problem and with randomized simulations to evaluate different blackjack strategies.
Chapter 11 shows how different forms of communication can be modeled at a high
level of abstraction in Maude. These techniques are used in Chapter 12 to model a
TCP-like transport protocol that uses sequence numbers to achieve reliable and or-
dered message communication when the network infrastructure is unreliable and only
supports unordered message delivery. We then modify this protocol to the alternating
bit protocol when we can assume ordered but unreliable links in the network. These
two protocols are then generalized to two versions of the sliding window protocol,
which is supposedly the best-known algorithm in computer networking [96].
We are then ready for some larger examples. Chapter 13 deals with modeling
and analyzing a number of classic distributed algorithms, including the two-phase
commit protocol for distributed database transactions, distributed mutual exclusion
algorithms, and distributed leader election and consensus algorithms.
Chapter 14 shows how Maude can be used to model and analyze the afore-
mentioned Needham-Schroeder security protocol,whose goal is to let Alice and Bob
establish a communication between them so that Alice can be sure she’s communi-
cating with Bob and not with the malicious intruder Walker. Is the security protocol
up to this task, or can Maude show that Walker can impersonate Bob?
8 1 Introduction
Chapter 15 introduces invariants and other kinds of requirements that our systems
may have to satisfy, and discusses both how Maude can be used to analyze such
system properties, and how they may be analyzed “by hand.”
These requirements are then formalized using temporal logic in Chapter 16, which
also explains how Maude’s model checker can be used to check whether a system
model satisfies its requirements.
Finally, Chapter 17 briefly discusses how the following kinds of systems can be
modeled and analyzed in (extensions of) Maude:
1. Real-time systems, where the amount of time of/between events plays a crucial
role and must be taken into account in the model.
2. Probabilistic systems, where certain events/values are chosen probabilistically.
This chapter describes how data types can be defined in Maude as equational
specifications. Section 2.1 introduces specification and execution in Maude with
some simple “Hello World!” examples specifying the natural numbers and the
Boolean values. Section 2.2 defines many-sorted equational specifications and
explains how Maude computes with equations. Section 2.1.3 describes important
requirements that an equational specification should satisfy. Section 2.4 shows
the Maude specifications of other data types, including lists, multisets, and binary
trees, and discusses the expressiveness of many-sorted equational specifications.
Data types are often related; for example, the natural numbers are a subset of the
integers. Such subset relationships are captured in equational specifications by sub-
sorts, which are treated in Section 2.5, and by sort memberships (Section 2.6). For
convenience and performance, efficient versions of basic data types (natural num-
bers, Booleans, integers, rationals, floating-point numbers, and strings) are built-in
in Maude as explained in Section 2.7. Section 2.8 introduces functional attributes
that can be used to define lists and multisets elegantly in Maude. Section 2.9 shows
Maude specifications of the well-known sorting algorithms quicksort and merge-
sort, and of solutions to some classic NP-complete problems. Finally, Section 2.10
briefly discusses other Maude features, including parameterized programming.
Maude specifications are declarative programs, which specify what to compute,
whereas imperative programs, such as Java programs, give a step-by-step descrip-
tion of how to compute something. Declarative languages have some attractive
features, including the following:
• Declarative languages do not have pointers, aliasing, and side effects, which
make imperative programs very hard to understand and reason about.
• Declarative programs are easier to specify and modify. The constructs are more
“powerful,” making it easier to specify complicated tasks, and to modify
programs, as there are no side effects.
c Springer-Verlag London 2017 11
P.C. Ölveczky, Designing Reliable Distributed Systems, Undergraduate Topics
in Computer Science, DOI 10.1007/978-1-4471-6687-0 2
12 2 Equational Specification in Maude
In this section we write and execute our first Maude specifications, defining the
natural numbers and the Boolean values. Such data types are defined as many-sorted
equational specifications, which consist of a set of sorts, where each sort roughly
corresponds to a data type, a set of function symbols (also called operators)—some
of which are used to construct the “values” of the data types, and others which are
ordinary functions on those values—, and equations defining the functions.
In Maude, an equational specification is called a functional module, and is intro-
duced with the following syntax:
fmod MODULENAME is
BODY
endfm
where MODULENAME is the name of the module being introduced, and BODY is
a set of declarations of sorts, function symbols, mathematical variables, and equa-
tions. The order of the declarations does not matter, since BODY is a set of decla-
rations. A comment starts with *** or - - - and goes until the end of the line, or it
starts with ***( or - - -( and lasts until the first matching occurrence of ‘)’.
2.1 Hello World: Our First Maude Specifications 13
The following Maude module NAT-ADD specifies the natural numbers and a function
‘+’ on the natural numbers:
fmod NAT-ADD is
sort Nat .
op 0 : -> Nat [ctor] .
op s : Nat -> Nat [ctor] .
op _+_ : Nat Nat -> Nat .
vars M N : Nat .
This module declares a sort Nat and three function symbols (or operators): 0, which
does not take any arguments (such function symbols are called constants) and gives
an element of sort Nat; s, which takes an element of sort Nat as argument and gives
an element of Nat; and +, which takes two elements of sort Nat as arguments and
“returns” a Nat-value. The underscore (‘_’) tells where the arguments should be
placed in “mix-fix” notation. If there are no underscores (as is the case for s), then
the function symbol must be written using standard “prefix” notation.
The function symbols define the expressions, or ground terms, in our system;
some of the terms of sort Nat are 0, s(0), s(s(0)), . . . , 0 + 0, s(0) + s(0), . . . .
The function symbols 0 and s are declared to be data constructors (ctor). The
ground terms built up by the constructors, 0, s(0), s(s(0)), s(s(s(0))), . . . ,
denote the data values of Nat, and intuitively represent the numbers 0, 1, 2, 3, . . .
After declaring two variables M and N of sort Nat, the module defines the function
+ recursively by two equations. The variables M and N are mathematical variables as
we know from equations such as (x+y)2 = x2 +2xy+y2 ; they are not “program vari-
ables” in the imperative programming sense that can be assigned values. Just like
an equation (x + y)2 = x2 + 2xy + y2 is usually applied from left to right to simplify
an expression, Maude also applies the equations from left to right to simplify an
expression until it cannot be further simplified. The variables in the equations say
that the equations hold for all possible values for M and N. The equations define a
recursive function for computing the sum m+n of two numbers m and n: if m is 0,
apply the first equation and we are done; if m has the form s(m ), i.e., is greater than
0, the second equation recursively computes m +n and adds one to this sum.
Assuming that you have installed Maude according to the instructions given at
http://maude.cs.illinois.edu/, you can start Maude, and should then
get a greeting from Maude that looks like
\||||||||||||||||||/
--- Welcome to Maude ---
/||||||||||||||||||\
Maude 2.7 built: Mar 3 2014 18:07:27
14 2 Equational Specification in Maude
You now need to enter the module NAT-ADD into Maude. This can be done either by
typing the specification directly on Maude’s command line (not recommended) or
by writing the module in some file, say nat-add.maude, and then let Maude read
this file by using the in command:1
Maude> in nat-add.maude
If you get some error message(s) you should be aware of the following:
• Maude is case-sensitive. The sorts Nat and nat are not the same.
• Each declaration should end with a space followed by a period (‘.’). However,
there should not be a period after endfm.
• For infix symbols such as + there should be a space before and after +. The
equation should be written eq 0 + M = M ., not eq 0+M = M .
• There should be no space between ‘_’ and ‘+’ in the declaration of +.
To exit Maude, give the command q (or quit).
Maude’s red (or reduce) command computes the “value” of a given expression,
such as 2 + 3, by using the equations “from left to right” to “replace equals for
equals” until no equation can be applied:
Maude> red s(s(0)) + s(s(s(0))) .
The last line gives the result s(s(s(s(s(0))))) (representing the number 5) and
states that this result has sort Nat.
The following module BOOLEAN defines a data type for the Boolean values.
The “values” in this data type are “true” and “false,” which we represent by two
constructor constants true and false. We also declare the Boolean functions not
(negation), and (conjunction), and or (logical disjunction) as follows:
1 The command load nat-add does the same thing, but does not print the list of modules.
2.1 Hello World: Our First Maude Specifications 15
fmod BOOLEAN is
sort Boolean .
ops true false : -> Boolean [ctor] .
op not_ : Boolean -> Boolean [prec 53] .
op _and_ : Boolean Boolean -> Boolean [prec 55] .
op _or_ : Boolean Boolean -> Boolean [prec 59] .
var B : Boolean .
eq not false = true . eq not true = false .
eq true and B = B . eq false and B = false .
eq true or B = true . eq false or B = B .
endfm
The actual names of sorts and operators do not matter; we can equally well use the
sort name Bool or TruthValues instead of Boolean, and the constructors 1 and 0
(or T and F) instead of true and false.
In first-order logic there is a precedence between the function symbols, where
e.g. negation binds tighter than conjunction, so that ¬x ∧ y is read (¬x) ∧ y. We can
tell the Maude parser to impose a similar precedence on the function symbols by
adding an attribute prec n to the function symbol declaration, where n is a natural
number. The lower the number of an operator, the tighter its binding. What matters
is the relationship between the numbers: instead of 53, 55, and 59 we could have
chosen 1, 2, and 3 with the same effect. A term true and not true or false
is understood as (true and (not true)) or false.
A module may import another module that has already been entered into Maude
using the keyword protecting or including.2 The following module imports both
our previous modules to define the “less than” function on natural numbers:
fmod NAT< is
protecting NAT-ADD . protecting BOOLEAN .
op _<_ : Nat Nat -> Boolean .
vars M N : Nat .
eq 0 < s(M) = true .
eq M < 0 = false .
eq s(M) < s(N) = M < N .
endfm
Exercise 1 Write the module NAT-ADD in a file, let Maude read the file with the
specification, and use Maude’s red command to compute 2 + 4 and (2 + 3) + 4.
2 Although protecting and including have different mathematical meaning (see [21] for
details), the Maude system treats them in the same way.
16 2 Equational Specification in Maude
The sorts are just names and do not contain a priori any associated values. Instead,
we use function symbols (also called operator symbols) to define the “elements”
or “values” of each sort, and to define functions on their domains of values. A
declaration of a function symbol has the form
op f : s1 . . . sn -> s .
for n ≥ 0, where f is the introduced function symbol, and s1 , . . . , sn , and s are sorts.
The list s1 . . . sn is the arity of f , and s is its value sort. Multiple function symbols
with the same arity and value sort can be declared in one declaration:
ops f g h : s1 ... sn -> s .
We will use the terms “function symbol”, “function”, “operator symbol”, “opera-
tor”, and “operation” interchangeably.
Example 2.1. In the module NAT-ADD, the function symbol 0 has the empty list as
its arity and Nat as its value sort, the function s has arity Nat and value sort Nat,
and the symbol + has arity Nat Nat and value sort Nat. ♦
A function symbol whose arity is the empty list (i.e., n = 0) is called a constant.
A many-sorted signature consists of a set of sorts and a set of function symbol
declarations (where an element w ∈ S∗ is a finite sequence of S-elements):
The ground terms define the “expressions” we can talk about. A ground term is
built by constants and other function symbols in a “sort-correct” way:
Definition 2.2 (Ground terms) Given a many-sorted signature (S, Σ ), the S-sorted
set TΣ = {TΣ ,s | s ∈ S} of ground terms are defined inductively as follows:
2.2 Many-Sorted Equational Specifications 17
Notation. I sometimes use type-writer font and write ‘,’, ‘(’, and ‘)’ instead
of ‘,’, ‘(’, and ‘)’, so that a term f(a, b) will also be written f(a,b).
Example 2.3. The set TΣNAT−ADD,Nat of ground terms of sort Nat contains the ground
terms 0, s(0), s(s(0)), 0 + 0, s(0) + 0, s(0) + (s(0) + 0), . . . ♦
When a definition mentions “all terms of the form f (t1 , . . . ,tn ) for n ≥ 0,” then this
also includes all the constants (i.e., when n = 0).
As already mentioned, constructor functions (such as 0 and s) define the ele-
ments of the data type: the data elements of a sort are the ground terms consisting
only of constructor functions. The other functions (such as +), called defined func-
tions, are ordinary functions on those elements, and are defined by equations.
Mathematical variables of different sorts are needed to define equations:
In Maude, the keywords var and vars are used to declare variables. However, vari-
ables of the form var:sort can also be used on-the-fly without explicit declaration,
so that the following two specification fragments are equivalent:
vars M N : Nat . eq 0 + M = M . eq s(M) + N = s(M + N) .
and
eq 0 + M:Nat = M:Nat . eq s(M:Nat) + N:Nat = s(M:Nat + N:Nat) .
(“Non-ground”) terms can contain variables: The set TΣ (X) of terms in a signature
(S, Σ ) w.r.t. a set of variables X are all the “things” that can be built in a sort-
consistent way from constants, variables, and the application of functions:
18 2 Equational Specification in Maude
Definition 2.4 (Terms) Given a many-sorted signature (S, Σ ) and a variable set
X = {Xs | s ∈ S}, the S-sorted set of terms TΣ (X) = {TΣ ,s (X) | s ∈ S} is defined
inductively by the following conditions:
1. Xs ⊆ TΣ ,s (X) for s ∈ S; that is, a variable of sort s is also a term of sort s.
2. Σε ,s ⊆ TΣ ,s (X) for s ∈ S; that is, a constant of sort s is also a term of sort s.
3. f (t1 , . . . ,tn ) ∈ TΣ ,s (X) if f ∈ Σs1 ...sn ,s and ti ∈ TΣ ,si (X) for each 1 ≤ i ≤ n.
4. TΣ (X) is the smallest S-sorted set satisfying the above conditions.
The operational meaning describes how Maude’s red command computes with
equations. For example, if we ask Maude to compute the “value” of a ground term
such as e.g., s(s(0 + s(0))) + 0, then the following happens:
1. Maude checks whether some equation can be applied somewhere in the term.
That is, it checks whether the lefthand side of an equation “matches” the term
somewhere. It then applies the equation by “replacing equal by equal.” For ex-
ample, the equation 0 + M = M could be applied to the term s(s(0 + s(0))) + 0,
2.2 Many-Sorted Equational Specifications 19
Exercise 2 Overloading a function symbol means that the same function symbol
can have different arities and/or value sorts. This can be quite convenient, since a
constant ‘0’ could be both a bit value, a Boolean value, and a natural number:
sorts Bit Boolean Nat .
ops 0 1 : -> Bit . ops 0 1 : -> Boolean . op 0 : -> Nat .
A data type consists of a set of elements (the domain) and a set of functions on those
elements. Examples of domains are the set N of natural numbers, the set of all lists
of natural numbers, the set of all binary trees of a certain kind, and so on.
20 2 Equational Specification in Maude
In Maude, the elements in a data type are represented by the ground terms built
by the constructor function symbols. For this to make sense: (i) each element in the
domain we want to model must be represented by a constructor ground term; (ii)
each element is only represented by one constructor ground term, or by a single
equivalence class of such terms when there are equivalences on constructor ground
terms (such as in the case of sets); and (iii) there are no “junk” constructor ground
terms that do not represent elements in our domain.
For the natural numbers and their Maude representation in the module
NAT-ADD we have the desired one-to-one correspondence: each number n ∈ N is
represented by a constructor ground term s(s(...(s (0))...); and a constructor
n
ground term of sort Nat is either 0 (representing the number 0) or has the form
s(s(...(s (0))...), for m ≥ 1, which represents the number m.
m
Maude would “simplify” a to b using the first equation, and then b would be simpli-
fied to a using the second equation, and then a would again be simplified to b using
the first equation, and so on, giving an infinite computation
a b a b ···
3 A “decrease” typically means that the number of function symbol occurrences in a constructor
ground term must decrease.
2.3 Requirements of Equational Specifications 21
is terminating, since the first argument of f decreases in each recursive call. How-
ever, if the second equation is replaced by
eq f(s(M), N) = f(M, M + N) + f(N , M) .
We want to compute the value (i.e., a constructor ground term) of a functional ex-
pression (i.e., a ground term). Each expression should therefore be reducible to a
constructor ground term. For example, if we “forget” the equation 0 + M = M in
NAT-ADD, then s(s(0 + s(0))) + 0 reduces to s(s((0 + s(0)) + 0)), which can-
not be further reduced, and which is not the result we really wanted.
This is the same as requiring that a non-constructor function is “defined” on all
constructor ground terms. For instance, for natural numbers, n1 + n2 is defined for all
values/constructor ground terms n1 and n2 , since n1 (and n2 as well for that matter)
should have the form 0 or s(n) for some n. In the first case, the equation 0 + M = M
will apply, and in the second case s(M) + N = s(M + N) can be applied.
22 2 Equational Specification in Maude
Functions are often defined by one equation for each constructor, although some-
times we need fewer, and sometimes more, equations:
op double : Nat -> Nat . var N : Nat . eq double(N) = N + N .
The above equation covers all arguments of double. A function minusTwo which
decreases any number greater than one by two can be defined by three equations:
op minusTwo : Nat -> Nat . var N : Nat .
eq minusTwo(0) = 0 . eq minusTwo(s(0)) = 0 .
eq minusTwo(s(s(N))) = N .
For any constructor ground term n, some equation can be applied on minusTwo(n).
The function < in Section 2.1.3 is defined for all pairs (m, n) of constructor ground
terms m and n; this can be checked by considering all possible values for this pair:
(0, 0), (0, s(n)), (s(m), 0), and (s(m), s(n)). In each of these cases, an equa-
tion defining < can be applied.
A more precise name for the definedness property is sufficient completeness: the
result of simplifying a ground term should be a constructor ground term.
Maude does not check whether your specification satisfies these requirements. The
first one obviously cannot be checked, since Maude cannot know what domain you
are trying to represent. The other three requirements are in general undecidable:
there is no algorithm that can look at any user module and tell whether the module
satisfies the requirements or not. However, Maude has (external) termination check-
ers [32], confluence checkers [33], and sufficient completeness checkers [57] that
can often be used to check the corresponding requirements.
You must make sure that the above requirements are satisfied independently of
how Maude is implemented. Since we have no control over the application of equa-
tions, it would be unsatisfactory if the result of computing a term would depend on
how the Maude system chooses which equations to apply.
Exercise 5 Explain why there are no infinite computations in NAT-ADD and NAT<.
This section explains how data types can be defined as many-sorted equational
specifications.
2.4 Many-Sorted Specification of Data Types 23
Although there is no automatic way to define functions, one hint to help get you
quickly started is to define a function op f : S -> S by one (or more) equation(s)
for each constructor for S. For example, if the constructors for the sort S are two
constants a and b, one unary operator g (i.e., a function taking one argument), and
one binary operator h (i.e., a function taking two arguments), then one could first
try to define f by four equations of the form
eq f (a) = ... eq f (b) = ... eq f (g(X)) = ... eq f (h(X,Y)) = ...
for variables X and Y of appropriate sorts. For the sort Nat, we can follow this scheme
to define the function double, which doubles its argument, also without using +:
eq double(0) = 0 . eq double(s(N)) = s(s(double(N))) .
If the function f takes two arguments, you can define f by “cases” on the construc-
tors for one of the arguments, or for both. NAT-ADD defines addition by “cases” on
the first argument, but it could equally well have used the second argument. We can
use this technique to define multiplication by “cases” on the first argument:
fmod NAT-MULT is protecting NAT-ADD .
op _*_ : Nat Nat -> Nat .
vars M N : Nat .
eq 0 * N = 0 .
eq s(M) * N = N + (M * N) .
endfm
For binary functions (or more generally, n-ary) functions, sometimes such case
definitions only work for one of the arguments (like list concatenation in Sec-
tion 2.4.3.1). Sometimes we may need to do a “case” on both arguments. For less-
than on natural numbers, we need to consider both arguments: the first argument is
0 or has the form s(m), and the second argument is either 0 or has the form s(n):
eq 0 < 0 = false . eq s(M) < 0 = false .
eq 0 < s(N) = true . eq s(M) < s(N) = M < N .
Again, this is just to help get you started; once you have defined your function, you
should make its definition more elegant: the upper two equations can be combined
into the single equation M < 0 = false, yielding the definition in Section 2.1.3.
While this is a useful starting point, sometimes you need more elaborate definitions,
such as for the function minusTwo above.
An important thing discussed next is that it is often convenient, or even necessary,
to introduce auxiliary functions in order to define a given function.
Bergstra and Tucker show in [12] that it is impossible to define the square function
on natural numbers in Maude without using other functions than 0 and s. And try
24 2 Equational Specification in Maude
to define exponentiation without using other functions than addition! However, both
the square function and exponentiation are easily defined if you introduce (addition
and) multiplication as auxiliary functions:
fmod NAT-EXP is protecting NAT-MULT .
op square : Nat -> Nat .
op _ˆ_ : Nat Nat -> Nat .
vars M N : Nat .
eq square(N) = N * N .
eq M ˆ 0 = s(0) .
eq M ˆ s(N) = M * (M ˆ N) .
endfm
What does this difficulty of defining simple functions without introducing auxil-
iary functions say about the expressive power of terminating and confluent
finitary4 many-sorted equational specifications? It turns out that by adding auxil-
iary functions, you can define whatever you want in this way. (The expressiveness
of equational specifications is also indicated in Section 4.1, which shows that Turing
machines can be simulated by equational specifications. However, the correspond-
ing specifications are not necessarily terminating and/or confluent).
Formally, any recursive (i.e., computable) function on finite products of natural
numbers can be defined by a terminating and confluent finitary many-sorted equa-
tional specification (see, e.g., [105, Section 3.2]). Furthermore, Bergstra and Tucker
prove the following remarkable result in [11, 12] (see also the discussion in [85]):
This means that anything you can do in your favorite programming language, you
can also do in Maude! Just add auxiliary functions (new sorts are not needed).
This section shows the Maude specification of some well-known data types.
How can lists of, say, natural numbers, be represented in a many-sorted equational
specification? A constructor for the empty list is obviously needed:
sort List .
op nil : -> List [ctor] .
4 That is, using only a finite number of sorts, functions, and equations.
5 A computable algebra is one whose domains are recursive sets (i.e., we can decide whether an
element is a member of the set) and whose functions are recursive (i.e., computable) functions.
2.4 Many-Sorted Specification of Data Types 25
A more appealing way of representing lists is to let the append function instead be
denoted by a mix-fix function symbol:
op _++_ : List Nat -> List [ctor] .
The list “1 2 3” is now represented by the term nil s(0) s(s(0)) s(s(s(0))).
The following module defines lists of natural numbers and some functions on them:6
fmod LIST-NAT1 is protecting NAT1 . protecting BOOLEAN1 .
sort List .
op nil : -> List [ctor] .
op _ _ : List Nat -> List [ctor] .
op length : List -> Nat . *** # of elements in a list
op concat : List List -> List . *** Concatenate two lists
op insertFront : Nat List -> List . *** Insert element first
ops first last : List -> Nat . *** First/last element
op empty? : List -> Boolean . *** Is the list empty?
op rest : List -> List . *** Remove first element.
op reverse : List -> List . *** Reverse list
op _occursIn_ : Nat List -> Boolean .
op remove : Nat List -> List . *** Remove element(s)
op max : List -> Nat . *** Largest element in list
op isSorted : List -> Boolean . *** Is the list sorted?
The length function, giving the number of elements in the list, can be defined
using the techniques suggested above; i.e., by recursion on the argument w.r.t. the
constructors nil and _ _:
eq length(nil) = 0 .
eq length(L N) = s(length(L)) .
To define the list concatenation function concat, it turns out that doing the recursion
on the second argument works:
eq concat(L, nil) = L .
eq concat(L, L’ N) = concat(L, L’) N .
The function first gives the value of the first element in the list. But what is the first
element in an empty list? The function first is a partial function that is not defined
on all lists, but only on non-empty lists. Partial functions are treated in Sections 2.5
and 2.6; in the meantime we just define that the first element in an empty list is 0:
eq first(nil) = 0 . *** Default/error value
eq first(nil N) = N .
eq first(L N N’) = first(L N) .
A binary tree whose nodes are (labeled with) natural numbers can be represented by
the following constructors:
sort BinTree .
op empty : -> BinTree [ctor] .
op bintree : BinTree Nat BinTree -> BinTree [ctor] .
where bintree(t, n,t ) represents the tree with root labeled n which has t as its left
subtree and t as its right subtree. For example, the tree in Fig. 2.1 is represented by
the term
bintree(empty, s(s(s(s(0)))),
bintree(empty, s(s(s(s(s(s(s(0))))))), empty))
It is easy to see that each binary tree can be represented by a unique constructor
ground term of sort BinTree, and that each such term represents a binary tree.
The following module defines a data type for binary trees:
fmod BINTREE-NAT1 is protecting LIST-NAT1 .
sort BinTree .
op empty : -> BinTree [ctor] .
op bintree : BinTree Nat BinTree -> BinTree [ctor] .
ops preorder inorder postorder : BinTree -> List .
ops size weight : BinTree -> Nat .
op isSearchTree : BinTree -> Boolean .
op reverse : BinTree -> BinTree .
eq preorder(empty) = nil .
eq preorder(bintree(BT, N, BT’))
2.4 Many-Sorted Specification of Data Types 27
The functions preorder, inorder, and postorder list the elements in a tree in the
order they are encountered in, respectively, a preorder, an inorder, and a postorder
traversal of the tree. weight gives the sum of the elements in the tree, size gives
the number of elements, and isSearchTree returns true if and only if the tree is
a binary search tree; that is, an inorder traversal (“from left to right”) encounters
the elements in increasing (or at least non-decreasing) order. The function reverse
reverses the tree; i.e. “flips it” around its vertical axis, and then does the same recur-
sively for each subtree.
Sets and multisets (which are essentially “sets,” but where the number of occur-
rences of each element matters) are important data types. However, since the sets
{a, b} and {b, a} are the same sets, it is hard to define a one-to-one constructor basis.
For example, using constructors
op empty : -> Set [ctor] . op _;_ : Set Nat -> Set [ctor] .
the same set {0, 1} = {1, 0} could be represented by the two different constructor
ground terms empty ; 0 ; s(0) and empty ; s(0) ; 0. Section 2.8.3 defines sets so
that each set is represented by one equivalence class of constructor ground terms.
Exercise 7 Define a function square : Nat -> Nat that computes the square of
a number, without using any other function except s, 0, +, and square itself.
Exercise 8 Explain why parentheses are not needed when using the constructors
nil and _ _ for lists. That is, show that expressions such as nil s(0) s(s(0))
s(s(s(0))) only can be parsed in one way.
Exercise 9 1. Define a module NAT1 that extends NAT< with the functions
op half : Nat -> Nat .
ops _monus_ diff min : Nat Nat -> Nat .
ops odd even : Nat -> Boolean .
ops _<=_ _>_ _>=_ _==_ : Nat Nat -> Boolean .
half is “integer division by 2,” m monus n is “minus down to 0,” i.e., max(m−
n, 0), diff is the difference between two numbers, min computes the smallest
of two numbers, and odd and even return true if its argument is an odd, resp.
even, number. The other functions are the usual comparison operators.
28 2 Equational Specification in Maude
2. Define a module BOOLEAN1 that extends BOOLEAN with the following functions:
op _implies_ : Boolean Boolean -> Boolean [prec 61] .
op if_then_else_fi : Boolean Boolean Boolean -> Boolean .
which compares two lists lexicographically, and test your definition in Maude.
Exercise 12 Represent the following binary tree as a term of sort BinTree.
4
2 7
3 6 9
Exercise 13 Define the remaining functions in the module BINTREE-NAT1 in Maude.
Exercise 14 1. Define a sort Bits of lists of bits 0 and 1.
2. Define a function neg : Bits -> Bits that “flips” each bit in the list.
3. Define a function _+_ : Bits Bits -> Bits that adds two binary numbers
(represented as Bits). For example, (nil 1 0 1 1) + (nil 1 1 0) (11+6)
should return nil 1 0 0 0 1 (17).
2.5 Order-Sorted Equational Specifications 29
Different sorts are not related in the many-sorted world. This hardly seems practical.
For example, it is natural to have a sort Nat for the natural numbers and a sort Int for
the integers. Using only the sort Int and forgetting about Nat is not very elegant,
since some functions, such as the factorial function, are partial functions on the
integers that do not take negative numbers as arguments. We have seen other partial
functions, such as first, last, and rest on lists, which should only be defined on
non-empty lists. To have two unrelated sorts Int and Nat is unsatisfactory as well,
since it requires functions used both on natural numbers and integers to be defined
twice, and does not allow the use of a natural number in place of an integer.
Maude supports order-sorted specifications (see e.g. [50, 82]), in which a sort
may have subsorts. Intuitively, a subsort declaration
subsort s’ < s .
means that the sort s’ is “included” in the sort s, in the sense that each element of
s’ is also an element of s. For example, since the natural numbers are a subset of
the integers, it is natural to have Nat < Int. Multiple subsort declarations can be
combined into a single one: subsorts Nat Neg < Int ., which states that both
Nat and Neg are subsorts of Int. (A subsort declaration does not declare the sorts,
so the above sorts must also be declared as usual).
Formally, the set of sorts is equipped with a partial order ≤ (see Appendix A).
The subsort relation ≤ induces a subsort relation ≤ on lists of sorts of the same
length, where s1 . . . sn ≤ s1 . . . sn holds if and only if si ≤ si for each 1 ≤ i ≤ n.
If Nat is a subsort of Int, a function which takes Int arguments will also accept
Nat arguments, since any Nat value is also an Int value. For example, a function
op _+_ : Int Int -> Int .
to tell Maude that the value of m + n has sort Nat if both m and n have sort Nat.
As explained in Section 2.6, such declarations of subsort overloaded functions are
only needed for constructors, to ensure that each (sub)sort has the desired domain.
An order-sorted signature is a many-sorted signature with a partial order ≤ on
the sorts:
Definition 2.7 (Order-sorted signature) An order-sorted signature (S, ≤, Σ ) con-
sists of a set S (of sorts), a partial order ≤ on S, and an S∗ × S-sorted family
Σ = {Σw,s | w ∈ S∗ , s ∈ S} of “function symbol declarations.”
Terms are defined as expected: if s ≤ s, then a term of sort s is also a term of sort s.
0. TΣ ,s (X) ⊆ TΣ ,s (X) if s ≤ s; that is, a term of a subsort s is also a term of the
supersort s.
to Definition 2.4, which defines the terms in a many-sorted signature.
The set of ground terms is defined as expected: TΣ = {TΣ ,s | TΣ ,s = TΣ ,s (0),
/ s ∈ S}.
The following example shows that the sort of a term could be ambiguous in the
sense of the term having completely unrelated sorts, which is of course undesired:
sorts s1 s2 s12 u1 u2 .
subsorts s12 < s1 s2 .
op a : -> s1 . op b : -> s2 . op c : -> s12 .
op f : s1 -> u1 . op f : s2 -> u2 . op h : u1 -> u1 .
What is the sort of the term f(c)? Since c is an element of sort s1, the term f(c)
should have sort u1, but since c is also an element of sort s2, the term f(c) should
have sort u2. Such ambiguity is undesirable since u1 and u2 are unrelated (is, e.g.,
h(f(c))) a term?). Maude therefore requires that each non-constant term has a
unique least sort as explained below. The specification would be OK if we added
sort u12 . subsorts u12 < u1 u2 . op f : s12 -> u12 .
7 A connected component of (S, ≤) is an equivalence class in the transitive and symmetric closure
of (S, ≤).
2.5 Order-Sorted Equational Specifications 31
2.5.1.1 Partiality
We have not defined division on the natural numbers. The reason is that division is a
partial function on the natural numbers, since n/0 is undefined for any n. The point
is that we can define a subsort NzNat, for the nonzero natural numbers, of Nat, so
that division is well-defined on (the domain defined by) the subsort.
sorts NzNat Nat . subsort NzNat < Nat .
The constructors must be declared so that the constructor ground terms of sort NzNat
are exactly all the nonzero positive numbers:
op 0 : -> Nat [ctor] . op s : Nat -> NzNat [ctor] .
The division operator can then be declared to have only nonzero denominators:
op _/_ : Nat NzNat -> Nat .
A subsort NeList of List for non-empty lists can be defined in the same way, so
that first, last, rest, and max become total functions on that subdomain:
sorts List NeList . subsort NeList < List .
op nil : -> List [ctor] . op _ _ : List Nat -> NeList [ctor] .
The first three of the above functions can then be defined as follows:
ops first last : NeList -> Nat .
op rest : NeList -> List .
Without subsorts it is fairly tricky to represent the integers so that each integer cor-
responds to exactly one constructor ground term, and vice versa. However, it is easy
to have this desired one-to-one correspondence using the sort hierarchy
sorts Zero NzNat NzNeg Nat Neg Int .
subsorts Zero < Nat Neg < Int .
subsort NzNat < Nat .
subsort NzNeg < Neg .
Zero is the sort for 0; NzNat and NzNeg denote the nonzero natural and negative
numbers, respectively; Nat and Neg all natural, respectively negative, numbers, in-
cluding 0; and Int denotes all integers. The sort NzInt for nonzero integers is added
to deal with division:
32 2 Equational Specification in Maude
There are two intuitive ways of constructing the negative numbers. One is to negate
a natural number to get a negative number (so that - s(s(0)) represents −2):
op -_ : NzNat -> NzNeg [ctor prec 15] .
The other option is to use a “predecessor” function p, where p(x) is the predecessor
of x (that is, x − 1), just as s(n) is the successor of n. Such a constructor is declared
op p : Neg -> NzNeg [ctor] .
Our lists have the form nil n1 . . . nk . It is possible to get rid of nil from this list by
saying that a natural number is also a (non-empty) list:
sorts Nat NeList List . subsort Nat < NeList < List .
op nil : -> List [ctor] . op _ _ : NeList Nat -> NeList [ctor] .
8The extra parentheses in the following equations are not needed, due to the precedence on the
operators. They are just added for readability.
2.5 Order-Sorted Equational Specifications 33
2.5.1.4 “Undefined”Values
Exercise 16 Define the integer division function /, the multiplication function, and
the functions in NAT1 (see Exercise 9) on the integers.
Exercise 17 Define the integers and the above functions when the predecessor func-
tion is used as the constructor for the nonzero negative numbers.
Explain why these equations do not define <= for all pairs of integers. Then add the
“missing” equation(s).
9These are binary trees where, for each subtree, the root element of the (sub)tree is greater than or
equal to all elements in its left subtree and is less than or equal to all elements in its right subtree.
34 2 Equational Specification in Maude
2. The subsort NzInt for non-zero integers was defined to avoid problems with
division by 0, so that s(s(0)) / 0 is not a (well-formed) term. A side effect is
that an expression like s(s(0)) / (s(0) - 0) (i.e., 2/(1 − 0)), which denotes
a well-defined mathematical expression, is not a term, since the least sort of
s(0) - 0 is Int. Likewise, we use a subsort NeList for non-empty lists to avoid
problems with first and last of an empty list. However, this means that a sen-
sible expression like first(rest(nil s(0) s(s(0)))) is not a well-formed
term, since rest(nil s(0) s(s(0))) is not a term of sort NeList.
Membership equational logic [82] is an elegant generalization of order-sorted spec-
ifications that solves these problems by allowing us to define membership axioms
mb t : s . and cmb t : s if cond .
stating that the term t (of some supersort of the sort s) is also a term of sort s (pro-
vided that the condition cond, consisting of a conjunction of memberships t : s and
equalities u = u , holds in the case of conditional membership axioms). The subsort
SortedList of List can then be defined as follows:
fmod SORTED-LIST-NAT1 is protecting LIST-NAT1 .
sort SortedList . subsort SortedList < List .
var L : List .
cmb L : SortedList if isSorted(L) = true .
endfm
A term nil 0 s(0) is also a term of sort SortedList, whereas nil s(0) 0 is not.
Considering our second problem, membership equational logic allows expres-
sions like s(s(0)) / (s(0) - 0) and gives them “the benefit of doubt.” Such an
expression does not have a sort like Int but an “error sort” [Int]. The term
s(s(0)) / (s(0) - 0) is evaluated by computing wherever possible, and is reduced
to s(s(0)) / s(0) using the equations for -. This latter term is a well-formed term
of sort Int and the computation can proceed to give the expected result:
Maude> red s(s(0)) / (s(0)) - 0) .
result NzNat: s(s(0))
The term s(0) / (s(0) - s(0)) is also given the benefit of doubt and is reduced to
s(0) / 0, which does not have a sort and cannot be further reduced, and is therefore
a term of “error sort” [Int]:
Maude> red s(0) / (s(0) - s(0)) .
result [Int]: s(0) / 0
The formal explanation of this possibility of giving a term “the benefit of doubt”
is that each connected component of the partially ordered set (S, ≤) of sorts has a
kind in membership equational logic. We write [s] for the kind of the connected
component of sort s. Terms which do not have sorts and only have a kind can be
seen as “error terms.” Maude automatically adds a declaration
op f : [s1 ] ... [sn ] -> [s] .
2.6 Membership Equational Logic Specifications 35
for each declaration op f : s1 ... sn -> s . in the specification. This means that
our Maude specification of the integers (implicitly) also contains a declaration
op _/_ : [Int] [NzInt] -> [Int] .
Since s(0) - s(0) is a term of sort Int, and therefore also of kind [Int], the term
s(0) / (s(0) - s(0)) has kind [Int], due to the implicit declaration above and the
fact that [NzInt] = [Int]. Since s(0) / (s(0) - s(0)) is a “well-kinded” term, it
can be further reduced to the term s(0) / 0 of kind [Int]. This term cannot be
reduced any further, and, although well-kinded, has no sort.
The representation of the natural numbers and integers we have seen so far is not
very convenient for computing with large numbers. Maude therefore provides built-
in versions of the natural numbers, the integers, the rational numbers, and the IEEE-
754 double precision floating-point numbers, in addition to strings and Boolean val-
ues. These built-in modules provide the standard notation for numbers and strings,
such as 2017, -273, 22/7, and "Maude", and the expected operations on numbers
and strings efficiently implemented in C++. In contrast to many programming lan-
guages, Maude provides an efficient implementation of unbounded natural numbers,
integers, and rational numbers, instead of only 32-bits or 64-bits numbers.
These built-in data types are defined in the file prelude.maude which is read
when you start Maude. You can modify this file if you feel like redefining the
built-in modules or giving commands which should be executed when Maude starts.
Only the built-in Booleans are included automatically into any user module; to im-
port Maude’s natural numbers, you need to explicitly import the module NAT into
your module in the usual way. To automatically include NAT into all your modules,
just add the Maude command set include NAT on . to the file prelude.maude.
This section briefly introduces some of Maude’s built-in modules; see the file
prelude.maude for more details about these and other built-in modules.
2.7.1 Booleans
The module BOOL defines the Boolean values and some useful functions:10
fmod TRUTH-VALUE is
sort Bool .
op true : -> Bool [ctor special (id-hook SystemTrue)] .
op false : -> Bool [ctor special (id-hook SystemFalse)] .
endfm
The special attribute says that the function is a built-in operator/function imple-
mented in C++. The attributes assoc and comm mean that the function is, respec-
tively, associative and commutative; these attributes are explained in Section 2.8.
We ignore the gather attribute (see [21] for an explanation of this parsing issue).
The poly attribute states that the corresponding arguments (of sort Universal) may
have any sort. The operator if_then_else_fi behaves as expected, x == y equals
true if and only x and y are equal (that is, reduce to the same term), and conversely
for the inequality operator.
A condition b = true in an equation can be written just b:
ceq M monus N = 0 if M <= N .
Finally, t :: s is a term of sort Bool which is true if and only if the term t has sort s.
Maude provides the following module for arbitrarily large natural numbers, whose
implementation uses the GNU GMP library [48].11
11 Multiple declarations of the same non-constructor function are usually not needed, since equa-
tions will reduce a term to a constructor term of the right sort. However, in built-in modules, oper-
ators such as + have multiple declarations, since it is a built-in function not defined by equations.
2.7 Built-in Data Types 37
The constructors for Nat are 0 and s, so the natural numbers are represented by
the terms 0, s 0, s s 0, . . . . For convenience, we can also write 0, 1, 2, . . . :
Maude> red s s 0 + s s s 0 .
result NzNat: 5
Maude> red 1234567 * 89 .
result NzNat: 109876463
There is no subtraction function on the natural numbers (why?). Instead, the func-
tion sd denotes the (symmetric) difference between two numbers.
Example 2.7. The factorial function can be defined by induction on the constructors:
fmod FACTORIAL is protecting NAT .
op _! : Nat -> Nat .
var N : Nat .
eq 0 ! = 1 . eq (s N) ! = s N * (N !) .
endfm
or using the “standard” natural numbers and replacing the above equations with
eq N ! = if N == 0 then 1 else N * (sd(N, 1) !) fi . ♦
The function quo defines division, rem the remainder function, ˆ exponentiation
(m ˆ n = mn ), gcd denotes the greatest common divisor, lcm the least common mul-
tiple, and <, <=, >, and >= are the usual comparison operators. The module NAT also
has bit manipulating functions such as bitwise and (&), bitwise or (|), bitwise xor
(xor), right shift (>>), and left shift (<<).
Subsort overloaded operators must have the same attributes, except for ctor. The
attribute ditto stands for all attributes except ctor in previous declarations of the
same (subsort overloaded) function symbol.
38 2 Equational Specification in Maude
2.7.3 Integers
The integers are constructed from the natural numbers using the constructor -_, so
that negative numbers can be written as - s 0, - 2009, . . . , and also as -1, -2009,
. . . . The built-in efficient implementation of (unbounded) integers are given in the
following module (where many functions are not shown):
fmod INT is protecting NAT .
sorts NzInt Int . subsorts NzNat < NzInt Nat < Int .
op -_ : NzNat -> NzInt [ctor special (...)] .
op -_ : NzInt -> NzInt [ditto] .
op -_ : Int -> Int [ditto] .
op _+_ : Int Int -> Int [assoc comm prec 33 special (...)] .
op _-_ : Int Int -> Int [prec 33 gather (E e) special (...)] .
op abs : Int -> Nat [...] .
...
endfm
(abs gives the absolute value of a number.) The function _- is a constructor only on
NzNat, and is a non-constructor on NzInt and Int.
The rational numbers are defined in the module RAT, which defines the sorts NzRat
(non-zero rational numbers), PosRat (non-zero positive rational numbers), and Rat
(all rational numbers), with all the expected functions:
fmod RAT is protecting INT .
sorts PosRat NzRat Rat .
subsorts NzInt < NzRat Int < Rat .
subsorts NzNat < PosRat < NzRat .
op _/_ : NzInt NzNat -> NzRat [ctor prec 31 ... special (...)] .
op _/_ : NzNat NzNat -> PosRat [ctor ditto] .
op _/_ : PosRat PosRat -> PosRat [ditto] .
op _/_ : NzRat NzRat -> NzRat [ditto] .
op _/_ : Rat NzRat -> Rat [ditto] .
...
ops trunc floor : PosRat -> Nat .
ops trunc floor ceiling : Rat -> Int .
op ceiling : PosRat -> NzNat .
op frac : Rat -> Rat .
The built-in module FLOAT implements 64-bits IEEE-754 double precision floating-
point numbers with all the expected functions such as sqrt (for square root), the
trigonometric functions, the logarithm function, and so on.12
fmod FLOAT is protecting BOOL .
sorts FiniteFloat Float . subsort FiniteFloat < Float .
op <Floats> : -> FiniteFloat [special (id-hook FloatSymbol)] .
op <Floats> : -> Float [ditto] .
...
op sqrt : Float ~> Float [...] .
op log : Float ~> Float [...] .
op sin : Float -> Float [...] . op cos : Float -> Float [...] .
op asin : Float ~> Float [...] . op acos : Float ~> Float [...] .
...
endfm
The syntax <Floats> means that the constructors are built-in as a set of constants
such as 1.0, -9.87654321, and -1.23e+14 (for −1.23 · 1014 ). The sort Float also
contains two constants Infinity and -Infinity that denote out of range values:
Maude> red 3.45e+223 * 2.99e+210 .
result Float: Infinity
2.7.6 Strings
The built-in Maude module STRING defines the sort String of strings of the form
"this is a string". Strings of length 1 are constants of a subsort Char.
fmod STRING is protecting NAT .
sorts String Char FindResult .
subsort Char < String . subsort Nat < FindResult .
op <Strings> : -> Char [special (id-hook StringSymbol)] .
op <Strings> : -> String [ditto] .
op notFound : -> FindResult [ctor] .
op ascii : Char -> Nat [...] . op char : Nat ~> Char [...] .
op _+_ : String String -> String [...] .
op length : String -> Nat [...] .
op substr : String Nat Nat -> String [...] .
op find : String String Nat -> FindResult [...] .
op rfind : String String Nat -> FindResult [...] .
op _<=_ : String String -> Bool [...] .
...
endfm
The function ascii gives the ASCII value of a character, char does the oppo-
site, + denotes string concatenation, and length returns the length of a string.
substr(s, p, l) returns the substring of s which starts at character p + 1 and is l
characters long. find(s1 , s2 , p) finds the starting position (minus 1) of the substring
s2 in s1 , starting at character number p + 1 in s1 (and returns notFound if s2 is not
such a substring of s1 ). rfind does the same, but starts looking “from the right.”
The comparison operators <, <=, >, and >= compare strings lexicographically.
The module CONVERSION defines functions for converting between numbers and
strings, and between rational numbers and floating-point numbers. For example,
string(r, n) takes a rational number r and a base n (between 2 and 36), and displays
the number as a String in the given base. That is, string(123,10) equals "123"
and string(5,2) equals "101". The function rat does the opposite.
The Maude module RANDOM provides a function random, where random(k) gives the
k-th “pseudo-random” number as a number between 0 and 232 − 1. Since random is
a function, random(k) gives the same result for the same k.
fmod RANDOM is protecting NAT .
op random : Nat -> Nat [special (...)] .
endfm
To restrict the range of the “random” number, e.g., to a number between 1 and
100, we can use the expression (random(k) rem 100) + 1:
Maude> red random(1) .
result NzNat: 2546248239
Maude> red (random(2) rem 100) + 1 .
result NzNat: 34
Exercise 20 Define a function isPrime : NzNat -> Bool which returns true
if and only if its argument is a prime number (that is, a number which is not divisible
by any number except 1 and itself). Test your specification on 14091 (not a prime),
2 (prime), 31 (prime), and 135727 (?).
Exercise 21 Explain what the functions trunc, floor, ceiling, and frac in the
module RAT are supposed to compute.
Exercise 22 American sports scores have the form "49ers 39 Giants 38", while
Europeans prefer the notation "49ers - Giants 39-38". Define a function
europify : String -> String which transforms a score from American format
to European format. You may assume that there are no blanks in the name of a team.
Exercise 23 Define a function binary : Nat -> Nat . which gives the “binary”
value of a natural number, so that e.g. binary(7) equals the number 111.
2.7 Built-in Data Types 41
Exercise 24 Define a sort for Roman numerals (lists of I, V, X, L, C, D, and M), and
functions roman and decimal that convert between Roman and decimal numbers
smaller than 3500.
This section defines some equational attributes, such as associativity and commuta-
tivity, that enable us to define lists and (multi-)sets in a nice way, and that can make
the definition of certain functions more elegant.
The (multi)sets {a, b} and {b, a} are the same, and therefore their representations
should be equivalent. More generally, it is sometimes needed or useful to define a
function f (such as, e.g., set union) to be commutative:
However, this equation leads to infinite loops f (x, y) f (y, x) f (x, y) · · · . The
Maude solution to having both commutativity and termination is to declare that “ f
is commutative,” so that Maude always “keeps in mind” that f is commutative. We
can declare that a function f is commutative by giving it an attribute comm:
fmod COMM1 is
sort s . op f : s s -> s [comm] . ops a b c : -> s .
eq f(a,b) = b .
endfm
TΣ ,C = {[t]C | t ∈ TΣ }
Notation: To avoid too many symbols, I most often write t for [t]C .
42 2 Equational Specification in Maude
Example 2.8. A function minimum which returns the smallest of two integers can
be elegantly defined by a single equation:
fmod MIN1 is protecting INT .
op minimum : Int Int -> Int [comm] .
vars I J : Int .
ceq minimum(I, J) = I if I <= J .
endfm ♦
That is, any term u of sort s is considered to be identical to f(u,t) and f(t,u). For
example, in
sort s .
ops a b e : -> s [ctor] . op f : s s -> s [id: e] .
vars X Y : s . eq f(X,Y) = a .
the term b reduces to a, since b is the same as f(b,e). However, be careful with ter-
mination; even the seemingly terminating equation above is nonterminating, since
it has an infinite computation [a]I = [f(a,e)]I [a]I = [f(a,e)]I [a]I = · · · .
Section 2.4.3.1 defines lists using a constructor _ _ : List Nat -> List and a
constant nil. All lists have the form (. . . (((nil n1 ) n2 ) n3 ) . . .) nk (even though the
parentheses may be omitted since there is only one way to parse a term). How-
ever, it is more natural to view lists as “flat” structures; this suggests the following
representation of lists, in which an integer is also a list (of one element):
sort List . subsort Int < List .
op nil : -> List [ctor] .
op _ _ : List List -> List [ctor assoc] .
Both 4 and 7 are terms of sort List, since Int is a subsort of List. These two
lists can be concatenated using the concatenation operator _ _, so that 4 7 is also
a term of sort List. This list can be concatenated with the list 11, which gives a term
(4 7) 11, which can be concatenated with the list 99 to get the list ((4 7) 11) 99.
Or, the two lists 4 7 and 11 99 can be concatenated into (4 7) (11 99). Since
_ _ is declared to be associative, these two lists are the same list, and we can ignore
parentheses: 4 7 11 99.
Unfortunately, since nil is a term of sort List, also nil 4 and 7 nil are Lists,
and so is their concatenation nil 4 7 nil. The good thing is that we can “elimi-
nate” these nils by declaring _ _ to have nil as its identity element:
op _ _ : List List -> List [ctor assoc id: nil] .
44 2 Equational Specification in Maude
nil 4 and 4 are now exactly the same list (i.e., [nil 4]AI = [4]AI ), and so are
therefore nil 4 7 nil and 4 7. This gives the desired one-to-one correspondence
between (equivalence classes of) constructor ground terms modulo associativity and
identity of the list concatenation constructor and the set of all lists of integers.
A list is now either the empty list nil or has the form i l, for i an integer and
l a list (since the one-element list i is identical to i nil) or, equivalently, the form
l i. This is reflected in the definitions below, which are much simpler than the cor-
responding ones in Section 2.4.3.1:
fmod LIST-INT is protecting INT .
sorts List NeList . subsorts Int < NeList < List .
op length : List -> Nat . ops first last : NeList -> Int .
op empty? : List -> Bool . op rest : NeList -> List .
op reverse : List -> List . op _occursIn_ : Int List -> Bool .
op max : NeList -> Int . op isSorted : List -> Bool .
A set is essentially a multiset where the multiplicity of elements does not matter.
Sets of integers can therefore be defined as multisets of integers with the extra axiom
eq I I = I . (for I a variable of sort Int) which removes duplicates.13
Exercise 25 For each of the (equivalence classes of the) terms f(f(b,a),a) and
f(b,b) and f(f(a,b),f(b,a)) and f(c,a), compute its normal form in COMM1
“by hand” and using Maude’s red command.
Exercise 26 Complete the module LIST-INT by defining the functions empty?,
rest, reverse, max, and isSorted.
such that comesBeforeIn(i, j, l) is true if and only if there are elements i and j
in the list l, and where the first occurrence of i comes before the first occurrence of
j in l; and where l1 >lex l2 is true if l1 is lexicographically greater than l2 (see
Exercise 11 for the definition of lexicographic comparison).
Exercise 28 1. Define a sort String for lists of characters a, b, . . . , z.
2. Define a function isPal : String -> Bool so that isPal(s) returns true if
and only if s is a palindrome, that is, reads the same backwards and forwards.
For example, a n n a and b o b are palindromes, whereas p e t e r is not.
3. Define a function _prefixOf_ : String String -> Bool that checks whe-
ther the first argument is a prefix of the second argument.
4. Define a function _substringOf_ : String String -> Bool that checks
whether the first argument is a substring of the second argument.
13 Maude has an idempotency attribute, but currently it cannot be used with the assoc attribute.
46 2 Equational Specification in Maude
5. Define a supersort Pattern of String for strings that may contain the symbol
‘?’, which is a “wild card” that matches any single character.
6. Define functions _prefixOf_ : Pattern String -> Bool and _substringOf_
: Pattern String -> Bool that check whether the first argument “matches”
a prefix, respectively, a substring, of the second argument. For example,
b ? d e ? g substringOf a b c d e f g h should return true.
7. (Trickier?) Repeat the exercises above for patterns that may contain the symbol
‘*’ that can stand for any sequence of characters.
Exercise 30 Define the functions size, mult, max, empty?, and the multiset com-
parison operator >mul in the module MSET-INT.
Exercise 31 Show that for any multiset m0 over the natural numbers, there is no
infinite sequence
m0 > m1 > m2 > m3 > . . .
of multisets m0 , m1 , m2 , m3 , . . . such that each mi is greater than mi+1 .
Exercise 32 Assume that we have already defined two sorts Obj and Msg. Define a
sort Mset-ObjMsg whose elements are multisets of Obj and Msg elements (that is, a
multiset may contain both Obj and Msg elements).
Exercise 33 Define a data type of sets of integers with functions in (does the
given number belong to the set?), delete (remove an element from a set), card
(the cardinality (number of distinct elements) of a set), setMinus (set difference),
and intersect (the intersection of two sets). Make sure that your specification is
confluent. delete(1, 0 1 2 1) should give 0 2 no matter how the equations are
applied. Similarly, the cardinality of the set 0 1 2 1 is 3.
2.9 Examples
This section shows how the sorting algorithms quicksort and mergesort, as well as
solutions to classic NP-complete problems like subset sum and Hamiltonian circuit,
can be formally specified in Maude. Such a specification has a number of benefits:
• In contrast to prose and pseudo-code (and even an imperative program), a Maude
specification gives a precise, un-ambiguous specification of the algorithm.
• The specification is also at the same time a program, defined in a simpler and
less error-prone way than, e.g., a Java implementation.14
• It is possible to reason mathematically about the Maude specification; it is also
much easier to reason informally about the correctness of the Maude program
than about the Java program, since we can focus on checking the correctness of
single equations, instead of having to reason about the entire program.
2.9.1.1 Quicksort
14Is it i=0 or i=1? j=i or j=i+1? i++ or ++i? Until j>k or j>=k? A -1 or +1 missing
somewhere?
48 2 Equational Specification in Maude
eq smallerElements(nil, N) = nil .
eq smallerElements(N L, M) = if N < M then
(N smallerElements(L, M))
else smallerElements(L, M) fi .
eq equalElements(nil, N) = nil .
eq equalElements(N L, M) = if N == M then (N equalElements(L, M))
else equalElements(L, M) fi .
eq greaterElements(nil, N) = nil .
eq greaterElements(N L, M) = if N > M then
(N greaterElements(L, M))
else greaterElements(L, M) fi .
endfm
2.9.1.2 Mergesort
eq mergeSort(nil) = nil .
eq mergeSort(I) = I .
ceq mergeSort(NEL NEL’) = merge(mergeSort(NEL), mergeSort(NEL’))
if length(NEL) == length(NEL’) or length(NEL) == s length(NEL’) .
eq merge(nil, L) = L .
ceq merge(I L, J L’) = I merge(L, J L’) if I <= J .
endfm
The raison d’être for mergesort is that its execution time is O(n log n). The above
specification may be less efficient, since splitting a list into two halves is done by
matching. The usefulness of this kind of specification is that it is a precise descrip-
tion of a complex algorithm, and that it is at the same time a prototype of your
algorithm that can be developed quickly to test and further analyze your algorithm
before a detailed and efficient algorithm is implemented in all its glory.
2.9 Examples 49
15There are 2n different subsets of an n-element set, and n! different permutations of n elements.
16The complexity of an algorithm can be precisely defined in a machine-independent way as the
number of steps performed by a Turing machine implementing the algorithm.
50 2 Equational Specification in Maude
The following equations take care of the two base cases: (i) there are no (remaining)
elements to choose from, and the (remaining) desired sum is a positive number NZN;
and (ii) the desired (remaining) sum is NZN and there is a number NZN in the multiset:
vars NZN NZN1 NZN2 : NzNat . var S : MSet .
eq subsetSum(none, NZN) = false .
eq subsetSum(NZN S, NZN) = true .
In the recursive case, we are left with some numbers NZN1 S and the desired sum
NZN2. From those elements, there is a subset with sum NZN2 if and only if either:
Since Hamiltonian Circuit is a graph problem, this section first shows one way of
representing graphs in Maude.
The following module represents a graph as a set of nodes node: n nbs: nbs,
where n is the name of the node and nbs is the (names of the) neighbors of n. An
undirected edge between nodes n1 and n2 must be represented twice: n2 must be in
the set of neighbors of node n1 , and vice versa:
fmod GRAPH is
sort NodeId . --- application-specific node names
sort NodeIdSet . subsort NodeId < NodeIdSet .
op none : -> NodeIdSet [ctor] .
op _ _ : NodeIdSet NodeIdSet -> NodeIdSet
[ctor assoc comm id: none] .
sort Node .
op node:_nbs:_ : NodeId NodeIdSet -> Node [ctor] .
The brute-force way to solve the Hamiltonian Circuit problem goes as follows:
1. Select any node as the starting node, and also as the “current node.”
2. For each neighbor n of the “current node”: either that neighbor n is the next
node in the circuit, in which case n becomes the “current” node, or the neighbor
n is not the next node in the circuit.
3. When all nodes are included in the path, check whether there is an edge from the
last (“current”) node to the starting node. If so, there is a Hamiltonian circuit.
The function hamiltonianCircuit : Graph -> Bool that checks whether an
undirected graph has a Hamiltonian Circuit can then be defined as follows in Maude;
this solution assumes that a graph always has at least three nodes.
fmod HAMILTONIAN-CIRCUIT is including GRAPH .
op hamiltonianCircuit : Graph -> Bool .
from startNode, this path includes all the nodes that are not in remainingNodes, the
“current” node (the last node in the path we are building) has neighbors currNbs,
and remainingNodes are the nodes not yet in the path.
The following equation takes an arbitrary node N as the starting node:
vars N N1 : NodeId . vars NBS NBS2 : NodeIdSet .
var NS : Graph . var NODE : Node .
In the following equation, the “current” node has (remaining) neighbors N1 NBS,
and there is a node N1 that has not yet been visited in the path. There are now two
choices: either N1 is the next node in the path, in which case we remove the node
N1 from the remaining nodes and update the “current neighbors” to N1’s neighbors
NBS2, or N1 is not the next node in the path, in which case we “forget” N1 from the
current neighbors and try the other neighbors NBS:
eq hCircuit(N, N1 NBS, (node: N1 nbs: NBS2) ; NS)
= hCircuit(N, NBS2, NS) --- try N1 as the next node
or hCircuit(N, NBS, (node: N1 nbs: NBS2) ; NS) . --- or not
If there are nodes yet to be visited but there is no (remaining) edge from the
current node, then the current path cannot be extended into a Hamiltonian circuit:
eq hCircuit(N, none, NODE ; NS) = false .
If there are no unvisited nodes, the current path can be extended to a Hamiltonian
circuit if and only if the starting node is a neighbor of the “current” (last) node:
eq hCircuit(N, NBS, emptyGraph) = (N in NBS) .
Exercise 35 Define a version of quicksort which, for lists of at least two elements,
will look at the first and the last element in the list, and choose as pivot element the
number f irst+last
2 . (It is possible that such a number is not an element in the list, but
that does not matter.) Explain also why your specification is terminating.
Exercise 37 Specify the insertion sort algorithm in Maude. Insertion sort works as
when you get some cards and have to sort them: you take the (unsorted) cards one
by one, and put them into the right place in your hand, which always remains sorted.
Exercise 38 In the Unbounded Subset Sum problem we can use each number in the
given multiset as many times as we want in order to achieve the desired sum. Specify
a function op unboundedSubsetSum : MSet NzNat -> Bool which solves this
problem. Is it easy to see that the new problem is NP-complete?
Exercise 39 Consider the Traveling Salesman problem, where the cost of a trip
between two cities is given by a function cost : City City -> NzNat [comm].
Exercise 125 shows an example of such a cost function.
1. Specify a function travelingSalesman : Cities NzNat -> Bool so that
travelingSalesman(cities, budget) is true if and only if there is a tour visiting
all cities in cities (once) that does not cost more than budget.
2. Show that Traveling Salesman is NP-complete by showing that a solution for it
easily can solve one of the other NP-complete problems in Section 2.9.2.
3. It is sometimes more expensive to travel directly between A and B than to travel
from A to C and then to B. Specify a solution to the traveling salesman problem
when the salesperson may visit a city more than once if needed.
Exercise 40 Explain how you can use a solution to the Subgraph Isomorphism
problem to solve two other NP-complete problems (which ones?) in Section 2.9.2.
Maude has a number of useful features that will not be mentioned elsewhere in this
book; the reader is referred to the Maude book [21] or the Maude manual for details.
Instead of defining a data type, such as lists, from scratch for each kind of list (lists of
integers, lists of strings, lists of lists of . . . , and so on), we can define parameterized
modules. Assume that we want to define a generic mergesort function that can sort
all kinds of lists, as long as we can compare the elements in the list. A parameter
2.10 * Some Other Maude Features 55
for this generic function must have a sort for the elements and a total order on those
elements. The “formal parameter” for this function is defined as the theory
fth TOTAL-ORDER is protecting BOOL .
sort Element .
op _le_ : Element Element -> Bool .
vars E E1 E2 E3 : Element .
--- reflexivity, anti-symmetry, transitivity, and totality:
eq E le E = true [nonexec] .
ceq E1 = E2 if (E1 le E2) and (E2 le E1) [nonexec] .
ceq E1 le E3 = true if (E1 le E2) and (E2 le E3) [nonexec] .
eq (E1 le E2) or (E2 le E1) = true [nonexec] .
endfth
This theory defines an “interface” or formal parameter TOTAL-ORDER that any ac-
tual parameter must “satisfy.” That is, an actual parameter must interpret the sort
Element and the function symbol le (the comparison operator), so that the four
equations for a total order are satisfied.
The parametric mergesort module is then given as follows:
fmod PARAM-SORT{X :: TOTAL-ORDER} is protecting INT .
sorts List NeList . subsort X$Element < NeList < List .
op nil : -> List [ctor] .
op _ _ : List List -> List [ctor assoc id: nil] .
op _ _ : NeList NeList -> NeList [ctor assoc id: nil] .
The module defines lists of the sort Element of the parameter X. The rest is our
mergesort function, with the comparison operator le used to compare elements.
Views define how the actual parameter module interprets the formal parameter
module. A view maps the sorts (resp. operators) of the formal parameter to sorts
(resp. operators or even expressions) in the actual parameter. For example, the fol-
lowing view Int<= says that we want to use INT as the actual parameter, with the
sort Element mapped to the sort Int, and with the function le mapped to <=:
view Int<= from TOTAL-ORDER to INT is
sort Element to Int .
op _le_ to _<=_ .
endv
56 2 Equational Specification in Maude
The following module SORT-INT<= then defines lists of integers and the mergesort
function w.r.t. the comparison operator <=:
Maude> fmod SORT-INT<= is protecting PARAM-SORT{Int<=} . endfm
Maude> red mergeSort(5 2 11 23 -4 8) .
result NeList: -4 2 5 8 11 23
Since the derivation started with 0 ! and has reached a term containing 0 !, the
specification is nonterminating. The point is that we assume that if_then_else_fi
first computes the value of its first argument, and then evaluates “itself” using the
if_then_else_fi-equations above. However, a term if b then t else u fi
could equally well be evaluated by first evaluating t, as happened above.
To avoid such undesired computations, and to increase the efficiency of Maude
computations, we can tell Maude how to evaluate a term by defining an evaluation
strategy of a function using the attribute strat. For example, a declaration
op f : s1 s2 s3 -> s [strat (2 0 1 3 0)] .
tells Maude to first evaluate the second argument (2), then the whole term (0), then
the first argument (1), and so on. That is, an expression f(t1 ,t2 ,t3 ) will be evaluated
by first reducing t2 as much as possible to t2 , and then simplify the term f(t1 ,t2 ,t3 )
“at the top” using f-equations. If the resulting term still has the form f(u1 , u2 , u3 ),
then u2 is again evaluated, and so on. For example, if_then_else_fi should have
the attribute strat (1 0 2 3 0) (or even strat (1 0)), which states that the test
is computed first, followed by the application of an if_then_else_fi-equation.
Maude’s default evaluation strategy of a function is (1 2 . . . n 0). This strategy, in
which all subterms are evaluated before the entire term is evaluated, is called eager
evaluation. A strategy that starts with 0 denotes lazy evaluation, since subterms are
not computed before the entire term is evaluated.
The choice of evaluation strategy can have a significant impact on the efficiency.
For example, efficient evaluation strategies for a function f defined by f (x, y, z) = y
are (2 0) or (0), whereas the strategy (1 3 2 0) is very inefficient (why?).
owise Equations. An equation of the form f (. . .) = t with the owise (for “other-
wise”) attribute can only be applied if no other equation for f can be applied. This
greatly simplifies the definition of some functions, as shown below:
vars I J : Int . vars L L1 L2 L3 : List .
var N : Nat . vars MS MS’ : Mset .
eq I occursIn L1 I L2 = true .
eq I occursIn L = false [owise] .
eq I in I MS = true .
eq I in MS = false [owise] .
It is worth remarking how easily the NP-complete Subset Sum problem can be
solved using the owise attribute and assoc and comm symbols:
ceq subsetSum(MS MS’, N) = true if sum(MS) == N .
eq subsetSum(MS, N) = false [owise] .
58 2 Equational Specification in Maude
Exercise 43 What is the most efficient evaluation strategy for the functions f , g,
and h in the specification
Exercise 44 The Boolean tests && and || evaluate their second argument only if
necessary in languages like C and Java, so that b2 is not evaluated in b1 && b2
if b1 evaluates to “false.” The built-in functions and and or evaluate both their
arguments in Maude:
Maude> red 0 > 0 and (5 / 0 > 4) .
result [Bool]: false and 5 / 0 > 4
Define two Boolean functions and-then and or-else which work more like the C
conjunctions and disjunctions.
Operational Semantics of Equational
Specifications 3
This section defines what it means that a term reduces1 in one step using an equation
in an unsorted specification without functional attributes. Function symbols are not
declared explicitly, but their declarations can be inferred from the context. Constants
are denoted a, a , b, c, . . . , non-constant function symbols f , g, h, . . . , terms t, t1 , t ,
u, . . . , and variables x, x , x1 , y, z, . . . . Therefore, a specification
1 Such reduction is often called rewriting (or (equational) simplification). To avoid confusion with
non-equational rewriting in rewriting logic, I use reduction when equations are applied, and rewrit-
ing for the application of (“non-equational”) rewrite rules in rewriting logic. Similarly, I use the
symbol instead of the more common arrow −→ for equational reduction/simplification.
c Springer-Verlag London 2017 59
P.C. Ölveczky, Designing Reliable Distributed Systems, Undergraduate Topics
in Computer Science, DOI 10.1007/978-1-4471-6687-0 3
60 3 Operational Semantics of Equational Specifications
A term has a tree structure in the absence of equational attributes such as assoc
and comm. For example, the term f (h(a, b, g(x)), f (y, f (z, b))) can be seen as the
tree in Fig. 3.1a. A position in a term is a string of numbers (with ε denoting the
empty string) as seen in Fig. 3.2. The set of (legal) positions in a term can be defined
formally by induction on the structure of the term as follows:
Definition 3.1 (Position) The set Pos(t) of positions in a term t is the following set
of strings of non-zero natural numbers:
• if t is a variable or a constant, then Pos(t) =ε
• if t = f (t1 , . . . ,tn ), then Pos(t) = {ε } ∪ ni=1 {i.p | p ∈ Pos(ti )}.
A term with infix function symbols can also be written in prefix form, so that the
term s(s(0 + s(0))) + 0 has the tree structure shown in Fig. 3.1b.
If p is a position in a term t, we denote by t | p the subterm of t at position p.
Definition 3.2 The subterm of t in position p ∈ Pos(t), written t | p , is defined by
t |ε = t
f (t1 , . . . ,tn ) |i.p = ti | p .
3.1 The Reduction Relation 61
Fig. 3.2 The positions in the term f (h(a, b, g(x)), f (y, f (z, b)))
Example 3.1. The subterm of h(a, b, g(x)) at position 3 is g(x) and h(a, b, g(x)) |3.1
is x. The subterms of h(a, b, g(x)) are h(a, b, g(x)), a, b, g(x), and x. The last four
are proper subterms. ♦
The term t[u] p is t with t | p replaced by u. That is, we put u into t at position p in t:
Definition 3.3 If t and u are terms, and p ∈ Pos(t), then t[u] p is defined as follows:
t[u]ε = u
f (t1 , . . . ,ti , . . . ,tn )[u]i.p = f (t1 , . . . ,ti [u] p , . . . ,tn ).
Example 3.2. f (a, f (x, g(y)))[b]2 is f (a, b), and f (a, f (x, g(y)))[c]ε is just c, and
f (a, f (x, g(y)))[c]2.2.1 is f (a, f (x, g(c))). ♦
vars(t) denotes the set of variables in t; e.g., vars( f (a, g(x, f (b, z)))) = {x, z}.
A variable substitution (or just substitution) maps variables to terms, and is usu-
ally written explicitly as {x1 → t1 , . . . , xn → tn }, where variables that are mapped to
themselves are not mentioned. If σ is a substitution σ : X → TΣ (Y ), we also de-
note by σ its (homomorphic) extension σ : TΣ (X) → TΣ (Y ) which takes a term and
simultaneously replaces each variable x in the term with σ (x). We often write sub-
stitutions in “postfix” notation. For example, if σ is {x → a, y→ g(x, y), z→ h(z, z)}
and t is the term f (x, x, f (x, y, z)), then t σ is f (a, a, f (a, g(x, y), h(z, z))).
A ground substitution maps each variable to a ground term.
Definition 3.5 (Reduction relation) Given a set of equations E (with each equa-
tion “directed” from left to right). A term t reduces (in one step) to a term u, written
t E u, if and only if there is an equation l = r in E, a position p in t, and a sub-
stitution σ such that t | p = l σ and u = t[rσ ] p . That is, t = t[l σ ] p E t[rσ ] p = u.
(I often write instead of E when E is given by the context or is unimportant.)
Example 3.4. If E = { f (x, y, z) = g(y)}, then we have both f (a, b, b) g(b) and
h(g(b), f (a, g(x), h(z))) h(g(b), g(g(x))). ♦
Exercise 45 1. What is f (a, b) |2 , and what is f (h(c), g(d, g(a, f (a, b)))) |2.2.1 ?
2. What is (s(s(0 + s(0))) + 0)[s(0)]1.1.1 ?
3. What is f (h(c), g(d, g(a, f (a, b))))[ f (b, b)]2.2 ?
Exercise 46 Let t be f (x, x, f (x, y, z)) and σ be {x → a, y → g(x, y), z → h(z, z)}.
What is (t σ )σ ?
Exercise 48 For each reduction step in Example 3.4, find the equation, the position,
and the substitution used, and show that the step is indeed a reduction step.
t1 E t2 E · · · E tn
or an infinite sequence
t1 E t2 E t3 E · · ·
of reduction steps ti E ti+1 in E.
• A computation in E is either an infinite derivation in E, or a finite derivation in E
which cannot be extended (that is, the last term in the derivation is irreducible).
The following definitions formalize the notions of termination and confluence
introduced informally in Chapter 2.
Definition 3.7 (Confluence) A specification is confluent if and only if for all terms
∗ ∗ ∗ ∗
t,t1 ,t2 such that t t1 and t t2 , there is a term u such that t1 u and t2 u.
Confluence, together with termination, essentially means that the result obtained
by a computation in Maude is independent of how/which equations are applied.
Theorem 3.1 Let E be a terminating specification. Then each term t has a unique
normal form if and only if E is confluent.
64 3 Operational Semantics of Equational Specifications
Proof. We first prove the “if” direction. Assume that E is confluent but that some
term t does not have a unique normal form. If this leads to a contradiction, then
each term has a unique normal form. If some term t has at least two distinct normal
∗ ∗
forms u1 and u2 , we have t u1 and t u2 . But then, according to the definition of
∗ ∗
confluence, there must be a term u such that u1 u and u2 u. Since u1 = u2 , and
∗ ∗ + +
we must have u1 u and u2 u, it means that either u1 u or u2 u (or both).
But this is impossible, since both u1 and u2 are normal forms, and therefore cannot
be reduced in one or more steps.
To prove the “only if” direction, assume that each term has a unique normal form
but that E is not confluent. If E is not confluent, then there are terms t,t1 ,t2 such that
∗ ∗ ∗ ∗
t t1 and t t2 , but there is no term u such that t1 u and t2 u. Since each term
has a unique normal form, t1 and t2 have the respective normal forms t1 ! and t2 !. If
t1 ! = t2 !, then t1 and t2 have such a common successor term u (namely, t1 !), and the
system is confluent. If t1 ! = t2 !, then t1 ! and t2 ! are two different normal forms of t,
which contradicts the assumption that each term has a unique normal form.
Analyzing whether a specification is terminating and confluent is the topic of the
next two chapters. Not only are these crucial properties by themselves, but Maude
assumes that your specifications are both terminating and confluent. Maude will not
check this for you, for reasons that will be clear soon.
This section briefly discusses the operational semantics of conditional equations and
the computational complexity of matching (and hence of applying an equation) with
operators that are declared to be associative and/or commutative.
l = r if t1 = u1 ∧ . . . ∧ tn = un
{a = b if a = b}.
3.3 Conditional Equations and Matching with assoc/comm 65
Proof. Following [6, 10], we show that 1-3-SAT, which is an NP-complete prob-
lem [47], can be solved easily by AC-matching. A 1-3-SAT instance is a set
{(pi ∨ qi ∨ ri ) | 1 ≤ i ≤ n}
for a new symbol fn . We can use an ordinary binary operator f if we want fi-
nite signatures, and the problem becomes whether f (t1 , f (t2 , f (. . . ,tn )) · · · ) matches
f (u1 , f (u2 , f (. . . , un )) · · · ).) For example, the 1-3-SAT problem for {(p1 ∨ p2 ∨ p3 ),
(p2 ∨ p3 ∨ p4 )} amounts to checking whether f (x p1 ∨ x p2 ∨ x p3 , x p2 ∨ x p3 ∨ x p4 )
matches f (true ∨ false ∨ false, true ∨ false ∨ false). Since the latter term
is AC-equal to f (false ∨ true ∨ false, true ∨ false ∨ false), there is such a
match {x p1 → false, x p2 → true, x p3 → false, x p4 → false}.
Proving that A-matching and C-matching are NP-complete can be done in a sim-
ilar way [6, 10]. Although these results indicate that computing with functions that
have the attributes assoc and/or comm may be very inefficient, the Maude develop-
ers have put a lot of effort and ingenuity into making the A-, C-, and AC-matching
algorithms really fast for most patterns occurring in practice [36].
Exercise 49 What matching problem solves the 1-3-SAT problem for {(p1 ∨ p2 ∨ p3 ),
(p2 ∨ p3 ∨ p4 ), (p1 ∨ p2 ∨ p4 ), (p1 ∨ p3 ∨ p4 )}? Is there such a match?
Termination
4
which, for any specification E, can figure out whether or not E is terminating. It
is well known that it is impossible to have such a function for both standard pro-
gramming languages and for Turing machines. Section 4.1 explains how any Turing
c Springer-Verlag London 2017 67
P.C. Ölveczky, Designing Reliable Distributed Systems, Undergraduate Topics
in Computer Science, DOI 10.1007/978-1-4471-6687-0 4
68 4 Termination
1 Turing machines are a model of computation and not a data type. Equational specifications are
therefore not well suited for modeling such machines, which can instead be naturally modeled in
rewriting logic (see Exercise 126). We show how the computations of a Turing machine can be sim-
ulated by equational simplification steps only to prove undecidability of termination of equational
specifications.
2 Our results carry directly over to deterministic Turing machines.
4.1 Undecidability of Termination 69
A Turing machine has a tape which is infinite in both directions. This tape is divided
into infinitely many squares. Each square contains one symbol from S, but there are
only a finite number of non-blank symbols on the tape. At any time, the machine is
in one of the states q0 , . . . , qn , and has a (read/write) head that points to some square
on the tape. The machine operates by performing transitions as long as possible:
either until no transition can be taken, or forever. More precisely, if the machine is
in state q, with its head pointing to a square that contains the symbol s, and there is
a transition (q, s, qnext , snext , right) in Δ , the machine can perform this transition, in
which case it writes snext in the square on the tape where its head is (thereby erasing
s from that square), goes to the new state qnext , and moves the head one position
to the right on the tape (if the transition is (q, s, qnext , snext , left), the head is instead
moved one position to the left).
Example 4.1. Two configurations (i.e., state, position of the head, and tape content)
of a Turing machine ({q1 , q2 }, {, a, b}, {(q1 , b, q2 , a, right), . . .}) are:
In the left-hand side, the machine is in state q1 and its head points to a square con-
taining the symbol ‘b.’ The right-hand side shows the configuration resulting from
performing the transition (q1 , b, q2 , a, right) on the left-hand side configuration. ♦
Example 4.2. The Turing machine ({qinit , qstop }, {, 1, 2}, Δ ) that changes every
‘1’ to ‘2’, and every ‘2’ to ‘1’, until it reaches a blank—when the initial state is
qinit and the machine reads towards the right—has the following transitions Δ :
Example 4.3. The left-hand side (resp., right-hand side) configuration in Example 4.1
is represented as the term [ a q1 b b a b ] (resp., [ a a q2 b a b ]). ♦
If a transition (q, sik+1 , qnext , snext , right) is performed when the machine is in the
configuration represented by the term [ si1 . . . sik q sik+1 sik+2 . . . ], then the next
configuration term is [ si1 . . . sik snext qnext sik+2 . . . ]. If the configuration was repre-
sented by [si1 . . . sik q sik+1 ] (that is, the head points to the last square represented
70 4 Termination
in the list), the next configuration will be represented by [ si1 . . . sik snext qnext ],
where the list has been extended with a blank. Moving to the left is symmetric.
The Maude representation e(M) of a Turing machine M is defined as follows:
sorts State Symbol Delimiter Tape .
ops q0 ... qn : -> State [ctor] .
ops s1 ... sm : -> Symbol [ctor] .
ops [ ] : -> Delimiter [ctor] .
subsort State Symbol Delimiter < Tape . --- non-empty list
op _ _ : Tape Tape -> Tape [ctor assoc] .
where the second equation takes care of the case when the head points to the leftmost
square represented in the list. In the first equation, the head points to the square
containing s. The content of this square is changed to s , and the new state q jumps
to the left, so that it now points to SYMBOL. The second equation inserts a blank ()
at the left end of the list and makes the head point to this new blank. A transition
(q, s, q , s , right) that moves the head to the right is represented in the same way:
eq q s SYMBOL = s q SYMBOL .
eq q s ] = s q ] .
It should be fairly obvious that e(M) can simulate each step of the Turing
machine M. An infinite computation in M is therefore simulated by an infinite
derivation in e(M), so that e(M) is nonterminating if M is nonterminating.
But hold the horses, there are two problems here:
1. A Turing machine is represented by an order-sorted specification with an assoc
operator, whereas we were supposed to reason about the termination of unsorted
specifications without such attributes.
2. e(M) should be terminating if and only if M is terminating. We have only shown
that if M is not terminating, then e(M) is also nonterminating. Can e(M) be
nonterminating even when M is terminating? Remember that for e(M) to be
terminating, it must be terminating for all possible initial terms t0 , even those
that do not represent legal Turing machine configurations. Could e(M) loop on
some “junk terms” even when M is terminating?
Addressing the first problem is easy. The above model was chosen for simplic-
ity of explanation. A list/string rewrite system of this form can be represented by
a term rewrite system where each (state, alphabet, and delimiter) symbol except
‘]’ is represented by a unary function3 symbol with the same name. For exam-
ple, the list [ s1 s2 s5 s1 q s3 ] can be represented by the (unsorted) term
[((s1 ((s2 (s5 (s1 (q((s3 (]))))))))). Translating the above system to such an
unsorted system is fairly easy and is left as Exercise 50.
The second issue is trickier. In such an unsorted representation, there are terms
with multiple q’s, de facto representing multiple Turing machine “instances” on the
same tape. For example, the term [ s1 q1 s2 q2 s5 q q5 s3 ] does not represent
any Turing machine configuration. Can the translation e(M) of a terminating Turing
machine M be nonterminating because it is not terminating on such junk terms?
Consider the following Turing machine Mab :
• If Mab initially reads ‘a’, it wants to ensure that the square to the right also con-
tains ‘a’. If the square to the right contains ‘b’, then Mab writes ‘a’ there, goes
one position left, and then one position right, to really ensure that the square to
the right position still contains ‘a’. If so, it is done. If not, then Mab again writes
‘a’ and goes left, and then right, and repeats the confirmation process.
• If Mab starts by reading ‘b’, it wants to ensure that the current square always
contains ‘b’. That is, it jumps to the right, then jumps back left, and if the original
square still contains ‘b’, it is done. If the original square contains ‘a’, it writes
‘b’ there, jumps to the right, then back to the left, and repeats the process.
The machine Mab is obviously terminating (for any initial machine state qi ). How-
ever, if you “combine” two versions of Mab on the same tape; that is, if you start
with a “junk term” [ qinit a qinit b ], you get a nonterminating system: the “first”
head reads ‘a’ and remembers this; the “second” head then reads the ‘b’ and jumps
to the right; the first head then reads that ‘b’ and turns it into an ‘a’, goes left and re-
members to check for ‘a’; the second head goes back and checks whether its initial
square still contains ‘b’, and since it does not, it sets that square to ‘b’ and moves
right; and so on. It is an easy exercise (Exercise 53) to formalize Mab and show that
its translation e(Mab ) is nonterminating from the above term.
If M is terminating, it would not be a problem if many “instances” of M work
at the same time independently of each other, since each instance would terminate
sooner or later. The problem occurs when these different instances interact, which is
exactly what happens in Mab : the “left head” insists on having ‘a’ in the second tape
position above, while the “right head” insists on having ‘b’ in this same location.
The solution is to ensure that the different “instances” cannot interact. Baader
and Nipkow [6] achieve this by using two different representations ← −
s and → −s of
each alphabet symbol s, with the arrow pointing to the head to which the sym-
bol “belongs.” A transition only considers symbols pointing towards the head, and
symbols generated by the head will always point towards it. Hence one head can-
not use symbols generated by another head. Modifying our translation in this way
(see Exercise 54) leads to an (unsorted and unconditional) equational specification
ẽ(M) that is terminating if and only if M is terminating. (The representation in [105]
avoids such unfortunate interactions between different Turing machine instances by
representing a configuration list1 q list2 as a term q(reverse(list1 ), list2 ).)
the first equation above gives rise to an equation si (q(s(x))) = q (si (s (x))) for each
symbol si , since variables range over terms and not over function symbols.)
Exercise 51 Define a Turing machine over the alphabet {, 1} that loops forever
if there is an odd number of consecutive 1’s on the tape (moving to the right from
where the head points initially), and that stops if the number of consecutive 1’s
is an even number. Then define a terminating Turing machine over the alphabet
{, 1, odd, even} that stops by writing odd or even, depending on whether the “num-
ber” on the tape is odd or even. Which are the initial states?
Exercise 52 Define a Turing machine over {, 0, 1} that adds 1 to the “binary
number” on the tape.
Exercise 53 Define the Turing machine Mab formally and show that its translation
e(Mab ) has an infinite derivation from the term [ qinit a qinit b ].
so that interpret (M, initConfig) returns the configuration resulting from run-
ning the deterministic Turing machine (represented by the term) M with initial
configuration initConfig.
4. Run your Turing machine interpreter on the terminating Turing machines you
defined in Exercises 51 and 52.
Since any computable function can be defined by a deterministic Turing machine,
there is a Turing machine that mimics the behavior of the function interpret. Such
a Turing machine that can simulate the steps of any Turing machine it gets as input
on any initial configuration for that machine is called a universal Turing machine.
4.2 Nontermination
+
A specification E is looping if there are terms t and u such that t E u and t is a
subterm of u. A looping specification is nonterminating, since the steps from t to u
can be repeated from (the subterm t inside) u.
4.2 Nontermination 73
Example 4.4.
• The specification { f (x) = f ( f (x))} has a reduction f (x) f ( f (x)) which is a
looping derivation since f (x) is a subterm of f ( f (x)). The specification is there-
fore nonterminating: f (x) f ( f (x)) f ( f ( f (x))) f ( f ( f ( f (x)))) · · · .
• The specification { f (x, y) = f (y, x)} has a looping derivation f (x, y) f (y, x)
f (x, y), where these steps can be repeated forever. ♦
Example 4.5. { f (x) = g(x, y)} has an infinite (and looping) derivation
To make the picture more complicated, there are also nonterminating systems
which are not looping:
Example 4.6. The system { f (x) = f (g(x))} is not looping, but is nonterminating:
This and the next section present some techniques that can be used to prove that a
specification is terminating.
The specification { f (x) = g(x)} is obviously terminating, but how would you
prove that it does not have an infinite derivation for any start term t0 ? Proba-
bly you would say that the number of (occurrences of) the function symbol f
in the term decreases in each simplification step, and since it cannot be less
than 0, the system must be terminating. Otherwise there would be an infinite
sequence t0 t1 t2 · · · which would lead to an infinite sequence
# f (t0 ) > # f (t1 ) > # f (t2 ) > · · · of decreasing natural numbers (where # f (t) denotes
the number of f s in t), which is impossible, no matter how large # f (t0 ) is. More
74 4 Termination
weight : TΣ → N
mapping a ground term to a natural number such that, for all ground terms t and u,
In the example above, the “weight” (or “progress”) function weight was # f .
One problem is the need to consider all contexts: if t u, then there are also
simplification steps f (t) f (u), f ( f (t)) f ( f (u)), f ( f ( f (t))) f ( f ( f (u))),
f (g(t)) f (g(u)), and so on, which must all be proved weight-decreasing. We can
avoid having to consider all contexts if the weight function is monotonic.
Definition 4.3 A function w : TΣ → N is monotonic (w.r.t. to the relation >) if and
only if, for each function symbol f , all ground terms t and u, and all lists t1 and t2
of ground terms,
weight( f (g(xσ ))) = (2 · weight(xσ ))3 > 2 · (weight(xσ ))3 = weight(g( f (xσ )))
holds for all weight(xσ ), since the weight of a ground term is at least 2. ♦
Example 4.10. The system { f ( f (x)) = f (g( f (x)))} can be proved to be terminating
using the non-monotonic weight function weight(t) = “the number of “adjacent”
pairs of f ’s in t”, since t u implies weight(t) > weight(u). However, it is hard to
define a monotonic weight function which proves termination of this system. ♦
It is sometimes more convenient, or even necessary, to use weights other than
natural numbers. Any domain S and weight comparison can be used as long as
t u implies weight(t) weight(u), and there is no infinite sequence s1 s2
s3 · · · of -decreasing S-elements.
Recall that a strict partial order on a set S is a relation ⊆ S × S which is
• irreflexive: there is no s ∈ S such that s s, and
• transitive: for all s1 , s2 , s3 ∈ S, s1 s2 and s2 s3 imply that s1 s3 .
Definition 4.4 (Well-founded strict partial order) A strict partial order on S
is well-founded if there is no infinite sequence
s1 s2 s3 ···
of S-elements s1 , s2 , s3 , . . .
76 4 Termination
Example 4.11. The greater-than relation > is a strict partial order on both the nat-
ural numbers N and the integers Z, but is only well-founded on N. ♦
Exercise 57 Prove termination of { f (h(x, y)) = h(x, x)} using weight functions.
Exercise 58 Explain why the weight function is Example 4.10 is not monotonic.
Exercise 59 1. Explain why the following program terminates for any m and n:
int x := m; int y := n;
while (x>2 and y>0) {
if x>y then {x := x-1; y := x+y;} else y := y/2;
}
2. Explain why the following “Euclidean” algorithm for computing the greatest
common divisor of two natural numbers terminates for all m and n.4
int gcd(int m, int n) { // m,n > 0
int x := m; int y := n; int r := x % y;
while (r>0) {x := y; y := r; r := x % y;}
return y;
}
Since finding suitable weight functions may require some clever ideas, the weight
function method is not suitable for proving termination automatically. This section
introduces the theory of simplification orders, due to Dershowitz (see, e.g., [26]),
and some powerful simplification orders which can be automated.
We start with some terminology. A term t embeds a term u if u is contained
“inside” t, in the sense that if we remove some function symbols from t we get u.
Definition 4.5 (Embedding) A term t embeds a term u, denoted
t u,
+
if and only if t EMB u in the specification EMB given by
Example 4.12. f (g( f (a))) f ( f (a)) holds since f (g( f (a))) EMB f ( f (a)), using
the equation g(x1 ) = x1 in EMB. We also have f (a, g(h(b, f (c, d)), e)) f (a, h(b, d))
and f (a, g(h(b, f (c, d)), e)) g(b, e). However, neither f (a, g(h(b, f (c, d)), e))
f (a, h(b, e)) nor f (a, g(h(b, f (c, d)), e)) g(b, d) holds. ♦
The following fundamental result says that some “patterns” must be repeated in
an infinite sequence of ground terms constructed by a finite set of function symbols:
Theorem 4.2 (Kruskal’s Tree Theorem) If Σ has a finite set of function symbols,
then any infinite sequence
t1 ,t2 , . . . ,t j , . . . ,tk , . . .
of ground terms in TΣ contains two terms t j and tk , with j < k, such that tk t j .
This theorem implies that if a finite specification does not have any self-embed-
ding derivation, i.e., a derivation of the form
t1 t2 . . . t j . . . tk . . .
f (t1 , . . . ,tn ) ti
t0 t1 · · · t j · · · tk · · ·
where all terms are built from a finite set of function symbols. (If the signature
contains an infinite set of function symbols, but a finite set of equations, then all
terms in the above derivation are constructed from the function symbols appearing
in t0 and in the right-hand sides of the equations. Given a finite set of equations,
there is only a finite number of distinct function symbols in these right-hand sides.)
Therefore, Kruskal’s Tree Theorem applies, and we have both tk t j and t j tk
+
(since t j tk =⇒ t j tk ). This is impossible (tk = t j is impossible because is
irreflexive, and tk t j implies that tk t j by Proposition 4.3, and with a strict
partial order we cannot have both t j tk and tk t j )!
Since a simplification order only proves that there is no self-embedding deriva-
tion, a simplification order cannot prove termination of self-embedding and termi-
nating specifications such as { f ( f (x)) = f (g( f (x)))}. Another way to put it is that
no simplification order can prove termination of E if E ∪ EMB is nonterminating.
To have your own simplification order mine , just make sure that mine is irreflex-
ive, transitive, monotonic, and that it satisfies the subterm property. If you then can
prove l σ mine rσ for each equation l = r and each ground substitution σ , you have
proved that your specification is terminating. In case you do not want to define your
own simplification order, you can use some of the path orders introduced next.
4.4 Simplification Orders 79
The lexicographic path order (lpo) [58] is a powerful simplification order which can
be applied automatically. lpo requires that you have a strict partial order , called a
precedence, on the function symbols.
Definition 4.7 (Lexicographic path order) Given a strict partial order on the
function symbols, the lexicographic path order lpo is the smallest relation satisfying
the following conditions for m, n ≥ 0:5
lpo-1: If ti lpo u or ti = u for some ti , then
f (. . . ,ti , . . .) lpo u.
The lexicographic path order can be extended to terms with variables, where a vari-
able is treated as a constant that is not comparable to anything in the precedence ,
in which case l lpo r implies l σ lpo rσ for all substitutions σ .
The following result is proved, e.g., in [6]:
Proposition 4.4 lpo is a simplification order for any precedence .
Therefore, one way of proving the termination of a finite6 specification is to de-
fine a precedence on the function symbols (and extend it to variables so that no
variable is comparable in with any other symbol) such that l lpo r holds for each
equation l = r in the specification.
Functions are often defined using previously defined functions. For example,
multiplication (∗) is defined in terms of addition (+), and exponentiation (∗∗) is
defined in terms of multiplication. In these cases, termination can often be shown
by choosing the precedence so that it satisfies ∗∗ ∗ +.
Example 4.13. We prove termination of
{ 0 + x = x, 0 ∗ x = 0, x ∗∗ 0 = s(0),
s(x) + y = s(x + y), s(x) ∗ y = y + (x ∗ y), x ∗∗ s(y) = x ∗ (x ∗∗ y) }
5 This definition also applies to constants when m = 0 or n = 0; for example, f (c) lpo b and
a lpo b and a lpo g(b) all hold by lpo-2 if f b and a b and a g.
6 A finite specification in this case is one with only a finite set of function symbols and/or a finite set
of equations. This should be the case for our Maude modules (except for some built-in modules).
80 4 Termination
The lexicographic path order is fully automatic, since a finite set of function
symbols only has a finite number of precedences , and checking whether each
equation is lpo -decreasing is also a terminating process. A program can then check
lpo-termination for each possible precedence (see Exercise 73).
In case lpo-3 in the definition of lpo, the immediate subterms (t1 , . . . ,tn ) and
(u1 , . . . , un ) are compared lexicographically. The multiset path order (mpo) is the
same as lpo except that (t1 , . . . ,tn ) and (u1 , . . . , un ) are compared as multisets. That
is, mpo is defined as lpo, except that the condition lpo-3 is replaced by
mpo-3: If {t1 , . . . ,tn } ms
mpo {u1 , . . . , un } (where
ms
mpo is the “multiset extension”
of mpo ), then f (t1 , . . . ,tn ) mpo f (u1 , . . . , un ).
mpo and lpo are incomparable: only lpo can prove that { f (a, b) = f (b, a)} is
terminating, and only mpo can prove that {g(x, a) = g(b, x)} is terminating.
As already mentioned, the main difference between “weight functions” and the path
orders lpo and mpo is that the former are custom-defined for each specification—
requiring ingenuity as well as possibly complex proofs of their suitability for prov-
ing termination—whereas the latter are automatic and ready to use.
Intuitively, the path orders seem fairly powerful. They can prove termination of
specifications such as { f (s(y), x, z) = f (y, x + y + z, z + z)}, for which it seems hard
to define a “standard” weight function (try!), and the Ackermann function, whose
termination cannot be proved by a polynomial weight function.7
The inherent weakness of simplification orders is that they cannot prove termi-
nation of self-embedding systems. The path orders lpo and mpo also cannot prove
the termination of a system like
whereas their termination can be proved by trivial weight functions such as the size
of the term. (If different function symbols can be regarded as the same in the prece-
dence, then the above system can be shown terminating using lpo/mpo. However, it
that case, the system
cannot be proved using mpo, lpo, or their extensions mentioned above, whereas it
can easily be proved terminating using weight functions (Exercise 60).)
If a finite specification can be proved terminating using a simplification order
, it can also be proved terminating using a weight function into a well-founded
domain (S, >): the domain S is the set of ground terms TΣ , the weight function is
the identity function, and the comparison operator > is the order .
In the other direction, a monotonic weight function weight : TΣ → S (with com-
parison operator >s ) that satisfies the property
for each function symbol f and all ti induces a simplification order weight on ground
terms defined by t weight u if and only if weight(t) >s weight(u).
7A polynomial weight function is one where weight( f (t1 , . . . ,tn )) is defined as a polynomial in
weight(t1 ), . . . , weight(tn ) for each function symbol f .
82 4 Termination
Exercise 62 Show that the last two equations in Example 4.13 are lpo -decreasing
with the given precedence .
{ ack(0, x) = s(x),
ack(s(x), 0) = ack(x, s(0)),
ack(s(x), s(y)) = ack(x, ack(s(x), y))}
Exercise 64 Can lpo prove that {h( f ( f (x))) = h( f (g( f (x))))} is terminating?
Exercise 65 Why is the condition f (t1 , . . . ,tn ) lpo ui , for each 2 ≤ i ≤ n, needed in
the case lpo-3 in the definition of lpo? That is, show a nonterminating specification
whose equations would be lpo -decreasing without this condition.
Exercise 67 Use lpo to prove that the specification of binary trees that you defined
in Exercise 13 is terminating.
Exercise 68 Consider the specification { f (a) = g(b), g(a) = f (b), f (x) = a}.
1. Show that the specification cannot be proved terminating using lpo or mpo if
different function symbols cannot have the “same precedence” in .
2. Use lpo to prove termination of the specification if two function symbols may
have the same precedence in .
Exercise 70 The order o extends a total strict partial order on the (finite) set
of function symbols, and is defined by t o u if and only if the list (number of
occurrences in t of the -greatest function symbol, . . . , number of occurrences in
t of the -smallest function symbol) is lexicographically greater than the corre-
sponding list for u. For example, if f g a, then g( f ( f (a)), f ( f (g(a, a)))) o
f (g(g( f (a), g(a, g(a, f (a)))), a)), since (4, 2, 3) >lex (3, 4, 5).
1. Is o well-founded?
2. Is o a simplification order?
3. Is there a specification that can be proved terminating using o , but that cannot
be proved terminating using lpo?
4. How can o deal with variables?
Exercise 73 In this exercise we implement lpo in Maude. We first define a data type
for representing equational specifications. A term is represented by a term of sort
Term. Such a term is either a constant, a variable, or a function symbol applied to
a list of terms, so that, e.g., the term f (a, g(b)) is represented by f[a, g[b]]:
sorts FuncSymbol VarSymbol .
ops a ack b c d f g h s 0 + * - v w . . . : -> FuncSymbol [ctor] .
ops x x1 x2 x3 x4 x5 y y1 y2 y3 y4 y5 . . . : -> VarSymbol [ctor].
The equations specifying the extremely fast-growing Ackermann function are then
represented by the following term of sort EquationSet:
84 4 Termination
eq ack[0, x] = s[x] .
eq ack[s[x], 0] = ack[x, s[0]] .
eq ack[s[x], s[y]] = ack[x, ack[s[x], y]] .
1. Define a function
op _>>_in_ : FuncSymbol FuncSymbol Precedence -> Bool .
such that f >> g in P equals true if and only if f is greater than g in the
precedence P. (Hint: it might be useful to extend this function to variables.)
2. Define a function
op lpoTerm : EquationSet Precedence -> Bool .
that checks whether a given set of equations can be proved terminating using
lpo with the given precedence. For example,
red lpoTerm(eq f[a, b, a] = f[a, b, b] .
eq f[a, a, b] = f[a, b, a] .
eq f[b, b, f[a, b, a]] = f[a, b, a] ., f >> a >> b) .
should return true while
red lpoTerm(eq f[a, b, a] = f[a, b, b] .
eq f[a, a, b] = f[a, b, a] .
eq f[b, b, f[a, b, a]] = f[a, b, a] ., f >> b >> a) .
should return false. Test your specification extensively in Maude.
3. Define a function
op lpoTerm : EquationSet -> Bool .
that returns true if there exists a precedence such that the given equations can
be proved terminating using lpo. Hint: it might be useful to recall Exercise 34.
Example 5.1. Both equations in { f ( f (x)) = g(x), a = b} can be applied to the term
f ( f ( f (a))); the first equation can be applied both in position ε and in position 1. ♦
This chapter also considers only unsorted specifications without conditional equa-
tions and operator attributes. We first recall the definition of confluence:
Confluence means that if t can be reduced to two different terms t1 and t2 (for
instance by applying different equations to t), we can always “join” t1 and t2 by
reducing both to a common term u. This property is shown in Fig. 5.1 (left), where
∗ ∗
a solid arrow means “for all ” and a dashed arrow means “there exists ”.
c Springer-Verlag London 2017 85
P.C. Ölveczky, Designing Reliable Distributed Systems, Undergraduate Topics
in Computer Science, DOI 10.1007/978-1-4471-6687-0 5
86 5 Confluence
We still have to address issue (ii) above: reducing the check of local confluence
to a finite number of “start terms” t. For that, we introduce the notion of unification.
5.1 Unification
Definition 5.3 (Unifier) A unifier of two terms t and u is a substitution σ such that
t σ = uσ .
Example 5.3. f (x, h(b)) and f (h(y), z) have a unifier σ = {x → h(y), z → h(b)}.
Any instance of σ , such as σ = {x → h( f ( f (a, a), a)), y → f ( f (a, a), a), z → h(b)},
is also a unifier. On the other hand, f (g(x)) and f (h(z)) have no unifier (why not?);
neither has the pair f (x) and g(y), nor the pair f (x) and f (g(x)).
Example 5.3 shows that two terms can have many unifiers. We are interested
in finding the most general unifier (mgu), which is a unifier ρ such that all other
unifiers σ are “instances” of ρ . That is, ρ is an mgu of a pair of terms if for each
unifier σ of the pair, there is a substitution π such that σ = π ◦ ρ , where ◦ denotes
function composition, i.e., ( f ◦ g)(x) = f (g(x)).
Example 5.3. (cont.) The substitution σ is an mgu of f (x, h(b)) and f (h(y), z). Two
other unifiers of these terms are the above σ and σ = {x → h(h(h(h(h(z))))),
z → h(b), y → h(h(h(h(z))))}. Both σ and σ are instances of σ :
Proposition 5.1 If two terms have a unifier, then they have a most general unifier.
Furthermore, the most general unifier is unique up to renaming of the variables.
unification problem we want to solve and ρ is the identity (the substitution which
“does nothing”). The algorithm proceeds by applying the following steps until it
returns <Not unifiable> or the desired mgu:
1. Return <Not unifiable> if UP contains a unification problem of the form
?
f (t1 , . . . ,tn ) = g(u1 , . . . , um ), for m, n ≥ 0, where f = g. (Obviously there is no
unifier for this unification (sub)problem.)
2. If UP has the form2
? ?
then we must find unifiers for t1 = u1 , and . . . , and tn = un . That is,
?
3. If UP contains a unification problem of the form t = t, then just remove this
trivial unification problem from UP.
? ?
4. If UP contains a unification problem x = t (or t = x) where x and t are different
terms and x occurs in t, then return <Not unifiable>. (For example, the
terms x and f (x) are not unifiable (why not?).)
? ?
5. If UP contains a unification problem of the form x = t (or t = x) where x and t
are (syntactically) different terms and x does not occur in t, then:
• remove this unification problem from UP,
• apply the substitution {x → t} on all remaining unification problems in UP,
and
• apply the substitution {x → t} on ρ (one effect is that x → t is added to ρ ,
since ρ does not contain an assignment of x (why not?) and hence has x→ x).
6. If UP is empty, then return ρ , which is the desired mgu.
Example 5.4. Let’s find the mgu of the pair f (x, h(x)) and f (h(y), z) using the
algorithm: We start with
?
({ f (x, h(x)) = f (h(y), z)}, Id)
?
2 The symbol denotes disjoint union, which means that f (t1 , . . . ,tn ) = f (u1 , . . . , un ) does not
appear in UP .
5.1 Unification 89
The unification algorithm is correct and terminating (see, e.g., [105] for proof).
Exercise 77 Decide whether the following unification problems have unifiers, and
if so, find their mgus:
?
1. f (x, y) = g(a, b)
?
2. f (x, x) = f (a, b)
?
3. f (x, y) = f (a, f (a, b))
?
4. f (x, y) = f (g(x), a)
?
5. f (x, y) = f (g(y), h(x))
?
6. f (x, x) = f (g(y), g(h(z)))
?
7. f (a, y) = f (x, b)
Newman’s Lemma means that it is enough to check the confluence property for
all terms t1 ,t2 , . . . reachable in one step from some term t. However, there can be an
infinite number of such start terms t. The next step is therefore to restrict the number
of terms t for which to check local confluence.
Let li = ri and l j = r j be two equations in our specification (they could be the
same equation!), and rename if necessary the variables in l j = r j so that li and l j
do not have any variables in common. Let p be a position in li so that li | p is not
a variable. If li | p and l j are unifiable with mgu ρ , then the term li ρ may reduce to
ri ρ (by applying li = ri at the top (position ε )). The term li ρ may also reduce to
(li ρ )[r j ρ ] p (by applying l j = r j at position p). That is,
To check local confluence we check whether the critical pair (ri ρ , (li ρ )[r j ρ ] p ) is
∗ ∗
joinable (that is, whether there is a term u such that ri ρ u and (li ρ )[r j ρ ] p u).
This has to be done for all pairs of equations (including two copies of the same
equation), and for all positions, and then we have checked local confluence:
Example 5.5. Let us check whether { f ( f (x)) = g(x)} is confluent. The only pair of
equations is ( f ( f (x)) = g(x), f ( f (x)) = g(x)). Since they share x, we rename one
of them to f ( f (x )) = g(x ) and check the pair ( f ( f (x)) = g(x), f ( f (x )) = g(x ) ).
Now, li is f ( f (x)) and we need to check all non-variable subterms of f ( f (x)) for an
overlap with f ( f (x )). The non-variable subterms of f ( f (x)) are f ( f (x)) and f (x),
and there is no need to check the trivial overlap with f ( f (x)).
Therefore, the only potentially interesting case happens if the subterm f (x)
(which is the subterm at position 1 of f ( f (x))) and f ( f (x )) are unifiable. Are they?
Yes, with mgu ρ = {x→ f (x )}. The resulting “overlap term” li ρ = f ( f ( f (x ))) can
be reduced to g( f (x )) by using the first equation at the top, and to f (g(x )) by using
the second equation at position 1. We then need to check whether the critical pair
(g( f (x )), f (g(x ))) is joinable. Since neither g( f (x )) nor f (g(x )) can be further
reduced, they are not joinable. Therefore, the specification is not confluent, since
f ( f ( f (x ))) g( f (x )) and f ( f ( f (x ))) f (g(x )), but there is no term u such
∗ ∗
that both g( f (x )) u and f (g(x )) u. ♦
3Instead of checking joinability directly, one can find some normal forms of the two terms. If they
have the same normal forms, they are obviously joinable; if not, the specification is obviously not
confluent, since the “overlap term” li ρ has two different normal forms.
5.2 Checking Local Confluence 91
one must check that the resulting specification is confluent (and terminating), since
the new equation could lead to new non-joinable critical pairs.
G = {e ◦ x = x, i(x) ◦ x = e, (x ◦ y) ◦ z = x ◦ (y ◦ z)}.
Exercise 81 Prove that the specification in Example 5.1 extended with the equation
f (g(x)) = g( f (x)) is confluent and terminating. Can you also prove that the speci-
fication in Example 5.1 extended with the equation g( f (x)) = f (g(x)) (the critical
pair in Example 5.5 oriented the other way) is confluent and terminating?
This chapter explains how we can reason about whether two expressions are “logi-
cally equivalent” in a specification E. We consider two different notions of what it
means that two terms t and u (which may contain variables) are logically equivalent:
1. t = u follows from the equations in E, without considering the signature of E.
2. t = u follows from the equations in E, but taking the signature into account, in
the sense that t and u are “equivalent” if and only if t and u are equivalent for
each ground instance: t σ = uσ for each ground substitution σ .
Let us start with the first notion. Many mathematical theories, such as the theory
of groups, the theory of rings, etc., can be defined by giving a set of equations as the
axioms of the theory. Given two terms t and u, a mathematician may be interested in
whether the equivalence t = u “follows logically” from the equations. For example,
do x ◦ i(x) = e and i(i(x)) = x hold in all groups? That is, do they follow logically
from the group axioms {e ◦ x = x, i(x) ◦ x = e, (x ◦ y) ◦ z = x ◦ (y ◦ z)}?
Chapter 7 defines what “follows logically from a set of equations” means:
t = u follows from E if and only t = u is true in all possible mathematical struc-
tures/models where the equations E hold. For example, x ◦ i(x) = e follows logically
from the group axioms if and only if x ◦ i(x) = e holds in all groups, that is, in all
mathematical structures satisfying the group axioms. The problem is of course that
it is impossible to explicitly check every structure satisfying E to figure out whether
an equality t = u holds in all of them. This chapter therefore introduces equational
logic as a way to reason about whether an equality t = u “follows logically” from an
equational specification E: t and u are logically equivalent if and only if t = u can
be deduced from the equations E using the rules of equational logic. The point is
that we can use equational logic reasoning instead of checking whether t = u holds
in all E-structures, since, as shown in Chapter 7, t = u follows from E in equational
logic if and only if t = u holds in all structures where the equations E hold.
For general theories such as groups, rings, and so on, reasoning about equalities
that hold in all structures is exactly what we want. However, quite often we are not
interested in all E-structures, but only in the intended structure. When reasoning
about NAT-ADD, we are not interested in studying whether something holds in all
c Springer-Verlag London 2017 93
P.C. Ölveczky, Designing Reliable Distributed Systems, Undergraduate Topics
in Computer Science, DOI 10.1007/978-1-4471-6687-0 6
94 6 Equational Logic
systems satisfying the two equations 0 + M = M and s(M) + N = s(M + N); we are
only interested in whether something holds for the natural numbers.
For example, addition on natural numbers is commutative, so it should be the
case that m + n = n + m holds in NAT-ADD for all “natural numbers” m and n.1 Like-
wise, to increase our confidence that we have specified lists and trees correctly, we
want to verify that expected properties such as reverse(reverse(bt)) = bt and
length(concat(l1 , l2 )) = length(l1 ) + length(l2 ) are logical consequences of our
specifications for all binary trees bt and all lists l1 and l2 .
It turns out that M + N = N + M (for variables M and N) does not follow logically
from the equations 0 + M = M and s(M) + N = s(M + N), since it does not hold in all
structures satisfying the two equations (just add a new constant a to NAT-ADD; then
a + 0 cannot be reduced and is therefore different from 0 + a). However, M + N = N + M
holds in the intended NAT-ADD-structure, in the sense that it holds for all instances
where M and N are instantiated with the numerals constructed by 0 and s. That is,
m + n = n + m holds for all constructor ground terms m and n of sort NAT.
Equalities that only hold in the intended structure, whose data elements are con-
structor ground terms, are called inductive theorems. Chapter 7 formally defines
what we mean by the “intended” model of a specification and explains that an in-
ductive theorem holds in this intended structure.
Section 6.1 introduces equational logic and Section 6.2 shows how to prove in-
ductive theorems. We assume in Section 6.1 that (Σ , E) is an unsorted specification
without conditional equations, and that Σ contains at least one constant.
We write E t = u for the sequent which means that the equality t = u can be proved
in equational logic to follow logically from the equations E.
Definition 6.1 (Equational logic) For an unsorted equational specification (Σ , E)
(without conditional equations), we write E t = u, for terms t, u ∈ TΣ (X), if and
only if E t = u can be derived by a finite number of applications of the following
axiom schemas and deduction rules of equational logic:
E1 (Substitutivity): The sequent E l σ = rσ holds for any equation l = r in E
and any substitution σ .
E2 (Reflexivity): E t = t holds for any term t.
E3 (Symmetry): If E t = u holds, then E u = t holds.
E4 (Transitivity): If E t1 = t2 and E t2 = t3 both hold, then E t1 = t3 holds.
E5 (Congruence): If E t1 = u1 , . . . , and E tn = un all hold, then
Reasoning with these kinds of logics, or deduction systems, may take some time
getting used to. The basic facts that we can start each deduction with are that we can
deduce E t σ = t σ for each equation t = t in E and each substitution σ , and that
we can deduce E t = t for each term t. From these basic facts, we can then use the
deduction rules of equational logic to deduce new facts, as exemplified below.
Example 6.1. For E the equations { f (x) = g(x), a = b, g(c) = c}, we can prove
E b = a as follows:
1. By Substitutivity we can prove that E a = b, since a = b is an equation in E.
The substitution is of course just the empty substitution.
2. Now we have proved E a = b. The deduction rule Symmetry says that if E
a = b holds, then so does E b = a. That’s all! We have proved the unsurprising
fact that b = a follows logically from the above equations E.
Does f (a) = g(b) follow logically from E? That is, can we prove E f (a) = g(b)?
1. E a = b holds because of Substitutivity, since a = b is an equation in E.
2. Since E a = b, the Congruence rule says that then E f (a) = f (b) also holds.
3. By Substitutivity w.r.t. the equation f (x) = g(x) and substitution σ = {x → b},
we also have E f (b) = g(b).
4. Since both E f (a) = f (b) and E f (b) = g(b) hold, the Transitivity rule then
says that E f (a) = g(b) also holds. This is what we wanted to prove. Q.E.D.
The above proof of E f (a) = g(b) can be summarized in the following shorter
form (where E1 denotes Substitutivity, E2 denotes Reflexivity, and so on):
1. E a = b (E1 ; equation a = b)
2. E f (a) = f (b) (E5 ; from 1)
3. E f (b) = g(b) (E1 ; equation f (x) = g(x))
4. E f (a) = g(b) (E4 ; from 2, 3)
Each line in such a deduction/proof must be justified, either by following directly
from Substitutivity or Reflexivity, or by following from claims which have already
been justified and one of the deduction rules Symmetry, Transitivity, or Congruence.
A graphical representation of the same proof shows the deductions used, with the
assumptions above the line and the conclusion below it. Such proofs must start with
instances of Substitutivity or Reflexivity. The proof of E f (a) = g(b) can be given
as the following proof tree:
Substitutivity
E a=b Congruence Substitutivity
E f (a) = f (b) E f (b) = g(b)
Transitivity
E f (a) = g(b) ♦
Example 6.2. NAT-ADD s(s(0)) + s(0) = s(0) + s(s(0)) holds because it can
be derived as follows in equational logic2 :
2 In this chapter, M t = t denotes eqs(M) t = t when M is a module name and eqs(M) are the
To prove that an equality follows logically from a set of equations you “just”
need to give a sequence of deductions leading to the desired equality. However, it is
in principle impossible to say that something, like E f (a) = f (c), does not hold.
We can only say something like “I have tried a bunch of deductions and I still could
not prove E f (a) = f (c).” But this could in principle be either because E f (a) =
f (c) does not hold, or because you are not clever enough using the deduction rules.
Fortunately, Theorem 6.5 shows that it is easy to prove that “E t = t does not
hold,” written E t = t , when the equations E are terminating and confluent.
Another way of proving E t = u is to come up with a mathematical structure
satisfying E, but where t = u does not hold. This is because E t = u holds if and
only if t = u holds in all “structures” where the equations E hold.
Example 6.3. It seems obvious that s(0) = 0 should not follow logically from the
equations in NAT-ADD. But how can we prove that? The equations in NAT-ADD all
hold for the natural numbers, where 0 is supposed to mean the number 0, s(n) is
supposed to mean 1 plus the interpretation of n, and + is supposed to mean addition
on natural numbers. Therefore, all equalities that follow from NAT-ADD must hold
for the natural numbers. s(0) = 0 does not hold for the natural numbers since 1 = 0,
and we can conclude that s(0) = 0 does not follow logically from NAT-ADD. ♦
The following theorem may not come as a major surprise after seeing how diffi-
cult it is to deduce NAT-ADD s(s(0)) + s(0) = s(0) + s(s(0)):
E t =u
Proof. This result can be proved in different ways. A well-known proof uses the
fact due to Matiyasevich [77] that it is in general undecidable even for ground terms
t and u whether t = u follows from the specification3
{x ◦ (y ◦ z) = (x ◦ y) ◦ z,
a◦a◦a◦a◦b◦b = b ◦ b ◦ a ◦ a ◦ b ◦ a,
a◦a◦b◦a◦b◦b◦a = b ◦ b ◦ a ◦ a ◦ a ◦ b ◦ a,
a◦b◦a◦a◦a◦b◦b = a ◦ b ◦ b ◦ a ◦ b ◦ a ◦ a,
b◦b◦b◦a◦a◦b◦b◦a◦a◦b◦a = b ◦ b ◦ b ◦ a ◦ a ◦ b ◦ b ◦ a ◦ a ◦ a ◦ a,
a◦a◦a◦a◦b◦b◦a◦a◦b◦a = b ◦ b ◦ a ◦ a ◦ a ◦ a}.
This book has primarily dealt with equational reduction (“applying an equation”).
The following theorem says that equational reduction and equational logic deduc-
tion can be seen as the same thing:
Theorem 6.2 For any set E of equations and terms t, u we have
∗
E t =u if and only if t u.
∗
Proof. We prove this theorem by first proving that E t = u implies t u, and
∗
then we prove the other direction, t u implies E t = u.
Since E t = u by definition means that E t = u can be derived by a finite
number of applications of the deduction rules of equational logic, we can prove
∗
the “E t = u implies t u” part by induction on the number of deduction steps
needed to prove E t = u.
• Base case: one application of an axiom of equational logic proves E t = u. This
means that either Substitutivity or Reflexivity was used to prove E t = u.
– Assume that Reflexivity was used to prove E t = u. Then t and u are the same
∗
term, which means that we need to prove that t t, which follows from the
∗
definition of on page 62.
– Assume that Substitutivity was used to prove E t = u. This means that there
is an equation l = r in E and a substitution σ such that t = l σ and u = rσ . We
∗
therefore need to prove l σ rσ . It follows directly from the definition of a
reduction step on page 62 that l σ rσ ; this in turns implies that the desired
∗ ∗
l σ rσ holds by the definition of .
• Induction step: Assume that E t = u has been proved using n + 1 deduction
∗
steps. The induction hypothesis is then that E t = u implies t u if E
t = u can be proved using n deduction steps or less.
– Assume that the Transitivity rule was used in the last step in the proof of
E t = u. That is, we have a proof of the form
.. ..
.. ..
E t = v E v = u Transitivity
E t =u
Since this proof uses n + 1 applications of the rules and axioms of equational
logic, both E t = v and E v = u can be proved in n steps or less. The
98 6 Equational Logic
t t implies E t = t
∗
Theorem 6.3 It is undecidable whether t u holds, even for ground terms t and u.
Proof. Let Ê, for any E, contain each equation l = r in E, and its symmetric version
∗ ∗
r = l. Then t E u if and only if t Ê u.
E t =u if and only if t! = u!
We have seen that it is easy to decide whether t and u are logically equivalent in
terminating and confluent specifications. Therefore, if our specification is not ter-
minating and confluent we could try to turn it into an terminating and confluent
specification that does not change the meaning of the original specification.
100 6 Equational Logic
G = {e ◦ x = x, i(x) ◦ x = e, (x ◦ y) ◦ z = x ◦ (y ◦ z)}
{e ◦ x = x, i(x) ◦ x = e, (x ◦ y) ◦ z = x ◦ (y ◦ z),
i(x) ◦ (x ◦ y) = y, x ◦ e = x, i(e) = e,
i(i(x)) = x, x ◦ i(x) = e, x ◦ (i(x) ◦ y) = y, i(x ◦ y) = i(y) ◦ i(x)}
that can be used to decide whether t = u holds in all groups [105]. That is, although
it is in general undecidable whether E t = u, this problem becomes decidable if
E can be transformed into an equivalent confluent and terminating specification E .
This example therefore shows that equality in the theory of groups is decidable. ♦
You do not need to re-prove something you have already proved, if you need that fact
later. For instance, we have already proved in Example 6.1 that E f (a) = g(b). If
you need this fact, you can just use it.
2. Prove that
BOOLEAN t implies X = (not t ) or X
Exercise 89 Explain that if E t = u, then it is also the case that E t = u for any
extension E of E, that is, for any set E of equations such that E ⊂ E .
(†) NAT-ADD M + N = N + M
does not hold, for variables M and N, since this equality does not hold in all struc-
tures satisfying the equations in NAT-ADD. One such structure adds a constant a
to NAT-ADD; addition is not commutative in this structure since a + 0 = 0 + a. (An-
other way to prove NAT-ADD M + N = N + M is to consider their normal forms; since
102 6 Equational Logic
4 We assume in this section that our specifications are sufficiently complete (see Section 2.3.4); that
is, each ground term reduces to some constructor ground term.
5 This undecidability result implies that for any sound and finitary proof system PS for the natural
numbers with addition and multiplication, there are polynomials p1 and p2 over variables x1 , . . . , xn
(and nonnegative coefficients) such that (∀x1 , . . . , xn ) p1 (x1 , . . . , xn ) = p2 (x1 , . . . , xn ) holds for the
natural numbers but is not provable in PS. However, this formula is an inequality, whereas our
inductive theorems are equalities. We must introduce another function, such as either equality == :
Nat → Nat, defined by 0 == s(x) = 0, s(x) == 0 = 0, 0 == 0 = s(0), s(x) == s(y) = x == y (this
is our usual function ==, but to keep within a one-sorted framework it returns 0 instead of false
and s(0) instead of true) or the “monus” function in Exercise 9. The unprovable formula that
holds for the natural numbers then becomes (∀x1 , . . . , xn ) p1 (x1 , . . . , xn ) == p2 (x1 , . . . , xn ) = 0 and
s(0) monus ((p1 (x1 , . . . , xn ) monus p2 (x1 , . . . , xn ))+(p2 (x1 , . . . , xn ) monus p1 (x1 , . . . , xn ))) = 0,
respectively. Since the natural numbers with addition and multiplication are the intended structure
for a specification like NAT-MULT, there is no optimal proof system for NAT-MULT extended with
monus or ==. Therefore, there is no optimal proof system for inductive theorems in general.
6.2 Inductive Theorems 103
NAT-ADD ind x + 0 = x.
Induction hyp.
NAT-ADD t + 0 = t
Subst. Congr.
NAT-ADD s(t ) + 0 = s(t + 0) NAT-ADD s(t + 0) = s(t )
Transitivity
NAT-ADD s(t ) + 0 = s(t )
∗
An important remark is that, since E u = v is the same as u E v by Theo-
rem 6.2, we can reason in terms of (two-way) reductions instead of equational
deductions, which is usually more convenient:
∗
s(t ) + 0 s(t + 0) (Ind.hyp.) s(t ).
This proves that NAT-ADD t + 0 = t holds for all constructor ground terms t and
hence that NAT-ADD ind x + 0 = x.
We can also formalize the “generic constant” t and the induction hypothesis in
Maude, and use Maude to prove the two steps:
fmod NAT-ADD-IND-PROOF is including NAT-ADD .
op t’ : -> Nat . --- generic constant for induction
eq t’ + 0 = t’ . --- induction hypothesis
endfm
A general induction scheme to prove that some property P(t) holds for all con-
structor ground terms t of sort Nat is therefore:
Base case: Prove that P(0) holds.
Induction step: Prove that P(s(t)) holds, when you can assume that the induc-
tion hypothesis P(t) holds. Furthermore, if needed you can assume the stronger
induction hypothesis that P(u) holds for all constructor ground terms u with
depth(u) < depth(s(t)).
Example 6.6. Associativity of addition does not follow from the equations in the
specification NAT-ADD (Exercise 90). However, we can prove that associativity of
addition is an inductive theorem in NAT-ADD, that is, NAT-ADD ind (x + y) + z = x
+ (y + z). In particular, we can prove NAT-ADD (t1 + t2 ) + t3 = t1 + (t2 + t3 ) for all
constructor ground terms t1 , t2 , t3 of sort Nat by induction on the depth of t1 :
Base case. t1 is 0, and we need to prove NAT-ADD (0 + t2 ) + t3 = 0 + (t2 + t3 )
for all constructor ground terms t2 and t3 . Let t2 and t3 be any two constructor
ground terms of sort Nat. Then we have (0 + t2 ) + t3 t2 + t3 t3 0 + (t2 + t3 ),
using the equation 0 + M = M on both sides.
Induction step. Let t1 be s(t). The induction hypothesis that we can assume is
NAT-ADD (t + t2 ) + t3 = t + (t2 + t3 ) for all constructor ground terms t2 , t3 of sort
Nat, and we have to prove NAT-ADD (s(t) + t2 ) + t3 = s(t) + (t2 + t3 ), which is
left to the reader as an easy exercise.
The proof steps can be represented and performed by Maude as follows:
fmod NAT-ASSOC-IND-PROOF is including NAT-ADD .
ops t t2 t3 : -> Nat .
eq (t + t2) + t3 = t + (t2 + t3) . --- induction hypothesis
endfm
The execution of both commands returns true, proving the desired property. ♦
It is not always as easy to prove inductive theorems as in the two examples above.
If there are multiple variables, one may need to choose which one to do the induc-
tion on. For example, it is much harder to prove that associativity of addition is an
inductive theorem in NAT-ADD if you instead choose to try induction on t2 instead of
t1 (try!). It may even be necessary to do simultaneous induction on the size of pairs
of constructor ground terms (t1 ,t2 ) in other cases, and so on.
An important issue is that additional lemmas may be needed in such proofs. Typ-
ically, if you get “stuck” during a proof, you may need to prove some lemma, that is,
a helpful “smaller” inductive theorem, that you can use in the main proof. Indeed,
in the following example, we need the following lemmas:
Lemma 1: NAT-ADD t + 0 = t
Lemma 2: NAT-ADD s(t1 + t2 ) = t1 + s(t2 )
The induction scheme used to prove inductive theorems of NAT-ADD can be gener-
alized to any data type. Some property P(t) holds for all constructor ground terms t
of sort s if one can prove:
Base case: The depth of t is 1. That is, t is a constant that is a constructor of sort s
(or of a subsort of s, since a constructor for a subsort s of s also constructs terms
of sort s). Therefore, we must prove P(c) for all such constructor constants c.
Induction step: The depth of t is n + 1. For each non-constant constructor f of
sort s, or of a subsort of s, one must prove
P( f (t1 , . . . ,tn ))
for all constructor ground terms t1 , . . . ,tn . Since the depth of each ti is smaller
than n + 1, we can assume the induction hypothesis P(ti ) for each ti of sort s.
More generally, we can assume P(t ) for any constructor ground term t of sort s
with depth(t ) ≤ n.
Note that equational logic extended with a deduction rule corresponding to the
above induction scheme in general is not a complete proof system. It is beyond the
scope of this book to present a proof system for inductive theorems, especially since
there are no optimal such proof systems. Instead, a number of examples illustrate
reasoning about inductive properties of different data types.
Example 6.8. A property Q(t) holds for all constructor ground terms t of sort s in
fmod M is
sorts s s’ .
ops a b : -> s [ctor] . ops c d : -> s’ [ctor] .
ops f g : s s’ -> s [ctor] . op h : s’ s s’ -> s’ [ctor] .
op k : s s’ s -> s [ctor] .
ops l p : s -> s . ops d : s -> s’ .
... *** variables and equations
endfm
106 6 Equational Logic
Example 6.9. We prove that the number of elements in a binary tree is the same as
the number of elements in the reversed tree. Recall our definition of binary trees:
fmod BINTREE-NAT1 is ...
sort BinTree .
op empty : -> BinTree [ctor] .
op bintree : BinTree Nat BinTree -> BinTree [ctor] .
ops size weight : BinTree -> Nat .
op reverse : BinTree -> BinTree .
We prove
BINTREE-NAT1 size(reverse(t)) = size(t)
assuming both
and
BINTREE-NAT1 size(reverse(t2 )) = size(t2 ).
The Maude execution of the first returns true; however, the second command
gives the result (size(t2) + size(t1)) == (size(t1) + size(t2)). Assuming
that our specifications are well-defined (i.e., sufficiently complete), both size(t1)
and size(t2) are natural numbers, and by the previously proven commutativity
property of addition on natural numbers, both sides are the same. ♦
for all constructor ground terms l and l of sort List. You can assume that all
functions are well-defined (i.e., the specification is sufficiently complete): each
ground term reduces to a constructor ground term. You will also need to use
lemmas that have already been proved, such as NAT-ADD ind x + 0 = x and
NAT-ADD ind x + y = y + x. Hint: prove the property by induction on l .
Exercise 96 Define the function reverse on lists in LIST-NAT1 (or recall your
solution from Exercise 10) and prove that you get the original list back if you reverse
a list twice: LIST-NAT1 ind reverse(reverse(L)) = L.
The introduction to this book says that the point of formal modeling is to define
a mathematical model of a computer system. However, we have not yet seen the
mathematical model(s) defined by a Maude functional module. What are they? (The
mathematical object defined by a program is called its denotational semantics.)
As mentioned in Chapter 6, we are sometimes interested in all models of a spec-
ification (Σ , E) (such as that of groups), but most often we are only interested in
the intended model of (Σ , E). This chapter therefore defines both all mathematical
models and the intended model specified by an equational specification (Σ , E).
Having such models enables us to reason about properties that hold in all (Σ , E)-
models and that hold in the intended (Σ , E)-model. For example, does x ◦ y = y ◦ x
hold in all groups; that is, in all models of our specification of groups? This chapter
proves that equational logic is sound and complete: t1 = t2 holds in all models of a
specification (Σ , E) if and only if E t1 = t2 holds. Furthermore, checking whether
E t1 = t2 holds is decidable when E is—or can be turned into—a confluent and
terminating specification. This is the case for the group axioms (see Example 6.4), so
that we can easily check whether an equality holds in all groups. The equalities that
hold in the intended model of a specification are the inductive theorems introduced
in Section 6.2, for which in general there is no sound and complete proof system.
What kind of mathematical models are we looking for? Meseguer and Goguen
argue in [85] that a software module/package involves various sorts of data that form
sets, and defines a number of operations on those data that correspond to functions
on the corresponding sets. In other words, a software module has the structure of
an algebra. Specifying a software module therefore means specifying an algebra,
which is exactly what a Maude specification (Σ , E) does.
Section 7.1 introduces Σ -algebras for a signature Σ , as well as key notions like
Σ -homomorphisms. Section 7.2 defines the class Alg(Σ , E) as all Σ -algebras that
satisfy the equations E. All such algebras are models of a specification (Σ , E).
Section 7.3 proves the soundness and completeness of equational logic.
c Springer-Verlag London 2017 109
P.C. Ölveczky, Designing Reliable Distributed Systems, Undergraduate Topics
in Computer Science, DOI 10.1007/978-1-4471-6687-0 7
110 7 Models of Equational Specifications
• 0N is the number 0;
• sN is the “successor function” s on natural numbers: s(n) = n + 1; and
• +N is the standard addition function + on natural numbers.
2. The algebra N⊥ , which adds an element ⊥ (for “undefined”) to N. The above
functions are extended to ⊥ as follows: s(⊥) = ⊥ and ⊥ + x = x + ⊥ = ⊥.
3. The algebra Nx whose domain NxNat is N, and where 0, s, and + are interpreted
by, respectively, 84, “+4”, and the function returning the first element of a pair.
4. E , whose domain ENat is the set {0, 2, 4, 6, . . .} of even numbers, and where
the interpretations of the function symbols 0, s, and + in the algebra E are as
expected: 0E = 0, sE (n) = n + 2, and n +E m = n + m.
5. The integers Z, whose domain ZNat is the integers {. . . , −2, −1, 0, 1, 2, . . .}
and where 0, s, and + are interpreted in the standard way.
6. The algebra squares with
• squaresNat = {0, 1, 4, 9, 16, . . .}
• 0squares = 0
√
• ssquares (n) = n + 2 n + 1
√ √
• m +squares n = ( m + n)2
7. The algebra bits of bit sequences without leading zeros. The domain bitsNat is
{0} ∪ {sequences of bits not starting with 0}, and 0, s, and + are interpreted in
bits by, respectively, 0, the function which gives the string resulting from adding
one to a given bit string, and the standard addition function on bit sequences.
8. The algebra bits0, which is defined as bits except that its domain bits0Nat is all
bit strings, including those with leading zeros.
112 7 Models of Equational Specifications
9. The algebra ∗ with a single element ∗. That is, ∗Nat = {∗}, and the functions
are interpreted in the only possible way (which way?).
10. The algebra +2 with domain +2Nat = {0, 1, 2, 3, 4, 5, . . .}, but where s is in-
terpreted as “plus 2”; that is, 0, s, and + are interpreted by 0, λ m . m + 2, and
λ m, n . m + n, respectively.
11. The algebra Q≥0 of non-negative rational numbers, with the obvious interpreta-
tions of 0 (the number 0), s (“plus one”), and + (standard addition on rationals).
12. For any number k > 1, the algebra Nk with domain NkNat = {0, 1, . . . , k − 1},
with 0Nk = 0, sNk the function λ n . n + 1 mod k, and +Nk the function λ m, n .
(m + n) mod k, where n mod k is the remainder when n is divided by k.
13. The algebra AB with ABNat = {∗, a, b}, 0AB = ∗, sAB the identity function,
and +AB the function + defined by ∗ + X = X, a + X = a, and b + X = b
for all X. ♦
Example 7.4. The signature of groups (see Exercise 79) has one sort s, a constant
e (the identity element), a unary function symbol i (inverse), and a binary function
symbol ◦. Three algebras for this signature are:
1. Z, with domain Zs the integers {. . . , −2, −1, 0, 1, 2, . . .}, and with functions 0,
(unary) −, and + interpreting e, i, and ◦, respectively.
2. R>0 , with domain all positive rational numbers, and interpretations 1, λ x . 1/x,
∗ (multiplication) of e, i, and ◦.
3. The algebra funcs({a, b, c}), whose domain is the set of all bijective functions
from {a, b, c} to {a, b, c}. The group operations are interpreted as expected:
• e is interpreted by the identity function λ x . x on {a, b, c};
• i is interpreted by the inverse function λ f . f −1 ; and
• ◦ is interpreted as standard function composition: ( f ◦ g)(x) = f (g(x)). ♦
In this introductory book I refer to, e.g., [50] for the treatment of order-sorted
algebras, and just mention that in such an algebra A, the domain As must be a subset
of the domain As whenever s is a subsort of s in the signature. For example, the
(sub)sorts NzNat < Nat < Int could be interpreted by the corresponding three
sets {1, 2, 3, . . .} ⊆ {0, 1, 2, 3, . . .} ⊆ {. . . , −2, −1, 0, 1, 2, . . .}, which satisfy the
subset requirement.
To avoid cluttering the exposition with details, the rest of this chapter consid-
ers unsorted specifications. The extension to the many-sorted case is straightfor-
ward when all sorts are non-empty. In the unsorted case, a homomorphism is just
a single function φ from the domain As of the algebra A to the domain Bs of the
algebra B. The homomorphism condition can then be written φ ( fA (a1 , . . . , an )) =
fB (φ (a1 ), . . . , φ (an )). I also write A for the domain As of an algebra A, and hope that
using the same name for both an algebra and its domain will not lead to confusion.
Example 7.9. Consider a signature Σ with a single constant a and no other function
symbol. Let A and B be two Σ -algebras A and B with domains A = {1, 2} and B =
{1, 3} and with aA = 1 and aB = 1. The homomorphism condition requires that
φ (aA )=aB for any homomorphism from A to B. The functions φ1 = {1 → 1, 2 → 3}
and φ2 = {1 → 1, 2 → 1} are both homomorphisms from A to B. ♦
Isomorphic algebras have the same structure; they only differ in the representa-
tion of the elements. Isomorphic algebras are therefore often said to be “abstractly
the same algebra.” For example, it does not really matter whether we represent the
naturals numbers by 0, 1, 2, 3, . . ., by 0, 2, 4, 6, . . ., or by 0, 1, 1 0, 1 1, . . ., as long
as the functions behave in the same way in these algebras.
Example 7.12. The algebras N and N3 are not isomorphic: there is no injective func-
tion from N to {0, 1, 2} (and no surjective function from {0, 1, 2} to N). ♦
7.1 Many-Sorted Σ -Algebras 115
Example 7.15. Let Σ0,s be the signature op 0 : -> Nat . op s : Nat -> Nat .
Then, the Σ0,s -algebras Tσ0,s and N (when seen as an Σ0,s -algebras by just forgetting
about having to interpret +) are isomorphic (see Exercise 104). ♦
Exercise 101 Show that the algebras N and bits are isomorphic.
Exercise 102 Assume that for two Σ -algebras A and B there is a Σ -homomorphism
φ1 : A → B and a Σ -homomorphism φ2 : B → A. Are A and B Σ -isomorphic?
1 The sets N and N ∪ {⊥} are isomorphic in the sense that there exists a bijective function f
between them (e.g., f (0) = ⊥, f (1) = 0, f (2) = 1, . . .). However, this function is not a
sign(NAT-ADD)-homomorphism.
116 7 Models of Equational Specifications
Example 7.17. The algebra Nx in Example 7.3 is not a NAT-ADD-algebra. It does not
satisfy the first (or the second) equation: σ ∗ (0 + M) = σ ∗ (0) +Nx σ ∗ (M) = σ ∗ (0) =
84 = σ ∗ (M) = σ (M) for variable assignments σ : {M} → N with σ (M) = 84. ♦
Example 7.18. The quotient algebra of the sign(NAT-ADD)-algebra N over the con-
gruence ≡3 (equality modulo 3) is the algebra N3 . ♦
t =E t if and only if Et = t .
It is intuitively fairly obvious that an algebra where the interpretation of two ground
terms t1 and t2 are the same element if E t1 = t2 satisfies all the equations in E
(see, e.g., [85, proof of Theorem 11]):
118 7 Models of Equational Specifications
If the equations E are terminating and ground confluent, then the algebra closest to
what we compute is the normal form algebra, also called the canonical term alge-
bra, CΣ ,E , whose elements are the E-normal forms of the ground terms: {t!E | t ∈
TΣ }, and where the interpretation fCΣ ,E of a function symbol f in Σ takes t1 , . . . ,tn
to the normal form of f (t1 , . . . ,tn ); that is, fCΣ ,E (t1 , . . . ,tn ) = ( f (t1 , . . . ,tn ))!E .
Example 7.20. The equations in NAT-ADD are terminating and confluent. The
elements of the algebra CNAT-ADD are the normal forms {0, s(0), s(s(0)), . . .},
and the interpretation of the function + in CNAT-ADD is the function +CNAT-ADD where
s(...s (0)...) +CNAT-ADD s(...s (0)...) = s(...s (0)...). ♦
m n m+n
Exercise 109 For each of the NAT-ADD-algebras in Example 7.3, can you extend the
algebra to a NAT-MULT-algebra by adding a suitable interpretation of *? Show that
the resulting algebras indeed are NAT-MULT-algebras.
Exercise 110 Show that each algebra in Example 7.4 satisfies the group axioms
{e ◦ x = x, i(x) ◦ x = e, (x ◦ y) ◦ z = x ◦ (y ◦ z)}.
The key point about equational logic is that it allows us to reason about equalities
that hold in all (Σ , E)-models. What we want is therefore the equivalence
Exercise 112 Show that funcs({a, b, c}) in Example 7.4 does not satisfy x ◦y = y◦x.
(This is probably the simplest example of a non-Abelian group.)
Exercise 114 Fill in the missing parts in the proof of Theorem 7.2.
We have seen a number of NAT-ADD-algebras. But which of them is the one that
we wanted to specify? Is it the algebra N, the algebra bits of binary numbers, the
algebra TNAT-ADD , or the normal form algebra CNAT-ADD ? Or could it be the integers
Z , the algebra squares, the even numbers E , the algebra +2, or even AB? The
answer by “ADJ” (Goguen, Thatcher, Wagner, and Wright) in 1975 was that the
intended model of an equational specification (Σ , E) is the initial algebra(s) in the
class Alg(Σ , E) of all (Σ , E)-algebras [49].
Definition 7.5 A Σ -algebra A is initial in a class A of Σ -algebras if and only if for
each algebra B ∈ A, there is exactly one Σ -homomorphism from A to B.
7.4 Intended Models: Initial Algebras 121
Theorem 7.3 If A and B are two initial Σ -algebras in a class A of Σ -algebras, then
A and B are isomorphic.
Can we be sure that a specification (Σ , E) has initial models? After all, Exer-
cise 117 shows that some specification formalisms fail to guarantee initial models.
It turns out that there is always an initial (Σ , E)-algebra, namely, the algebra TΣ ,E :
Theorem 7.4 The algebra TΣ ,E is an initial algebra in the class Alg(Σ , E).
Proof. (The proof roughly follows a similar proof in [60].) We must prove that there
is one, and only one, Σ -homomorphism φ̂ : TΣ ,E → A for each (Σ , E)-algebra A.
Since A is a Σ -algebra, it has interpretations cA and fA for each constant c and
each non-constant f in Σ . We define a Σ -homomorphism φ : TΣ → A as follows:
So the algebra TNAT-ADD is an intended model of the specification NAT-ADD. The
algebra N of natural numbers is another intended model of NAT-ADD.
Example 7.21. N is an initial algebra in the class Alg(NAT-ADD), since it is isomor-
phic to the initial algebra TNAT-ADD . This can be proved as follows. We have previ-
ously shown that NAT-ADD is terminating, confluent, and sufficiently complete: all
ground terms reduce to constructor ground terms, which, cannot be reduced further.
All this implies that the elements in TNAT-ADD are {[0], [s(0)], [s(s(0))], . . .}.
Let the function φ : TNAT-ADD → N be defined by φ ([s(...s (0)...)]) = n.
n
φ is a sign(NAT-ADD)-homomorphism, since φ ([0]) = 0 = 0N , and φ ([s(t )]) =
1 + φ (t) = sN (φ (t)), and φ ([t1 + t2 ]) = φ (t1 ) + φ (t2 ), which can be proved easily.
φ is an isomorphism, since φ is injective ([t1 ] = [t2 ] implies φ ([t1 ]) = φ ([t2 ]),
which should be fairly obvious), and φ is surjective (for any n ∈ N there is a t such
that φ ([t]) = n, which also holds, for t equal s(s(· · · s (0) · · · ))). ♦
n
Theorem 7.5 If the equations E are confluent and terminating, the normal form
algebra CΣ ,E is isomorphic to TΣ ,E and therefore an initial algebra in Alg(Σ , E).
As mentioned above, the definition of the intended model as the initial algebra
is somewhat abstract. How can we easily see what is an initial algebra? Two key
properties characterizing any initial (Σ , E)-algebra A are:
• No junk: A does not contain any “junk” element that is not an interpretation of
some ground term.
• No confusion: If two ground terms t1 and t2 are interpreted by the same element
in A, then E t1 = t2 . That is, A does not identify elements that are not E-equal.
Intuitively, if an algebra A has “confusion,” i.e., identifies (the interpretation) of
two ground terms t1 and t2 that are not E-equivalent, then there is no homomorphism
from A to TΣ ,E . Such a homomorphism should map t1A (the interpretation of t1 in
A) to [t1 ]E and should map t2A to [t2 ]E = [t1 ]E , which is impossible if t1A = t2A .
Example 7.22. In the algebra N3 both s(s(s(0))) and 0 are interpreted as the num-
ber 0, even though NAT-ADD s(s(s(0))) = 0. There is no homomorphism from
N3 to the TNAT-ADD , since a homomorphism φ : N3 → TNAT-ADD would have
Exercise 116 Show that φ2 ◦ φ1 = idA and φ1 ◦ φ2 = idB imply that φ1 and φ2 are
both injective and surjective.
Exercise 117 Assume that we have a specification formalism that supports disjunc-
tions (or) of equations, with the obvious meaning. Then show that the class of alge-
bras satisfying the specification a = b or a = c does not have an initial algebra.
124 7 Models of Equational Specifications
Exercise 118 Show that the function φ in the proof of Theorem 7.5 is surjective,
injective, and a Σ -homomorphism.
Exercise 119 Which of the NAT-ADD-algebras in Example 7.3 satisfy the “no junk”
property? Which of them satisfy the “no confusion” property? Which ones are initial
algebras in Alg(NAT-ADD)?
Going from the unsorted case to the many-sorted case is straightforward, as long
as all sorts have ground terms. If some sorts do not have any ground terms, which
may be useful, for example, when reasoning about parametric modules, the obvious
extension of unsorted equational logic to the many-sorted setting is unsound.
Example 7.23. The sort Empty has no ground terms in the following specification:
A model A of this specification has elements AEmpty = 0/ and ABool = {t, f}, and
the function f could be interpreted in A by the function f defined by (∀e ∈ 0) /
f (e) = t. This algebra A is a model of the specification EMPTY-SORT, with the in-
terpretations t and f of, respectively, true and false, being different elements. A
/ f(x) = false holds in A; if it were not to hold,
satisfies both equations: (∀x ∈ 0)
there should be some element e such that f (e) = f. Obviously, there is no such e in
the empty set.
However, EMPTY-SORT true = false could be proved in a straightforward
extension of unsorted equational logic. Since this equality does not hold in A, which
is a model of EMPTY-SORT, the logic would be unsound. ♦
Meseguer and Goguen have defined a sound and complete many-sorted equational
logic that treats variables carefully [85]. In their logic we can prove EMPTY-SORT
(∀x : Empty) true = false, but not the undesired EMPTY-SORT true = false.
Part II
Specification and Analysis of Distributed
Systems in Maude
Modeling Distributed Systems
in Rewriting Logic 8
This chapter introduces rewriting logic, which can be used to model dynamic sys-
tems and to reason about concurrent change in a distributed system. This chapter
may be read together with Chapter 9, which explains how rewriting logic models
can be executed in Maude.
Part I of this book deals with specifying data types by defining what terms are equiv-
alent. There is (mathematically) no dynamic behavior in an equational specification.
A term represents an expression, and two expressions are either equivalent or there
is no relation between them. Due to symmetry of the equivalence relation, both
length(2 5 7) = 3 and 3 = length(2 5 7) hold. Likewise, 2 + 1 is always 3,
and 3 is always 2 + 1.
The rest of this book deals with modeling and analyzing dynamic (or changing
or evolving) systems, where the state—which is also represented as a term—of the
system changes over time. Consider modeling the life of a person. A state in such
a model can be represented by a term person("Peter", 46, married), denoting
a person named Peter, who is currently 46 years old and married. This state can
change to person("Peter",47,married) or to person("Peter",46,divorced).
Change is quite often irreversible: a human being cannot go from being 46 years
old to being 45 years old; a football game with the score 39–38 cannot change the
score back to 14–38; a bad chess move cannot be reversed; an unfortunate email
sent into the network cannot be called back; and so on.
A component in a dynamic computer system is often a reactive system, which in-
teracts with its environment by reacting to input from the environment by changing
its state and/or by providing some output. The prototypical example of a reactive
c Springer-Verlag London 2017 127
P.C. Ölveczky, Designing Reliable Distributed Systems, Undergraduate Topics
in Computer Science, DOI 10.1007/978-1-4471-6687-0 8
128 8 Modeling Distributed Systems in Rewriting Logic
Sequential computer programs and functional Maude modules are typically deter-
ministic: starting the system from the same state/expression will always lead to the
same result, no matter how often we execute the system. Sorting a given list will
always give the same result, and 3+2 is always 5. Furthermore, functional modules
and sequential programs should typically be terminating.
In contrast, many dynamic systems are nondeterministic: they may have many
different behaviors—possibly resulting in different final outcomes (if any)—from
the same starting state. For example, a state person("Peter", 46, married) could
change in one “step” to either the state person("Peter", 47, married), the state
person("Peter", 46, deceased), or the state person("Peter", 46, divorced)
in a model of a person. Likewise, consider a game of chess: from the initial configu-
ration where all the pieces are in their starting positions, the state of the game could
evolve in many different ways, resulting in different final states.
Networked systems are intrinsically nondeterministic. Consider two persons
vying to buy the last super cheap plane ticket from Oslo to San Francisco at the
same time. The person whose message is routed the fastest from his/her computer
to the ticket reservation server will get the desired plane ticket. However, the winner
depends on a number of factors: network load, routing, whether sites along the route
in the network are down, and so on. The next time the same persons want to buy a
plane ticket at the same time, the outcome may be different.
While some of these systems (such as the life of a person) are terminating, many
distributed systems are not supposed to terminate: your operating system, airplane
ticket reservation services, and most other web services should always be up.
only one event can take place in the system at the same time. We typically assume
that any component can execute its next operation at any time, and must consider all
possible sequences of “interleaved” operations of the three components. Two such
behaviors of the above system are:
1. c1 :op1 → c2 :op1 → c3 :op1 → c2 :op2 → c3 :op2 → c1 :op2 → c2 :op3 →
c2 : op4 → c1 : op3 → c3 : op3 → c3 : op4 → c1 : op4
2. c3 :op1 → c3 :op2 → c3 :op3 → c1 :op1 → c1 :op2 → c2 :op1 → c3 :op4 →
c1 : op3 → c2 : op2 → c2 : op3 → c2 : op4 → c1 : op4
Understanding and analyzing distributed systems is hard. Even this trivial example
has a whopping 34,650 different behaviors!
The interleaving model is particularly suitable for systems where only one com-
ponent can execute at the same time, for example, because they are all sharing
some resource (like a server executing the operations, or they are accessing the
same shared data). However, in a distributed system, two or more components could
execute at the same time, or concurrently. For example, all three components could
perform their first operation at the same time, and c1 and c3 could perform their
second operation at the same time, as long as they execute on separate computers
and do not access shared resources:
Exercise 120 Three persons share a bank account x, which initially contains $100,
and each person wants to add $20 to this account by executing the three (atomic)
operations
y := read(x); z := y + 20; write(x, z);
in the given order. Assume that the three persons all execute the same program more
or less at the same time in an interleaved fashion.
1. What are the possible outcomes (i.e., balance of the shared bank account x) if y
and z are local variables?
2. What are the possible outcomes if also y and z are shared variables?
For example, trying to model the aging of a person using an equation like
person(X, N, S) = person(X, N + 1, S)
In rewriting logic, dynamic behavior is modeled by rewrite rules that define the
local transitions of a system. The “birthday” action in our simple example could be
modeled by a rewrite rule
person(X, N, S) −→ person(X, N + 1, S)
t −→ t
means that the state t can change to (or evolve to, or reach) the state t
using the rewrite rules zero or more times. (I will use “term” and “state” inter-
changeably since the state of a system is represented by a term.) Therefore, the
sequent person("Peter", 46, married) −→ person("Peter", 56, married)
holds, but person("Peter", 46, married) −→ person("Peter", 20, married)
does not hold.
A rewrite rule can be equipped with a label which names the action or event that
causes the state change. In our example, the labeled rule could be
l : t −→ t
l : t −→ t if cond
where l ∈ L, t and t are terms in TΣ (X), and cond is a conjunction of rewrite condi-
tions of the form u −→ u , equational conditions of the form v = v and membership
conditions w : s, for u, u , v, v , w terms in TΣ (X) and s a sort in Σ . The terms t and
t must be terms of the same kind.2
Example 8.1. (Borrowed from [80]) A nondeterministic choice operator _?_, which
nondeterministically returns one of its arguments, can be specified as follows:
1 Although we use membership equational logic as the underlying equational logic in the definition
below, rewriting logic is actually parametric in the underlying equational logic, which could be
unsorted, order-sorted, membership, or some other kind of equational logic.
2 In order-sorted specifications, the sorts of t and t must be in the same connected component of
(S, ≤).
132 8 Modeling Distributed Systems in Rewriting Logic
Nondeterministic behavior could mean that the set of rewrite rules is not confluent.
Since many distributed systems are nonterminating, the set of rewrite rules may well
be both non-confluent and nonterminating.
8.2.3 Examples
3 The module expression module1 + module2 gives the union of the two modules.
8.2 Modeling Dynamic Systems in Rewriting Logic 133
crl [successful-proposal] :
person(X, N, S) => person(X, N, engaged)
if N >= 15 /\ (S == single or S == divorced) .
Exercise 121 Complete the module ONE-PERSON with rules for, e.g., broken engage-
ment, separation, divorce, death, death of a spouse, and other possible events.
Exercise 123 In the whiteboard game there are a bunch of non-zero natural num-
bers on a whiteboard. Specify the following versions of this exciting game in Maude:
1. Any two numbers m and n on the whiteboard can be replaced by the number
(m + n) quo 2.
2. As above; in addition, if there are two occurrences of the number m on the white-
board, then one of them may be replaced by the numbers m − 1, m − 2, . . . , 2, 1.
3. Any two numbers m and n on the whiteboard can be replaced by m + n + (m · n).
Exercise 124 The “Tower of Hanoi” is a classic “puzzle” with m rods and n disks
of different sizes. The puzzle starts with all the disks, ordered by size, on rod 1, with
the smallest on top. The objective is to move all disks onto the “last” rod m, by
repeatedly moving the upper-most disk from some rod onto another rod, so that a
disk is never placed on top of a smaller disk. A rod can be represented in Maude
as a term rod i stack disks, with i the number of the rod and disks a list of natural
numbers between 1 and n. The state of the system is a multiset of m such rods.
1. Define an initial state init(m,n) with m rods and n disks.
2. Define all possible legal moves of this “puzzle” in Maude.
Exercise 125 Recall the Traveling Salesman (TS) problem: Given a set of cities and
a cost for traveling between each pair of cities, can a salesman start in his home
city and visit every other city exactly once before returning to his home city, for a
total cost of the journey less than equal to some limit K?
Assume that cost(c1 , c2 ) gives the cost of traveling between c1 and c2 , and
that cities gives the set of cities to visit. There are at least three cities to visit. For
example, some cities and the cost between them could be given in Maude as follows:
sorts City Cities . subsort City < Cities .
op none : -> Cities [ctor] .
op _;_ : Cities Cities -> Cities [ctor assoc comm id: none] .
ops PhnomPenh SiemReap Sisophon Battambang KompongSom : -> City [ctor] .
1. Define a function ts : NzNat -> Bool so that ts(K) returns true if and only
if there is a TS trip with total cost less than or equal to K.
2. One thing is knowing that it is possible to travel for less than $K; another thing
is knowing which route to take. Explain why we cannot have a “well-defined”
function okTrip : NzNat -> Trip which returns a trip with total cost ≤ K.
3. Define a sort State for the states in your system, and define a suitable initial
state. Each state must contain the journey undertaken so far.
4. Specify all possible behaviors of a traveling salesman in Maude.
5. It may sometimes be cheaper to go via a third city instead of traveling directly
between two cities. For example, if you are in Sisophon and must head home
to PhnomPenh, you can save money by going through SiemReap. Specify all
possible behaviors of the salesperson when (s)he can visit a city multiple times.
Exercise 126 Define a simulator for Turing machines in Maude with states the form
machine: TM state: q tapeLeft: tape1 head: symbol tapeRight: tape2 ,
where TM is a Maude representation of the (transitions of the) Turing machine, q is
the current “state” of the machine, tape1 and tape2 represent, respectively, the tape
to the left and to the right of the current “head,” and symbol is the symbol on the
square the head is pointing at.
1. Assume sorts Symbol and State. Define the data types TuringMachine, rep-
resenting a Turing machine, and Tape, representing tapes of a Turing machine.
2. Define the rewrite rules for simulating the steps of a given Turing machine.
8.3 Concurrency
Different actions may take place concurrently, i.e., at the same time, in a distributed
system. Rewriting logic is a logic of change in which the statements have the form
“state t may evolve to a state t .”
In addition, rewriting logic is a logic for reasoning about possible concurrent com-
putation steps which allows us to reason about properties of the form
“the system in state t may perform actions concurrently to reach a state t in one concurrent
step.”
One way to think about “possible concurrent computation steps” is: assume that
we have as many processors as we want and a way of delegating jobs to different
processors. What actions could under this scenario be performed at the same time?
136 8 Modeling Distributed Systems in Rewriting Logic
Assume that from state t1 a system may evolve in one step to state u1 . (If it helps
your intuition, imagine that each action takes, say, 10 minutes to perform.) Assume
furthermore that a state t2 could evolve to a state u2 in one step (which may also take
10 minutes). It then seems reasonable that a state f (t1 ,t2 ) could evolve to the state
f (u1 , u2 ) in one concurrent step, in which the steps t1 −→ u1 and t2 −→ u2 have
been computed in parallel. (I emphasize that this is abstract reasoning about possible
concurrent computations. A concrete implementation on a distributed architecture
would have to take care of the task of distributing the two computation tasks to two
processors, of synchronizing the results, and so on.)
could obviously be distributed so that one processor could spend, say, 15 minutes on
computing squareroot(9762385199087), and another processor could be
assigned to compute findPrime(13852379) in the same time. That is, if
squareroot(9762385199087) −→ m and findPrime(13852379) −→ n, for some
numbers m and n, can be computed in one step each, then
squareroot(9762385199087) + findPrime(13852379)
A concurrent step takes g(a,b) to g(a’,b’) (just let one processor compute
a −→ a’ and another processor compute b −→ b’). Similarly, there is a concur-
rent step g(c,d) to g(c’,d’) and another concurrent step g(e,f) to g(e’,f’).
Furthermore, there is a concurrent step
h(g(a,b),g(c,d),g(e,f)) −→ h(g(a’,b’),g(c’,d’),g(e’,f’))
Example 8.4. Our specification ONE-PERSON simulates only one person. In this ex-
ample we consider an entire population, which is modeled as a multiset of persons:
mod POPULATION is protecting NAT + STRING .
sorts Person Population Status .
subsort Person < Population .
op empty : -> Population [ctor] .
op _ _ : Population Population -> Population
[ctor assoc comm id: empty] .
op person : String Nat Status -> Person [ctor] .
ops single divorced : -> Status [ctor] .
ops engaged separated married : String -> Status [ctor] .
crl [birthday] :
person(X, N, S) => person(X, N + 1, S) if N <= 1000 .
crl [engagement] :
person(X, N, S) person(X’, M, S’)
=>
person(X, N, engaged(X’)) person(X’, M, engaged(X))
if (S == single or S == divorced) /\ N >= 16
/\ (S’ == single or S’ == divorced) /\ M >= 16 .
rl [wedding] :
person(X, N, engaged(X’)) person(X’, M, engaged(X))
=>
person(X, N, married(X’)) person(X’, M, married(X)) .
...
endm
An example of a population is
person("Claudius", 60, married("Gertrude"))
person("Gertrude", 50, married("Claudius"))
person("Hamlet", 28, single)
person("Ophelia", 19, single)
person("Old Norway", 67, married("Ingrid"))
person("Fortinbras", 40, single)
person("Laertes", 22, single).
138 8 Modeling Distributed Systems in Rewriting Logic
to a state
person("Hamlet", 29, single) person("Ophelia", 20, single)
in which two birthday steps have been performed at the same time. (The above
state has the “form” f (a, b), where a −→ a and b −→ b can be seen as the two
birthday steps and f as the multiset union operator _ _.)
From a state
person("Hamlet", 28, single) person("Ophelia", 19, single)
person("Rosencrantz", 38, single) person("Juliet", 16, single)
However, it should not be possible for one person (say, "Ophelia") to be involved
in two engagements at the same time (which reception should she attend?). It is of
course also possible to go from a state
person("Hamlet", 28, single) person("Ophelia", 19, single)
person("Rosencrantz", 38, single) person("Juliet", 16, single)
to a state
person("Hamlet", 28, engaged("Ophelia"))
person("Ophelia", 19, engaged("Hamlet"))
person("Rosencrantz", 38, single) person("Juliet", 16, single).
Exercise 127 What concurrent steps are possible from g(g(a,a),g(b,c)) and
h(a,b’,g(c,d)) in the specification in Example 8.3?
8.3 Concurrency 139
Exercise 129
1. What is the largest number of “actions” that can be performed concurrently in
one step starting from the state
person("Claudius", 60, married("Gertrude"))
person("Gertrude", 50, married("Claudius"))
person("Hamlet", 28, single)
person("Ophelia", 19, single)
person("Old Norway", 67, married("Ingrid"))
person("Fortinbras", 40, single)
person("Laertes", 22, single)
2. What is the largest number of concurrent actions possible in one step from the
above state if we do not count birthdays and deaths?
3. Is it possible to reach a state in which "Ophelia" is older than "Hamlet" from
the above state?
Exercise 130 How many actions (rule applications) can be performed in one step
from a state f ( f ( f (a))) in the specification {l1 : f (x) −→ g(x), l2 : a −→ b}?
This section formally defines the rewrite relation and the notion of concurrent
rewrite steps. For simplicity of exposition, we consider one-sorted specifications
without conditional rewrite rules.
Given a rewriting logic specification R = (Σ , E, L, R) the sequents (“logical for-
mulas”) of rewriting logic have the form
t −→ u
for t and u terms in TΣ (X) belonging to sorts of the same connected component.
This sequent intuitively means that it is possible to reach the state u from the state t
using the rules in R (zero or more times).
Notation. I sometimes write t(x1 , . . . , xn ) for a term t to emphasize that all the
variables in t are in the list x1 , . . . , xn . I write t(u1 /x1 , . . . , un /xn ) for the term t
where each occurrence of xi has been replaced by the term ui . For example, if t
is f (g(x), h(a, y)), then t(g(y)/x, a/y) denotes the term f (g(g(y)), h(a, a)).
Definition 8.2 (Deduction rules of rewriting logic) The sequent
t −→ u
140 8 Modeling Distributed Systems in Rewriting Logic
also holds.
Transitivity: If t1 −→ t2 and t2 −→ t3 both hold, then t1 −→ t3 also holds.
These deduction rules look very similar to the deduction rules of equational
logic (with Replacement corresponding to Substitutivity). Indeed, only the Symme-
try property of equational logic is missing.
The rewrite relation −→ corresponds to applying rewrite rules from left to right
zero or more times, and to equational reduction in the following sense:
The following fact follows trivially from Theorem 6.3 and Proposition 8.1:
t E u if and only if (Σ , 0,
/ {l}, rules(E)) t −→ u is a one-step sequential rewrite.
a −→ a b −→ b c −→ c d −→ d e −→ e f −→ f
g(a, b) −→ g(a , b ) g(c, d) −→ g(c , d ) g(e, f ) −→ g(e , f )
h(g(a, b), g(c, d), g(e, f )) −→ h(g(a , b ), g(c , d ), g(e , f ))
t −→ t1 −→ · · · −→ tn −→ t .
5 We follow the notational conventions for one-sorted equational specifications when writing one-
sorted rewriting logic specifications.
142 8 Modeling Distributed Systems in Rewriting Logic
Repeated use of the swap rule (for example by using Maude’s rew command
explained in Chapter 9) will swap integers until the list is sorted:
Maude> rew 8 4 0 -3 76 54 21 0 -9 3 23 .
result List: -9 -3 0 0 3 4 8 21 23 54 76
The notions of termination and confluence carry over to rewrite systems as expected:
a rewriting logic specification is terminating if (the underlying equational specifica-
tion is terminating and) there is no infinite sequence of one-step rewrites. Likewise,
a rewriting logic specification is confluent if and only if t −→ t1 and t −→ t2 imply
that there is a term u such that both t1 −→ u and t2 −→ u hold.
Exercise 133 Recall the coffee bean game described in Section 8.2.3 and your
Maude specification of it which solved Exercise 122.
8.4 Deduction in Rewriting Logic 143
Exercise 135 Which/how many “actions” can be performed concurrently in: (i) the
different versions of the whiteboard game with seven numbers on the whiteboard;
and (ii) your “Tower of Hanoi” specification with five rods and seven disks?
4 8 5 0 1 −→ 0 1 4 5 8.
a. What is the smallest number of concurrent rewrite steps required to sort the
lists 8 4 0 -3 76 54 21 and 1 3 2 0 in the modified specification?
b. Is there any list which can be sorted by fewer steps in the modified specifi-
cation than in the original one?
5. What would be the undesired consequence of adding an equationally-defined
function op sorted : List -> Bool to the module SORT? (The expression
sorted(l ) reduces to true if l is a sorted list, and to false otherwise.) Hint:
think about the Congruence rule of rewriting logic. See also Section 8.5.
which returns the first element in a list. This function should be defined equationally.
In the (extended) module SORT there is a rewrite 5 2 −→ 2 5. Using the Congru-
ence rule of rewriting logic it follows that first(5 2) −→ first(2 5), which
by the Equality rule of rewriting logic gives 5 −→ 2, which seems undesirable. To
avoid such undesired rewrites caused by functions mapping a “dynamic” domain
onto a “static” domain, one can declare such functions to be frozen:
op first : List -> Int [frozen] .
The intended model of an equational specification is the initial algebra of the speci-
fication. In an algebra, (the interpretation of) two expressions either denote the same
element in the domain of the algebra, or different elements with no relationship
between them. What is the intended model of a rewrite theory? The (interpreta-
tion of) two terms t and t may denote different values, but could still be related
by rewriting: t −→ t . Therefore, algebraic models, whose domains are sets with
no relationship between different elements in the set, may not be the best models.
Instead, the models of rewrite theories are categories, which are sets with arrows
between elements.
Definition 8.4 (Category) A category A is a pair (A, M), where A is a set (of
objects) and M is set of morphisms (or arrows) f : A → B, for A, B ∈ A, such that:
1. If f : A → B and g : B → C are two morphisms in M, then there is a designated
composite morphism f ; g : A → C in M;
2. each object A has an identity morphism idA : A → A such that idA ; f = f and
g; idA = g for any morphisms f : A → B and g : B → A; and
3. morphism composition is associative: ( f ; g); h = f ; (g; h) for all morphisms
f : A → B, g : B → C, and h : C → D.
That is, there must be an arrow from an object to itself, and arrows compose. As the
reader may have guessed, the initial model TR of a rewrite theory is a category,
whose objects are the elements of the underlying initial algebra TΣ ,E , and where
there is a morphism p : t → t if and only if t −→ t (the p is the “proof term” repre-
senting the proof of t −→ t ). In particular, because of Reflexivity of rewriting logic,
there is an arrow from each t to itself, and because of Transitivity, arrows compose.
It is beyond the scope of this book to further discuss the models of rewrite the-
ories; a thorough exposition is given in [80]. A non-categorical model theory for
rewriting logic with frozen operators is defined in [16].
Executing Rewriting Logic
Specifications in Maude 9
This chapter introduces some ways in which a rewriting logic model of a dynamic
system can be analyzed by execution in Maude.
Since an equational specification is assumed to be terminating and confluent, and
the main goal is to compute the normal form of an expression, such specifications
can be executed by applying equations until no equation can be applied, without
worrying about which equation to apply or where to apply it. Rewriting logic (or
just rewrite) specifications, on the other hand, model all possible behaviors of a
dynamic system, and might not be terminating or confluent. The above execution
approach may therefore not make much sense for rewrite specifications.
This chapter discusses the following ways of executing a rewrite specification
in Maude. Chapter 16 explains how Maude can be used to analyze whether each
behavior of a system satisfies a temporal logic formula.
1. The Maude commands rew (or rewrite) and frew (“fair rewrite”) “simulate”
one of the many possible system behaviors from a given initial state of the
system. This is done by applying rewrite rules to the state, starting with the
initial state. Since this process may not terminate, the user can give an upper
bound on the number of rewrite steps to perform.
2. Maude’s search command uses a breadth-first search strategy to check whether
a given state pattern can be reached from the initial system state.
Although rewriting logic allows reasoning about concurrent rewrites, the Maude
system only executes one-step sequential rewrites, i.e., applying a rewrite rule once
in each step. No rewrites are lost by this approach, since, by Proposition 8.2, a con-
current rewrite can be decomposed into a sequence of one-step sequential rewrites.
c Springer-Verlag London 2017 145
P.C. Ölveczky, Designing Reliable Distributed Systems, Undergraduate Topics
in Computer Science, DOI 10.1007/978-1-4471-6687-0 9
146 9 Executing Rewriting Logic Specifications in Maude
It is fairly easy to see that applying a rewrite rule when there are no equations
in the specification is the same as applying an equation in the “corresponding equa-
tional specification.” The problem of executing a rewrite rule boils down to dealing
with the Equality rule of rewriting logic. Even checking whether a rule l −→ r ap-
plies to the root (or top) position of a term t requires checking whether there is a
substitution σ such that E l σ = t, for E the equational part of the rewriting logic
specification, which is in general undecidable.
While the search for E-equivalent forms in the above small example does not
seem disastrous, this check whether a rule can be applied to a term is in general
undecidable. Maude therefore assumes that the equational part E of a rewriting logic
specification is confluent and terminating, and first reduces a term t to its E-normal
form t! using the equations in the specification, and then checks whether a rewrite
rule can be applied to t!. If so, the rewrite rule is applied, and the resulting term t
is normalized to t !. To avoid “losing” rewrites when applying rewrite rules in this
way, the left-hand side of a rewrite rule should be a constructor term.
Example 9.2. The left-hand side of the rule in the specification NON-COHERENT is
not a constructor term, and Maude will first reduce a state a to c, and will then check
whether a rule applies. Therefore, Maude “misses” the rewrite a −→ d. The rule
rl [l] : b => d should therefore be replaced by the rule rl [l] : c => d. ♦
would never be applied, since a term would first be normalized to a form such as
person("Gilgamesh", 50, married). The problem is that the left-hand side of the
rule is not a constructor term since it contains the defined symbol +. A better rule is
rl [gettingYounger] : person(X, s N, S) => person(X, N, S) . ♦
9.1 Executing One Sequential Rewrite Step 147
A conjunct in the condition of a rewrite rule (or an equation) may also have the form
x := t
where x is a variable which does not appear in the left-hand side of the rule. Log-
ically, this is just an equational condition with the same logical meaning as x = t.
Operationally, it instantiates the variable x to the value to which the corresponding
instance of t is evaluated by the equations E. While it does not make much sense in
our simple example, our birthday rule could have been written
var X : String . vars M N : Nat .
crl [birthday] : person(X, N) => person(X, M) if M := N + 1 .
Conjunctions of conditions are evaluated from left to right, and while one can have
more than one instantiation of new variables, in each conjunct of the condition, all
the variables (except those being instantiated in the conjunct) must have appeared
in the left-hand side of the rule or must have been instantiated in earlier conjuncts.
The left-hand side of such a matching equation in a condition does not need to
be just a variable, but could have the form
t(x1 , . . . , xn ) := t
for any constructor term t with variables x1 , . . . , xn . The conjunct holds if there are
terms t1 , . . . , tn such that t(t1 /x1 , . . . , tn /xn ) equals the normal form of t ; if that is
the case, then each x j gets assigned the value t j .
Finally, the truth of a rewrite condition
... if ... /\ u => u /\ ...
cannot be determined by just computing “normal forms.” Maude must search (in
a breadth-first way) all computation paths from (the instance of) u to check if (the
corresponding instance of) u can be reached. If u cannot be reached from u, Maude
might search forever just to determine whether the rule can be applied!
Maude’s rew and frew commands are used to execute a single behavior of a system.
These commands apply rewrite rules to perform one-step sequential rewrites until no
rule can be applied, or until a user-given bound on the number of rewrites has been
reached. The execution could go on forever if the specification is nonterminating
and the user does not provide an upper bound on the number of rewrites.
Since each term is reduced to its E-normal form before a rewrite rule is applied,
a finite Maude execution with the rew or frew command has the form
∗ ∗ ∗ ∗
t t! −→ t1 t1 ! −→ t2 t2 ! −→ · · · −→ tn tn !
148 9 Executing Rewriting Logic Specifications in Maude
and returns the term tn !. Such an execution is often referred to as “simulating one
behavior” of the system. Giving the Maude command
set trace on .
before running the rew command shows all the intermediate steps in the execution.
The syntax for the rew command is rew t . or rew [n] t ., where n is the max-
imum number of rewrite steps to execute, and t is the term to rewrite. The frew
command has similar syntax. In case the execution should take place in a module
different from the “current” module, one can specify in which module the rewrite
should take place:
Maude> rew [100] in ONE-PERSON : person("Peter", 46, married) .
result Person: person("Peter", 146, married)
Since the specification is not (necessarily) confluent, the choice of which rule
to apply in each step, and where in the term to apply it, is important, as different
choices give different results. Both the rew and the frew commands try to apply
the rules in a “round-robin” format. However, the highest priority of rew is to apply
rules as close to the “top” of the term as possible, and thereafter to apply the rules
to the leftmost subterms. The frew command is more “fair” w.r.t. where in the term
to apply the rules. Both rew and frew are deterministic in the sense that two frew
executions starting with the same initial term will give the same result.
Example 9.5. The following examples compare the rewrite commands rew and
frew. Counters of the form rule2(n) indicate that rule2 was applied n times.
Both rew and frew choose rules in a “fair” way when all rewrites happen at the top:
mod TEST-REW1 is protecting NAT .
sort Counter .
ops rule1 rule2 rule3 : Nat -> Counter [ctor] .
op f : Counter Counter Counter -> Counter [ctor] .
vars N M K : Nat .
rl [rule1] : f(rule1(N), rule2(M), rule3(K))
=> f(rule1(s N), rule2(M), rule3(K)) .
rl [rule2] : f(rule1(N), rule2(M), rule3(K))
=> f(rule1(N), rule2(s M), rule3(K)) .
rl [rule3] : f(rule1(N), rule2(M), rule3(K))
=> f(rule1(N), rule2(M), rule3(s K)) .
endm
In both cases, rule1 was applied 34 times and the other rules 33 times each. The
application of the rules seems less fair when the rewrites happen in a subterm, since
rew applies rules in a leftmost-outermost way, while frew is fair also w.r.t. giving
each subterm a chance to rewrite:
9.2 Simulating Single Behaviors 149
Since rew first looks at the leftmost subterm, it always rewrites the rules that are
applicable there, while frew tries to apply rules in all subterms. ♦
Exercise 137 Declare an associative (assoc) and commutative (comm) choice op-
erator _?_ and use only one rewrite rule so that e.g. the term 1 ? 2 ? 3 ? 4 can
change to either 1, 2, 3, or 4. Use Maude’s rew and frew commands to test which
element is chosen from the terms 1 ? 2 ? 3 ? 4 and 6 ? 2 ? 3.
Exercise 139 Another version of the coffee bean game has the following rules:
• • −→ ◦ ◦ ◦ ◦ • ◦ −→ ◦ ◦ ◦ •
◦ • −→ • ◦ ◦ −→ ◦
1. Specify this game in Maude and play it in Maude. Does it always terminate?
2. Prove that the game is nonterminating or prove that it is terminating.
3. Is the game confluent?
4. If the game is confluent and terminating, what is the result of playing the game?
150 9 Executing Rewriting Logic Specifications in Maude
Exercise 140 Execute your specifications of all the whiteboard games in Exer-
cise 123 with both rew and frew on an initial state with the numbers 2, 11, 21,
27, 77, and 85. Who ends up with the smallest number: you or Maude?
Exercise 141 Simulate your “Tower of Hanoi” specification with four rods and five
disks for at most 1000 rewrite steps. Does Maude find the right solution?
Exercise 142 Execute your Traveling Salesman specifications from Exercise 125
with rew and frew. Does Maude select a trip with cost less than 21?
Exercise 143 Execute your Turing machine simulator from Exercise 126, for exam-
ple on the Turing machines solving Exercises 51 and 52.
9.3 Search
While using the rew and frew commands to execute one out of possibly many
different behaviors can be very useful for a first prototyping of a specification, such
executions may not be sufficient to deeply understand a specification. For example,
no matter how many times we execute the module ONE-FOOTBALL-GAME, the home
team never loses. After many such tests one could therefore be tempted to conclude
that “the visiting team cannot win a football game,” which is clearly wrong. We
therefore need to be able to analyze specifications further.
Maude provides a search command which searches through all behaviors from
a given initial state and returns all—or a user-given number of—states which can
be reached from the initial state and which satisfy the given search condition. The
search may be restricted to analyze all behaviors up to n rewrite steps.
Maude’s search command searches in breadth-first way through all behaviors
from the initial state. That is, Maude first visits all terms reachable in one (sequen-
tial) rewrite step from the initial state, then it visits all states reachable in two steps
from the initial state, and so on. Maude stores the visited states and ignores states
which have been visited earlier during the search. This kind of search may not ter-
minate if an infinite number of states are reachable from the initial state.
The basic forms of the search command are
and
search t0 arrow pattern such that cond .
The term t0 is the initial state, pattern is a constructor term which can contain vari-
ables, and cond is a condition which has the same form as a condition of an equation.
A term t satisfies the search condition if pattern matches t and cond holds for the
matching substitution. The arrow is either =>1, =>*, =>+, or =>! and indicates in
how many (sequential) rewrite steps the desired terms are to be found:
9.3 Search 151
=>1: states which can be reached in exactly one step from the initial state t0 ;
=>*: states reachable in zero or more steps from t0 ;
=>+: states reachable in one or more steps from t0 ; and
=>!: states that cannot be further rewritten.
searches for all states reachable in one step from person("Babko", 84, widow)
that match the variable P:Person. (Remember that variables of the form name : sort
can be used without being explicitly declared. A search pattern can use both such
undeclared variables and variables declared in the module being analyzed.) The
variable P:Person matches all terms of sort Person, so the command searches for
all states reachable in one step from person("Babko", 84, widow). The output
from a search is all the matching substitutions:
Solution 1 (state 1)
P:Person --> person("Babko", 84, deceased)
Solution 2 (state 2)
P:Person --> person("Babko", 85, widow)
No more solutions.
No more solutions.
No solution.
Finally, one may be interested in how it may end; that is, what are the possible
final states from which nothing more will happen?
Maude> search person("Peter", 46, married) =>! P:Person . ♦
152 9 Executing Rewriting Logic Specifications in Maude
The command
show path n .
outputs the shortest rewrite sequence from the initial state to state number n in the
previous search, and the command
show path labels n .
outputs the sequence of rules (represented by their labels) applied in that sequence.
Example 9.7. In Example 9.6 we search for all states where the age of "Edward" is
35. The solution in which this person was divorced had number 46. The command
show path 46 . will then let Maude show the path leading to the divorced state:
Maude> show path 46 .
birth-day
birth-day
birth-day
successful-proposal
marriage
separation
divorce ♦
A search (with an arrow different from =>1) will not terminate if there are infinitely
many states reachable from the initial state. This is because the search command
looks for all results. One may therefore put an upper bound on the number of so-
lutions, using the syntax search [n] . . . , and/or put an upper bound d on the
number of rewrite steps in the behaviors, using the syntax search [n,d ] . . . and
search [,d ] . . .
Exercise 144
1. Assume that Maude instead would search the rewrite paths from the initial state
in a “depth-first” way. Could we still guarantee that searching for n solutions
would always be successful if there exist at least n solutions?
2. Can you use Maude’s search command to prove that it is impossible to go from
the state person("Gilgamesh", 50, married) to a state in which the noble
man’s age is less than 50, provided the birthday rule has no age limit?
3. Explain why it is impossible to implement a search command which always
terminates and which can be used to find whether there exists (at least) one
reachable state from the initial state that is matched by the search pattern.
Exercise 145 In this exercise you should use Maude’s search command to analyze
the coffee bean game described in Section 8.2.3.
1. What are the possible results of the game when starting with the bean sequence
◦ • ◦ ◦ ◦ • • ◦ ◦ • • • ◦◦
Ask Maude to display the run which resulted in the fewest remaining beans.
154 9 Executing Rewriting Logic Specifications in Maude
2. Is it the case that each state reachable from an initial state with an even number
of black beans will contain an even number of black beans? Test this on the
initial states ◦ • ◦ ◦ ◦ • • ◦ ◦ • • • ◦ ◦ and ◦ ◦ • • ◦ • ◦ • .
3. Search for all the results of playing the game when the initial state contains an
odd number of black beans. Try this for a couple of initial states, such as for
example • ◦ • ◦ ◦ ◦ • • ◦ ◦ • • • ◦ ◦ and • ◦ ◦ • • ◦ • ◦ •.
Do you see any pattern in the answers?
4. Check some more examples and suggest whether it is always possible to end up
with one coffee bean no matter what the initial state is.
Exercise 146 Use Maude to prove that each nonextensible rewrite sequence in the
module SORT on page 142 starting with the list 8 4 0 -3 76 54 21 ends with
the sorted list -3 0 4 8 21 54 76.
Exercise 147 Analyze your solutions of the whiteboard game to see if it possible to
end up with a number smaller than 13 or greater than 65, starting from the initial
state in Exercise 140.
Exercise 149 Use search to analyze your specifications of the Traveling Salesman
problem in Exercise 125.
1. Is it possible, in your specification of the standard version of the problem, to
reach a non-final state where the salesperson visits a city for the second time?
2. For each of the specifications: is it possible to find a trip with cost less than 17?
Exercise 150 Use search to check whether all executions of your Turing machines
in Exercise 143 end with the expected tape/state values.
Exercise 151 Storing all visited states can be a bottleneck in a Maude search.
1. Would not storing all visited states lead to (significantly?) less memory usage
in a breadth-first search?
2. What would be the disadvantages of not storing all visited states?
Concurrent Objects in Maude
10
denote an object of class C which has the name (or identifier) o and attributes att1
to attn , whose current values are val1 to valn , respectively. Continuing our example
from Chapter 8, a Person object in a certain state could be represented by a term
< "Edward" : Person | age: 32, status: single >.
Letting a sort Object denote objects, a class C can be declared using a constructor
op <_: C | att1 :_, . . ., attn :_> : Oid s1 ... sn -> Object [ctor] .
c Springer-Verlag London 2017 155
P.C. Ölveczky, Designing Reliable Distributed Systems, Undergraduate Topics
in Computer Science, DOI 10.1007/978-1-4471-6687-0 10
156 10 Concurrent Objects in Maude
where Oid is some sort denoting object identifiers, and s1 to sn are the sorts of the
attributes att1 to attn . A class Person may therefore be declared
sorts Object Oid .
subsort String < Oid .
op <_: Person | age:_, status:_> : Oid Nat Status -> Object [ctor] .
The objects of the above form are then terms of sort Object.
A system may also contain messages, which are terms of a sort Msg. A distributed
system can then be seen as a multiset of objects and of messages traveling between
objects. A sort Configuration denoting such multisets can be defined as expected:
sorts Object Msg Configuration .
subsorts Object Msg < Configuration .
op none : -> Configuration [ctor] .
op _ _ : Configuration Configuration -> Configuration
[ctor assoc comm id: none] .
Rewrite rules define the behavior of objects, including their treatment of messages.
The left-hand side is a multiset of objects and messages, and so is the right-hand
side. A rule may involve zero, one, or many objects, and zero or more messages.
The objects need not be the same on both sides: objects may be created and/or
deleted by a rule, and so may messages.
The following rewrite rule has the same object on both sides:
vars X X’ : String . vars N N’ : Nat . vars S S’ : Status .
crl [birthday] :
< X : Person | age: N, status: S >
=>
< X : Person | age: N + 1, status: S > if N < 999 .
The rule defines the local state change for an object. With this rule we have
< "Mette" : Person | age: 21, status: single > C −→
< "Mette" : Person | age: 22, status: single > C
for any configuration C because of the Congruence rule in rewriting logic, since
< "Mette" : Person | age: 21, status: single > −→
< "Mette" : Person | age: 22, status: single >
o1 . . . on −→ o1 . . . on ,
since the Congruence rule of rewriting logic with respect to the operator _ _ gives
a concurrent step o1 o2 −→ o1 o2 , another application of Congruence then gives
(o1 o2 )o3 −→ (o1 o2 )o3 , and so on. For example, in the concurrent one-step rewrite
< "Peter" : Person | age: 20, status: single >
< "Mette" : Person | age: 21, status: single >
< "Ingrid" : Person | age: 17, status: single >
−→
< "Peter" : Person | age: 21, status: single >
< "Mette" : Person | age: 22, status: single >
< "Ingrid" : Person | age: 18, status: single >
Such a rule models synchronous communication where two (or more) objects meet
and perform an action together. Any two objects may meet in this way, due to com-
mutativity and associativity of the constructor _ _. For example there is a rewrite
< "Bence" : Person | age: 35, status: single >
< "Peter" : Person | age: 36, status: single >
< "Daniele" : Person | age: 29, status: single >
−→
< "Bence" : Person | age: 35, status: engaged("Daniele") >
< "Peter" : Person | age: 36, status: single >
< "Daniele" : Person | age: 29, status: engaged("Bence") >
is the same as
< "Bence" : Person | age: 35, status: single >
< "Daniele" : Person | age: 29, status: single >
< "Peter" : Person | age: 36, status: single >
to the state
< "Hamlet" : Person | age: 28, status: single >
< "Old Norway" : Person | age: 67, status: married("Ingrid") >.
The right-hand side of a rule may contain objects not present in the left-hand side,
in which case these additional objects are “created” by the rule and added to the
state. For example, to model the birth of a new person, we include an extra object,
containing a list of attractive names, in the state. Using the module
fmod STRING-LIST is protecting STRING .
sort StringList .
subsort String < StringList .
op nil : -> StringList [ctor] .
op _ _ : StringList StringList -> StringList [ctor assoc id: nil] .
endfm
which defines lists of strings, a class containing attractive names could be declared
op <_: Names | OKnames:_> : Oid StringList -> Object [ctor] .
The following rule then models the birth of a person, where the name of the newborn
is chosen nondeterministically among the favored names:
vars L L’ : StringList .
crl [birth] :
< X : Person | age: N, status: married(X’) >
< X’’ : Names | OKnames: L X’’’ L’ >
=>
< X : Person | age: N, status: married(X’) >
< X’’ : Names | OKnames: L X’’’ L’ >
< X’’’ : Person | age: 0, status: single > if N < 60 .
"Zeus" may want a separation (and later a divorce) so that he can marry his sister
"Hera". In one application of the rule separationInit the above state rewrites to
< "Zeus" : Person | age: 700, status: separated("Dione") >
separate("Dione")
< "Hera" : Person | age: 19, status: single >
< "Dione" : Person | age: 21, status: married("Zeus") >
1Unfortunately, this straightforward way of separating by message passing may destroy future
marriages, as explained in Section 11.2.1.1.
160 10 Concurrent Objects in Maude
*** Classes:
op <_: Names | OKnames:_> : Oid StringList -> Object [ctor] .
op <_: Person | age:_, status:_> : Oid Nat Status -> Object
[ctor] .
*** Message for separating from spouse:
op separate : Oid -> Msg [ctor] .
sort Status .
op single : -> Status [ctor] .
ops engaged married separated : Oid -> Status [ctor] .
crl [birthday] :
< X : Person | age: N, status: S >
=>
< X : Person | age: N + 1, status: S > if N < 999 .
crl [engagement] :
< X : Person | age: N, status: single >
< X’ : Person | age: N’, status: single >
=>
< X : Person | age: N, status: engaged(X’) >
< X’ : Person | age: N’, status: engaged(X) >
if N > 15 /\ N’ > 15 .
crl [birth] :
10.1 Modeling Concurrent Objects in Maude 161
rl [separationInit] :
< X : Person | age: N, status: married(X’) >
=>
< X : Person | age: N, status: separated(X’) >
separate(X’) .
rl [acceptSeparation] :
separate(X)
< X : Person | age: N, status: married(X’) >
=>
< X : Person | age: N, status: separated(X’) > .
To avoid typing large states each time you execute your specification it can be useful
to define “abbreviations” for initial states such as the constant greeks above, so that
we can execute the specification as follows:
Maude> frew [10] greeks .
Exercise 152
1. Is there a one-step concurrent rewrite
< "Zeus" : Person | age: 700, status: single >
< "Hera" : Person | age: 19, status: single >
< "Dione" : Person | age: 21, status: single >
−→
162 10 Concurrent Objects in Maude
in which "Hera" celebrates her birthday while the others are getting engaged?
2. Is there a one-step concurrent rewrite
< "Zeus" : Person | age: 700, status: single >
< "Dione" : Person | age: 21, status: single >
−→
< "Zeus" : Person | age: 700, status: engaged("Dione") >
< "Dione" : Person | age: 22, status: engaged("Zeus") >
where "Dione" celebrates her birthday and her engagement at the same time?
3. Define the rule for marriage.
4. Use Maude’s search command to prove that there is a behavior from greeks to
a state in which the age of "Kronos" is 807. Try to avoid mentioning the other
objects explicitly in the search pattern. Repeat the search for ages 810 and 811.
5. Search for a state in which both "Zeus" and "Hades" have been born.
6. Define a rule twinBirth for the birth of twins in one step. You may assume that
the list of names contains at least two distinct names.
7. Can more than one person be born at the same time using rule birth?
8. Define the rules for separation, divorce, and the death of a non-single person.
9. Use the command frew to execute your specification in Maude.
One Full Maude command worth mentioning here is (show all .), which dis-
plays the Maude module which results from Full Maude’s translation into Maude.
The Maude command trace exclude FULL-MAUDE . should be given (without
parentheses) after the command set trace on . to trace a Full Maude execution.
The sorts Oid, Object, Msg, and Configuration with the constructors described
above are defined in the following module CONFIGURATION (given in the file
prelude.maude) which is automatically imported in any object-oriented module.2
mod CONFIGURATION is
sorts Attribute AttributeSet .
subsort Attribute < AttributeSet .
op none : -> AttributeSet [ctor] .
op _,_ : AttributeSet AttributeSet -> AttributeSet
[format (o m so o) ctor assoc comm id: none] .
2 The module below has been slightly changed by the author to get better formatted output; the
same formatting should be added to Full Maude’s CONFIGURATION module.
164 10 Concurrent Objects in Maude
The sort Cid denotes class identifiers, and the sort AttributeSet denotes multisets
of attribute-value pairs, so that the order in which the attributes are given does not
matter. Classes are declared with syntax (note the blank also before the colon)
class C | att1 : s1 , ..., attn : sn .
if object identifiers are strings. Objects are written as before, with the difference that
a colon is preceded by a blank, and that the order of attributes does not matter:
< "Edward" : Person | status : single, age : 32 > .
Only a few of the attributes of an object may affect, or be affected by, the applica-
tion of a rewrite rule. Only attributes whose values are changed need to be present
in the right-hand side of a rule, and only those attributes whose values affect the
applicability of a rule, the new values of the attributes changed by the rule, or the
messages need to be present in the left-hand side of a rule.
For example, since the status of a person is not changed in the birthday rule,
and the status does not affect the “next” age of a person, the status attribute may
be omitted from the birthday rule:
crl [birthday] :
< X : Person | age : N >
=>
< X : Person | age : N + 1 > if N < 999 .
The age of a person influences whether the person can be engaged, but is not itself
changed by the engagement, so the age may be omitted from the right-hand side:
crl [engagement] :
< X : Person | age : N, status : single >
< X’ : Person | age : N’, status : single >
=>
< X : Person | status : engaged(X’) >
< X’ : Person | status : engaged(X) > if N > 15 and N’ > 15 .
10.2 Concurrent Objects in Full Maude 165
The partial specification can then be given as follows (note the parentheses):
load full-maude
crl [birthday] :
< X : Person | age : N >
=>
< X : Person | age : N + 1 > if N < 999 .
crl [engagement] :
< X : Person | age : N, status : single >
< X’ : Person | age : N’, status : single >
=>
< X : Person | status : engaged(X’) >
< X’ : Person | status : engaged(X) > if N > 15 and N’ > 15 .
10.2.3 Subclasses
Just as for subsorts, where subsort Ape < Animal means that every Ape is also
an Animal and therefore “inherits” all properties and functionalities of an animal,
so subclass B < C means that every B-object is also a C-object which inherits all
the attributes and all functionality of the class C, so that a rule
166 10 Concurrent Objects in Maude
also applies to B-objects, whose set of attributes are att1 , . . . , attn , att1 , . . . , attk .
Full Maude supports multiple inheritance, where a class may be a subclass of a
number of classes:
subclass C < C1 ... Cn .
In this case, the set of attributes of a C-object is the union of the sets of attributes
of C1 to Cn and those declared in C. The class C also inherits all rewrite rules of its
superclasses. A superclass Ci may itself be a subclass of some other class.
Example 10.1. We extend our example to model the fact that some people are Chris-
tian, some are Muslim, and some are neither. Important events for a Christian are
baptism and confirmation, and an important event for a Muslim is the hajj (the pil-
grimage to Mecca). Both Christians and Muslims are persons: they celebrate birth-
days, engagements, marriages, and they separate, divorce, and die like all persons.
There are at least two different ways of extending the module POPULATION with
the important religious events:
1. The religion of a person is given at birth.
2. A person is born without religion, but can become Christian or Muslim by
being baptized or by being read the call-for-prayer (or publicly pronounce the
declaration of faith), respectively.
We first model that one’s religion is given at birth. Since Christians and Muslims
are persons, we define the classes Christian and Muslim as subclasses of Person:
sort ChristianStatus .
ops notBapt baptized confirmed : -> ChristianStatus [ctor] .
(The attribute hajji is true iff a Muslim has done a hajj.) The rules for baptism
and confirmation are straightforward:
rl [baptism] :
< X : Christian | chrStatus : notBapt >
=>
< X : Christian | chrStatus : baptized > .
rl [confirmation] :
< X : Christian | chrStatus : baptized >
=>
< X : Christian | chrStatus : confirmed > .
The rule for hajj shows that a Muslim can do the pilgrimage more than once:
rl [hajj] : < X : Muslim | > => < X : Muslim | hajji : true > .
10.2 Concurrent Objects in Full Maude 167
Since the rule birth also applies to Muslims and Christians, a religious couple
may get a non-religious offspring. The following rule models the possibility that a
newborn child is a Muslim if one of his/her parents is Muslim:
crl [birthMuslim] :
< X : Names | OKnames : L X’ L’ >
< X’’ : Muslim | age : N, status : married(X’’’) >
< X’’’ : Person | age : N’, status : married(X’’) >
=>
< X : Names | OKnames : L L’ >
< X’’ : Muslim | >
< X’’’ : Person | >
< X’ : Muslim | age : 0, status : single, hajji : false >
if N < 60 or N’ < 60 .
A typical initial state of this system is
< "Possible names" : Names | OKnames : "Aaron" "Isaac" >
< "Imtiaz" : Muslim | age : 30, status : married("Maiken"),
hajji : false >
< "Maiken" : Christian | age : 29, status : married("Imtiaz"),
chrStatus : confirmed >
< "Panchen Lama" : Person | age : 28, status : single >.
In the second version, only a Person is born, and can later become a Christian
or a Muslim. We therefore need a rule for baptism, so that both non-Christians
and Muslims can be baptized, while Christians are already baptized and cannot be
baptized again. The rule
rl [baptism] :
< X : Person | age : N, status : S >
=>
< X : Christian | age : N, status : S, chrStatus : baptized > .
cannot be used since it would allow a Christian to be baptized again. How can we
modify the rule so that only non-believers and Muslims can be baptized? The easiest
way is to define two sorts ChrObject and MuslimObject using memberships:
sorts ChrObject MuslimObject .
subsorts ChrObject MuslimObject < Object .
mb (< X : Christian | >) : ChrObject .
mb (< X : Muslim | >) : MuslimObject .
The change of class corresponds to the deletion of an object and the creation of
another object with the same name but a different class. All the attributes of the new
object must therefore be provided in the right-hand sides of class-changing rules. ♦
A search pattern < o : C | att : pattern > in an object-oriented system will match
any object of class C or of a subclass of C whose attribute att is matched by pattern.
Therefore, there is no need to worry about subclasses or mentioning all the attributes
in the search pattern. This can be seen from the “echo” of the search command:
Maude> (search [1] greeks =>*
C:Configuration < "Uranus" : Person | age : 902 > .)
Solution 1
C:Configuration -->
< "Gaia" : Person | age : 999, status : married("Uranus") > ;
V#0:Person --> Person ;
V#1:AttributeSet --> status : married("Gaia")
The command echo shows that Full Maude replaces the class names with vari-
ables (V#0 above) that can be used to capture objects belonging to subclasses of the
class C (Person in the above example). Likewise, the “remaining” attributes of each
object are captured by variables (V#1 above) of the sort AttributeSet. The search
result shows that the (least) class of the object is Person.
Warning: When you search for a pattern that contains an object whose attribute
values you are not interested in, you must use none for the attribute set in the search
pattern instead of just leaving the “place for the attributes” empty.
Variables in search commands with such that-conditions need to be written in
their “explicit” form var:sort:
Maude> (search [1] greeks =>*
C:Configuration < "Uranus" : Person | age : N:Nat >
such that N:Nat > 902 .)
10.2 Concurrent Objects in Full Maude 169
which outputs the (core) Maude version of the current module. One can then cut-
and-paste the output from this command into (core) Maude and perform the search
in (core) Maude:
Maude> search [1] greeks =>*
C:Configuration
< "Uranus" : Person | age : 903, status : S:Status > .
Solution 1 (state 3)
C:Configuration -->
< "Gaia" : Person | age : 999, status : married("Uranus") >
S:Status --> married("Gaia")
Instead of cutting-and-pasting, you can specify your Full Maude module in a file,
say file.maude, which ends with the lines
(show all .)
q
will then write the equivalent Maude module to the file core-maude-file.maude.
Remove the welcome and farewell greetings from this file and enter it into Maude.
The specification can then be analyzed using all of Maude’s features.
• Commands such as red, rew, and search should likewise be enclosed by a pair
of parentheses.
• The commands in and load should be treated by (core) Maude and should not
be enclosed by parentheses.
• Many Maude commands and features—such as the debugger and the show path
command—are not available in Full Maude. See the Maude manual for details.
• Load the file full-maude.maude to activate Full Maude.
Exercise 153 Complete the Full Maude module POPULATION with rules for birth,
marriage, separation, divorce, and death. Avoid superfluous attributes. Execute
your specification in Full Maude.
Exercise 154 The second version of our example allows a Christian to convert to
Islam, and vice versa. How would you modify the specification to disallow that?
The dining philosophers problem [29] is a classic example due to Dijkstra. It is used
to illustrate some concepts in distributed systems whose components need to access
shared resources such as printers or shared memory.
Five philosophers sit around a round table with an enormous bowl filled with de-
licious dumplings in the middle of the table. Each philosopher spends her life al-
ternating between thinking, being hungry, eating, then thinking again, and so on,
in a never-ending cycle. However, even this seemingly idyllic setting is not perfect.
By a cruel quirk of fate there are only five chopsticks on the table: one chopstick
between each neighboring pair of philosophers, as seen in Fig. 10.1. We all know
that dumplings are delicious but hot and slick, so a philosopher needs both her left
chopstick and her right chopstick to eat.
A hungry philosopher will first grab a (left or right) chopstick if one is available,
and will then hold on to this stick until she grabs the other chopstick and starts
eating. No philosopher can eat forever, so after a finite time of eating, an eating
philosopher must put back both chopsticks, and start thinking.
There are some intriguing questions about this world. Is it possible that all
philosophers will starve to death due to lack of available chopsticks? Is it possi-
ble that one philosopher will starve to death while the others are feasting?
10.3 Example: The Dining Philosophers 171
This section presents an object-oriented model which specifies all possible behav-
iors of the philosophers system.
I choose to model an available chopstick as a message, so that a “message”
chopstick(i) means that chopstick i is available, and can be seen as a message
which can be read and consumed by a philosopher, who then “has” the chopstick.
When the philosopher stops eating she sends two chopstick messages into the con-
figuration, making the chopsticks available again. Chopsticks are defined as follows:
msg chopstick : Nat -> Msg .
where i denotes the number of the philosopher. The philosopher class is declared
class Philosopher | state : State, #sticks : Nat, #eats : Nat .
sort State .
ops thinking hungry eating : -> State [ctor] .
172 10 Concurrent Objects in Maude
Each philosopher starts in a thinking state without a chopstick in hand. The rule
hungry models the philosopher becoming hungry:
vars I J K : Nat .
rl [hungry] :
< I : Philosopher | state : thinking >
=>
< I : Philosopher | state : hungry > .
The rule grabFirst models the philosopher grabbing her first chopstick, which
could be either her left or her right chopstick:
crl [grabFirst] :
chopstick(J)
< I : Philosopher | state : hungry, #sticks : 0 >
=>
< I : Philosopher | state : hungry, #sticks : 1 >
if I can use stick J .
A philosopher can start eating when she grabs her second chopstick:
crl [grabSecond] :
chopstick(J)
< I : Philosopher | #sticks : 1, #eats : K >
=>
< I : Philosopher | state : eating, #sticks : 2, #eats : K + 1 >
if I can use stick J .
The last rule stops the eating and puts the chopsticks back on the table:
rl [stopEating] :
< I : Philosopher | state : eating >
=>
< I : Philosopher | state : thinking, #sticks : 0 >
chopstick(I) chopstick(right(I)) .
A distributed system where processes need exclusive access to shared resources may
deadlock. This means that the system is stuck and nothing can happen in the system
because no process can proceed until it gets a shared resource which is controlled
by another process (which is also stuck, since it may need, e.g., some resource con-
trolled by the first process). A deadlock here could be a state where each philosopher
has one chopstick, and cannot do anything because there are no chopsticks available.
(Exercise 155 uses Full Maude to analyze whether the system may deadlock.)
Livelock (also known as starvation) is a trickier property which means that one
philosopher could starve to death because she can never get hold of both chopsticks,
while at the same time the other philosophers could feast merrily.
The fairness assumptions about this problem are: an eating philosopher eventually
stops eating; a thinking philosopher eventually becomes hungry; and a philosopher
will eventually pick up a needed chopstick if it is available infinitely often. These
assumptions are not captured by our specification (why not?). However, each finite
behavior (simulated with frew [n]) is “correct,” since it is a prefix of a behavior in
which no philosopher eats continuously. Therefore, this deficiency does not affect
the reasoning about deadlocks. However, a livelock (starvation) is an infinite sce-
nario, so that certain livelock behaviors allowed by a specification may not satisfy
the fairness constraints.
Fairness criteria which say that eventually (i.e., “some time in the future”) some-
thing must happen cannot be “implemented” in full generality, since there is no
bound on when that “something” must happen. Instead, as explained in Chapter 16,
we can analyze properties of the form “property X holds in all ‘fair’ computations.”
A solution which has been proposed to avoid deadlocks is to let each philosopher
grab both chopsticks at the same time (and not allow them to grab only one).
The philosophers could get stuck in a deadlock situation where each philosopher
proudly holds, say, her right chopstick and waits for the other chopstick, which will
never become available. The solution where each philosopher grabs both chopsticks
removes the possibility of deadlock, but not the possibility of livelock.
174 10 Concurrent Objects in Maude
The following solution has been proposed to avoid also livelocks: Philosophers
should not contemplate the deep questions of existence in the dining room, but in the
adjacent library! Furthermore, there is now a doorman (or a sophisticated turnstile
system) allowing at most four philosophers to be in the dining room at any time.
A state in this new setting can be an object of the form
< GlobalSystem : DinPhilHouse | diningRoom : philsAndSticks,
#inDinRoom : n,
library : philosophers >
Philosophers and chopsticks are modeled as before, and so is the rule hungry
which lets a thinking philosopher become hungry (although this transformation now
takes place in the library). A new rule lets a hungry philosopher enter the dining
room if there are less than four philosophers in the dining room:
var O : Oid . vars C C’ : Configuration .
crl [enterDinRoom] :
< O : DinPhilHouse | diningRoom : C, #inDinRoom : K,
library :
(< I : Philosopher | state : hungry > C’) >
=>
< O : DinPhilHouse | diningRoom : (< I : Philosopher | > C),
#inDinRoom : K + 1, library : C’ >
if K < 4 .
The variable C matches the configuration consisting of the philosophers and chop-
sticks already in the dining room, and C’ matches the philosophers left in the library.
The rules grabFirst and grabSecond apply as before. We could be harsh and
require that a philosopher leaves the dining room at the moment she stops eating. A
gentler version keeps the rule stopEating and adds the rule
rl [enterLibrary] :
< O : DinPhilHouse | diningRoom :
(< I : Philosopher | state : thinking > C),
#inDinRoom : s K, library : C’ >
=>
< O : DinPhilHouse | diningRoom : C, #inDinRoom : K,
library : (< I : Philosopher | > C’) > .
in which a philosopher who has started thinking leaves the dining room.
10.3 Example: The Dining Philosophers 175
In the initial state all philosophers are in the library thinking, while the delicious
dumplings and the chopsticks are in the dining room:
< GlobalSystem : DinPhilHouse |
diningRoom : chopstick(1) chopstick(2) chopstick(3)
chopstick(4) chopstick(5),
#inDinRoom : 0,
library :
(< 1 : Philosopher | state : thinking, #sticks : 0, #eats : 0 >
< 2 : Philosopher | state : thinking, #sticks : 0, #eats : 0 >
< 3 : Philosopher | state : thinking, #sticks : 0, #eats : 0 >
< 4 : Philosopher | state : thinking, #sticks : 0, #eats : 0 >
< 5 : Philosopher | state : thinking, #sticks : 0, #eats : 0 >)
>
I have used Maude to verify in a fully automatic way that this version of the
dining philosopher’s problem indeed is livelock-free (see Exercise 237).
Exercise 157 Consider the version of the dining philosophers in Section 10.3.6.
1. Specify this version of the dining philosophers and execute your specification.
2. Explain why there cannot be a deadlock in this specification.
176 10 Concurrent Objects in Maude
The enticing casinos in Las Vegas offer the possibility of striking it rich quickly.
Instead of experimenting with different strategies on the casino floor or perform
complex error-prone statistical calculations to come up with a winning strategy, we
use Maude to simulate the outcome of gambling with different strategies.
Blackjack (“21”) is a popular card game in which each player plays against the
casino (called the dealer). The goal of a player is to amass cards with total value
closer to 21 than the dealer, but without going over 21 (“busting”). A player faces
many choices during a game: should he ask for another card? should he “double
down,” “split,” or “surrender”? should he play at a table marked “dealer must stand
on all 17’s” or at one marked “dealer must hit soft 17’s”? and so on.
Our approach to striking gold in Vegas is to simulate many rounds of the game
with the desired strategy and see how much money we are left with. We use Maude’s
built-in pseudo-random number generator random to perform randomized simula-
tions: the next card is drawn “randomly” from the remaining cards in the deck.
10.4.1 Blackjack
In blackjack, a face card counts as 10, and an ace counts as either 1 or 11. A
player/dealer has blackjack if he has two cards with total value 21.
A round of blackjack goes as follows. The player places his bet and gets one
card; the dealer then gets a card that can be seen by the player; and the player
gets his second card. The player then considers the situation and ask for new cards
(“hit”), one by one, until the player is satisfied or goes bust. The dealer must follow
a fixed pre-defined strategy, and gets his remaining cards when the player is done.
The player loses his bet if either:
• the sum of (the values of) his cards is greater than 21 even when his aces count
as 1 and even if the dealer also busts;
• the dealer has blackjack and the player has not; or
10.4 Randomized Simulations: Winning in Vegas 177
• the sum of the dealer’s cards is closer to 21, without going over 21, than the sum
of the player’s cards.
The player keeps his bet if either:
• both the player and the dealer have blackjack; or
• neither has blackjack and the best sum of their cards have the same value v ≤ 21.
The player wins 1.5 times his bet if he has blackjack and the dealer has not. In all
other cases, the player wins an amount equal to his bet.
After getting his first two cards, the player may perform any of the following
actions (typically at most once, although rules may vary):
Double down: Double his bet and get exactly one more card.
Split: If the two cards have the same value, the player may split them into two
separate hands and play on with two separate hands.
Surrender: Give up, and keep half his bet.
There are many different strategies to consider for the blackjack player, including:
• The dealer must hit (get new cards) until he gets 17; in some casinos, the dealer
must hit on “soft 17” (i.e., an ace and other cards with total sum 6) and in other
casinos the dealer must stand on “soft 17.” In which casino should you play?
• In general, how should you play based on the dealer’s visible card?
• Should you play with one deck of cards or with multiple decks?
• When should you double down, split, or surrender?
For simplicity of exposition, this section models the play of a nervous first-time
visitor to Las Vegas who adopts the following very simple strategy: stand if the
least value of your hand is ≥ 15 or if its best value is ≥ 18. In Exercise 163 you can
modify this strategy to your strategy of choice.
We use ::-separated lists of cards (since we will randomly draw the n-th remain-
ing card) such as < diamonds , A > (ace of diamonds). A deck is a special list:
fmod CARD is
sorts Suit Value Card .
ops 2 3 4 5 6 7 8 9 10 J Q K A : -> Value [ctor] .
ops spades hearts clubs diamonds : -> Suit [ctor] .
op <_,_> : Suit Value -> Card [ctor] .
Since an ace can count as either 1 or 11, we define here and in Exercise 158
different sums of the cards in a hand: leastValue, largestValue, and bestValue.
fmod RESULT is protecting CARD + NAT .
ops leastValue largestValue bestValue : Cards -> Nat .
The expression result(player, dealer, bet) defines the payment (including the
original bet) to a player after a game in which he bet $bet and ended with hand
player, while the dealer finished with the hand dealer:
op result : Cards Cards Nat -> Nat .
eq result(PLAYER, DEALER, BET)
= if blackJack(PLAYER) and (not blackJack(DEALER))
then (5 * BET) quo 2 --- blackjack!
else (if bestValue(PLAYER) <= 21 and
(bestValue(PLAYER) > bestValue(DEALER)
or leastValue(DEALER) > 21)
then (BET + BET) --- player wins
else (if (blackJack(PLAYER) and blackJack(DEALER))
or
((not blackJack(DEALER))
and
(bestValue(PLAYER) <= 21)
and (bestValue(PLAYER) == bestValue(DEALER)))
then BET --- push
else 0 fi) fi) fi . --- player loses
endfm
We next model the game in an object-oriented style, where the state consists of
three classes of objects: dealer, players, and a Table objects which contains the
remaining cards (attribute shoe), the index for the random function (rndIndex),
and information about whose turn is next. Since I tend to be alone with the dealer
on my one-and-done forays to the high-roller table, I assume for simplicity that there
is only one player at the table (extending this is trivial; see Exercise 160).
Since we deal with objects, we start using Full Maude:
load full-maude
The following rewrite rules model the start of the game: first the player gets his first
card (startGame), then the dealer gets his first card (dealerFirstCard), followed
by the player getting his second card (playerSecond). The index for the random
function must increase each time a card is taken:
vars CARD CARD2 : Card . vars CARDS CARDS2 : Cards .
var N : Nat . var NZN : NzNat . vars T P D : Oid .
rl [startGame] :
< T : Table | shoe : CARDS, rndIndex : N >
< P : Player | hand : nil, bet : NZN >
=>
< T : Table | shoe : remove(getRandomCard(CARDS, N), CARDS),
rndIndex : s N >
< P : Player | hand : getRandomCard(CARDS, N) > .
rl [dealerFirstCard] :
< T : Table | shoe : CARDS, rndIndex : N >
< D : Dealer | hand : nil >
< P : Player | hand : CARD >
=>
< T : Table | shoe : remove(getRandomCard(CARDS, N), CARDS),
rndIndex : s N >
< D : Dealer | hand : getRandomCard(CARDS, N) >
< P : Player | > .
180 10 Concurrent Objects in Maude
rl [playerSecond] :
< T : Table | shoe : CARDS, rndIndex : N >
< P : Player | hand : CARD >
< D : Dealer | hand : CARD2 >
=>
< T : Table | shoe : remove(getRandomCard(CARDS, N), CARDS),
rndIndex : s N, turn : P >
< P : Player | hand : CARD :: getRandomCard(CARDS, N) >
< D : Dealer | > .
Next, the player hits or stands according to the simple strategy described above:
crl [playerHit] :
< T : Table | shoe : CARDS, rndIndex : N, turn : P >
< P : Player | hand : CARDS2 >
=>
< T : Table | shoe : remove(getRandomCard(CARDS, N), CARDS),
rndIndex : s N >
< P : Player | hand : CARDS2 :: getRandomCard(CARDS, N) >
if not (leastValue(CARDS2) >= 15 or bestValue(CARDS2) >= 18) .
crl [playerStand] :
< T : Table | turn : P >
< P : Player | hand : CARDS2 >
< D : Dealer | >
=>
< T : Table | turn : D >
< P : Player | >
< D : Dealer | >
if leastValue(CARDS2) >= 15 or bestValue(CARDS2) >= 18 .
We can then simulate one game of blackjack, starting with random number 7:
Maude> (rew < t : Table | shoe : deck, rndIndex : 7, turn : t >
< caesarsPalace : Dealer | hand : nil >
< peter : Player | hand : nil, bet : 100 > .)
result Configuration :
10.4 Randomized Simulations: Winning in Vegas 181
< caesarsPalace : Dealer | hand : < clubs, 3 > :: < hearts,5 > ::
< spades, 4 > :: < spades, K > >
< peter : Player | bet : 100, hand : < clubs, A > :: < diamonds, 9 > >
< t : Table | rndIndex : 13, ... >
So far, so good. Simulating single rounds is, however, not very efficient. We there-
fore model a player who spends an entire day (or as long as money lasts) at the
blackjack table as an object of the subclass MultiPlayer, which adds attributes
gamesLeft (number of games left to play), money (total amount of player money),
and eachBet (bet in each round) to the class Player, and add two rules: reset
cleans up the table after the previous round, and restart starts a new round if the
player has sufficient funds:
(omod PLAY-MANY-ROUNDS is protecting PLAY-BJ .
crl [reset] :
< T : Table | rndIndex : N, turn : D >
< D : Dealer | hand : CARDS1 >
< P : MultiPlayer | hand : CARDS2, bet : NZN, money : N2 >
=>
< T : Table | shoe : deck, rndIndex : s N, turn : T >
< D : Dealer | hand : nil >
< P : MultiPlayer | hand : nil, bet : 0,
money : N2 + result(CARDS2,CARDS1,NZN) >
if bestValue(CARDS1) >= 17 .
crl [restart] :
< P : MultiPlayer | gamesLeft : s N, bet : 0,
money : N2, eachBet : NZN >
=>
< P : MultiPlayer | gamesLeft : N, bet : NZN,
money : sd(N2, NZN) > if NZN <= N2 .
endom)
If we start the day with $1000, how much money do we have left after playing
100 rounds of blackjack with our trivial strategy?
Maude> (rew < t : Table | shoe : deck, rndIndex : 1, turn : t >
< caesarsPalace : Dealer | hand : nil >
< peter : MultiPlayer | hand : nil, bet : 0,
gamesLeft : 100, money : 1000, eachBet : 100 > .)
result Configuration :
< peter : MultiPlayer | gamesLeft : 0, money : 800, ... > ...
This is surprisingly good; the player only lost $200 after 100 rounds of $100-games.
182 10 Concurrent Objects in Maude
Even if the results of a few simulations of your blackjack strategy look good, you
want stronger guarantees before quitting your day job. Our specification can be seen
as a probabilistic rewrite theory (see Chapter 17) where each card is drawn from
the deck with the same probability. The following analysis methods, discussed in
Chapter 17, provide stronger guarantees than single executions:
Probabilistic model checking: One could prove properties such as “the likelihood
of ending up with more $1200 after a day’s work is more than 60%.”
Statistical model checking: Unfortunately, probabilistic model checking can be
very inefficient. Statistical model checking [102, 109] trades certainty for effi-
ciency by simulating single runs until the desired confidence level is reached,
and allows you to ascertain properties like “with confidence level 0.9, the likeli-
hood of ending up with more $1200 after a day’s work is more than 60%.”
Value estimation: To better plan your economy as a professional blackjack player,
you may be more interested in estimating the amount of money you have at the
end of the day than the likelihood of making more than $200.
c Springer-Verlag London 2017 183
P.C. Ölveczky, Designing Reliable Distributed Systems, Undergraduate Topics
in Computer Science, DOI 10.1007/978-1-4471-6687-0 11
184 11 Modeling Communication in Maude
in which two parties communicate their mutual desire to marry. Objects can be seen
as “swimming” in the “soup” which makes up the state, and can meet to perform the
rule. More than two objects may of course participate in a communication event.
rl [initiateSeparation] :
< X : Person | status : married(X’) >
=>
< X : Person | status : separated(X’) >
separate(X’) .
186 11 Modeling Communication in Maude
rl [acceptSeparation] :
separate(X)
< X : Person | status : married(X’) >
=>
< X : Person | status : separated(X’) > .
Both "JR" and "Sue Ellen" are now separated, and can divorce (using a straight-
forward synchronous divorce rule1 ), leading to the state
< "JR" : Person | age : 50, status : single >
separate("Sue Ellen")
< "Cally" : Person | age : 25, status : single >
< "Sue Ellen" : Person | age : 45, status : single >
separate("JR")
< "Cliff" : Person | age : 46, status : single >
"JR" is again single and starts courting "Cally", and eventually marries her. Like-
wise, "Sue Ellen" goes on and marries "Cliff", leading us to the state
< "JR" : Person | age : 50, status : married("Cally") >
separate("Sue Ellen")
< "Cally" : Person | age : 25, status : married("JR") >
< "Sue Ellen" : Person | age : 45, status : married("Cliff") >
separate("JR")
< "Cliff" : Person | age : 46, status : married("Sue Ellen") >
1 There is no contradiction in using a synchronous divorce rule, since the parties do not talk to each
other and hence do not know that both of them have initiated a separation when they meet in court.
11.2 Unordered Asynchronous Communication by Message Passing 187
And disaster strikes! "JR" reads the separate("JR") message (sent by "Sue
Ellen") which has been lying around, and thinks that "Cally" wants a separation:
< "JR" : Person | age : 50, status : separated("Cally") >
separate("Sue Ellen")
< "Cally" : Person | age : 25, status : married("JR") >
< "Sue Ellen" : Person | age : 45, status : married("Cliff") >
< "Cliff" : Person | age : 46, status : married("Sue Ellen") >
In the same way, "Sue Ellen"—now happily married to "Cliff"—could read the
old separation message from "JR". Two happy marriages have been broken up by
old separate messages! The problems are that
1. a separate message is not read if you are in a state separated (you don’t look
for separate messages if you think that you have separated), and
2. an old separate message can arrive a couple of years later, destroying a new
and happy marriage.
The first of these problems could be fixed by adding a rule
rl [sep2] :
separate(X)
< X : Person | status : separated(X’) >
=>
< X : Person | > .
Adding this rule does not solve the second problem, since the unfortunate behav-
ior above is still possible. (Adding the sender to the separate message does not
solve our problems, since "JR" and "Sue Ellen" might remarry after their first di-
vorce, and the old separate message would destroy their new and happy marriage.)
Fortunately, it is possible to separate safely as follows: A new status waitSep(p)
denotes that a separation from p has been initiated and that the person is waiting for
the answer. The following rules specify this way of separating:
rl [initSep] :
< X : Person | status : married(X’) >
=>
< X : Person | status : waitSep(X’) >
separate(X’) .
rl [acceptSep] :
separate(X)
< X : Person | status : married(X’) >
=>
< X : Person | status : separated(X’) >
separate(X’) .
rl [acceptSep2] :
separate(X)
< X : Person | status : waitSep(X’) >
=>
< X : Person | status : separated(X’) > .
188 11 Modeling Communication in Maude
This specification describes a protocol for how each spouse should behave to suc-
cessfully separate. “Programs” for distributed systems are often protocols which
define how the distributed components should interact. Correctness of the separa-
tion protocol follows from the fact that each party must send exactly one separate
message, and must consume one separate message, in the separation process.
This example illustrates the difficulty with asynchronously communicating sys-
tems. It seems almost impossible to find a simpler example: only one communi-
cation event (a separation) should take place, and there is no loss or corruption of
messages, yet the problem has a fairly unintuitive solution. Furthermore, if messages
can get lost, then the problem becomes really hard (or unsolvable).
A letter sent in the mail typically consists of an envelope, with the sender and
receiver addresses, inside which there is some message content. In the rest of this
chapter, we use a message wrapper (“envelope”), so that a unicast message in the
global configuration is a term of the form
msg content from sender to receiver
11.2.2 Multicast
Multicast means that a sender sends a message to a group of recipients at once, for
example sending stock quotes or conference announcements to groups of recipients
who subscribe to such notifications.
A group of receivers can be modeled as a set of object identifiers:
class Sender | multicast-group : OidSet, ...
11.2.3 Broadcast
Broadcast means that a node sends a message to all the (other) nodes in the sys-
tem. An example is a television satellite system that broadcasts TV signals to all
households in the world that have certain kinds of reception equipment. Unlike for
multicast, a broadcasting node does not know the group of receivers. The idea is to
transform a broadcast message into a multicast message to all the other nodes in the
system. To have “control” over all the nodes in the system, we introduce an operator
sort GlobalSystem .
op {_} : Configuration -> GlobalSystem [ctor] .
and require that the whole state has the form {conf }, for some configuration conf .
A broadcast message wrapper can be declared
op broadcast_from_ : MsgContent Oid -> Configuration .
Assuming that the nodes in the system are objects of a class Node,2 and knowing
that all the objects in systems are enclosed within the curly braces, the following
equations define a broadcast message to be a multicast message to all other nodes
in the system:
2 This is not a significant restriction, since any class can be a subclass of the class Node.
190 11 Modeling Communication in Maude
The function objectIds gives the set of object identifiers in a configuration. Broad-
casting a message is done by a rule of the form
rl [broadcast] :
< o : ... > => < o : ... > broadcast content from o .
Messages can get lost or corrupted during transmission. Corruption can typically be
detected by the communication infrastructure and is usually modeled as a message
11.2 Unordered Asynchronous Communication by Message Passing 191
loss. In many systems, a sender resends a message after a certain amount of time if it
has not heard from a receiver in the meantime. To avoid having to deal with time and
timeouts (see Section 17.1 for the treatment of time in Maude), such retransmission
is sometimes modeled abstractly as the duplication of a message in the system.
Since the message wrapper msg_from_to_ is used for all messages in transmis-
sion, message loss and duplication can be modeled by the following modules:
(omod MESSAGE-LOSS is including MESSAGE-WRAPPER .
var MC : MsgContent . vars O O’ : Oid .
rl [duplicate-msg] :
msg MC from O to O’
=>
(msg MC from O to O’) (msg MC from O to O’) .
endom)
(omod MESSAGE-LOSS-DUPLICATION is
including MESSAGE-LOSS + MESSAGE-DUPLICATION .
endom)
Another solution is to have a “shark” object that swims in the configuration and
devours and duplicates messages:
class Shark .
rl [devour-msg] :
(msg MC from O to O’) < O’’ : Shark | > => < O’’ : Shark | > .
rl [duplicate-msg] :
(msg MC from O to O’) < O’’ : Shark | >
=>
< O’’ : Shark | > (msg MC from O to O’) (msg MC from O to O’) .
The advantage of the last solution is that it can easily be modified to model a setting
where, say, at most 20 messages are lost or duplicated in a single execution. Its
disadvantage is the lack of concurrency, and that it defines a less elegant model.
Exercise 165 Extend your specification of a population with the “standard” rules
for asynchronous separation (including the rule sep2), so that it includes (synchro-
nous rules) for divorce, engagement, marriage, etc.
1. Use Full Maude’s search capabilities to show that a married couple can turn
into a couple in which one of them is married to the same spouse, while the
other spouse is separated, and there is no pending (unread) separate message
in the system. (This case corresponds to the case of "JR" and "Sue Ellen"
remarrying and then one of them discovers the old separation message.)
192 11 Modeling Communication in Maude
2. Use Maude’s search capabilities to show that, starting from a normal state in
which "JR" and "Sue Ellen" are married and "Cally" is single, it is possible
to reach a state in which "JR" is separated from "Cally", "Cally" is married
to "JR", and there is no message pending.
Exercise 167 A node wants to distribute an important message to all other nodes
(that are reachable from the sender) in a network where each node knows its neigh-
bors. There is only one message to transmit. The following protocol achieves this:
• The sender multicasts the very important message to its neighbors.
• When a node reads an important message for the first time, it stores the content
of the message, and multicasts the message to its neighbors except the node from
which it just received the message.
• When a node receives an important message but has already received some
important message (hopefully the same message), it just ignores the message.
1. Specify the protocol in Full Maude.
2. Define an initial state initState corresponding to the case when node b wants
to distribute a very important message, and where
• node a has neighbors b and e,
• node b has neighbors a and d,
• node c has neighbors d,
• node d has neighbors b, c, and e, and
• node e has neighbors a and d.
3. Execute the protocol using Full Maude’s frew command.
4. Use Full Maude’s search command to check that each final state reachable from
initState is as expected.
Exercise 168 Assume that there are three classes Satellite, HouseWithAntenna,
and HouseWithoutAntenna. Modify the definition of broadcast so that a broadcast
message only reaches objects of class HouseWithAntenna. Test your specification.
Exercise 170 To save battery power, wireless devices may send wireless signals
with different signal strength. Define a model of wireless broadcast where the
sender broadcasts messages of the form wl-broadcast content from o withRange r,
where r is the transmission distance.
11.2 Unordered Asynchronous Communication by Message Passing 193
Exercise 171 Define a class LimitedShark, whose objects can cause at most 10
message losses and 10 message duplications during an execution.
Ordered message delivery typically means that a sequence of messages sent from a
node a to a node b are received by b in the order in which they were sent by a.
An infrastructure that provides ordered communication can be seen as a link (or
a channel or a buffer) between two components, and can therefore be abstractly
modeled using link objects. A one-directional link from a node (with identifier) a to
a node b can be represented by an object
< a to b : Link | content : mc1 :: mc2 :: . . . :: mck >
194 11 Modeling Communication in Maude
The global state should contain one Link object (two for two-ways communica-
tion channels) between each pair of nodes that are connected. The network
A message is sent by inserting its content at the back of the link, so that the
sending of a message with content mc from an object a to an object b is modeled by
a rule of the form
var MCL : MsgContentList .
rl [send-mc] :
< a : ... | ... >
< a to b : Link | content : MCL >
=>
< a : ... | ... >
< a to b : Link | content : MCL :: mc > .
11.3 Ordered Asynchronous Communication using Links 195
An object b reads the “next” message (content) from an object a by removing the
first element in the link from a to b:
rl [read-mc] :
< b : ... | ... >
< a to b : Link | content : mc :: MCL >
=>
< b : ... | ... >
< a to b : Link | content : MCL > .
A lossy link, i.e., a link in which messages in transit can be lost, can be modeled as
an object of the following subclass LossyLink, where the rule lose-msg models
the loss of any message (content):
class LossyLink .
subclass LossyLink < Link .
rl [lose-msg] :
< SOURCE to DEST : LossyLink | content : MCL :: MC :: MCL’ >
=>
< SOURCE to DEST : LossyLink | content : MCL :: MCL’ > .
rl [duplMsg] :
< SOURCE to DEST : DuplLink | content : MCL :: MC :: MCL’ >
=>
< SOURCE to DEST : DuplLink | content : MCL :: MC :: MCL’ :: MC > .
Finally, the following class UnrelLink specifies links where messages can get lost
as well as getting duplicated during transmission:
class UnrelLink .
subclass UnrelLink < LossyLink DuplLink .
For full generality, rewrite rules involving sending and receiving messages should
mention links of the superclass Link, so that they apply to all kinds of links. The
initial states should then specify exactly what kind of links are used in each case. In
this way, we can easily model systems with different kinds of links: some links may
be reliable while other links can be lossy and/or duplicating.
196 11 Modeling Communication in Maude
In many cases messages are dropped because the link is full. A link which can
transport at most N messages can be modeled by the following class BoundedLink:
class BoundedLink | content : MsgContentList, capacity : NzNat,
currentSize : Nat .
where currentSize is the size of the list in the content attribute.3 The sending of
a message m through such a link should be modeled by rules of the forms
crl [send-OK] :
< a : ... >
< a to b : BoundedLink | content : MCL, capacity : NZ,
currentSize : N >
=>
< a : ... >
< a to b : BoundedLink | content : MCL :: m, currentSize : s N >
if N < NZ .
rl [send-full] :
< a : ... >
< a to b : BoundedLink | capacity : NZ, currentSize : NZ >
=>
< a : ... >
< a to b : BoundedLink | > .
and by using an equation to move messages from the back of the link to the front.
3. Prove that in the resulting specification
< "o1" : Node | ... > < "o2" : Node | ... >
< "o1" to "o2" : LinkFront | front : m3 :: m2 >
< "o1" to "o2" : LinkBack | back : nil >
rewrites in one concurrent step (in which "o1" sends m1 and "o2" reads m3) to
< "o1" : Node | ... > < "o2" : Node | ... >
< "o1" to "o2" : LinkFront | front : m2 :: m1 >
< "o1" to "o2" : LinkBack | back : nil >.
3The currentSize attribute is not needed, since its value can be computed given the content
value; however, it is usually more efficient to have such an attribute.
11.4 Asynchronous Communication Using Shared Variables 197
where v is the current value of x. If the shared variable ranges over elements of sort
s, such objects are instances of the class
class SharedVar | value : s .
If the system contains shared variables of different sorts s1 , . . . , sn , one could either
1. declare a class SharedVarsi for each si ,
2. let each si be a subsort of a supersort Data and let Data be the sort of the value
attribute, or
3. define a sort Data and an operator [_] : si -> Data for each sort si , so that
a variable is represented by an object < x : SharedVar | value : [ v ] >.
Exercise 174 In Exercise 120 we analyzed “by hand” what could happen if three
persons deposit $20 each to a shared bank account x at the same time (but in
different branches of the bank). In particular, a bank clerk:
• first checks the current balance of the account x, and stores this result in a local
variable y (e.g., a post-it note on his desk);
• receives $20 from the depositor and computes the new balance of the bank
account in a new post-it note/local variable (z := y + 20); and finally
• writes the value of z as the new balance of the account x.
That is, each bank clerk performs the program
where x is a shared variable and y and z are local variables. Each statement is
atomic (can be executed in one step), but since the three bankers perform these
operations more or less at the same time, the execution of the statements can be
interleaved: other clerks may execute statements also between the execution of two
statements by any given clerk.
1. Model this system in Full Maude, with each clerk represented by an object.
2. Use Full Maude search to find all the possible balances of the account x after
the three persons have deposited $20 each, when the original balance was $100.
3. What are the possible outcomes if also y and z are shared variables?
(In databases, the three instances of the program above are three transaction
(requests), and any database management system is expected to ensure atomicity
(either all operations or no operations in a transaction are applied to the database)
198 11 Modeling Communication in Maude
and serializability of concurrent transactions: the result of executing the three trans-
actions in parallel must be the same as some execution without interleaving. Such
transaction support would ensure that no deposit was lost in our example.)
Exercise 175 Multiple agents (Orbitz, Expedia, Priceline, etc.) access a global
database for flight tickets, searching for a certain trip, for which there is only one
seat left. Each agent a performs the following transaction:
x := read(seat);
/* If seat is free, wait until the customer makes up her mind.
Then ask for name and credit card details. Takes time */
if x == free then {
y := getCreditCardDetails();
if ok(y) then {write(seat, sold(a)); chargeCustomer(y);}
}
1. Model a system with multiple agents and one desired plane ticket, also record-
ing which agents charged its customer. Assume that there are two customers,
and either (a) both customers want to buy the plane ticket (modeled by ok(y)
being true), and (b) only one customer wants to buy the ticket.
2. Use search and show that something can go wrong in case (a).
3. Add a test whether the ticket is still available just before selling the ticket (this
is one atomic action: a database query). Can something still go wrong? What
went wrong is called a “lost update” in the database community.
4. Variants of the following solution are often used in practice:
x := read(seat);
if x == free then {
write(seat, sold(a)); /* Hold ticket for up to 15 minutes */
y := getCreditCardDetails(); /* Takes some time */
if ok(y) then chargeCustomer(y) else write(seat, free);
}
This chapter illustrates how (Full) Maude can be used to model and analyze a series
of protocols for achieving reliable ordered communication on top of an underlying
unreliable transmission medium. For example, the IP protocol does not guarantee
reliable delivery of a single message, and may also reorder a sequence of messages
between two nodes as they cross the Internet. The TCP protocol in the transport
layer of the Internet protocol stack then provides reliable ordered communication of
a sequence of messages between two nodes on top of IP.
Section 12.1 specifies a simple protocol that uses sequence numbers and ac-
knowledgments to achieve reliable and ordered delivery of a sequence of messages
when the underlying infrastructure is unreliable and does not guarantee ordered de-
livery. If the infrastructure provides ordered, but unreliable, message delivery (lossy
links), we can use the same protocol, but then we only need two sequence numbers.
This yields the well-known alternating bit protocol discussed in Section 12.2.
These protocols are not very efficient: the sender must know that the receiver has
seen a message before it transmits the next message. In the sliding window protocol,
the sender may send multiple different messages before getting acknowledgments
from the receiver. Sliding window may be the best known algorithm in computer
networking, and the TCP protocol is essentially just the sliding window protocol
on top of IP [96]. Section 12.3 describes the sliding window protocol for both un-
ordered and ordered communication infrastructures, but leaves the actual Maude
modeling and analysis as an exercise/course project.
You want to send a sequence of important messages, and want to be absolutely cer-
tain that the receiver gets all the messages and in the intended order. Unfortunately,
c Springer-Verlag London 2017 199
P.C. Ölveczky, Designing Reliable Distributed Systems, Undergraduate Topics
in Computer Science, DOI 10.1007/978-1-4471-6687-0 12
200 12 Modeling and Analyzing Transport Protocols
Since we assume that message delivery may be lossy and out of order, we use the
“standard” model of communication in Section 11.2. In particular, we use the mes-
sage wrapper (“envelope”) there, so that each message has the form
12.1 Reliable Communication Using Sequence Numbers 201
rl [start] :
< O : Sender | msgsToSend : S ++ SL, currentMsg : nil >
=>
< O : Sender | msgsToSend : SL, currentMsg : S,
currentSeqNo : 1 > .
202 12 Modeling and Analyzing Transport Protocols
The sender repeatedly sends the current string with the current sequence number
(the rule cannot be applied when currentMsg is nil, since S is a variable of sort
String):
rl [sendCurrentMsg] :
< O : Sender | currentMsg : S, currentSeqNo : N,
receiver : O’ >
=>
< O : Sender | >
msg (S withSeqNo N) from O to O’ .
If the sender gets an acknowledgment for the current sequence number, it prepares
for the sending of the next message. If the current string was the last to be sent,
currentMsg is set to nil, and the sender will not send more messages:
rl [receiveCurrentAckNotLast] :
(msg (ack withSeqNo N) from O’ to O)
< O : Sender | currentSeqNo : N, msgsToSend : S ++ SL >
=>
< O : Sender | currentSeqNo : N + 1, currentMsg : S,
msgsToSend : SL > .
rl [receiveAckLast] :
(msg (ack withSeqNo N) from O’ to O)
< O : Sender | currentSeqNo : N, msgsToSend : nil >
=>
< O : Sender | currentSeqNo : N + 1, currentMsg : nil > .
The receiver repeatedly sends an acknowledgment for the greatest sequence number
it has seen:
rl [sendAck] :
< O : Receiver | greatestSeqNoRcvd : NZ, sender : O’ >
=>
< O : Receiver | >
msg (ack withSeqNo NZ) from O to O’ .
When the receiver receives a new message, it stores the content of the new message
and updates its greatestSeqNoRcvd attribute:
12.1 Reliable Communication Using Sequence Numbers 203
rl [rcvNewPacket] :
(msg (S withSeqNo s N) from O’ to O)
< O : Receiver | greatestSeqNoRcvd : N, msgsRcvd : SL >
=>
< O : Receiver | greatestSeqNoRcvd : s N,
msgsRcvd : SL ++ S > .
To analyze our protocol, we define an initial state init, in which "Alice" wants to
use the protocol to transmit the sequence "Sequence" ++ "numbers" ++ "are"
++ "great" ++ "fun" of strings to "Bob":
(omod TEST-SEQNO-UNORDERED is including SEQNO-UNORDERED .
subsort String < Oid .
op init : -> Configuration . --- initial state
eq init
= < "Alice" : Sender | msgsToSend : "Sequence" ++ "numbers" ++
"are" ++ "great" ++ "fun",
currentMsg : nil,
currentSeqNo : 0,
receiver : "Bob" >
< "Bob" : Receiver | greatestSeqNoRcvd : 0, msgsRcvd : nil,
sender : "Alice" > .
endom)
result Configuration :
< "Bob" : Receiver | greatestSeqNoRcvd : 5, sender : "Alice",
msgsRcvd :
("Sequence" ++ "numbers" ++ "are" ++ "great" ++ "fun") >
< "Alice" : Sender | currentMsg : nil, currentSeqNo : 6,
msgsToSend : nil, receiver : "Bob" >
msg ack withSeqNo 5 from "Bob" to "Alice"
Although this looks good, we have just analyzed one out of the many possible be-
haviors. We use search to analyze all possible behaviors. The following command
searches for a bad state in which the receiver has received sequence number 5, but
where its stored sequence of strings is different from the desired one:
204 12 Modeling and Analyzing Transport Protocols
The execution of this search command does not terminate (before the operating
systems kills it), since (i) the reachable state space is infinite, and (ii) such a bad state
should not be reachable, and hence Maude searches forever for the unreachable bad
state. A bounded search (search [1,25] ...), which checks whether the bad state
can be reached in 25 rewrite steps or less, will terminate.
Although the fact that the search command does not find bad states increases our
confidence in the correctness of the protocol, it does not allow us to conclude that
the protocol is correct. It could happen that bad states could be found if we searched
for a few more hours/days/years. Furthermore, we have only analyzed the protocol
for one initial state. Maybe the protocol behaves incorrectly for other initial states?
Assume now that the underlying communication infrastructure provides ordered but
lossy message transmission. That is, the communication can be seen to take place
using lossy links, as explained in Section 11.3.1. The above protocol can of course
be used to achieve reliable communication also in such an infrastructure, with the
only difference that we use link objects for the communication (Exercise 177).
However, this solution is not optimal when a large number of messages are trans-
mitted, since the sequence numbers can become very large. The point is that all
those sequence numbers are no longer needed when communication is through
lossy (but not duplicating) links. It is enough to consider the sequence numbers
0 and 1. Each sequence number n in the original protocol is just replaced by its
parity n rem 2: the first packet to be transmitted gets sequence number 1, the sec-
ond packet gets sequence number 0, the third packet gets sequence number 1, the
fourth gets the number 0, and so on. The reason we can do this optimization is that
12.2 The Alternating Bit Protocol 205
if the largest sequence number in the current state of system is n, then each mes-
sage/acknowledgment in the links has sequence number n or n − 1 (Exercise 177).
This optimized protocol is the well-known alternating bit protocol, which can be
summarized as follows:
1. Use the protocol from Section 12.1, but with messages traveling in lossy links.
2. Each sequence number n in that protocol is replaced by its parity bit n rem 2.
The following Maude specification of the alternating bit protocol is a straightfor-
ward modification of our specification in Section 12.1:
(fmod BIT is sort Bit . --- data type for bits
ops 0 1 : -> Bit [ctor] .
op not : Bit -> Bit .
eq not(0) = 1 .
eq not(1) = 0 .
endfm)
(omod ALTERNATING-BIT-PROTOCOL is
including STRING-LIST + MESSAGES .
including LOSSY-LINK . --- Links and the rule lose-msg
rl [start] :
< O : Sender | msgsToSend : S ++ SL, currentMsg : nil >
=>
< O : Sender | msgsToSend : SL, currentMsg : S,
currentBit : 1 > .
rl [sendCurrentMsg] :
< O : Sender | currentMsg : S, currentBit : B,
receiver : O’ >
< O to O’ : Link | content : MCL >
=>
< O : Sender | >
< O to O’ : Link | content : MCL :: (S withBit B) > .
206 12 Modeling and Analyzing Transport Protocols
rl [receiveCurrentAckNotLast] :
< O : Sender | currentBit : B, msgsToSend : S ++ SL >
< O’ to O : Link | content : (ack withBit B) :: MCL >
=>
< O : Sender | currentBit : not(B), currentMsg : S,
msgsToSend : SL >
< O’ to O : Link | content : MCL > .
...
endom)
Exercise 177 In this exercise we consider our sequence number protocol in Section
12.1, but where communication is through lossy links.
1. Model the protocol in Section 12.1 where communication is through links.
2. Define a suitable initial state with lossy links.
3. Perform the same Maude analysis as in Section 12.1 on your specification.
4. Explain why the sequence numbers, in the messages in the links and in the
greatestSeqNoRcvd attribute, are either n or n − 1 if the current value of
currentSeqNo is n.
5. Use Maude search to analyze the property above: Search for a state where a
message/acknowledgment in a link, or the greatestSeqNoRcvd attribute, has
a sequence number that is two less than the sender’s currentSeqNo attribute.
Exercise 178 In this exercise we model and analyze the alternating bit protocol.
1. Complete the above specification of the alternating bit protocol.
2. Define an appropriate initial state with lossy links.
3. Perform the “usual” Maude analysis:
a. test your specification using rewriting;
b. search for a bad state in which the receiver has received at least as many
strings as the sender wanted to transmit, but where the sequence is different
from the one the sender wanted to send; and
c. search for a state in which the receiver has received the desired messages.
4. Explain why the alternating bit protocol does not work if the links also may
duplicate messages according to the link model in Section 11.3.1.
5. Use an initial state with lossy and duplicating links, and use Maude search to
show that the alternating bit protocol does not work in this setting.
In the above protocols, the sender waits for an acknowledgment of a message be-
fore sending the next message. The two versions of the sliding window protocol
12.3 The Sliding Window Protocol 207
Fig. 12.2 The sliding window of the sender after receiving acknowledgments of, respectively,
message 11 (top), message 14 (center), and message 16 (bottom).
presented in this section generalize our previous two protocols so that the sender
can send multiple different messages before getting an acknowledgment.
Both the sender and the receiver have a window (or “buffer”) of a certain size, and
the sender can send any of the messages in its sending window. For example, Figure
12.1 shows the sending window, of size 3, which currently contains the messages
with sequence numbers 12, 13, and 14. The sender should continuously send these
messages until it receives an acknowledgment of one of the messages. For example,
if the sender receives an acknowledgment of message 14, it “slides” the window
and starts sending the messages 15, 16, and 17, as illustrated in Figure 12.2. If the
sender then receives an acknowledgment of message 16, it again slides the window
and starts sending messages 17, 18, and 19.
The receiver keeps track of the greatest sequence number (currentAck) for which
it has seen all messages with sequence number ≤ currentAck. In Figure 12.3 (top),
currentAck is 11: the receiver has seen all messages with sequence number 1, 2,
. . . , 11, and has delivered them to its application. Since the receiver can receive
either message 12, 13, or 14 next, it must have a buffer (“window”) in which it
stores the messages that cannot be sent to the application yet. For example, if it
receives message 14 next, it cannot send this message to its application, since it has
208 12 Modeling and Analyzing Transport Protocols
Fig.12.3 The window of the receiver after having received all messages up to sequence number 11
(top); then after also receiving messages 13 and 14 (second row); then after also receiving message
12 (third row); then after also receiving message 16 (fourth row); and, finally, after also receiving
message 15 (bottom).
not yet received messages 12 and 13. Therefore, it stores message 14 in its receiving
buffer/window. If it then receives 13 and thereafter message 12, the receiver has seen
the first 14 messages, and (i) transfers messages 12, 13, and 14 to its application,
(ii) updates currentAck to 14, and (iii) “slides” its receiving window/buffer to make
space for the messages 15, 16, and 17. If the receiver instead receives message 12
before message 13, it acknowledges message 12 and moves its window to make
room for messages 13, 14, and 15 (note that message 14 is already buffered).
More precisely, the sender protocol goes as follows, where k is the window size:
• Initially: put the messages 1, . . . , k into the sending window.
• Repeatedly send any of the messages in the sending window.
• If the sender receives an acknowledgment (with the sequence number) for a mes-
sage that is not in its sending window: just ignore the acknowledgment.
• If the sender receives an acknowledgment for a sequence number n that is in the
sending window, put the packets with sequence numbers n + 1, . . . , n + k in the
sending window (unless there are no more messages to be sent).
12.3 The Sliding Window Protocol 209
If communication is through lossy links, we can optimize the sliding window pro-
tocol, just as we did for the alternating bit protocol. It turns out that it is sufficient
to use only 2k sequence numbers; the alternating bit protocol can then be seen as
the special case of this version of sliding window when the window size k = 1. For
example, if the window size is 3, the sequence numbers used could be 0, . . . , 5; and
the packet that comes after packet 5 has sequence number 0. We model and analyze
this version of the sliding window protocol in Exercise 180.
Exercise 179 This exercise models and analyzes the sliding window protocol in
Maude when the underlying communication infrastructure provides lossy and un-
ordered message delivery. The setting is the same as in Section 12.1: a sender wants
to use the sliding window protocol to transfer a sequence of strings to a receiver.
1. Model the sliding window protocol (with lossy and unordered communication)
in (Full) Maude by generalizing the Maude specification of the protocol in Sec-
tion 12.1. Make sure that the sender can nondeterministically select to send any
message in the sending window.
2. Define an initial state in which the sender wants to transfer the sequence
"Sliding" ++ "window" ++ "is" ++ "an" ++ "amazing" ++ "protocol".
Make the window size a parameter of the initial state, so that init(k) denotes
the initial state with window size k.
3. Use the rew command to test your protocol.
4. Use the search command to search for a state reachable from init(4)
where the receiver has stored the entire sequence "Sliding" ++ "window"
++ "is" ++ "an" ++ "amazing" ++ "protocol" in its msgsRcvd attribute.
5. Repeat the same search, but from initial state init(2).
6. Define a function _prefixOf_ : StringList StringList -> Bool which
checks whether a list is a prefix of another list. Test your function.
7. Use Maude to analyze whether it is possible to reach, in less than 19 rewrite
steps, a state in which the receiver’s msgsRcvd attribute is not a prefix of
"Sliding" ++ "window" ++ "is" ++ "an" ++ "amazing" ++ "protocol".
Exercise 180 In this exercise we model the version of the sliding window protocol
where communication takes place through lossy (but not duplicating) links, and
where we use the sequence numbers 0, 1, . . . , 2k − 1, with k the size of the sender’s
window. You should analyze the protocol with both reliable links and lossy links.
1. What could go wrong if we use less than 2k sequence numbers? Show a bad
behavior when k is 3, and only the sequence numbers 0, . . . , 4 are used.
2. Model this version of the sliding window protocol in Maude.
3. Define initial states corresponding to those in Exercise 179. Define one para-
metric initial state with reliable links and one with lossy links.
4. Perform all the analyses in Exercise 179, for both lossy and reliable links. (If
needed, use a smaller window size (2 or 3) and/or fewer strings stored (4 or 5).)
5. Make a rough estimate of the number of states encountered during a search:
a. What is the smallest number of rewrite steps needed to go from an initial
state with window size 4 to a state in which the receiver has stored all 6
messages in its msgsRcvd attribute?
b. How many different rewrite steps can be performed from a state in which
the sending window is full and each lossy link contains two messages?
c. Based on the answers to the above questions, give a very rough estimate of
the size of the “search tree” from the initial state until a good final state is
reached. (This search tree will contain multiple copies of the same state, but
nevertheless gives you an impression of the state space encountered during
a Maude search.)
6. Modify your specification so that the sequence numbers are 0, . . . , 2k − 2 and
use Maude analysis to show that the protocol then does not work correctly; use
window size k = 3, or, if that analysis does not terminate within a reasonable
amount of time, use k = 2.
Distributed Algorithms
13
This chapter shows how Maude can be used to formally model and analyze a num-
ber of textbook distributed algorithms; that is, algorithms (or protocols) in which a
number of nodes use message passing communication to achieve a common goal.
Section 13.1 explains in detail how Maude can model and analyze the two-phase
commit protocol for transactions on distributed databases, and includes a discussion
on general techniques for modeling node failures and recoveries. Sections 13.2–13.4
treat, respectively, distributed mutual exclusion algorithms, distributed leader elec-
tion algorithms, and distributed consensus algorithms. The algorithms discussed are
cornerstones of state-of-the-art cloud computing and wireless systems. For example,
the two-phase commit protocol, distributed leader election, and the Paxos consensus
algorithm mentioned in Section 13.4 are all key building blocks in Google’s Mega-
store cloud computing infrastructure used for Gmail, Google+, and AppEngine [9].
Distributed Transactions.
An upscale travel agent may issue the following transaction for a person X who
wants to visit Paris, stay at the Ritz, and have dinner at Chez M:
c Springer-Verlag London 2017 211
P.C. Ölveczky, Designing Reliable Distributed Systems, Undergraduate Topics
in Computer Science, DOI 10.1007/978-1-4471-6687-0 13
212 13 Distributed Algorithms
The two-phase commit (2PC) protocol [66] tries to achieve atomicity of transactions
on multiple sites: either all distributed components commit to physically update the
databases, or no component does so. Furthermore, if some participant votes to abort
13.1 Atomicity of Distributed Transactions: Two-Phase Commit 213
the transaction, then no updates are performed, and if all nodes can commit, then
all components should be updated. The databases are not physically updated during
the database transaction. Instead, the database is physically changed only at the end
of the transaction if everything went well in each database (replica).
The 2PC protocol starts by selecting some component to be the coordinator. The
two phases of 2PC are then given as follows in the textbook [40, Chapter 23]:1
Phase 1. When all participating databases signal the coordinator that the part of the multi-
database transaction involving each has concluded, the coordinator sends a message pre-
pare for commit to each participant to get ready for committing the transaction. Each
participating database receiving that message will force-write all log records and needed
information for local recovery and then send a ready to commit or OK signal to the co-
ordinator. If the force-writing to disk fails or the local transaction cannot commit for
some reason, the participating database sends a cannot commit or not OK signal to the
coordinator. If the coordinator does not receive a reply from the database within a certain
amount of time, it assumes a not OK response.
Phase 2. If all participating databases reply OK, and the coordinator’s vote is also OK, the
transaction is successful, and the coordinator sends a commit signal for the transaction to
the participating databases. [...] Each participating database completes transaction com-
mit by writing a commit entry for the transaction in the log and permanently updating
the database if needed. On the other hand, if one or more of the participating databases
or the coordinator have a not OK response, the transaction has failed, and the coordi-
nator sends a message to roll back or UNDO the local effect of the transaction to each
participating database. This is done by undoing the transaction operations.
Notice that 2PC can solve the problem in the world-wide online auction site
where two bidders, one in Norway and one on Tanna, Vanuatu, both (try to) bid
in the dying seconds of the auction: Before the bid from Norway is committed, all
replicas must accept it; however, the replica closest to Tanna could veto the conflict-
ing bid. The result would be that no bid is committed, and no one gets the item.
13.1.2 Abstraction
When analyzing 2PC we are interested in whether the different databases are up-
dated or not; we are not interested in their actual content, which therefore can be
abstracted away. The description of 2PC says that “if the coordinator does not re-
ceive a reply from the database within a certain amount of time, it assumes a not OK
response.” We could use timers to capture this, which would give us a more precise
description of 2PC, but at the cost of having to deal with time. Instead, we abstract
from time and the details of how the underlying timeout mechanism detects the loss
of a message, and assume that a prepare for commit message always gets a reply,
where the timeout scenario above corresponds to receiving a not OK message. Other
aspects of a database system, such as reading and writing from/to the database, do
not appear in the description of the 2PC protocol and do not need to be modeled.
13.1.3 Assumptions
This section shows how 2PC can be formally specified and analyzed using Maude.
We first specify and analyze 2PC without communication and site failures. Sec-
tion 13.1.4.3 then analyzes 2PC in the presence of message losses. Section 13.1.4.4
presents some general techniques for modeling site failures and recoveries in Maude
that allow us to analyze 2PC also in the presence of site failures.
sort CoordState .
op notCoord : -> CoordState [ctor] . --- not coordinator
op waitFor : OidSet -> CoordState [ctor] . --- wait for replies
sort CommitState .
ops initial ready abort : -> CommitState [ctor] .
The attribute updated is true if and only if the database has performed the update.
state is the internal state of the node (initial in the beginning; and then the node
decides whether it is ready to commit or must abort). otherNodes denotes the
other nodes, coordState is notCoord for nodes that are not currently coordinators,
and is waitFor(os) when a coordinator is waiting for replies from the nodes os,
and, finally, veto is true if the coordinator has received a veto.
The messages are declared as follows, where a “message” startCommit starts a
run of the protocol:
ops prepare OK notOK abort commit : -> MsgContent [ctor] .
msg startCommit : Oid -> Msg .
2PC starts with the coordinator (the node receiving the startCommit message)
sending a prepare message to all the other nodes, and going into waiting mode:
13.1 Atomicity of Distributed Transactions: Two-Phase Commit 215
rl [prepareReq] :
startCommit(O)
< O : 2PCDB | state : initial, otherNodes : OS >
=>
< O : 2PCDB | coordState : waitFor(OS) >
multicast prepare from O to OS .
rl [notOK] :
(msg prepare from O to O’)
< O’ : 2PCDB | state : initial >
=>
< O’ : 2PCDB | state : abort >
(msg notOK from O’ to O) .
The coordinator itself should also vote (see also Exercise 182):
rl [coordNotOk] :
< O : 2PCDB | state : initial, coordState : waitFor(OS) >
=>
< O : 2PCDB | state : abort, veto : true > .
rl [coordOk] :
< O : 2PCDB | state : initial, coordState : waitFor(OS) >
=>
< O : 2PCDB | state : ready > .
In the second phase, the coordinator reads the responses and decides whether or not
to order a global abort or a global commit. First, it reads the responses, and sets
veto to true if some node cannot commit:
rl [recOK] :
(msg OK from O’ to O)
< O : 2PCDB | coordState : waitFor(O’ ; OS) >
=>
< O : 2PCDB | coordState : waitFor(OS) > .
rl [recNotOk] :
(msg notOK from O’ to O)
< O : 2PCDB | coordState : waitFor(O’ ; OS) >
=>
< O : 2PCDB | coordState : waitFor(OS), veto : true > .
Next, the coordinator sends its decision and stops being a coordinator (and updates
its own database if needed):
216 13 Distributed Algorithms
rl [commitAll] :
< O : 2PCDB | coordState : waitFor(none),
otherNodes : OS, veto : false >
=>
< O : 2PCDB | coordState : notCoord, updated : true >
(multicast commit from O to OS) .
rl [abortAll] :
< O : 2PCDB | coordState : waitFor(none),
otherNodes : OS, veto : true >
=>
< O : 2PCDB | coordState : notCoord, updated : false >
(multicast abort from O to OS) .
Finally, the other nodes receive the coordinator’s decision and decide whether to
physically update the database:
rl [recAbort] :
(msg abort from O to O’)
< O’ : 2PCDB | >
=>
< O’ : 2PCDB | updated : false > .
rl [recCommit] :
(msg commit from O to O’)
< O’ : 2PCDB | >
=>
< O’ : 2PCDB | updated : true > .
Our specification does not include rules for message loss, so we first analyze our
protocol in a reliable setting. The following module, where some parts are replaced
by ‘...’, defines an initial state with five databases (or database replicas):
(omod TEST-2PC is including TWO-PHASE-COMMIT . protecting STRING .
subsort String < Oid .
op init : -> Configuration .
eq init
= startCommit("a")
< "a" : 2PCDB | updated : false, state : initial,
otherNodes : "b" ; "c" ; "d" ; "e",
coordState : notCoord, veto : false >
< "b" : 2PCDB | updated : false, state : initial,
otherNodes : "a" ; "c" ; "d" ; "e",
coordState : notCoord, veto : false >
< "c" : 2PCDB | updated : false, state : initial,
otherNodes : "b" ; "a" ; "d" ; "e",
coordState : notCoord, veto : false >
< "d" : 2PCDB | ... >
< "e" : 2PCDB | ... > .
endom)
13.1 Atomicity of Distributed Transactions: Two-Phase Commit 217
result Configuration :
< "a" : 2PCDB | state : ready, updated : false, ... >
< "b" : 2PCDB | state : abort, updated : false, ... >
< "c" : 2PCDB | state : ready, updated : false, ... >
< "d" : 2PCDB | state : abort, updated : false, ... >
< "e" : 2PCDB | state : ready, updated : false, ... >
This is promising: the databases "b" and "d" could not commit the transaction, and
no database was updated. Since the rewrite command only analyzes one possible
behavior, we check for consistency of the distributed databases at the end of a run
of 2PC by searching for a “bad” final state in which one component has updated its
database while another component has not done so:
Maude> (search [1] init =>! < O:Oid : 2PCDB | updated : false >
< O’:Oid : 2PCDB | updated : true >
C:Configuration .)
No solution.
The result shows that it is not possible to reach an inconsistent final state from init.
However, the correctness requirement of 2PC also says that: (i) if one database
decides to abort, then no database should update; and (ii) if all databases are ready
to update, then they should indeed all update. Again, we analyze these properties by
searching for final states in which the properties do not hold:
Maude> (search [1] init =>! < O:Oid : 2PCDB | state : abort >
< O’:Oid : 2PCDB | updated : true >
C:Configuration .)
No solution.
No solution.
Although everything looks good, we have not proved 2PC correct, only that it works
well from state init. Maybe inconsistent states can be reached from other initial
states? Nevertheless, this analysis has increased our confidence that 2PC is correct.
We next analyze 2PC when messages may be lost during transmission. As men-
tioned in Section 13.1.2, we assume that a prepare request always gets a reply, so
218 13 Distributed Algorithms
the loss of prepare, OK, and notOK messages does not need to be modeled. The
following module extends our model of 2PC with a rewrite rule modeling the loss
of an abort or a commit message:
(omod TWO-PHASE-COMMIT-WITH-MESSAGE-LOSS is including TEST-2PC .
vars O O’ : Oid . var MC : MsgContent .
result Configuration :
< "a" : 2PCDB | state : ready, updated : false, ... >
< "b" : 2PCDB | state : abort, updated : false, ... >
< "c" : 2PCDB | state : ready, updated : false, ... >
< "d" : 2PCDB | state : abort, updated : false, ... >
< "e" : 2PCDB | state : ready, updated : false, ... >
Solution 1
... ; O’:Oid --> "a" ; O:Oid --> "e"
The result shows that it is possible to reach an inconsistent final state. It is necessary
to exhibit a behavior leading to the inconsistent state, for the following reasons:
• To ensure that the faulty behavior really corresponds to a flaw in 2PC, and is not
just an error in our model of 2PC.
• To learn about the flaw in the protocol.
Since Full Maude cannot exhibit the path to a state found during a search, we use
the method described in Section 10.2.4.1 to transform a Full Maude module into a
(core) Maude module, repeat the search in (core) Maude, and obtain the path to the
inconsistent state. The path shows that all nodes could commit; however, the commit
message from "a" to "e" was lost, so that node "e" never updates its database.
A process (server, database, etc.) can “fail” in a number of ways for various reasons.
A common source of unavailability is scheduled upgrades of software or hardware.
A failed process can behave in different ways, from being unresponsive (omission
failures) to producing completely arbitrary values/messages (Byzantine failures).
13.1 Atomicity of Distributed Transactions: Two-Phase Commit 219
Byzantine failures happen for example when an airplane sensor is broken and re-
ports bogus values, or when the process is (taken over by) an attacker sending bogus
messages. Chapter 14 defines such a Byzantine attacker on a security protocol.
This section focuses on omission failures, such as crash failures, where a failed
process becomes unresponsive. We also model the recovery of a failed process.
This rule creates a new object of the class Failed2PCDB, with the old object identi-
fier, and deletes the old object. Since a new object is created, all the attributes of the
new object must be present in the right-hand side.
The recovery of a failed process can be modeled by the following rewrite rule:
rl [nodeRecovery1] :
< O : Failed2PCDB | updated : B, state : S, otherNodes : OS,
coordState : CS, veto : B2 >
=>
< O : 2PCDB | updated : B, state : S, otherNodes : OS,
coordState : CS, veto : B2 > .
220 13 Distributed Algorithms
This model could lead to too many failures and quickly makes search unfeasible.
It is often more practical and common to explicitly inject faults by using messages
fail and recover, so that a node fails when it reads a fail message and recov-
ers when it reads a recover message. If the message does not specify which node
should fail, then any node could fail at any time. Including n such fail messages
in the initial state would allow us to analyze the protocol with any combination of
n failures, including the possibility that the same node fails multiple times. This ap-
proach is used next to analyze the 2PC protocol with process failures and recoveries.
The following rewrite rules model a node failing and recovering from failure:
vars O O’ : Oid . var S : CommitState . var OS : OidSet .
var CS : CoordState . vars B B2 : Bool . var MC : MsgContent .
rl [nodeFailure] :
fail
< O : 2PCDB | updated : B, state : S, otherNodes : OS,
coordState : CS, veto : B2 >
=>
< O : Failed2PCDB | updated : B, state : S, otherNodes : OS,
coordState : CS, veto : B2 > .
rl [nodeRecovery] :
recover
< O : Failed2PCDB | updated : B, state : S, otherNodes : OS,
coordState : CS, veto : B2 >
=>
< O : 2PCDB | updated : B, state : S, otherNodes : OS,
coordState : CS, veto : B2 > .
Finally, the term initWithFailures defines an initial state with two arbitrary fail-
ures and only one recovery, by adding two fail messages and one recover mes-
sage to the previous initial state init:
op initWithFailures : -> Configuration .
eq initWithFailures = fail fail recover init .
endom)
Exercise 181 Explain informally why it is impossible (in the context of replicated
data stores) to have both very high availability, tolerance for network failures, and
consistency. (This impossibility result is called the CAP Theorem [15].)
Exercise 182 Modify the above specification of 2PC so that the coordinator itself
must choose whether it is ready to commit or wants to abort.
Exercise 183 One problem with 2PC is that the system could deadlock when the
coordinator fails. Use Maude search to show that it is possible to reach a deadlocked
state where no node has received a prepare message.
Fig. 13.1 Token ring, where the token is being sent from process p2 to process p3
Each process accessing a shared resource executes the following “program scheme”:
<execute outside critical section>;
<request to enter critical section>; // wait until access granted
<execute in critical section>; // access shared resources
<release critical section>;
<execute outside critical section>;
waitForCS (the node is waiting to enter its critical section), insideCS (the node is
executing in its critical section), and afterCS (the node has left the critical section):
(omod MUTEX-WITH-CENTRAL-SERVER is including MESSAGE-WRAPPER .
sort MutexState .
ops beforeCS waitForCS insideCS afterCS : -> MutexState [ctor] .
where b is true when a node is in its critical section, and waiting nodes is the list
of processes that are waiting to enter their critical sections:
class MutexServer | nodeInCS : Bool, waiting : OidList .
op server : -> Oid [ctor] . --- name of server object
When a node wants to enter its critical section, it sends a requestCS message to
the server. If the server’s nodeInCS value is false, it grants the node access to the
critical section by sending an accessGranted message; otherwise, the requesting
node is added to the server’s waiting list and remains in the waitForCS state:
ops requestCS accessGranted releaseCS : -> MsgContent [ctor] .
rl [requestAccessToCS] :
< O : Node | state : beforeCS >
=>
< O : Node | state : waitForCS >
(msg requestCS from O to server) .
rl [grantAccess] :
(msg requestCS from O to server)
< server : MutexServer | nodeInCS : false >
=>
< server : MutexServer | nodeInCS : true >
(msg accessGranted from server to O) .
rl [putInWaitQueue] :
(msg requestCS from O to server)
< server : MutexServer | nodeInCS : true, waiting : OL >
=>
< server : MutexServer | waiting : OL :: O > .
224 13 Distributed Algorithms
rl [startExecutingInCS] :
(msg accessGranted from server to O)
< O : Node | state : waitForCS >
=>
< O : Node | state : insideCS > .
When a process has finished executing its critical section, it sends a releaseCS
message to the server. If nodes are waiting, the longest-waiting node is given access:
rl [exitCS] :
< O : Node | state : insideCS >
=>
< O : Node | state : afterCS >
(msg releaseCS from O to server) .
rl [nooneWaiting] :
(msg releaseCS from O to server)
< server : MutexServer | waiting : nil >
=>
< server : MutexServer | nodeInCS : false > .
rl [grantAccessToFirstWaiting] :
(msg releaseCS from O to server)
< server : MutexServer | waiting : O’ :: OL >
=>
< server : MutexServer | waiting : OL >
(msg accessGranted from server to O’) .
endom)
The term init(n) defines an initial state with n nodes and one server:
(omod MUTEX-WITH-CENTRAL-SERVER-INITIAL-STATE is
including MUTEX-WITH-CENTRAL-SERVER . protecting NAT .
op node : NzNat -> Oid [ctor] . --- names node(1), node(2), ...
No solution
It is easy to see that requests to enter the critical section eventually will suc-
ceed (why?). However, this algorithm does not ensure that processes access their
respective critical sections in the order in which they wanted to access it (why not?).
Section 16.3.5 explains how Maude can be used to analyze these two properties.
Exercise 184 Modify the central server mutual exclusion algorithm so that each
process executes forever, alternating between executing outside and inside the crit-
ical section. Use Maude to analyze whether the mutual exclusion property is satis-
fied. Will the search command terminate? Is it still the case that each process will
eventually be able to enter its critical section? Could the system deadlock?
Exercise 185 In the “token ring” mutual exclusion algorithm, the nodes logically
form a “ring” structure, as shown in Figure 13.1 where a node only knows the next
node in this ring. The algorithm works as follows: there is one “token,” and only
the node that holds the token may enter its critical section. The node then holds on
to the token during its execution in the critical section, and passes the token to the
next node in the ring when it exits its critical section. If a node that is not waiting to
enter its critical section receives the token, it just passes the token to the next node.
1. Model the token ring algorithm in Maude.
2. Use Maude to analyze whether this algorithm guarantees mutual exclusion.
3. Does the algorithm guarantee that nodes enter the critical section in the order
in which they want to enter the critical section?
4. Explain why the algorithm cannot terminate, even after all nodes have finished
executing their critical sections.
5. Can you modify/extend the algorithm so that it terminates?
6. Modify your model so that each node executes forever, again alternating be-
tween executing outside and inside the critical section.
7. In this new version, is it possible that a node that wants to enter its critical
section never gets to do so?
Exercise 186 In Maekawa’s voting algorithm, each node i has a voting set Vi , so
that any pair (Vi ,V j ) of voting sets has at least one element in common: Vi ∩V j = 0.
/
A node that wants to enter its critical section multicasts a request message to all
nodes in its voting set. The node then enters its critical section when it has received
a go-ahead message from each node in its voting set. When the node exits its critical
section, it multicasts a release message to the nodes in its voting set.
226 13 Distributed Algorithms
A node that receives a request message replies with a go-ahead message if: (i)
it is not in the critical section itself, and (ii) it has not already voted (i.e., has not
sent a go-ahead message) for someone without receiving a release message from
that node. Otherwise, the node just queues the request.
When a node receives a release message, it sends a go-ahead message to the first
node in its request queue (if any).
1. Model this algorithm in Maude.
2. Define a number of suitable initial states.
3. Use Maude to analyze whether this algorithm guarantees mutual exclusion.
4. Use Maude to analyze whether the system may deadlock.
Exercise 187 Although these algorithms were not designed to tolerate message
losses and node crashes, we can nevertheless analyze what kinds of failures, if any,
each algorithm can withstand. Therefore, for each of the three algorithms:
1. What messages can be lost without affecting the operation of the system?
2. What nodes (and in which circumstances) can crash (and not recover) without
affecting the rest of the system?
A distributed system often needs to select one of the nodes to be the leader. For
example, the two-phase commit protocol assumes that there is a leader, called
the coordinator. Likewise, airplanes typically have multiple “copies” of each com-
puter/cabinet, in case one fails; which computer is currently running the airplane?
A leader election algorithm should elect one of the nodes to be the leader, and all
nodes should agree on the leader. If the leader crashes, then another leader must be
elected. Since multiple nodes may discover that the leader is down, more than one
node may initiate a leader election process. This section considers two leader elec-
tion algorithms: a ring-based algorithm and a spanning-tree-based algorithm. The
goal of these algorithms is to elect the node with the best value of some parameter
(e.g., processor capacity, remaining amount of energy, number of Facebook friends,
etc.) as the leader. These algorithms do not tolerate node or communication failures.
The bully algorithm [46] is a well-known leader election algorithm that can deal
with node failures (and recoveries) but requires real-time features such as timeouts
and time-bounded communication, since it is impossible to detect a node failure in
an untimed asynchronous distributed system (why?).
In the ring-based leader election algorithm by Chang and Roberts [18], the nodes
are arranged in a logical ring and each node knows the next node in the ring.
13.3 Distributed Leader Election 227
Fig. 13.2 A graph (left) and two of its spanning trees (the “thick” edges)
A node that starts a new round of the leader election algorithm, for example upon
discovering that the current leader has failed, sends an election message, containing
its own value and identity, to the next node in the ring. When a node receives an
election message, it compares the received value with its own value: If the received
value is better, the node forwards the election message to the next node in the ring;
if the received value is worse, then the node sends an election message with its own
value and id to the next node; and, finally, if a node receives an election message
with its own identity,2 then the node knows that it is the new leader (why?), and
sends a leader message with its own identity to the next node in the ring. A node
that receives a leader message, stores the identity of the new leader; furthermore, if
the receiver is not the new leader, it forwards the leader message to the next node in
the ring. Exercise 188 deals with modeling and analyzing this algorithm in Maude.
In wireless networks, and in many other networks, a node has a number of neighbors
that it can reach in “one hop.” It is desirable to use one-hop communication as much
as possible. The ring-based algorithm is not well suited for such networks since
it assumes that the nodes are arranged in a ring structure. However, finding a ring
of one-hop links—if it exists—is an NP-hard problem (the “Hamiltonian Circuit”
problem), and therefore quite costly. Furthermore, this must be done quite often
since the topology in a wireless network may change frequently.
The following spanning-tree-based leader election algorithm assumes that each
nodes knows its neighbors, and that the network topology is a connected undirected
graph. The algorithm has three “phases”:
1. Build a “tree” of all the nodes in the graph. Such a tree is called a spanning
tree. (Figure 13.2 shows a graph and two of its spanning trees.) The starting
node sends an election message to its neighbors. A node that sees an election
message for the first time, remembers the sender as its parent in the tree, and
sends the election message to its other neighbors.
2. When the spanning tree has been built, each node sends the best value in its
“subtree” to its parent, starting with the leaf nodes and going towards the root.
The root/starting node will receive the best value in each of its subtrees, and can
determine the best-valued node in the entire system.
3. The root node then sends a leader message, with the new leader, to all its neigh-
bors, who then propagate this information to their neighbors, and so on.
Phase 1 can be described in more detail as follows:
• The node starting the leader election sends an election message to its neighbors.
• A node that receives an election message for the first time, sets its parent to be the
sender of this message. It then sends an election message to all other neighbors.
• A node that receives an election message, but not for the first time, simply replies
with an ack(0) message.
Each node maintains a value max that stores the best node value that the node has
seen; initially the value of max is the node’s own value. Phase 2 of the algorithm
can then be described as follows:
• When a node has received an ack message from all neighbors, except its parent,
it sends a message ack(max) to its parent (unless it is the root node).
• When a node receives a message ack(n), its updates max to n if n is better than
the node’s current max value.
When all this is done, the root node knows the best node in the entire system and
can start propagating the identity of the leader l by sending a leader(l) message to
its neighbors, who then send the message to their neighbors, and so on (Phase 3).
sort STstate .
ops idle waitForLeader : -> STstate [ctor] .
op waitForAck : OidSet -> STstate [ctor] .
The max attribute denotes the best value in the node’s subtree and is initially set to
the node’s value; parent and leader are initially 0. The state is idle before the
node starts the election, is waitForAck(nodes) when the node awaits ack messages
from nodes, and is waitForLeader after the node has sent an ack to its parent.
A message electLeader(n) starts the algorithm with n as the starting node. This
node sets itself as its parent and multicasts an election message to its neighbors:
13.3 Distributed Leader Election 229
msg electLeader : Oid -> Msg . --- kick off leader election
op election : -> MsgContent [ctor] .
ops ack leader : Oid -> MsgContent [ctor] .
rl [startLeaderElection] :
electLeader(O)
< O : Node | neighbors : OS >
=>
< O : Node | state : waitForAck(OS), parent : O >
(multicast election from O to OS) .
When a node receives an election message for the first time (the node is idle),
it remembers its parent, sets its state to wait for acknowledgments from its other
neighbors, and propagates the election message to those neighbors:
rl [rcvElection1] :
(msg election from O1 to O)
< O : Node | state : idle, neighbors : O1 ; OS >
=>
< O : Node | parent : O1, state : waitForAck(OS) >
(multicast election from O to OS) .
A node that is already in an election (the state is different from idle) just replies
with an ack(0) message when it receives another election message:
crl [rcvElection2] :
(msg election from O1 to O)
< O : Node | state : S >
=>
< O : Node | >
(msg ack(0) from O to O1) if S =/= idle .
When a node receives an ack message, from a “child” or a “sibling” in the span-
ning tree, it removes the sender from the set of nodes from which it awaits an ack
message, and updates its max attribute if it received a better max value:
rl [rcvAck] :
(msg ack(N) from O1 to O)
< O : Node | state : waitForAck(O1 ; OS), max : MAX >
=>
< O : Node | state : waitForAck(OS), max : max(MAX, N) > .
When a node has received all the acks it is waiting for, it sends an ack message to
its parent with the best-value node in its subtree:
crl [ackParent] :
< O : Node | state : waitForAck(none), max : MAX, parent : O1 >
=>
< O : Node | state : waitForLeader >
(msg ack(MAX) from O to O1) if O1 =/= O .
230 13 Distributed Algorithms
When the root node (whose parent points to itself) has received all the acks it is
waiting for, its max attribute denotes the best node in the entire tree. The root node
then starts Phase 3 of the protocol by propagating the new leader downstream:
rl [sendLeader] :
< O : Node | state : waitForAck(none), neighbors : OS,
max : MAX, parent : O >
=>
< O : Node | state : idle, leader : MAX >
(multicast leader(MAX) from O to OS) .
A node that sees the leader message for the first time stores the new leader and
propagates the leader message further downstream:
rl [rcvLeader1] :
(msg leader(LEADER) from O1 to O)
< O : Node | state : waitForLeader, neighbors : O1 ; OS >
=>
< O : Node | state : idle, leader : LEADER >
(multicast leader(LEADER) from O to OS) .
Finally, a node that has already seen the leader message just ignores it:
rl [rcvLeader2] :
(msg leader(LEADER) from O1 to O)
< O : Node | state : idle >
=>
< O : Node | > .
endom)
This model only allows one round of the leader election algorithm; I leave it to
the reader to come up with an extension supporting multiple elections.
Maude Analysis.
The following module defines an initial state init1 with three nodes:
(omod ST-LEADER-STATES is protecting ST-LEADER-ELECTION .
op init1 : -> Configuration .
eq init1
= electLeader(1)
< 1 : Node | state : idle, max : 1, parent : 0, leader : 0,
neighbors : 2 ; 3 >
< 2 : Node | state : idle, max : 2, parent : 0, leader : 0,
neighbors : 1 ; 3 >
< 3 : Node | state : idle, max : 3, parent : 0, leader : 0,
neighbors : 1 ; 2 > .
endom)
The algorithm should terminate with node 3 as the leader. To analyze whether
this is the case, we search for a final state where some node has a different leader:
Maude> (search init1 =>!
C:Configuration < O:Oid : Node | leader : N:Nat >
such that N:Nat =/= 3 .)
13.3 Distributed Leader Election 231
Exercise 189 Assume that a node may fail, but that the failed node is so kind as
to let its predecessor know about both the failure and its next neighbor in the ring.
Show that the “obvious extension” of the ring-based algorithm, that just bypasses
the failed node, may fail to terminate.
Exercise 190 Extend the spanning-tree-based algorithm to deal with multiple elec-
tions at the same time. You may assume that a node never initiates more than one
election. Hint: Maybe it is useful to label each round of the algorithm with its initia-
tor, and just “vacate” the leader election process initiated by a lower-valued node?
Model and analyze your algorithm in Maude.
Exercise 191 Model the consensus algorithm described above in Maude. Include
the possibility of message losses and that a site may fail (to ensure termination, it
might be useful not to model node recovery). Define a number of suitable initial
states, and use Maude to analyze the following properties:
1. It is impossible to reach a state in which two nodes “agree” on different values.
2. It is possible to reach a final state in which all nodes agree on a value.
3. It is possible to reach a final state in which no node has agreed on a value.
4. It is possible to reach a final state in which no node has been elected leader.
Exercise 192 Explain how nodes easily can reach consensus if they have access to
an atomic multicast primitive (see Exercise 172).
Exercise 193 (Slightly tricky?) Model and analyze the Paxos algorithm in Maude.
The paper [65] gives a fairly precise and brief description of Paxos.
Analyzing a Cryptographic Protocol
14
Web services such as email, photo, social networks, internet commerce, and
online banking require that entities authenticate themselves. Scrooge McDuck must
be sure that he (it?) is communicating with the bank, and not with some bad guy
with a look-a-like web page. Likewise, when the bank gets the request “transfer 5
gazillions from my account to the Beagle Boys” from “Scrooge,” the bank must be
sure that it is communicating with Scrooge and not with the Beagle Boys.
Back in the 20th century, such mutual authentication was trivial: you knew that
you were entering your bank by its imposing building, and the bank authenticated
you by asking you to show some photo identification. But how can we achieve
authentication online? Messages can be faked, communication can be overheard
and/or intercepted, and genuine-looking web sites can easily be set up.
Authentication protocols are used to achieve the desired authentication. In this
chapter we model and analyze one of the most well known and influential mutual
authentication protocols: the Needham-Schroeder public-key authentication proto-
col (NSPK) [88] from 1978. Is this protocol secure, or can the Beagle Boys fool the
bank into thinking that it is has a trusted connection with Scrooge?
Instead of thinking hard and trying to break this well-known and well-studied
protocol (see, e.g., [17, 79]) by finding some really clever attacks, we will do a “brute
force” analysis of the protocol by adding an intruder to the system, and by modeling
all possible behaviors of an intruder. If the protocol is safe with such intruders, then
the protocol is safe.
In public-key cryptography [28, 98] each agent A has a public key, denoted PK A ,
and a private key, denoted PrvK A . All agents know the public key of each
c Springer-Verlag London 2017 233
P.C. Ölveczky, Designing Reliable Distributed Systems, Undergraduate Topics
in Computer Science, DOI 10.1007/978-1-4471-6687-0 14
234 14 Analyzing a Cryptographic Protocol
agent,1 but the private key of an agent A is only known by A. An agent which knows
key K can encrypt the plaintext data m with K. The data m encrypted with key K is
written {m}K . Data which have been encrypted with a public key PK A can only2 be
decrypted with the private key PrvK A i.e., only by A. Likewise, data encrypted with
PrvK A can only be decrypted with PK A .3
The amazing thing about public-key cryptography is that two parties Alice and
Bob can communicate secretly without having a shared secret key! (This is obvi-
ously very useful when you want to communicate securely, for example sending
credit card numbers, with a web service that you have not interacted with before.)
If Alice wants to send a secret message m that only Bob can understand, she just
sends the message encrypted with Bob’s public key ({m}PKBob ). Only Bob can de-
crypt this message; no other agent who sees this message can decrypt it. This does
not solve the authentication problem, however: Bob cannot be sure that Alice sent
the message; everybody knows Bob’s public key and can send the message.
Public-key cryptography is based on finding public/private key pairs and encryp-
tion/decryption algorithms so that it is computationally infeasible (meaning that it
should take large networks of computers many years) to:
1. figure out the private key of an agent, and
2. decrypt an encrypted message without knowing the decryption key.
The RSA algorithm is the main framework for public-key cryptography. It is based
on selecting two very large (1024 bits or so) prime numbers p and q; their product
n = p · q is part of the public key. RSA cryptography relies on the fact that it is
impossible to factor n into its two constituents p and q within reasonable time.4
(If there were a quick way to factor a very large number, for example by quantum
computing [104], then RSA-based public-key cryptography would no longer work.)
Encryption and decryption in RSA is done by modular exponentiation: the en-
crypted version of the plaintext message m is {m}(n,e) = me mod n, and decryption
also uses modular exponentiation: decrypt({m}(n,e) )(n,d) = ({m}(n,e) )d mod n = m,
where (n, e) and (n, d) is a public key/private key pair, with the secret d easily
obtained from e and the prime factors p and q of n = p · q.
In real life a person signs a document to prove that (s)he wrote/saw the document
and to ensure that the document cannot be forged. Public-key cryptography can be
used to “sign” digital contracts. If Peter wants to sign a contract m (such as “Peter
owes the bank $1000”) with the bank so that:
1 An agent can get the public key of another agent from a trusted key server, but we abstract from
such details. Such a key server setup is, however, itself a nontrivial issue.
2 This is called the perfect cryptography assumption. In reality, keys could be weak enough to be
1. the bank knows—and can prove—that Peter has agreed to the contract, and
2. neither Peter nor the bank can later fake the contract (to either “Peter owes the
bank $1” or to “Peter owes the bank $1,000,000”)
then Peter just encrypts the contract m with his private key and sends the encrypted
message {m}PrvKPeter to the bank.
The bank can now decrypt the received message with Peter’s public key: if the
result is as expected, then the bank knows that Peter signed the document (no-
body else could send the message {m}PrvKPeter ). Furthermore, Peter cannot later on
claim that the contract has been altered (since the bank can just present {m}PrvKPeter
and decrypt it), and the bank cannot fake the contract, since it cannot produce the
encrypted version {m }PrvKPeter of the faked message m .
As explained below, public-key cryptography is somewhat inefficient. The entire
message m is therefore usually not encrypted. Instead, a hash function h “shortens”
the message to h(m), and the pair (m, {h(m)}PrvKPeter ) is sent to the bank.
The agent A is the initiator who wants to establish a communication session with
the responder B.
In the first step, A generates the nonce Na , adds her identity A, encrypts this
concatenation Na . A with the public key of B, and sends this encrypted message,
together with her own and B’s name (unencrypted) to B. When B receives this first
message, he decrypts the encrypted part using his private key PrvKB to obtain the
nonce Na . Only A and B know the value of Na at this stage, even if there are eaves-
droppers “listening” to the messages being transmitted in the network. (Why?)
The responder B then generates his own nonce Nb , and returns the nonce Na along
with the new nonce Nb , encrypted with the public key of A. In addition, B adds the
names B and A (unencrypted) and sends this Message 2 back to A. When A receives
this Message 2 she decrypts it with her private key to read both Na and Nb . It seems
that at this stage of the protocol run A should be assured that she is talking to B
while B cannot be sure that he is talking to A.
To convince B that he is talking to A, the initiator A encrypts the received nonce
Nb with B’s public key, and sends the message back to B (together with the receiver
and sender names). Since only A could decrypt Message 2, only A and B know
Nb , and when B receives {Nb }PKB he is convinced that only A could have sent this
message. At the end of a protocol run A is convinced that she is talking to B, and B
is convinced that he is talking to A.
Exercise 194 Assume that we have intruders who can send fake messages but can-
not guess private keys and nonces. After A has received Message 2,
1. why would it seem that A should be assured that she is talking to B?, and
2. why cannot B be sure he is talking to A?
Exercise 196 Can you indicate how the NSPK protocol can be extended/used to
establish a secret key between two (mutually authenticated) agents?
This section shows how the NSPK protocol can be modeled in Maude. Although
the informal specification of NSPK only describes a single run of the protocol, our
14.3 Modeling NSPK in Maude 237
model allows more than two agents in the system and also allows multiple concur-
rent runs, or sessions, of the protocol. An agent can be either an initiator, a respon-
der, or both initiator and responder (in different runs of the protocol). For simplicity
I assume that an agent A can initiate at most one run of the protocol with the same
responder. Two agents may however simultaneously initiate contact with each other.
For reasons explained above we assume that: (i) no agent can successfully guess
the value of a nonce or a private key whose value it does not know, (ii) no agent can
decrypt a ciphertext (encrypted plaintext) whose decryption key it does not know,
and (iii) no agent can encrypt plaintext with a key whose value it does not know.
Modeling Nonces and Keys. We abstract from the numerical value of a nonce, and
represent the i-th nonce generated by agent A by the term nonce(A, i):
(omod NSPK is protecting NAT . including MESSAGE-WRAPPER .
sort Nonce .
op nonce : Oid Nat -> Nonce [ctor] .
It is not necessary to model the private keys since we assume that only the agent A
can decrypt a ciphertext which was encrypted with the public key of A.
Modeling the Messages. The three messages in the protocol all have the form
O1 . O2 . {message content}K where message content is either a nonce and an agent
identifier, two nonces, or just a single nonce. This part of the message content is
modeled by the following sort PlainTextMsgContent:
sorts PlainTextMsgContent EncrMsgContent .
op _;_ : Nonce Oid -> PlainTextMsgContent [ctor] . --- Message 1
op _;_ : Nonce Nonce -> PlainTextMsgContent [ctor] . --- Message 2
subsort Nonce < PlainTextMsgContent . --- Message 3
Finally, a message is equipped with the (presumed!) sender and receiver identities;
they are included in the usual message wrapper, which means that an encrypted
message content is the content of a message sent around the network:
subsort EncrMsgContent < MsgContent .
Modeling Initiators. An agent which can initiate a run of the protocol is modeled as
an object of the following class Initiator:
class Initiator | initSessions : InitSessions, nonceCtr : Nat .
238 14 Analyzing a Cryptographic Protocol
The initiator must remember the nonce it sent in Message 1, so that it can check
whether this is the same nonce that it receives in Message 2. Since an initiator may
be simultaneously involved in many runs of the protocol, it must remember the
nonces in all these sessions. The attribute initSessions of an initiator A stores
such information in a multiset of elements of the following kinds:
• notInitiated(B) indicates that A wants to initiate contact with B but has not
yet done so;
• initiated(B, N) indicates that A has sent Message 1 to B with nonce N and is
waiting for Message 2 from B; and
• trustedConnection(B) indicates that A has established (what she thinks is) an
authenticated connection with B.
The data type representing this kind of information is defined as follows:
sorts Sessions InitSessions .
subsort Sessions < InitSessions .
op emptySession : -> Sessions [ctor] .
op __ : InitSessions InitSessions -> InitSessions
[ctor assoc comm id: emptySession] .
op __ : Sessions Sessions -> Sessions
[ctor assoc comm id: emptySession] .
op notInitiated : Oid -> InitSessions [ctor] .
op initiated : Oid Nonce -> InitSessions [ctor] .
op trustedConnection : Oid -> Sessions [ctor] .
The attribute nonceCtr denotes the index of the next nonce generated by the object.
The following variables are used in the definition of the initiator:
vars A B : Oid . vars M N : Nat .
vars NONCE NONCE’ : Nonce . var IS : InitSessions .
The rule send-1 models sending Message 1. The agent A has notInitiated(B)
in its initSessions attribute, which means that it wants to establish a connection
with B. The agent A generates a fresh nonce nonce(A, N) and sends the correspond-
ing Message 1 to B. Agent A must also remember that it has initiated contact with B
using nonce nonce(A, N) and must increase its nonce counter:
rl [send-1] :
< A : Initiator | initSessions : notInitiated(B) IS,
nonceCtr : N >
=>
< A : Initiator | initSessions : initiated(B, nonce(A, N)) IS,
nonceCtr : N + 1 >
msg (encrypt (nonce(A, N) ; A) with pubKey(B)) from A to B .
rl [read-2-send-3] :
(msg (encrypt (NONCE ; NONCE’) with pubKey(A)) from B to A)
< A : Initiator | initSessions : initiated(B, NONCE) IS >
=>
< A : Initiator | initSessions : trustedConnection(B) IS >
msg (encrypt NONCE’ with pubKey(B)) from A to B .
The attribute respSessions keeps track of the sessions in which the agent is
responder; a value responded(A, N) means that the agent has received Message 1
from A and has responded using its own nonce N:
sort RespSessions .
subsort Sessions < RespSessions .
op _ _ : RespSessions RespSessions -> RespSessions
[ctor assoc comm id: emptySession] .
op responded : Oid Nonce -> RespSessions [ctor] .
The rule read-1-send-2 models the reception of Message 1. The condition not
A inSession RS ensures that the responder B is not already a responder in a ses-
sion with the initiator A. When B receives the message, it creates its own nonce
(nonce(B, N)) and sends this nonce together with the received nonce (NONCE),
appropriately encrypted, back to A:
var RS : RespSessions .
crl [read-1-send-2] :
(msg (encrypt (NONCE ; A) with pubKey(B)) from A to B)
< B : Responder | respSessions : RS, nonceCtr : N >
=>
< B : Responder | respSessions : responded(A, nonce(B, N)) RS,
nonceCtr : N + 1 >
msg (encrypt (NONCE ; nonce(B,N)) with pubKey(A)) from B to A
if not A inSession RS .
The second, and last, responder rule models the reception of Message 3 with the
expected nonce from A:
rl [read-3] :
(msg (encrypt NONCE with pubKey(B)) from A to B)
< B : Responder | respSessions : responded(A, NONCE) RS >
=>
< B : Responder | respSessions : trustedConnection(A) RS > .
Agents that are Both Initiators and Responders. An agent that may be both initiator
and responder is modeled as an object instance of the class InitAndResp, which is
a subclass of both Initiator and Responder and therefore inherits the union of
the attributes of these classes, as well as their rewrite rules:
240 14 Analyzing a Cryptographic Protocol
class InitAndResp .
subclass InitAndResp < Initiator Responder .
endom)
To analyze NSPK in the absence of “bad guys” we define an initial state init2
with three agents "a", "Bank", and "c". The agents "a" and "c" may initiate
a session with each other simultaneously (remember the “separation problem”?).
Furthermore, "a" does not want to establish communication with "Bank", so the
"Bank" should never have a trusted connection with "a".
(omod TEST-NSPK is including NSPK . protecting STRING .
subsort String < Oid .
Solution 1
C:Configuration -->
< "Bank" : Responder | nonceCtr : 2,
respSessions : trustedConnection("c")>
< "a" : InitAndResp | initSessions : trustedConnection("c"),
nonceCtr : 3,
respSessions : trustedConnection("c")>
< "c" : InitAndResp | initSessions : trustedConnection("Bank")
trustedConnection("a"),
nonceCtr : 4,
respSessions : trustedConnection("a")>
No more solutions.
All behaviors lead to the single final state in which all the desired connections have
been established: the protocol seems to be doing its job in the absence of “bad guys.”
14.4 Modeling Intruders 241
This section presents a model of an intruder (also called attacker, adversary, enemy,
etc.) which allows us to analyze our protocol in the presence of “bad guys.”
Since messages may be transmitted over an unprotected network, we use the
well-known “Dolev-Yao” intruder model [30, 79] where an intruder can:
• Overhear and/or intercept (steal) messages that are sent around in the system.
• Decrypt messages that are encrypted with its own public key.
• Introduce new messages into the system, using nonces that the intruder knows.
• Replay any message it has seen, even if it cannot understand the encrypted part
of the message. The intruder may change the plaintext parts of such messages.
The intruders are assumed to be part of the computer network and can also take part
in normal runs of the protocol [79]. (After all, an intruder must contact the bank as
an ordinary agent to reap the benefits of his illegal activities.) This also means that
an intruder knows the protocol being used.
The following specification defines all possible behaviors of an intruder, most of
which make no sense whatsoever. The point is that if the protocol can withstand all
possible attacks, then it is secure (under the perfect cryptography assumption).
The following variables are used to specify the intruder:
(omod NSPK-INTRUDER is
including NSPK . including OID-SET .
Since an intruder is also a normal actor, it has all the attributes of a normal agent. In
addition, an intruder stores the information it gathers in three attributes:
• agentsSeen contains the set of agent identifiers known by the intruder;
• noncesSeen contains the set of nonces the intruder knows; and
• encrMsgsSeen contains the set of encrypted message contents which the intruder
has seen without being able to decrypt.
The sort NonceSet is defined as expected:
242 14 Analyzing a Cryptographic Protocol
sort NonceSet .
subsort Nonce < NonceSet .
op emptyNonceSet : -> NonceSet [ctor] .
op _ _ : NonceSet NonceSet -> NonceSet
[ctor assoc comm id: emptyNonceSet] .
eq NONCE NONCE = NONCE .
That is, when receiving Message 1, the intruder responds to the message accord-
ing to the NSPK protocol. In addition, it stores the identity of the sender (A) in
its agentsSeen attribute, and stores the received nonce NONCE and its own newly
created nonce nonce(I, N) in its noncesSeen attribute.
The following rule intercept-but-not-understand models the case when an
intruder intercepts (steals) a message which is encrypted with another agent’s public
key. (Since each message in NSPK is encrypted with the public key of the intended
receiver, the intruder knows that the message is encrypted with O’s public key, even
though it cannot decrypt the message.) The intruder cannot decrypt the message, but
stores the encrypted message content and the sender and receiver names:
crl [intercept-but-not-understand] :
(msg ENCRMSG from O’ to O)
< I : Intruder | agentsSeen : OS, encrMsgsSeen : ENCRMSGS >
=>
< I : Intruder | agentsSeen : OS ; O ; O’,
encrMsgsSeen : ENCRMSG ENCRMSGS >
if O =/= I .
rl [intercept-msg1-and-understand] :
(msg (encrypt (NONCE ; A) with pubKey(I)) from O to I)
< I : Intruder | agentsSeen : OS, noncesSeen : NSET >
=>
< I : Intruder | agentsSeen : OS ; O ; A,
noncesSeen : NSET NONCE > .
rl [intercept-msg2-and-understand] :
(msg (encrypt (NONCE ; NONCE’) with pubKey(I)) from O to I)
< I : Intruder | agentsSeen : OS, noncesSeen : NSET >
=>
< I : Intruder | agentsSeen : OS ; O,
noncesSeen : NSET NONCE NONCE’ > .
We next model an intruder’s capabilities for sending fake messages, using the
agent identities, the nonces, and the encrypted message contents it knows.
The rule send-encrypted models the case in which an intruder sends a fake
message with a content that it has previously stored but could not decrypt. Since
the content is encrypted with B’s public key, the fake message will be sent to B. The
claimed “sender” could be any agent A whose identity the intruder knows:
crl [send-encrypted] :
< I : Intruder | encrMsgsSeen :
(encrypt MSGC with pubKey(B)) ENCRMSGS,
agentsSeen : A ; OS >
=>
< I : Intruder | >
(msg (encrypt MSGC with pubKey(B)) from A to B)
if A =/= B .
(A skeptic reader may wonder whether the intruder knows that the encrypted mes-
sage is encrypted with the public key of B, since that knowledge is not given from
the ciphertext itself. As mentioned above, the intruder can store this information
when it intercepts the message, since it can read the receiver part of the message.)
Finally, an intruder may compose any Message 1, Message 2, or Message 3 (see
Exercise 197) using the nonces and agent identifiers it knows:
crl [send-1-fake] :
< I : Intruder | agentsSeen : A ; B ; OS,
noncesSeen : NONCE NSET >
=>
< I : Intruder | >
(msg (encrypt (NONCE ; A) with pubKey(B)) from A to B)
if A =/= B /\ B =/= I .
crl [send-2-fake] :
< I : Intruder | agentsSeen : A ; B ; OS,
noncesSeen : NONCE NONCE’ NSET >
=>
< I : Intruder | >
(msg (encrypt (NONCE ; NONCE’) with pubKey(A)) from B to A)
if A =/= B /\ A =/= I .
244 14 Analyzing a Cryptographic Protocol
Exercise 198 Why are no behaviors lost by adding the equation eq MSG MSG = MSG?
Exercise 199 Are the three rules in which the intruder intercepts a message to itself
really necessary? Why/why not?
This section uses Maude to analyze whether the Beagle Boys can fool the bank into
thinking that it has an authenticated connection with Scrooge, who does not want to
connect to the bank. We define the following initial state intruderInit:
op intruderInit : -> Configuration .
eq intruderInit
= < "Scrooge" : Initiator |
initSessions : notInitiated("BeagleBoys"), nonceCtr : 1 >
< "Bank" : Responder |
respSessions : emptySession, nonceCtr : 1 >
< "BeagleBoys" : Intruder |
initSessions : emptySession, respSessions : emptySession,
nonceCtr : 1, agentsSeen : "Bank" ; "BeagleBoys",
noncesSeen : emptyNonceSet, encrMsgsSeen : emptyEncrMsg > .
The Beagle Boys do not know any other agent, except the bank, but hope to be con-
tacted by some rich guys after creating an enticing web site promising . . . Indeed,
Scrooge wants to contact the Beagle Boys but not the bank. Therefore, if it is pos-
sible to reach a state where the bank thinks that it has established an authenticated
connection with Scrooge, then the protocol is broken, and Scrooge’s wealth can
be transferred to the Beagle Boys. The following search command checks whether
such an undesired state is reachable from intruderInit:
14.5 Analyzing NSPK with Intruders 245
After about hundred minutes execution on a 1,7 GHz laptop, Maude replies with:
Solution 1
C:Configuration -->
< "Scrooge" : Initiator |
initSessions : trustedConnection("BeagleBoys"),
nonceCtr : 2 >
< "BeagleBoys" : Intruder |
agentsSeen :("Bank" ; "Scrooge" ; "BeagleBoys"),
encrMsgsSeen : encrypt nonce("Scrooge",1) ; nonce("Bank",1)
with pubKey("Scrooge"),
initSessions : emptySession, nonceCtr : 1,
noncesSeen : nonce("Bank",1) nonce("Scrooge",1),
respSessions : emptySession > ;
...
The Beagle Boys have fooled the bank into thinking that it has a trusted connection
with the unknowing Scrooge! The NSPK protocol is therefore insecure . . . or our
Maude model is incorrect. To be sure that NSPK can be broken, and to learn about
the attack on NSPK, we need to obtain the path leading to the bad state. Using the
technique in Sections 10.2.4.1 and 13.1.4.3, we obtain the following path:
Maude> show path 3443070 .
state 0, Configuration:
< "Bank" : Responder | nonceCtr : 1, respSessions : emptySession >
< "Scrooge" : Initiator | initSessions : notInitiated("BeagleBoys"),
nonceCtr : 1 >
< "BeagleBoys" : Intruder | agentsSeen : ("Bank" ; "BeagleBoys"),
encrMsgsSeen : emptyEncrMsg, initSessions : emptySession,
nonceCtr : 1, noncesSeen : emptyNonceSet, respSessions : emptySession >
===[ rl ... [label start-send-1] . ]===>
state 1, Configuration:
< "Bank" : Responder | ... > < "BeagleBoys" : Intruder | ... >
< "Scrooge" : Initiator | initSessions :
initiated("BeagleBoys", nonce("Scrooge", 1)),
nonceCtr : 2 >
msg encrypt nonce("Scrooge", 1) ; "Scrooge" with pubKey("BeagleBoys")
from "Scrooge" to "BeagleBoys"
===[ rl ... [label intercept-msg1-and-understand] . ]===>
state 2, Configuration:
< "Bank" : Responder | ... > < "Scrooge" : Initiator | ... >
< "BeagleBoys" : Intruder | agentsSeen : ("Bank" ; "Scrooge" ; "BeagleBoys"),
noncesSeen : nonce("Scrooge", 1), ... >
===[ crl ... [label send-1-fake] . ]===>
state 9, Configuration:
< "Bank" : Responder | ... > < "Scrooge" : Initiator | ...>
< "BeagleBoys" : Intruder | ... >
msg encrypt nonce("Scrooge", 1) ; "Scrooge" with pubKey("Bank")
from "Scrooge" to "Bank"
===[ crl ... [label read-1-send-2] . ]===>
state 66, Configuration:
< "Bank" : Responder | nonceCtr : 2,
respSessions : responded("Scrooge", nonce("Bank", 1)) >
< "Scrooge" : Initiator | ... > < "BeagleBoys" : Intruder | ... >
msg encrypt nonce("Scrooge", 1) ; nonce("Bank", 1) with pubKey("Scrooge")
from "Bank" to "Scrooge"
246 14 Analyzing a Cryptographic Protocol
All steps are indeed valid steps: the bank and Scrooge follow the protocol, and the
Beagle Boys only send things they know. This is therefore a valid attack on NSPK.
Exercise 200 Is the set of states reachable from intruderInit finite or infinite?
(Remember the equation that removes copies of a message from the configuration.)
Exercise 201 A search for attacks often searches for compromised nonces or keys.
Search for a state reachable from intruderInit where the bank has responded to
a (perceived) request from Scrooge with nonce nb , and where the intruder knows
both Scrooge and the bank, and also knows the nonce nb . (The intruder then has all
the knowledge needed to send the appropriate fake Message 3 to the bank.)
Exercise 202 The Handbook of Applied Cryptography [79] presents the following
version of the NSPK protocol that avoids encrypting the third message:
14.6 Discussion
The NSPK protocol, which was published in 1978, is discussed in the Handbook
of Applied Cryptography [79] from 1996 without any comments that it is insecure.
The protocol was also proved correct in the absence of intruders in 1989 [17].
The attack on the protocol was originally reported by Gavin Lowe in 1995 [72].
The attack was supposedly found during formal analysis using the FDR tool for the
process algebra CSP [73], and is the same attack found in our Maude analysis.
248 14 Analyzing a Cryptographic Protocol
Although it might look slightly disconcerting that the Maude search took hundred
minutes, this reflects the complexity of the problem: after all, the attack had escaped
the attention of experts for 17 years. In Lowe’s analysis, the intruder model did not
include rules for taking part in original runs of the protocol; if we ignore those four
rules, Maude finds the attack in a few seconds. Another common way of speeding
up the search is to search for compromised nonces/keys: it is sufficient to search for
a state in which the bank is waiting for some nonce and the attacker has that nonce;
this search takes about 15 seconds (see Exercise 201).
Lowe’s work showed the need for automatic analysis of cryptographic proto-
cols, since humans could no longer be expected to be able to manually verify their
correctness. This led to the development of a number of successful formal tools for
analyzing such protocols; examples include the TAMARIN Prover [78], Scyther [24],
ProVerif [13], the Avispa toolset [5], and the Maude-based Maude-NPA tool [42].
Gavin Lowe also suggested a modification of the protocol to make it secure: The
responder adds its own identity to the encrypted part of Message 2, which becomes
Exercise 203 Explain why the attack on NSPK no longer works (or can be easily
modified to work) in the modified protocol.
Exercise 204 Modify the Maude specification of the protocol to model the new
version of the protocol. Define an initial state with one honest initiator, one
honest responder, and one intruder. Can you break the modified protocol?
Exercise 205 Explain why the reachable state space is finite if each honest agent in
the three-agent setting above generates at most one nonce.
System Requirements
15
The previous chapters of this book explained how the behaviors of a system can be
specified mathematically. Such a system specification must be complemented by a
requirement specification defining the properties that the system must satisfy.
Example 15.1. In our building metaphor in Section 1.1, the system specification
corresponds to a model of the building, e.g., a physical scale model, a set of draw-
ings of the building, and/or a virtual model of the building that together describe
(aspects of) the building to be constructed. The requirement specification, defining
the requirements that the building must satisfy, could include properties such as:
• The building should be able to withstand an 8.0 magnitude earthquake.
• The building should be able to withstand winds of up to 95 knots.
• All rooms in the building must be wheelchair accessible.
• There must be at least one bathroom for every 12 bedrooms. ♦
Example 15.2. This book has presented a number of system specifications. Some
desired requirements of the respective systems are:
1. Two philosophers should never hold the same chopstick at the same time.
2. Each philosopher must eat infinitely often.
3. Each philosopher could eat infinitely often.
4. The receiver will sooner or later receive all the messages in the right order.
5. The Beagle Boys are able to establish a trusted connection with the bank.
6. The bank never has a trusted connection with Scrooge.
7. Two processes will not execute in their critical sections at the same time.
8. Each process will eventually execute in its critical section.
9. The processes will execute in their critical sections in the order in which they
wanted to access their critical sections.
10. All nodes will eventually elect the same leader. ♦
c Springer-Verlag London 2017 249
P.C. Ölveczky, Designing Reliable Distributed Systems, Undergraduate Topics
in Computer Science, DOI 10.1007/978-1-4471-6687-0 15
250 15 System Requirements
Given a system (represented by its model) and the requirements that the sys-
tem should satisfy, the all-important question is whether the system satisfies its
requirements. The answer to that question obviously depends on the initial states.
Even a “correct” system will not satisfy a desired requirement from a bad ini-
tial state. For example, the requirement “two processes are never in their re-
spective critical sections at the same time” is not satisfied by the specification
MUTEX-WITH-CENTRAL-SERVER on page 223 if the initial state contains two nodes
whose state attribute has the value insideCS. Therefore, the main question is:
Does the system S satisfy the requirement R when started from any initial state s0 ∈ I from
a set I of admissible initial states?
System requirements may be stated either in terms of the actions (or events) that are
performed during system executions, or in terms of the states that are encountered
during system executions, or both.
In some cases, action/event-based properties are more natural:
Example 15.3. Consider (American) football games (see Section 8.2.3). An impor-
tant requirement (that is not satisfied by the module ONE-FOOTBALL-GAME) is that
15.1 State-based and Action-based Properties 251
an extra point or a two-point conversion by a team may only be performed immediately after
that team has scored a touchdown.
In other words, a two-point conversion may not follow directly after a field goal, a
safety, an extra point, a two-point conversion, or a touchdown by the other team. ♦
Some other natural action-based requirements are:
• No person can be baptized more than once.
• Each wedding must be preceded by an engagement involving the same persons.
• Each philosopher can start eating infinitely many times.
• The order in which the nodes enter their critical section should equal the order in
which the nodes perform the requestAccessToCS action.
Other system requirements are more conveniently expressed as properties of the
states of the system:
• Two nodes are never in local state insideCS in any state.
• The population is never inconsistent: there is no state in which Bridget is married
to Tom and Tom is married to Gisele.
• Two nodes do not have different leaders in any state.
• The value of the receiver’s msgsRcvd attribute should eventually equal the value
of the sender’s msgsToSend attribute in the initial state.
• The bank should not have an established connection with Scrooge unless Scrooge
has an established connection with the bank.
• Two neighboring philosophers should never eat at the same time.
Other requirements are most naturally given by combining actions and states:
• A person in state baptized should not be able to make a hajj (pilgrimage to
Mecca) or undergo another baptism.
• No person older than 50 years old should be able to give birth.
A requirement that is naturally expressed using actions can often also be ex-
pressed (albeit less conveniently) using states, and vice versa. However, the require-
ment in Example 15.3 cannot be expressed in terms of states (unless the original
specification is modified), because it is impossible to differentiate a two-point con-
version from a safety by just looking at the states, since both are worth two points.
This book focuses on state-based requirements, which are more commonly used.
15.1.1 Actions/Events
where State denotes the sort of the states. For example, for object-oriented systems,
the State sort is Configuration.
Exercise 206 Show a behavior that illustrates that the action-based requirement
“philosopher 2 starts eating infinitely often” is different from the state-based re-
quirement “philosopher 2 is in state eating infinitely often.” Are the requirements
“each philosopher starts eating infinitely often” and “each philosopher is in state
eating infinitely often” different (in terms of being satisfied by different behaviors)?
Exercise 207 For each system requirement in this section, decide whether it is satis-
fied by the corresponding specification(s) (for the obvious admissible initial states).
A state proposition p is (an) invariant with respect to an initial state t0 if and only
if p holds in each state that can be reached (in zero or more rewrite steps) from t0 :
that is, t0 −→ t implies p(t). The state proposition p is an invariant with respect to
set I of initial states if and only if p is an invariant w.r.t. each initial state t0 ∈ I. An
invariant is often called a safety property, since it can be seen to mean that “nothing
bad will happen.” Figure 15.1 illustrates invariants.
Fig.15.1 “The state is red” is an invariant in the “tree” of possible system behaviors from the given
initial state. (Each state is shown as a circle; an arrow means that there is a one-step sequential
rewrite from the source state to the destination state.)
Example 15.5. A useful invariant pt0 in the alternating bit protocol w.r.t. a “normal”
initial state t0 is that “the value of the receiver’s msgsRcvd attribute is a prefix of the
value of the sender’s msgsToSend attribute in the initial state t0 .” ♦
Example 15.6. “At most one node has state attribute value insideCS” is a desired
invariant in mutual exclusion algorithms. ♦
Example 15.7. “Neighboring philosophers are not both in state eating” should be
an invariant for our initial states in solutions to the dining philosophers problem. ♦
A state may be “inconsistent” until some nodes receive a message. In these cases
the invariant should take the messages traveling between nodes into account:
Example 15.8. The state proposition “if person A is married to person B, then also
B is married to A” is not an invariant (w.r.t. sensible initial states) in our model
of populations. However, the state proposition “if A is married to B, then either B
is married to A or there is a message (msg separate from B to A) in the state”
should be invariant. ♦
254 15 System Requirements
Example 15.9. The state proposition “either all nodes have updated equal true or
all nodes have updated equal false” is not invariant (w.r.t. good initial states) in
the two-phase commit protocol without node or communication failures. However,
the state formula “either all nodes have updated equal false, or all nodes have
updated equal true, or some nodes have updated equal true and there is a commit
message in the state addressed to each node with updated equal false” should be
an invariant in the two-phase commit protocol without failures. This invariant also
implies that the databases are consistent when all messages have been consumed. ♦
Invariants say that something bad will not happen. We also want to be able to say
that something good must eventually happen. A state proposition p is a guarantee
(or liveness) property if a p-state can be reached in all possible computations from
the initial state. That is, it is guaranteed that a p-state will be reached sooner or later,
no matter how the rules are applied. Guarantee properties are illustrated in Fig. 15.2.
Example 15.10. The state proposition “process node(4) is executing inside its crit-
ical section” is not guaranteed in the central server algorithm (w.r.t. initial state
init(5)) if all processes execute forever, alternating between executing outside
and inside the critical section (see Exercise 184). The property is guaranteed by the
ring-based mutual exclusion algorithm also when all nodes execute forever. ♦
Example 15.11. “Philosopher 3 is in state eating” is not guaranteed in any of the
solutions to the dining philosophers problem (why not?). ♦
Example 15.12. “Each node has elected the highest-valued node as its leader” is
guaranteed in both leader election algorithms in Section 13.3. ♦
Example 15.13. “The desired string has been stored in the receiver’s msgsRcvd
attribute” is not guaranteed in our transport protocols. ♦
15.2 Temporal Properties 255
15.2.2.1 Fairness
It is often impossible to guarantee that a desired property (such as “philosopher 2
is eating” in the deadlock-free solutions, and “the receiver has received all strings
in the desired order”) will be reached in all behaviors, since the model may allow
extreme behaviors in which, e.g., all messages are lost, or only philosopher 1 does
something. Such behaviors typically do not represent realistic system behaviors.
Therefore, we can (and must) often assume fairness requirements on how the rewrite
rules are applied in order to guarantee that a desired state will be reached. Since just
imposing requirements on which rules are applied does not exclude unfair behaviors
in which only philosopher 1 gets to execute, we generalize fairness to events.
Two classes of fairness requirements are:
• Compassion (or strong fairness): if an event is enabled (i.e., could take place)
infinitely often, then the event must take place infinitely often.
• Justice (or weak fairness): an event cannot be continuously enabled from a certain
point on without taking place.
Event fairness notions can state that the rule applications should be fair w.r.t. both
which objects and which rules are executed. We also have to consider communica-
tion fairness. If messages can be lost, then there are (unrealistic) behaviors in which
all messages are lost. One communication fairness assumption could be that “if an
infinite number of copies of a certain message are sent, then an infinite number of
such messages are not dropped.” Another fairness assumption is that no message is
“overtaken” infinitely often by other messages. For example, the central server mu-
tual exclusion algorithm in Section 13.2 with continuously executing processes does
not guarantee that a given process p will be able to execute inside its critical sec-
tion, since the requestCS message from p could be overtaken forever by messages
from the other processes. We therefore need a “no infinite message overtaking”
fairness assumption such as “a message cannot be available for reading continu-
ously/infinitely often without being read.” Both of the above fairness assumptions
can be seen as event fairness conditions: the event(s) in which the message m is read
must be applied in a fair way.
There are many different notions and variations of fairness [45], and discussing
them further is beyond the scope of this book.
A state proposition p is reachable w.r.t. an initial state t0 if there exists some state
t such that t0 −→ t and p(t) holds. That is: it is possible to reach a p-state. The
difference between a guaranteed property and a reachable property is that the for-
mer requires that a p-state is reached in all possible runs, whereas the latter only
requires that a p-state is reached in some run, as illustrated in Fig. 15.3. There is a
256 15 System Requirements
Reachability properties are mostly used to analyze the possibility of reaching bad
states in a specification.
15.2.5 Stability
A state proposition p is stable if it never stops holding after it first holds. For exam-
ple, the property “the receiver’s msgsRcvd attribute equals the desired string s” is
15.2 Temporal Properties 257
Fig. 15.4 Response: Each yellow state must eventually be followed by a red state
a crucial stable property in the alternating bit protocol, which continues its execu-
tion even after the desired state has been reached. Stability, illustrated in Fig. 15.5,
ensures that this result will not be destroyed by remaining actions of the system.
Likewise, “all nodes have the best-valued node as their leader” should be stable in
a setting where nodes do not fail, so that new rounds of the leader election protocol
do not destroy this property.
Until. The property “p1 until p2 ” means that, in each behavior from the initial
state(s), each state is a p1 -state until a p2 -state is reached. For example, “there is
258 15 System Requirements
a message in the state” until “all nodes have elected the best-valued node as their
leader” should hold in a distributed leader election algorithm, and “there is a mes-
sage in the state” until “all databases have the same updated value” holds in 2PC
without failures. Two variations of the until property are shown in Fig. 15.6:
• Weak until: It is not necessary that a p2 -state is eventually reached, in which case
p1 holds all the time. For example, in the Paxos consensus protocol “the nodes
try to achieve consensus” weak-until “all nodes have agreed on a value” holds.
• Strong until: A p2 -state must eventually be reached in all behaviors.
Fig. 15.6 “The state is yellow” weak-until “the state is red” (left), and “the state is yellow”
strong-until “the state is red” (right)
Exercise 210 Consider the second solution to the dining philosophers problem.
1. Show a behavior in which philosopher 2 could grab both chopsticks infinitely
often, and never does so, but where (s)he cannot continuously grab both chop-
sticks from some point on.
2. Show a behavior in which philosopher 2 from a certain point on continuously
can grab both chopsticks, but never gets to do so.
15.2 Temporal Properties 259
Which of these behaviors are illegal if we assume compassion w.r.t. the event
“philosopher 2 grabs both chopsticks”? Which is illegal if we assume justice?
Exercise 211 Is the state proposition “philosopher 2 is in state eating” guaran-
teed in the deadlock-free solutions to the dining philosophers problem if we assume
compassion w.r.t. “all events”? Is it guaranteed if we only assume justice?
Exercise 212 Consider the three solutions to the dining philosophers problem and
the corresponding initial states.
1. In which solution(s) is “some philosopher is in state eating” guaranteed?
2. Is “two philosophers are in state eating” guaranteed in any of the solutions?
3. Is “two philosophers are in state eating” guaranteed in any of the solutions if
we assume justice? How about if we assume compassion?
Exercise 213 1. Is the property “the receiver has received all the desired strings”
guaranteed in the transport protocol SEQNO-UNORDERED (or the alternating bit
protocol for that matter) under the “message loss fairness” assumption?
2. What additional fairness requirements (and for which events) are needed to
guarantee that the above property will be reached? Is justice sufficient?
3. Assume message loss fairness, object/rule compassion, and no infinite message
overtaking. Is the above property guaranteed in the sliding window protocol?
Exercise 214 Which is the state proposition whose reachability would show an er-
ror in the two-phase commit protocol? (Remember that a property of the form “. . .
and the state is a final state” is not a state proposition.)
Exercise 215 The reachability of which state proposition would imply that the al-
ternating bit protocol is incorrect?
Exercise 216 Consider the following statements about the Traveling Salesman
problem with the parameters in Exercise 125:
1. The cost of the trip to now (i.e., stored in the current state) is greater than zero.
2. The (incomplete) trip up to the current state can be extended to a completed trip
with total cost ≤ 45.
3. The trip (stored in the current state) will sooner or later end in PhnomPenh.
4. The cost of the trip stored in the current state is greater than 12.
5. The cost of the trip stored in the current state is less than 22.
Which of these statements are state propositions? For each state proposition: is it
invariant, guaranteed, reachable, and/or stable, for the obvious initial state?
Exercise 217 Assume that the initial state satisfies the state proposition p1 . Explain
why it is still not the case that “p2 is guaranteed” and “(p1 , p2 ) is a response
property” are the equivalent for this initial state. Does one of these imply the other?
Exercise 218 Consider the following classes of requirements: invariance, guaran-
tee, reachability, response, stability, and strong and weak until. What are the rela-
tionships between these properties? For example, are any of these special cases of
others? Does any of them imply any others?
260 15 System Requirements
Exercise 219 Explain how the “no deadlock” requirement can be seen as a special
case of the “all final states must satisfy p” requirement.
• There is no bound on the number of objects that can be created. For example, in
our population examples, there is no limitation on how many new Person objects
can be created by the rule birth.
Using search to analyze invariance has some limitations:
• It cannot be used to prove that a state proposition is an invariant if the reachable
state space is infinite.
• Invariance can only be analyzed for single initial states, not for infinite sets of
admissible initial states.
For example, Maude search cannot even prove that “two neighboring philosophers
do not eat at the same time” for the single initial state with 5 philosophers. Even if
we remove the #eats attribute, which is the source of the infinite state space, search
cannot prove this invariant for any number of philosophers. Search cannot even
prove that “the total number of points scored is greater than 10” is an invariant when
starting with initial state "Steelers" vs "Ravens" 9 : 3. If we limit the scoring
in a football game, we would like to prove that the above property is invariant w.r.t.
all initial states with at least 11 points scored. This is impossible using search.
We can instead prove inductively “by hand” that a state proposition p is an in-
variant w.r.t. to a set I of initial states by proving that:
• Each initial state s0 ∈ I satisfies p.
• For each rewrite rule r: if t −→ t is a one-step rewrite using rule r and t and t
are ground terms such that t is a p-state, then t must also be a p-state.
Example 15.14. Let us prove that “the number of points scored is greater than 10”
is an invariant w.r.t. all initial states (i.e., all ground terms of sort Game) where at
least 11 points have been scored.
• Each initial state has at least 11 points scored, and therefore satisfies the property.
• Any ground term of sort Game has the form a vs b m : n. Any initial state then
has this form, where, in addition, m + n > 10. Assume that we apply the rule
touchdown-home to any such state; the resulting state is a vs b (m + 6) : n,
which also satisfies the desired property since m + 6 + n > 10 holds when we
assume that m + n > 10. In this same way, we can show that each rewrite rule
preserves the desired formula.
We have therefore proved that the formula is an invariant in ONE-FOOTBALL-GAME
for all possible initial states with at least 11 points scored. ♦
If p is not inductive, that is, p is not strong enough to prove that p(t) ∧ t −→ t
implies p(t ), we can strengthen p. The property p is then an invariant if: (i) the
strengthened version p is an invariant, and (ii) p implies p. Exercise 224 is an
exercise where the desired property must be strengthened in this way.
Exercise 220 Assume that the reachable state space from the single initial state is
finite. Invariance and reachability can be analyzed using Maude’s search command.
Explain why guarantee requirements cannot be analyzed using Maude’s search
command. How about response, stability, and until requirements?
262 15 System Requirements
1 Edmund Clarke, Allen Emerson, and Joseph Sifakis received the Turing Award in 2007 for their
pioneering work on temporal logic model checking.
c Springer-Verlag London 2017 263
P.C. Ölveczky, Designing Reliable Distributed Systems, Undergraduate Topics
in Computer Science, DOI 10.1007/978-1-4471-6687-0 16
264 16 Formalizing and Checking Requirements
Example 16.2. In rewriting logic, the formulas have the form t −→ u. As for the
semantics, the models were just briefly mentioned in Section 8.6, and the proof
system is given in Section 8.4. ♦
This section presents the syntax of LTL (defining the set of LTL formulas) and
its semantics. This book focuses on using model checking to automatically check
whether a property is satisfied. We are less interested in coming up with a proof
that a formula holds, and therefore do not provide a proof system for LTL, although
sound and complete proof systems exist for LTL.
16.1.1 Behaviors
In this chapter we assume that each behavior from an initial state t0 is an infi-
nite sequence of one-step sequential rewrites. This assumption avoids having to
define many concepts twice: one for finite behaviors and one for infinite behav-
iors. The point is that any finite behavior t0 −→ t1 −→ · · · −→ tn , where tn cannot
be further rewritten, can be extended to an infinite sequence (also called a path)
t0 −→ t1 −→ · · · −→ tn −→ tn −→ · · · −→ tn −→ · · · by just adding a self-loop
from any deadlocked state tn . Maude’s model checker does this automatically.
t0 −→ t1 −→ t2 −→ t3 −→ · · ·
of one-step sequential rewrites ti −→ ti+1 in R . The set of all such behaviors starting
with t0 is denoted pathsR (t0 ). If π is the above behavior and k ∈ N, then
• π (k) = tk (the (k + 1)-th state in the path), and
• π k = tk −→ tk+1 −→ tk+2 −→ · · · (the rest of the behavior from state tk ).
The basic building blocks in linear temporal logic formulas are atomic propositions.
In a state-based logic, an atomic proposition p is a state proposition, which is either
true or false in a state t of sort State, as explained in Section 15.1.2.
16.1 Linear Temporal Logic 265
Example 16.3. Consider the specification ONE-PERSON in Section 8.2.3, which spec-
ifies the life of a single person. The designated sort State is Person, and some
examples of state propositions are alive, dead, and teenager. ♦
Example 16.5. A useful family of state propositions, one for each pair of agents a
and b, for NSPK is a_hasTrustedConnectionWith_b. ♦
Linear temporal logic then adds to the atomic propositions the usual Boolean
connectives ¬ (“not”), ∧ (“and”), ∨ (“or”), → (“implies”), and ↔ (“if and only
if”), and the following temporal operators: , ♦, , U , and W . Intuitively,
• the formula ϕ holds in a path π if the formula ϕ holds everywhere in the path;
• the formula ♦ ϕ holds in a path if ϕ holds somewhere in the path;
• the formula ϕ holds if ϕ holds in the next position/state in the path;
• the formula ϕ U ψ holds in a path if the formula ψ holds somewhere in the path,
and all positions in the path up to that point satisfy the formula ϕ ; and
• ϕ W ψ is similar to ϕ U ψ , except that it is possible that ψ never holds in the
path (in which case the formula ϕ must hold everywhere in the path).
Definition 16.2 Given a set AP of atomic propositions, the set of linear temporal
logic (LTL) formulas are defined inductively as follows:
• true and false are LTL formulas;
• any state proposition p ∈ AP is an LTL formula;
• if both ϕ and ψ are LTL formulas, then the following are also LTL formulas:
– ¬ϕ (not ϕ )
– ϕ ∧ψ (ϕ and ψ )
– ϕ ∨ψ (ϕ or ψ )
– ϕ →ψ (ϕ implies ψ )
– ϕ ↔ψ (ϕ if and only if ψ )
– ϕ (always ϕ )
– ♦ϕ (eventually ϕ )
– ϕU ψ (“ϕ (strong-) until ψ ”)
– ϕW ψ (“ϕ weak-until ψ ”)
– ϕ (ϕ holds in the next state)
Example 16.6. Examples of temporal logic formulas involving the atomic proposi-
tions in Examples 16.3–16.5 are:
1. alive (the person is always alive).
2. ♦ dead (sooner or later a state where the person is dead must be reached).
3. alive U dead (the person is continuously alive until she becomes dead).
4. alive W dead (as above, except that the person could live forever).
266 16 Formalizing and Checking Requirements
The only operators needed are true, p, ¬, ∧, U , and . The other operators can
be defined in terms of these. For example, ϕ ∨ ψ can be seen as an abbreviation of
¬((¬ϕ ) ∧ (¬ψ )), and ϕ → ψ can be seen an abbreviation of (¬ϕ ) ∨ ψ , and ϕ ↔ ψ
can again be seen as an abbreviation of (ϕ → ψ ) ∧ (ψ → ϕ ). Likewise, as explained
below, the temporal logic operators , ♦, and W can be defined in terms of U .
The formulas ϕ and ψ can themselves be LTL formulas, which means that we
can have nested formulas such as (p → ♦ q). (What does this formula mean?)
To formally define the meaning of LTL; i.e., to define whether an LTL formula ϕ
holds in a specification R with initial state t0 , we must first define what it means
that an atomic proposition holds in a state. A labeling function maps each state to
those atomic propositions which hold in the state:
Example 16.7. The obvious labeling function L in Example 16.3 gives us:
• L(person("Peter", 46, married)) = {alive} and
• L(person("Joan of Arc", 19, deceased)) = {dead, teenager}. ♦
Definition 16.4 Let R be a rewrite theory with a specific sort State denoting the
•
sort of the states. Then RState is R , except that there is a rewrite t −→ t for each
deadlocked state t ∈ TΣ ,State .
R , State, L,t0 |= ϕ
16.1 Linear Temporal Logic 267
Notation: We often omit the state sort State and/or the labeling function L from
R , State, L,t0 |= ϕ and R , L, π |= ϕ .
We define what it means that an (infinite) path π satisfies a formula ϕ inductively
on the structure of ϕ :
We can visualize this definition as follows, where we write below each state/po-
sition the subformula holding in the rest of the path starting in that state:
t0 −→ t1 −→ t2 −→ t3 −→ t4 −→ · · · satisfies ϕ
ϕ ϕ ϕ ϕ ϕ ···
t0 −→ t1 −→ t2 −→ · · · −→ tk −→ · · · satisfies ♦ ϕ
ϕ
t0 −→ t1 −→ t2 −→ t3 −→ · · · satisfies ϕ
ϕ
268 16 Formalizing and Checking Requirements
It is worth emphasizing that temporal formulas are not evaluated on states, but
on (sub)paths starting at certain positions. We say that ϕ holds at position j in π if
and only if ϕ holds in π j . Notice that if the first position in the path satisfies ϕ , then
♦ ϕ and ψ U ϕ both hold.
a −→ a −→ b −→ a −→ a −→ · · ·
isA isA isB isA isA ···
¬ isB isB ¬ isB ¬ isB ¬ isB ···
♦ isB ♦ isB ♦ isB ¬ ♦ isB ¬ ♦ isB ··· ♦
As already mentioned, the operators W , , and ♦ are not strictly necessary, since
they can be defined in terms of U :
• ♦ ϕ can be defined as true U ϕ (why?);
• ϕ W ψ can be defined as (ϕ U ψ ) ∨ ϕ ; and
• ϕ can be defined in terms of U and the Boolean operators (see Exercise 227).
We have defined the meaning of LTL formulas in terms of rewrite theories. However,
LTL formulas can also talk about behaviors specified using other formalisms (such
as Petri nets, automata, process algebras, etc.). The semantics of LTL is therefore
usually defined on a more abstract model called a Kripke structure.
A rewrite theory R = (Σ , E, R) with designated state sort State and labeling func-
tion L defines a Kripke structure (TΣ ,E State , −→• , L) in the obvious way, where:
• the set of states are the (E-equivalence classes of) ground terms of sort State;
• the transition relation −→• is the one-step sequential rewrite relation on the states
extended with transitions t −→• t for deadlocked states; and
• L is the labeling function. Notice that for L to be a well-defined function (that
is, assigning to each E-equivalence class of terms a single set of propositions
holding in that equivalence class), E-equivalent states must be equivalent under
L, which was assumed above.
16.1 Linear Temporal Logic 269
Exercise 225 1. Explain why it could be the case that neither R , L,t0 |= ϕ nor
R , L,t0 |= ¬ϕ holds.
2. Prove that it is always the case that either R , L, π |= ϕ or R , L, π |= ¬ϕ holds.
Exercise 226 Consider the formulas in Example 16.6. You can assume that the un-
derlying specifications have been suitably completed, e.g., with rules for divorce.
1. Which of the formulas hold for the “standard” initial states?
2. For each formula, give the set of initial states for which the formula holds.
3. Give some examples of formulas ϕ and initial states t0 such that neither
R , L,t0 |= ϕ nor R , L,t0 |= ¬ϕ holds.
4. Define other useful atomic propositions and LTL formulas.
Exercise 227 Define ϕ in terms of U and the Boolean operators. Hint: Remem-
ber that ♦ can be defined by U , and then define ϕ in terms of ♦.
This section discusses different LTL formulas, including the formalization of the
different classes of properties mentioned in Chapter 15 and fairness assumptions.
This section formalizes the properties2 in Chapter 15 and discusses other properties.
2 Those properties talk about state formulas, which are LTL formulas without temporal operators.
270 16 Formalizing and Checking Requirements
Stability. Stability, which means that a property continues to hold forever once it
starts holding, can be formalized as the property (ϕ → ϕ ). That is, ϕ must
hold whenever ϕ holds. For example: (dead → dead).
t0 −→ t1 −→ · · · −→ tk −→ tk+1 −→ tk+2 −→ · · ·
ϕ ¬ϕ ¬ϕ ···
t0 −→ · · · −→ tk −→ tk+1 −→ tk+2 −→ · · · −→ tl −→ · · ·
♦ϕ which means
♦ϕ ··· ♦ϕ ♦ϕ ♦ϕ ··· ♦ϕ ··· in particular:
♦ϕ which means
ϕ for an l > k.
Recall that certain fairness assumptions are often necessary to prove that any kind
of progress will be made by excluding obviously “unfair” behaviors in which, for
example, messages are created and dropped all the time, or in which a person con-
tinuously marries and divorces all the time without even having time to celebrate
her birthday. We mention two classes of fairness assumptions in Chapter 15:
• Compassion: If an event could be taken infinitely often, it should be taken
infinitely often.
• Justice: An event cannot be enabled continuously from some point on without
being taken infinitely often. (See also Exercise 232.)
If the formulas eenabled and etaken denote, respectively, that a certain event e is
enabled and taken, then compassion fairness with respect to the event e can be
expressed as the LTL formula
( ♦ eenabled ) → ♦ etaken
(♦ eenabled ) → ♦ etaken .
Example 16.9. For the dining philosophers, one compassion fairness condition on
the application of the rules could be that if it happens infinitely often that philoso-
pher number 2 already has one chopstick and the other chopstick is free, then this
philosopher should be able to eat infinitely often:
Justice, however, would not help our philosopher much (why not?). In an unfair
world, philosopher 2 may not even become hungry, since she could be thinking
forever while other philosophers are doing stuff continuously. Justice is enough to
ensure that philosopher 2 becomes hungry:
(♦ phil2thinking) → ♦ phil2hungry,
If your LTL model checker does not support fairness, and you can encode your
fairness assumptions as a formula ψ , you can model check the desired property ϕ
under the fairness assumption by analyzing the formula ψ → ϕ instead.
One problem is that, since we use a state-based logic, etaken cannot be defined
directly, but must be defined by considering the effect of performing the event e,
if possible. In Example 16.9 the event performed is “philosopher 2 applies the rule
grabSecond,” and the effect of performing this event is that philosopher 2 is in
state eating. In Section 16.3.5 we model check the central server mutual exclusion
algorithm in Maude, and formalize all of its fairness assumptions in LTL.
Exercise 228 Why are the following formalizations of the response property wrong?
1. ϕ → ♦ ψ
2. (ϕ → ψ )
Exercise 230 Explain why obtaining a counterexample from model checking the
formula (hasValidLotteryTicket → ¬isMillionaire) does not imply
that the desired “may-lead-to” requirement holds.
Exercise 231 Two LTL formulas ϕ and ψ are equivalent if they are evaluated in the
same way in every possible path π . For example, ¬ ♦ ϕ and ¬ ϕ are equivalent:
• Assume that a path π satisfies ¬ ♦ ϕ . This means that a ϕ -position is never
reached in the path, which of course means that all positions in the path sat-
isfy ¬ ϕ , which again means that the whole path satisfies ¬ ϕ .
• The other way: Assume that a path ρ satisfies ¬ ϕ . This means that all positions
in the path are ¬ ϕ positions, and hence nowhere do we reach a ϕ -position, and
therefore ¬ ♦ ϕ holds.
For each the following pairs of LTL formulas (the last four of which are borrowed
from [75]), determine whether the two formulas in the pair are equivalent. If not,
show a path where one formula holds and the other formula does not hold. Does
one of the formulas imply the other?
16.2 Some LTL Formulas 273
1. ϕ and ϕ
2. ♦ ϕ and ♦ ϕ
3. ( ϕ ) → ψ and (ϕ → ψ )
4. ((♦ ϕ ) → ♦ ψ ) and (ϕ → ♦ ψ )
5. (♦ ϕ ) ∧ ( ψ ) and ♦ (ϕ ∧ ψ )
6. (♦ ϕ ) ∧ (♦ ψ ) and ♦ (( ϕ ) ∧ ( ψ ))
7. (ϕ U ψ ) ∧ (ψ U θ ) and (ϕ U θ )
8. ( ϕ ) ∧ (♦ ψ ) and ϕ W (♦ ψ )
Exercise 232 The justice property is often (including in Chapter 15) defined “if,
from a certain point on, an event is continuously enabled, then it must be taken,”
which directly translates to the LTL formula (( eenabled ) → ♦ etaken ). Is this
formula equivalent to (♦ eenabled ) → ( ♦ etaken )? Why/why not?
Exercise 233 (From [75]; tricky?) We can define the before operator B by ϕ B ψ =
(¬ ψ ) W (ϕ ∧ ¬ ψ ). That is, the first occurrence of ϕ comes strictly before the first
occurrence of ψ . Define U in terms of B and the Boolean connectives; that is,
without using any temporal operator except B .
mod MODEL-CHECK-MY-SPEC is
protecting MY-SPEC . including MODEL-CHECKER .
subsort s < State .
--- declare and define atomic propositions
--- and define complex formulas, if any
endm
When using Full Maude, this module should be enclosed between parentheses.
Next we need to define the meaning of the atomic propositions; i.e., the labeling
function L. This is done by defining the built-in function
op _|=_ : State Prop -> Bool [frozen] .
so that t |= p evaluates to true whenever p ∈ L(t). That is, we need to define the
states in which p holds. It is not necessary to define explicitly the cases when the
propositions do not hold. For example:
var X : String . vars M N : Nat . var S : Status .
These equations also define the false cases; since this is not strictly needed, the
second and fourth equation could have been replaced by
eq person(X, N, deceased) |= dead = true .
eq person(X, N, S) |= is N yearsOld = true .
The syntax of formulas is therefore pretty much the typewriter version of LTL for-
mulas, with True for true (to avoid confusion with the Boolean true), ~ for ¬
(negation), /\ for ∧, [] for , <> for ♦, and so on.
If the formula ϕ does not hold in all paths from the initial state t0 , a counterexample
Example 16.10. The following command checks whether the formula alive U dead
holds in ONE-PERSON from initial state person("Methuselah", 999, single):
Maude> red modelCheck(person("Methuselah", 999, single),
alive U dead) .
result ModelCheckResult:
counterexample(
{person("Methuselah", 999, single), ’birth-day}
{person("Methuselah", 1000, single), ’birth-day}
{person("Methuselah", 1001, single), ’successful-proposal} ,
{person("Methuselah", 1001, engaged), ’marriage}
{person("Methuselah", 1001, married), ’separation}
{person("Methuselah", 1001, separated), ’divorce}
{person("Methuselah", 1001, divorced), ’successful-proposal})
276 16 Formalizing and Checking Requirements
The counterexample shows that after becoming 1001 years old and engaged, Methu-
selah spends his remaining days marrying, separating, divorcing, proposing, remar-
rying, separating, and so on, forever and ever.
Although death may not be certain, it should be a stable property:
Maude> red modelCheck(person("Peter", 46, single),
[] (dead -> [] dead)) .
A person should be able to reach any age or be dead. This property does not hold
in our specification due to possible loops of marriages, divorces, and remarriages.
Since such behaviors are unrealistic, we can assume justice fairness on the appli-
cation of the birth-day rule. However, one fairness condition is needed for each
birthday event, so that the fairness assumption is formalized as the formula
((<> [] (alive /\ is 0 yearsOld) -> <> (is 1 yearsOld)) /\
((<> [] (alive /\ is 1 yearsOld) -> <> (is 2 yearsOld)) /\
((<> [] (alive /\ is 2 yearsOld) -> <> (is 3 yearsOld)) /\
...
((<> [] (alive /\ is 1000 yearsOld) -> <> (is 1001 yearsOld)) .
Fortunately, we can exploit the fact that Maude allows us to (i) specify parametric
atomic propositions; and (ii) define more complex formulas equationally, to specify
a function fairBirthdays(currAge, desiredAge), which defines the fairness no-
tions for all birthday events between age currAge and desiredAge:
op fairBirthdays : Nat Nat -> Formula .
vars M N : Nat .
ceq fairBirthdays(M, N) =
= ((<> [] (alive /\ is M yearsOld)) -> <> (is M + 1 yearsOld))
/\ fairBirthdays(M + 1, N) if M < N .
eq fairBirthdays(N, N) = True .
In this section we analyze the central server mutual exclusion algorithm, but where
each process loops forever (see Exercise 184; the only difference w.r.t. the specifi-
cation in Section 13.2 is that a processor goes to state beforeCS instead of afterCS
when exiting its critical section). Such distributed mutual exclusion should achieve:
(i) two processes are never in the critical section at the same time; (ii) each process
executes infinitely often in its critical section; and (iii) the processes access their
critical sections in the order in which they wanted to enter it.
Requirement (i) is an invariant which is analyzed using search in Section 13.2.
In this section we analyze the two other requirements.
Requirement (ii).
We start by checking whether each process is infinitely often in its critical section.
The following parametric state propositions beforeCS(o), waiting(o), and
inCS(o) hold when node o is, respectively, executing outside its critical section,
blocked waiting to enter its critical section, and executing inside its critical section:
(omod MODEL-CHECK-MUTEX-LOOP is
protecting MUTEX-WITH-CENTRAL-SERVER-INITIAL-STATE .
including MODEL-CHECKER .
We therefore only consider just paths, and must define the just use of the rewrite
rules for an object o.
We first consider rule requestAccessToCS in Section 13.2, where a node in
state beforeCS sends a request to the central server asking for access to the critical
section. The result of applying this rule is that the node is waiting; the justice
assumption therefore says that a node o cannot be continuously enabled (i.e., the
node satisfies beforeCS(o)) without being taken infinitely often:
278 16 Formalizing and Checking Requirements
The rules grantAccess and putInWaitQueue in Section 13.2 define how the
central server reads request messages. The problem is that the server may always
choose to read requests from other nodes, ignoring all requests from an unlucky
node. In this case, not only is rule fairness required (is it really required?), but also
fairness concerning which message the server reads.
We must define that if a request from object o is in the state forever, then it
must eventually be read. The following formalization is based on the fact that there
should never be more than one request message from the same node in the state. The
desired communication fairness assumption reqMsgFairness(o) then just says that
a request message from o cannot be in the state continuously from some point on:
op reqMsgFairness : Oid -> Formula .
op reqFrom_inState : Oid -> Prop [ctor] .
eq REST (msg requestCS from O to server) |= reqFrom O inState = true .
eq reqMsgFairness(O) = ~ (<> [] reqFrom O inState) .
Since this definition relies on the fact that there are not multiple requests from the
same node in the state, we should first verify this fact:
Maude> (search [1] init(4) =>* (msg requestCS from O:Oid to server)
(msg requestCS from O:Oid to server)
REST:Configuration .)
No solution.
We can now check Requirement (ii) for node(3), assuming justice fairness for
node(3) (the rules may be applied justly or unjustly w.r.t. node(1) and node(2)):
Maude> (red modelCheck(init(3),
justice(node(3)) -> [] <> inCS(node(3))) .)
We can also check the requirement for all three nodes in one shot:
Maude> (red modelCheck(init(3),
(justice(node(1)) /\ justice(node(2))
/\ justice(node(3)))
-> (([] <> inCS(node(1))) /\ ([] <> inCS(node(2)))
/\ ([] <> inCS(node(3))))) .)
Requirement (iii).
The nodes should access the critical section in the order in which they request it.
This should also hold when the nodes execute forever.
We first define the before operator on LTL formulas (see Exercise 233):
op _before_ : Formula Formula -> Formula .
eq P before Q = (~ Q) W (P /\ ~ Q) .
The point is that when two nodes send a request to the server, the server may not
read the first request until the second arrives, and the server may then choose to
read any of these multiple requests. Therefore, Requirement (iii) can only hold if
the server reads request messages in the order in which they were sent.
The following formula orderedReqRead(o1 , o2 ) states that if there is a request
from node o1 , but not from node o2 , in the state, then the request from o1 will be
read (i.e., disappear from the state) before a possible message from o2 :
op orderedReqRead : Oid Oid -> Formula .
eq orderedReqRead(O1, O2) =
= (reqFrom O1 inState /\ ~ reqFrom O2 inState)
W (~ reqFrom O1 inState
\/ ((reqFrom O1 inState /\ reqFrom O2 inState)
W (~ reqFrom O1 inState))) .
We can then check the desired property (for three pairs of nodes) when the server
reads requests in order:
280 16 Formalizing and Checking Requirements
Exercise 235 Explain why fairness assumptions on the application of the last four
rewrite rules of the central server mutual exclusion algorithm are not needed to
prove Requirement (ii).
Exercise 236 Consider the token ring mutual exclusion protocol in Exercise 185,
item 6, where each process executes forever. Use LTL model checking to show that
each process can start executing inside its critical section infinitely many times.
What justice assumptions are needed? Are any compassion assumptions necessary?
The counterexample shows a path which demonstrates that the formula does not
hold; first the initial segment and then a loop: the first position satisfies ¬ P, the
second position satisfies P ∧ ¬ Q, and the loop part does not matter (True).
∀ (hasValidLotteryTicket → ∃ ♦ isMillionaire).
On the other hand, CTL cannot formalize fairness assumptions (which concern
paths); LTL and CTL are therefore incomparable in expressiveness and have dif-
ferent strengths and weaknesses [107]. The logic CTL* extends both CTL and LTL.
(p → q) (p S q) ♦ p p.
Exercise 240 Specify all chess moves in Maude (if you want). Explain why you
cannot express “white wins in two moves” in LTL. Can you express it in CTL?
Real-Time and Probabilistic Systems
17
The previous chapters abstract away timed and probabilistic aspects of distributed
systems. This chapter briefly explains how real-time systems (Section 17.1) and
probabilistic systems (Section 17.2) can be modeled and analyzed in rewriting logic.
Real-time systems are systems where the duration of/between events affects the
functionality of the system. This book abstracts from real-time features in its treat-
ment of the two-phase commit protocol in Section 13.1, where, instead of using
time-outs to determine whether a message has been lost, we assume that messages
of certain types are never lost. More generally, message duplication is an abstrac-
tion for re-sending a message when the sender has not received feedback from the
receiver for some time.
However, real-time features cannot be abstracted away in many distributed sys-
tems, for example because of the following reasons:
• Fault-tolerant systems must estimate whether messages are lost and/or whether
other nodes are down, and must take appropriate action if so. However, it is
impossible to check message loss and node crashes without taking time into
account. Using time, a node can assume that a message was lost or that another
node crashed if it has not gotten a reply within a certain time bound.
• Time is a key parameter in many distributed algorithms and protocols, for
example to fine-tune performance.
• Most computer systems today, from those in toasters and cars to airplanes, are
embedded systems, where processors interact with some physical devices/envi-
ronments. Such systems tend to be time-critical: an action that happens at the
wrong time could have unfortunate consequences.
c Springer-Verlag London 2017 283
P.C. Ölveczky, Designing Reliable Distributed Systems, Undergraduate Topics
in Computer Science, DOI 10.1007/978-1-4471-6687-0 17
284 17 Real-Time and Probabilistic Systems
The operator {_} can be defined as follows when the states are configurations:
(omod OO-TIMED-PRELUDE is protecting NAT-TIME .
sorts GlobalState ClockedState .
subsort GlobalState < ClockedState .
op ‘{_‘} : Configuration -> GlobalState [ctor] .
Assuming a sort Time for the time domain, the ‘in time’ part of tick rules can be
modeled by the operator
op _in‘time_ : GlobalState Time -> ClockedState [ctor] .
var CLS : ClockedState . vars T1 T2 : Time .
eq (CLS in time T1) in time T2 = CLS in time (T1 + T2) .
endom)
so that the “clocked state” of the system has the form {t} in time r, where r is the
total amount of time that has elapsed in the system since the start of the execution.
17.1 Real-Time Systems 285
We can let the natural numbers be the time domain Time. It is often useful to
have a supersort TimeInf of Time with the additional value oo (for infinity) and a
function monus denoting “minus down to 0”:
fmod NAT-TIME is protecting NAT .
sorts Time NzTime TimeInf . subsort NzTime < Time < TimeInf .
subsort Nat < Time . subsort NzNat < NzTime .
We extend these modules to model and execute some real-time systems in Maude.
Example 17.1. We model a stylish modern watch with only an hour hand/marker.
The watch is retrograde, so that the hour hand must jump from 12 to 0, instead of
time 12 being equal to time 0. The watch runs perfectly while it runs, but can break
at any time (battery exhausted, dropped on bathroom floor, . . . ). Such a retrograde
watch can be modeled as an object of the class Clock as follows:
(omod SINGLE-CLOCK is including OO-TIMED-PRELUDE .
class Clock | state : ClockState, time : Time .
sort ClockState . ops running stopped : -> ClockState [ctor] .
op genta : -> Oid [ctor] .
This specification has two instantaneous rewrite rules: At any time, a running
watch may break, and when the watch shows 12, it must immediately jump to 0:
var C : Oid . var T : Time .
rl [batteryDies] :
< C : Clock | state : running > => < C : Clock | state : stopped > .
rl [jumpToZero] :
< C : Clock | state : running, time : 12 >
=>
< C : Clock | time : 0 > .
The condition ensures that the watch is reset as soon as it reaches 12.
Finally, as we all know, time continues to fly even if your watch has stopped:
rl [tickOneStopped] :
{< C : Clock | state : stopped >} => {< C : Clock | >} in time 1 .
endom)
result ClockedState :
{< genta : Clock | state : stopped, time : 12 >} in time 98
Since the state has the form {t} in time r, where r can grow beyond any bound,
the reachable state space is infinite. We therefore use bounded search to analyze the
main safety requirement: the watch never shows a value greater than 12:
Maude> (search [1,1000]
{< genta : Clock | state : running, time : 0 >} =>*
{< genta : Clock | time : T:Time >} in time T2:Time
such that T:Time > 12 .)
No solution. ♦
If the states have multiple objects and/or messages, the following tick rule has
proved useful in most large Real-Time Maude applications [90]:
var CONF : Configuration .
crl [tick] :
{CONF} => {timeEffect(CONF,τ )} in time τ if τ <= mte(CONF) .
The function timeEffect defines what happens with a configuration when a certain
amount of time has elapsed. It distributes over the elements in a configuration, so
the user must define timeEffect only on single objects and messages:
vars CONF1 CONF2 : Configuration . vars T T’ : Time .
The function mte, for maximum time elapse, defines how much time can pass in the
configuration before something must happen. This function also distributes over the
elements in a configuration:
op mte : Configuration -> TimeInf [frozen (1)] .
eq mte(none) = oo .
ceq mte(CONF1 CONF2) = min(mte(CONF1), mte(CONF2))
if CONF1 =/= none and CONF2 =/= none .
This infrastructure is used in all the subsequent examples in Section 17.1. The
following example models a system with multiple retrograde watches.
17.1 Real-Time Systems 287
Example 17.2. The state may now have multiple retrograde watches, each of which
behaves as in Example 17.1. All running watches may not show the same time, since
they can have different values initially.
The class Clock and the two instantaneous rules are as in Example 17.1 and
are not shown. The tick rules in Example 17.1 are replaced by the above tick rule
for object-oriented specifications, with the value 1 for τ . What remains is to define
the functions timeEffect and mte on single Clock objects. Time elapse affects a
running watch by increasing the time it shows by the amount of elapsed time, and
passage of time does not affect a stopped watch at all:
eq timeEffect(< C : Clock | state : S, time : T >, T’)
= if S == running then < C : Clock | time : T + T’ >
else < C : Clock | > fi .
Time is allowed to advance until the moment when a running watch would show 12,
and can advance forever when the watch is broken:
eq mte(< C : Clock | state : running, time : T >) = 12 monus T .
eq mte(< C : Clock | state : stopped >) = oo .
We can then simulate a system with three watches:
Maude> (rew [100] {< seiko : Clock | state : running, time : 0 >
< dubuis : Clock | state : running, time : 0 >
< ap : Clock | state : running, time : 0 >} .)
result ClockedState :
{< ap : Clock | state : stopped, time : 0 >
< dubuis : Clock | state : stopped, time : 0 >
< seiko : Clock | state : stopped, time : 12 >} in time 94 ♦
Our watches keep perfect rate in Examples 17.1 and 17.2. It is more common that
some watches are slow, while others are fast. However, time advances by the same
amount in all parts of a distributed system,1 even if the local clocks are imperfect.
Example 17.3. We now consider imperfect watches. Each watch has a rate, which
tells how fast or slow it is. For example, a (fast) watch with rate 5/4 increases its
time value by 1.25 in an hour. A slow watch has rate < 1. We use the non-negative
rational numbers as the time domain. To ensure that each watch resets when it shows
12, we no longer advance time by one time unit in each tick step; we instead advance
time to the next moment when some watch must be reset, and by 10 time units when
all clocks have stopped:
(omod MANY-SKEWED-CLOCKS is including OO-TIMED-PRELUDE .
protecting POSRAT-TIME .
class Clock | state : ClockState, time : Time, rate : PosRat .
...
crl [tick] : {CONF1} => {timeEffect(CONF1, min(10, mte(CONF1)))}
in time min(10, mte(CONF1)) if mte(CONF1) =/= 0 .
The instantaneous rules are as before, and it remains to define timeEffect and
mte on single watches. Time affects a watch in the expected way:
mte defines how much time can advance before a watch must be reset:
eq mte(< C : Clock | state : running, time : T, rate : RATE >)
= (12 monus T) / RATE .
eq mte(< C : Clock | state : stopped >) = oo . ♦
The following example is a small network example with common timing features
such as timers, clocks, and time-out-based message retransmissions.
Example 17.4. We consider a protocol for finding the round trip time (RTT)
between two nodes; i.e., the time it takes for a message to travel from sender to
receiver, and back. The sender sends a message rttReq(t ), where t is the value of
the sender’s local clock. When the receiver receives this message, it replies with the
message rttReq(t ), with the same timestamp t. When the original sender receives
rttReq(t ), it computes the RTT as t1 − t, with t1 its current clock value.
The message delay may be arbitrarily long, and messages could be lost. There-
fore, if the original sender has not received the reply within 10 time units, it assumes
that some message was lost or hopelessly delayed, and sends a new RTT request.
This process goes on until the sender has recorded an RTT value smaller than 10.
The sender, receiver, and the messages are declared as follows:
(omod FIND-RTT is including OO-TIMED-PRELUDE .
including MESSAGE-LOSS . --- message wrapper and message loss
--- (from Section 11.2.5)
class Sender | clock : Time, rtt : Time,
resendTimer : TimeInf, receiver : Oid .
class Receiver .
The clock attribute denotes the value of the sender’s local clock, and rtt stores
the desired RTT value. resendTimer is a timer. A timer counts down, and when it
reaches zero, time does not advance; this forces the application of an action which
either resets or turns off the timer before time can advance further.
The following instantaneous rewrite rule starts an iteration of the RTT-finding
process when the resendTimer expires (becomes zero). The sender then sends an
rttReq message to the receiver with its current clock value as timestamp. The rule
also resets the resendTimer to 10, so that the process will repeat itself in 10 time
units from now, unless the resetTimer is turned off before then:
vars T T’ T1 T2 : Time . var TI : TimeInf . vars S R O1 O2 : Oid .
vars CONF CONF1 CONF2 : Configuration . var MC : MsgContent .
rl [sendRequest] :
< S : Sender | clock : T, resendTimer : 0, receiver : R >
=>
< S : Sender | resendTimer : 10 >
(msg rttReq(T) from S to R) .
17.1 Real-Time Systems 289
The receiver replies to a request with an rttReply message with the received
timestamp T:
rl [reply] :
(msg rttReq(T) from S to R)
< R : Receiver | >
=>
< R : Receiver | >
(msg rttReply(T) from R to S) .
When the sender receives the reply, it checks whether this message is a response
to its latest request, or to a previous request. If it is the former, the sender computes
and stores the rtt value and turns off its timer by setting it to the infinity value oo.
If the received message is a reply to an older request, it is just ignored:
rl [recReply] :
(msg rttReply(T1) from R to S)
< S : Sender | time : T2 >
=>
if (T2 monus T1) < 10
then < S : Sender | rtt : T2 monus T1, resendTimer : oo >
else < S : Sender | > fi .
Those are all the instantaneous rewrite rules. The tick rule is the standard one:
crl [tick] :
{CONF} => {timeEffect(CONF, 1)} in time 1 if 1 <= mte(CONF) .
timeEffect is defined on sender objects by increasing the local clock and decreas-
ing the timer value according to the elapsed time:
eq timeEffect(< S : Sender | time : T1, resendTimer : TI >, T2)
= < S : Sender | time : T1 + T2, resendTimer : TI monus T2 > .
The elapse of time does not affect the receiver or the messages:
eq timeEffect(< R : Receiver | >, T) = < R : Receiver | > .
eq timeEffect(msg MC from O1 to O2, T) = (msg MC from O1 to O2) .
mte must ensure that time advance stops when the resendTimer expires, and
that time cannot advance when the timer value is zero:
eq mte(< S : Sender | resendTimer : TI >) = TI .
The receiver and the messages do not place any restrictions on time advance:
eq mte(< R : Receiver | >) = oo .
eq mte(msg MC from O1 to O2)= oo .
We test our specification, which fails to record an RTT value within 200 steps:
Maude> (rew [200] {init} .)
result ClockedState :
{< rec : Receiver | none >
< snd : Sender | resendTimer : 6, rtt : 0, clock : 154, ... >
msg rttReq(150) from snd to rec} in time 154
Solution 1
N:Nat --> 3 ; ... ♦
Message Delays.
The treatment of message delays (the time it takes for a message to travel from
sender to receiver) in Example 17.4 is not very sophisticated. We briefly discuss
how the following types of message delays can be specified in our methodology:
1. The message delay is exactly Δ time units.
2. The message delay is at most Δ time units.
3. The message delay is at least Δ time units.
4. The message delay is any value in the time interval [δ , Δ ].
To address the first three types, we introduce a message delay operator dly:
sort DlyMsg . subsorts Msg < DlyMsg < Configuration .
op dly : Msg Time -> DlyMsg [ctor right id: 0] .
so that dly(m,t ) denotes a message with remaining delay t. right id: 0 means
that a message m is considered identical to dly(m,0). In the delay kinds 1–3 above,
• the sender sends a message of the form dly(m,Δ ), and
• time advance decreases the remaining delay according to the elapsed time:
eq timeEffect(dly(M, T1), T2) = dly(M, T1 monus T2) .
2. If the message delay is at most Δ , the receiver should read a message of the
form dly (m, T), and the above definition mte(dly(M, T)) = T ensures that
the message is read no later than after its (maximal) delay has expired.
3. If the message delay is at least Δ , the receiver must read the undelayed message
m, while the time advance does not need to stop when the message is ripe:
eq mte(dly(M, T)) = oo .
No solution.
Real-time system requirements are often timed properties, such as “the airbag will
deploy within 10 milliseconds after a crash has been detected” or “the ventilator
machine cannot be paused more than once, and for no longer than two seconds,
every ten minutes during surgery.”
292 17 Real-Time and Probabilistic Systems
There are a number of timed extensions of temporal logics for specifying timed
system requirements (see, e.g., [4]). The standard extension (also called metric tem-
poral logic) equips the temporal operator U (and therefore also , ♦, and W ) with a
time interval I: φ1 UI φ2 . A path π satisfies such a formula if it reaches a φ2 -position
in some time within the interval I, and all positions up to that point satisfy φ1 .
The first property above can then be formalized as
Real-Time Maude [90, 93] supports the specification and analysis of real-time
rewrite theories. It is implemented in Maude as an extension of Full Maude, and
provides timed versions of Maude’s analysis methods: simulate the system up to
a certain time; search for states satisfying a certain pattern that are reachable in a
certain time interval; and timed temporal logic model checking [68].
Real-Time Maude has been applied to a wide range of state-of-the-art appli-
cations, including wireless sensor networks and cloud storage systems, and also
provides semantics and formal analysis to industrial modeling languages such as
AADL and Ptolemy II [89, 90]. It is worth remarking that Real-Time Maude ran-
domized simulations could estimate the performance as well as dedicated simula-
tion tools for wireless sensor networks [95].
Exercise 241 Our watches only show the hours. Specify a watch, or a system with
multiple watches, that display time in terms of hours, minutes, and seconds.
Exercise 242 Specify populations in a timed setting, so that when time advances by
one time unit, a new year begins and everybody becomes one year older. Further-
more, an engaged couple should marry or break the engagement the same year, and
a separated couple must be divorced within two years.
Exercise 243 Model the two-phase commit protocol as it was described: assume
that prepare, ok, and notOK messages may be lost or much delayed, and that the
coordinator assumes a “not OK” answer if it does not receive an answer from a
node within 20 time units. Assume furthermore that abort and commit messages
are not lost, and use Maude to check whether all “final” states are consistent.
Exercise 244 Assuming discrete time, how can we model that the message delay
can be any time value in the interval [τ1 , τ2 ]? What kind of message should the
sender send, the receiver receive, and how should timeEffect and mte be defined?
17.2 Probabilistic Systems 293
In probabilistic rewrite theories [1], a probabilistic rewrite rule has the form
Example 17.5. A system where a rewrites to b with 30% probability and to c with
70% probability, and where c rewrites to d with 40% probability and to e with 60%
probability, can be specified with the following probabilistic rewrite rules, using a
Maude-like syntax:
prl a => Y with probability Y := {b with prob 0.3; c with prob 0.7} .
prl c => Y with probability Y := {d with prob 0.4; e with prob 0.6} .♦
Example 17.6. Let us revisit our person/population example. There should be a cer-
tain probability of dying and of living one more year. This probability is a function
of the age of a person: at early age the probability of celebrating a birthday should
be much higher than that of dying. Estimating the probability of dying is far from
my expertise, so I assume for illustration purposes that the probability of dying at
age x is x4 /1204 . The following probabilistic rewrite rule specifies birthdays and
deaths with these probabilities:
17.2 Probabilistic Systems 295
cprl [birthdayOrDeath] :
< P : Person | age : X, state : S >
=>
if B then < P : Person | state : deceased >
else < P : Person | age : X + 1 > fi
with probability B := bernoulli((X ˆ 4) / (120 ˆ 4))
if S =/= deceased .
The Bernoulli distribution with probability p returns true with probability p and
false with probability 1 − p. The probability of assigning the value true to the
new Boolean variable B in the right hand side of the rule is therefore X4 /1204 . If
true is sampled, the person becomes deceased; otherwise, the person has dodged
the Grim Reaper for another year and celebrates his/her birthday. ♦
The probability distribution is again a function of the current state, namely, of the
number of cards remaining in the shoe. ♦
rl bernoulli(P)
=> if (random(counter) / max-rand) <= P then true else false fi .
Example 17.8. Assume that there are two persons, "Robert" and "Roland", in the
state. The rule birthdayOrDeath could be applied on either person. If the system
is a timed system, we can start by generating the messages
dly(firstBDay("Robert"), X) dly(firstBDay("Roland"), Y)
with probability X := contDist(365) /\ Y := contDist(365)
Just as temporal logics can be extended with time, so can they be extended to proba-
bilistic temporal logics [8, 55], many of which deal with both time and probabilities.
For example, the formula P≥0.9 ( (crash → ♦≤10ms airbag)) says that the airbag
will deploy within 10 milliseconds after a crash, with probability 90% or higher.
There are a number of probabilistic model checkers which can check whether a
probabilistic temporal logic formula holds in a model [56, 63]. The difference be-
tween such (precise) probabilistic model checking and statistical model checking
is that the former guarantees that it gets the right answer, whereas the latter cannot
2 That the probability of something happening is 0 does not imply that it cannot happen.
17.2 Probabilistic Systems 297
guarantee that its answer is correct, only that it is likely to be correct. For exam-
ple, consider the property Ψ ≡ a |= P≥0.41 (♦ E) of the system in Example 17.5,
with E an atomic proposition that holds only in state e. Since the probability of
reaching e from a is exactly 42%, a probabilistic model checker will always answer
that Ψ holds, whereas a statistical model checker might be very unlucky with its
randomized simulations and could say that Ψ does not hold. On the other hand, as
already mentioned, probabilistic model checking suffers from state space explosion
and does not scale up well, in contrast to statistical model checking.
17.2.3 PV E S TA Analysis
Example 17.9. In Example 17.6, the interesting value of a run is the value of the
age attribute in the final state of the execution; in the blackjack game, the key value
of a run is the value of the money attribute in the final state of the run.
The probabilities we might be interested in are:
1. What is the probability of becoming at least 65 years old?
2. Starting with $1000, what is the probability of having at least $1200 after 20
rounds of blackjack at the $100 table? ♦
Exercise 245 How would you model message communication where the message
delay is chosen probabilistically by some distribution, and where the likelihood of a
message being dropped is k%?
Exercise 246 Add suitable probabilities (and define the corresponding probabilis-
tic rewrite rules) to other events in a population.
Exercise 247 Explain how some of your favorite probability distributions can be
sampled in Maude, using random and counter.
Exercise 248 How would you model failures that occur with probability of one fail-
ure every 100 days?
3 It
is known that “dealer must hit soft 17” is indeed better for the casino, even when faced with
more sophisticated players.
Appendix A
Mathematical Preliminaries
This appendix gives some necessary background to the mathematical concepts used
in this book, many of which are only needed in Chapter 7.
Sets. A set is a finite or infinite collection of elements. Examples of sets are the
natural numbers N = {0, 1, 2, 3, . . .} and the set {a, b, c}. We write a ∈ A if a is an
element in the set A. If A1 , …, An are sets, then A1 × · · · × An is another set, the
product of those sets, whose elements are n-tuples (a1 , . . . , an ), where each ai ∈ Ai .
The powerset P (A) of A is the set whose elements are the subsets of A.
Partial Orders. A binary relation R over A is a partial order (over A) if and only if
it satisfies the following properties for all a, b, c in A:
• Reflexivity: a Ra holds for all a ∈ A.
• Antisymmetry: If a Rb and b Ra both hold, then a = b.
• Transitivity: If a Rb and b Rc both hold, then a Rc also holds.
Examples of partial orders include the relations ≤ and ≥ on numbers, and the (re-
flexive) subset relation ⊆ on sets. The relations < and > on numbers (do not satisfy
reflexivity), and “brother of” (does not satisfy antisymmetry) are not partial orders.
A binary relation R over A is strict partial order if and only if it satisfies:
• Irreflexivity: There is no a ∈ A such that a Ra.
• Transitivity: Defined as above.
Examples of strict partial orders are the relations < and > on numbers, the “forefather
of” relation, and the proper subset relation ⊂ on sets. Antisymmetry is not mentioned,
since a Rb and b Ra cannot both hold in a strict partial order (why not?).
[0]≡3 = {0, 3, 6, 9, . . .}
[1]≡3 = {1, 4, 7, 10, . . .}
[2]≡3 = {2, 5, 8, 11, . . .},
1. G. Agha, J. Meseguer, and K. Sen. PMaude: Rewrite-based specification language for prob-
abilistic object systems. Electronic Notes in Theoretical Computer Science, 153(2):213–239,
2006.
2. M. AlTurki and J. Meseguer. PVeStA: A parallel statistical model checking and quantitative
analysis tool. In Proc. Algebra and Coalgebra in Computer Science (CALCO 2011), volume
6859 of Lecture Notes in Computer Science. Springer, 2011.
3. M. AlTurki, J. Meseguer, and C. A. Gunter. Probabilistic modeling and analysis of DoS
protection for the ASV protocol. Electronic Notes in Theoretical Computer Science, 234:3–
18, 2009.
4. R. Alur and T. A. Henzinger. Logics and models of real time: A survey. In Real-Time: Theory
in Practice, volume 600 of Lecture Notes in Computer Science. Springer, 1992.
5. A. Armando et al. The AVISPA tool for the automated validation of internet security protocols
and applications. In Proc. Computer Aided Verification (CAV 2005), volume 3576 of Lecture
Notes in Computer Science. Springer, 2005.
6. F. Baader and T. Nipkow. Term Rewriting and All That. Cambridge University Press, 1998.
7. K. Bae and J. Meseguer. Model checking linear temporal logic of rewriting formulas under
localized fairness. Science of Computer Programming, 99:193–234, 2015.
8. C. Baier, J.-P. Katoen, and H. Hermanns. Approximate symbolic model checking of
continuous-time Markov chains. In Proc. Concurrency Theory (CONCUR 1999), volume
1664 of Lecture Notes in Computer Science. Springer, 1999.
9. J. Baker et al. Megastore: Providing scalable, highly available storage for interactive services.
In Proc. Innovative Data Systems Research (CIDR 2011). www.cidrdb.org, 2011.
10. D. Benanav, D. Kapur, and P. Narendran. Complexity of matching problems. In Proc. Rewriting
Techniques and Applications (RTA 1985), volume 202 of Lecture Notes in Computer Science.
Springer, 1985.
11. J. A. Bergstra and J. V. Tucker. A characterization of computable data types by means of a finite,
equational specification method. CWI Technical Report IW 124/79, Stichting Mathematisch
Centrum, Amsterdam, 1979.
12. J. A. Bergstra and J. V. Tucker. Algebraic specification of computable and semicomputable
data types. Theoretical Computer Science, 50:137–181, 1987.
13. B. Blanchet. Automatic verification of security protocols in the symbolic model: The verifier
ProVerif. In Foundations of Security Analysis and Design VII (FOSAD 2012/2013), volume
8604 of Lecture Notes in Computer Science. Springer, 2014.
14. D. Bogdanas and G. Rosu. K-Java: A complete semantics of Java. In Proc. Principles of
Programming Languages (POPL 2015). ACM, 2015.
15. E. A. Brewer. Towards robust distributed systems (abstract). In Proc. Principles of Distributed
Computing (PODC 2000). ACM, 2000.
16. R. Bruni and J. Meseguer. Semantic foundations for generalized rewrite theories. Theoretical
Computer Science, 360(1-3):386–414, 2006.
17. M. Burrows, M. Abadi, and R. M. Needham. A logic of authentication. ACM Transactions on
Computer Systems, 8(1):18–36, 1990.
18. E. Chang and R. Roberts. An improved algorithm for decentralized extrema-finding in circular
configurations of processes. Communications of the ACM, 22:281–283, 1979.
19. S. Chen, J. Meseguer, R. Sasse, H. J. Wang, and Y.-M. Wang. A systematic approach to
uncover security flaws in GUI logic. In Proc. IEEE Symposium on Security and Privacy. IEEE
Computer Society, 2007.
20. M. Clavel, F. Durán, S. Eker, S. Escobar, P. Lincoln, N. Martí-Oliet, J. Meseguer, and C. Talcott.
Maude Manual (Version 2.7.1), July 2016. http://maude.cs.illinois.edu.
21. M. Clavel, F. Durán, S. Eker, P. Lincoln, N. Martí-Oliet, J. Meseguer, and C. Talcott. All
About Maude – A High-Performance Logical Framework, volume 4350 of Lecture Notes in
Computer Science. Springer, 2007.
22. S. A. Cook. The complexity of theorem-proving procedures. In Proc. ACM Symposium on
Theory of Computing (STOC 1971). ACM, 1971.
23. G. Coulouris, J. Dollimore, and T. Kindberg. Distributed Systems: Concepts and Design.
Addison-Wesley, third edition, 2001.
24. C. J. F. Cremers. The Scyther Tool: Verification, falsification, and analysis of security
protocols. In Proc. Computer Aided Verification (CAV 2008), volume 5123 of Lecture Notes
in Computer Science. Springer, 2008.
25. M. Davis, Y. Matijasevic̆, and J. Robinson. Hilbert’s tenth problem. Diophantine equations:
positive aspects of a negative solution. In Mathematical Developments Arising from Hilbert
Problems, Part 2, volume 28.2 of Proceedings of Symposia in Pure Mathematics. American
Mathematical Society, 1976.
26. N. Dershowitz. Orderings for term-rewriting systems. Theoretical Computer Science, 17:279–
301, 1982.
27. N. Dershowitz. Termination of rewriting. Journal of Symbolic Computation, 3:69–116, 1987.
28. W. Diffie and M. Hellman. New directions in cryptography. IEEE Transactions on Information
Theory, 22:644–654, 1976.
29. E. W. Dijkstra. Two starvation free solutions to a general exclusion problem. EWD 625,
Plataanstraat 5, 5671 Al Nuenen, The Netherlands, 1978.
30. D. Dolev and A. Yao. On the security of public-key protocols. IEEE Transactions on Infor-
mation Theory, 29:198–208, 1983.
31. G. Dowek, C. A. Muñoz, and C. Rocha. Rewriting logic semantics of a plan execution language.
In Proc. Structural Operational Semantics (SOS 2009), volume 18 of Electronic Proceedings
in Theoretical Computer Science, 2009.
32. F. Durán, S. Lucas, C. Marché, J. Meseguer, and X. Urbain. Proving operational termination of
membership equational programs. Higher-Order and Symbolic Computation, 21(1-2):59–88,
2008.
33. F. Durán and J. Meseguer. On the Church-Rosser and coherence properties of conditional
order-sorted rewrite theories. Journal of Logic and Algebraic Programming, 81(7-8):816–
850, 2012.
References 305
56. A. Hartmanns and H. Hermanns. The Modest toolset: An integrated environment for quan-
titative modelling and verification. In Proc. Tools and Algorithms for the Construction and
Analysis of Systems (TACAS 2014), volume 8413 of Lecture Notes in Computer Science.
Springer, 2014.
57. J. Hendrix, J. Meseguer, and H. Ohsaki. A sufficient completeness checker for linear order-
sorted specifications modulo axioms. In Proc. Automated Reasoning (IJCAR 2006), volume
4130 of Lecture Notes in Computer Science. Springer, 2006.
58. S. Kamin and J.-J. Lévy. Two generalizations of the recursive path ordering. Unpublished
Note, Department of Computer Science, University of Illinois, Urbana, IL, 1980.
59. M. Katelman, J. Meseguer, and J. Hou. Redesign of the LMST wireless sensor protocol through
formal modeling and statistical model checking. In Proc. Formal Methods for Open Object-
Based Distributed Systems (FMOODS 2008), volume 5051 of Lecture Notes in Computer
Science. Springer, 2008.
60. B. Kirkerud. Lecture notes on rewrite systems. Dept. of Informatics, University of Oslo, 1994.
http://heim.ifi.uio.no/~in307/notater/.
61. T. Kleinjung et al. Factorization of a 768-bit RSA modulus. In Proc. Advances in Cryptology
(CRYPTO 2010), volume 6223 of Lecture Notes in Computer Science. Springer, 2010.
62. D. E. Knuth and P. B. Bendix. Simple word problems in universal algebras. In J. Leech, editor,
Computational Problems in Abstract Algebra, pages 263–297. Pergamon Press, 1970.
63. M. Kwiatkowska, G. Norman, and D. Parker. PRISM 4.0: Verification of probabilistic real-
time systems. In Proc. Computer Aided Verification (CAV 2011), volume 6806 of Lecture
Notes in Computer Science. Springer, 2011.
64. L. Lamport. The part-time parliament. ACM Transactions on Computer Systems, 16(2):133–
169, 1998.
65. L. Lamport. Paxos made simple. ACM SIGACT News, 32:51–58, 2001.
66. B. Lampson and H. Sturgis. Crash recovery in a distributed data storage system. Technical
report, Xerox Palo Alto Research Center, 1976.
67. F. Laroussinie, N. Markey, and P. Schnoebelen. Temporal logic with forgettable past. In Proc.
Logic in Computer Science (LICS 2002). IEEE Computer Society, 2002.
68. D. Lepri, E. Ábrahám, and P. C. Ölveczky. Sound and complete timed CTL model checking
of timed Kripke structures and real-time rewrite theories. Science of Computer Programming,
99:128–192, 2015.
69. E. Lien and P. C. Ölveczky. Formal modeling and analysis of an IETF multicast protocol.
In Proc. Software Engineering and Formal Methods (SEFM 2009). IEEE Computer Society,
2009.
70. S. Liu, J. Ganhotra, M. R. Rahman, S. Nguyen, I. Gupta, and J. Meseguer. Quantitative analy-
sis of consistency in NoSQL key-value stores. Leibniz Transactions on Embedded Systems,
4(1):03:1–03:26, 2017.
71. S. Liu, M. R. Rahman, S. Skeirik, I. Gupta, and J. Meseguer. Formal modeling and analysis
of Cassandra in Maude. In Proc. Formal Methods and Software Engineering (ICFEM 2014),
volume 8829 of Lecture Notes in Computer Science. Springer, 2014.
72. G. Lowe. An attack on the Needham-Schroeder public-key authentication protocol. Informa-
tion Processing Letters, 56:131–133, 1995.
73. G. Lowe. Breaking and fixing the Needham-Schroeder public-key protocol using FDR. In
Proc. Tools and Algorithms for Construction and Analysis of Systems (TACAS 1996), volume
1055 of Lecture Notes in Computer Science. Springer, 1996.
74. R. R. Lutz. Analyzing software requirements errors in safety-critical embedded systems. In
Proc. IEEE International Symposium on Requirements Engineering. IEEE, 1993.
75. Z. Manna and A. Pnueli. Temporal Verification of Reactive Systems: Safety. Springer, 1995.
76. N. Martí-Oliet, M. Palomino, and A. Verdejo. Rewriting logic bibliography by topic: 1990-
2011. Journal of Logic and Algebraic Programming, 81(7-8):782–815, 2012.
References 307
99. J. Rushby. Mechanized formal methods: Progress and prospects. In Proc. Foundations of
Software Technology and Theoretical Computer Science (FSTTCS 1996), volume 1180 of
Lecture Notes in Computer Science. Springer, 1996.
100. R. Sasse, S. T. King, J. Meseguer, and S. Tang. IBOS: A correct-by-construction modular
browser. In Proc. Formal Aspects of Component Software (FACS 2012), volume 7684 of
Lecture Notes in Computer Science. Springer, 2012.
101. S. Sebastio and A. Vandin. MultiVeStA: Statistical model checking for discrete event simu-
lators. In Proc. Performance Evaluation Methodologies and Tools (ValueTools 2013). ICST,
Brussels, Belgium, 2013.
102. K. Sen, M. Viswanathan, and G. Agha. On statistical model checking of stochastic systems. In
Proc. Computer Aided Verification (CAV 2005), volume 3576 of Lecture Notes in Computer
Science. Springer, 2005.
103. K. Sen, M. Viswanathan, and G. A. Agha. VeStA: A statistical model-checker and analyzer
for probabilistic systems. In Proc. Quantitative Evaluation of Systems (QEST 2005). IEEE
Computer Society, 2005.
104. P. W. Shor. Polynomial-time algorithms for prime factorization and discrete logarithms on a
quantum computer. SIAM Journal of Computing, 26(5):1484–1509, 1997.
105. Terese. Term Rewriting Systems, volume 55 of Cambridge Tracts in Theoretical Computer
Science. Cambridge University Press, 2003.
106. Y. Toyama. Counterexamples to termination for the direct sum of term rewriting systems.
Information Processing Letters, 25:141–143, 1987.
107. M. Y. Vardi. Branching vs. linear time: Final showdown. In Proc. Tools and Algorithms for
the Construction and Analysis of Systems (TACAS 2001), volume 2031 of Lecture Notes in
Computer Science. Springer, 2001.
108. M. Wirsing. Algebraic specification. In J. van Leeuwen, editor, Handbook of Theoretical
Computer Science, volume B. Elsevier, 1990.
109. H. L. S. Younes and R. G. Simmons. Probabilistic verification of discrete event systems using
acceptance sampling. In Proc. Computer Aided Verification (CAV 2002), volume 2404 of
Lecture Notes in Computer Science. Springer, 2002.
Index
A natural numbers, 36
algebra, 110 random numbers, 40
canonical term algebra, 118, 122 rational numbers, 38
computable, 24 strings, 39
ground term algebra, 115
initial algebra, 120 C
isomorphic, 114 canonical form, 63
many-sorted, 110 category, 144
normal form algebra, 118 choice operator, 131
order-sorted, 112 class declaration, 164
quotient algebra, 117 class inheritance, 165
(, E)-algebra, 116 multiple inheritance, 166
T,E , 117 coffee bean game, 133, 149
term algebra, 115 comment, 12
alternating bit protocol, 205 communication, 183
arity, 16 asynchronous, 160, 183
associativity, 42 ordered, 183, 193
atomic commit, 212 synchronous, 157, 183, 184
atomic multicast, 193 unordered, 184
authentication, 1, 233 unordered and asynchronous, 185
unreliable, 191
B commutativity, 41
behavior, 142 computation, 19, 63
Bernoulli distribution, 295 computation tree logic (CTL), 282
binary tree, 26, 106 concurrency, 135
BINTREE-NAT1, 26 nested concurrency, 138
Birkhoff’s Completeness Theorem, 119 sideways concurrency, 136
blackjack, 176 CONFIGURATION, 163
statistical model checking of, 298 configuration, 156
BOOL, 35 confluence, 63, 85, 90
BOOLEAN, 14 ground confluence, 85
broadcast, 189 local confluence, 86
wireless, 190 congruence, 301
built-in module, 35 connected component, 30
Boolean values, 35 consensus, 231
floating-point numbers, 39 Paxos consensus algorithm, 232
integers, 38 consistency, 212
© Springer-Verlag London 2017 309
P.C. Ölveczky, Designing Reliable Distributed Systems, Undergraduate Topics
in Computer Science, DOI 10.1007/978-1-4471-6687-0
310 Index
L MESSAGE-LOSS, 191
label, 130 MESSAGE-LOSS-DUPLICATION, 191
language semantics, 6 MESSAGE-WRAPPER, 188
leader election, 226 metric temporal logic, 292
ring-based algorithm, 226 monotonic, 74
spanning-tree-based algorithm, 227 MSET-INT, 44
least sort, 30 MULTICAST, 189
lexicographic comparison, 28, 76 multicast, 188
lexicographic path order (lpo), 79 multiset, 44
implementation, 83 multiset comparison, 76
linear temporal logic (LTL), 263 multiset path order (mpo), 80
formula, 265 mutual exclusion, 221
model checking in Maude, 273 central server algorithm, 223
satisfiability and tautology checking, 281 Maekawa’s voting algorithm, 222, 225
semantics, 267 temporal logic model checking of, 277
LINK, 194 token ring algorithm, 222, 225
link, 193
limited capacity, 196 N
unreliable, 195 NAT, 37
list, 24, 43 NAT-ADD, 13
LIST-INT, 44 NAT-EXP, 24
LIST-NAT1, 25 NAT-MULT, 23
livelock, 173 NAT<, 15
looping, 72 natural numbers, 13
Needham-Schroeder protocol (NSPK), 1,
M 233, 235
many-sorted equational specification, 12, 18 attack on, 245
expressiveness, 23 Lowe’s correction, 248
matching, 61 Newman’s Lemma, 86
modulo axioms, 65 no confusion, 123
matching equation, 147 no junk, 123
mathematical induction, 301 nonce, 235
Maude, 4 nondeterminism, 128
applications, 5 nontermination, 72
comments, 12 Toyama’s example, 73
download, 5, 13 normal form, 21, 63
errors, 14 NP-complete problem, 49
functional module, 12 Clique, 50
module importation, 15 Hamiltonian Circuit, 50, 52
run, 13 Integer Knapsack, 54
system module, 131 Knapsack, 50
Windows, 5 Multiprocessor Scheduling, 50
membership equational logic, 34 Partition, 50
mergesort, 48 Satisfiability, 49
parametric, 55 Subgraph Isomorphism, 50
message delay, 290 Subset Sum, 50, 51, 57
message passing, 159 Traveling Salesman, 50, 54, 134
message wrapper, 188 NSPK, 237
312 Index
O Q
object, 155 quicksort, 47
creation and deletion, 158
identifier, 164 R
object-oriented module, 163 RANDOM, 40
ONE-PERSON, 133 random numbers, 40
one-sorted, 59 RAT, 38
one-step concurrent rewrite, 141 reactive system, 127
OO-POPULATION, 160 Real-Time Maude, 292
operational semantics, 18, 59 real-time system, 283
operator, 16 in rewriting logic, 284
operator attribute reduces, 59
assoc, 42 reducible, 63
comm, 41 reduction, 59
ctor, 13 reduction sequence, 19, 63
ditto, 37 reduction step, 62
format, 58 relation, 300
frozen, 144 renaming, 87
id:, 43 replicated databases, 212
prec, 15 requirement specification, 249
special, 36 rew, 145, 147, 148
strat, 57 rewrite condition, 131
optimal proof system, 102 rewrite rule, 130
order-sorted specification, 29 rewrite theory, 131
overloaded, 29 rewriting, 59
owise, 57 rewriting logic, 127, 130
concurrent steps, 140
confluence, 142
P deduction rules, 140
PARAM-SORT, 55 execution, 145
parameterized module, 54 model, 144
partial order, 300 semantics, 144
past temporal operator, 282 sequent, 130
POPULATION, 137, 165 specification, 131
position, 60 termination, 142
powerset, 299 run, 142
precedence, 15, 79
prelude.maude, 35 S
preregular, 30 search, 150
probabilistic rewrite theory, 294 show path, 152
to ordinary rewrite theory, 295 search, 145, 150, 168
probabilistic system, 293 self-embedding, 77
probabilistic temporal logic, 296 separation problem, 185
process failure, 219 SEQNO-UNORDERED, 201
Byzantine failure, 219 sequence number, 200
crash failure, 218 sequent, 94, 139
fault injection, 220 sequential rewrite, 141
recovery, 219 shared variable, 184, 197
protocol, 188 signature
public-key cryptography, 233, 234 many-sorted, 16
RSA algorithm, 234 order-sorted, 29
Index 313