Slides (Updated On 27-05-2024)
Slides (Updated On 27-05-2024)
Summer 2024
1
0 General
• Correctness of programs
• Functional programming with OCaml
2
1 Correctness of Programs
Problem
How can it be guaranteed that a program behaves as it should behave?
3
Approaches
Tool: assertions
4
Example
5
Example
6
Comments
7
Caveat
The run-time check should evaluate a property of the program state when reaching a
particular program point.
The check should by no means change the program state (significantly) !!!
Otherwise, the behavior of the observed system differs from the unobserved system ???
8
Problem
==⇒
We require a general method in order to guarantee that a given assertion is valid ...
9
1.1 Program Verification
10
Simplification
Idea
• We annotate each program point with an assertion !
• At every program point, we argue that the assertion is valid ...
==⇒ logic
11
1.2 Background: Logic
∀ x. human(x) ⇒ mortal(x)
human(Socrates), mortal(Socrates)
Tautology: A ∨ ¬A
∀ x ∈ Z. x < 0 ∨ x = 0 ∨ x > 0
13
Laws: ¬¬A ≡ A double negation
A∧A ≡ A idempotence
A∨A ≡ A
¬(A ∨ B) ≡ ¬A ∧ ¬B De Morgan
¬(A ∧ B) ≡ ¬A ∨ ¬B
A ∧ (B ∨ C) ≡ (A ∧ B) ∨ (A ∧ C) distributivity
A ∨ (B ∧ C) ≡ (A ∨ B) ∧ (A ∨ C)
A ∨ (B ∧ A) ≡ A absorption
A ∧ (B ∨ A) ≡ A
14
The Count Example
Start
x = 0;
no yes
x < 100
write(x); x=x+1;
Stop
15
The Count Example
Start
A
x = 0;
E no
x < 100
yes C
write(x); x=x+1;
F D
Stop
16
Discussion
• We require one assertion per program point ...
• The program points correspond to
• the join points;
• all edges of the control-flow diagram not entering or leaving join points.
A true
B 0 ≤ x ≤ 100
C 0 ≤ x < 100
D 1 ≤ x ≤ 100
E x = 100
F x = 100
17
Assertions: Idea
18
Question
Sub-problem 1: Assignments
Consider the assignment: x = y+z;
In order to have after the assignment: x > 0, // post-condition
we must have before the assignment: y + z > 0. // pre-condition
19
General Principle
• Every assignment transforms a post-condition B into a minimal assumption
that must be valid before the execution so that B is valid after the execution.
A ⇒ WPJsK (B)
20
Example
assignment: x = x-y;
post-condition: x>0
21
... in the Count Program (1):
assignment: x = 0;
post-condition: B
weakest pre-condition:
B[0/x] ≡ (0 ≤ x ∧ x ≤ 100)[0/x]
≡ 0 ≤ 0 ∧ 0 ≤ 100
≡ true
≡ A
22
... in the Count Program (2):
assignment: x = x+1;
post-condition: D
weakest pre-condition:
23
Wrap-up
∀ x. B B B[e/x]
x = read(); write(e); x = e;
B B B
WPJ;K(B) ≡ B
WPJx = e;K(B) ≡ B[e/x]
WPJx = read();K(B) ≡ ∀ x. B
WPJwrite(e);K(B) ≡ B
24
Discussion
• For all actions, the wrap-up provides the corresponding weakest pre-conditions
for a post-condition B.
• An output statement does not change any variable. Therefore, the weakest
pre-condition is B itself.
• An input statement x=read(); modifies the variable x unpredictably.
In order B to hold after the input, B must hold for every possible x
before the input.
25
Orientation
Start
A
x = 0;
E no
x < 100
yes C
write(x); x=x+1;
F D
Stop
26
Sub-problem 2: Conditionals
A
no yes
b
B0 B1
It should hold:
• A ∧ ¬b ⇒ B0 and
• A ∧ b ⇒ B1 .
27
This is the case, if A implies the weakest pre-condition of the conditional branching:
28
Example
29
The Count Example
30
The GCD Example
Start
x = a = read();
y = b = read();
no yes
x != y
no yes
write(x); x<y
x=x−y; y=y−x;
Stop
31
Insight
You only can verify what you understand ...
Background
d | x holds iff x = d · z for some integer z.
32
Discussion
gcd(x, 0) = |x|
gcd(x, x) = |x|
gcd(x, y) = gcd(x, y − x)
gcd(x, y) = gcd(x − y, y)
33
Idea for GCD
A ≡ gcd(a, b) = gcd(x, y)
B ≡ A ∧ (x = y)
34
The GCD Example
Start
true
x = a = read();
a=x
y = b = read();
A
no yes
x != y
B A
no yes
write(x); x<y
A A
B x=x−y; y=y−x;
Stop A
35
... in the GCD Program (1):
assignment: x = x-y;
post-condition: A
weakest pre-condition:
36
... in the GCD Program (2):
assignment: y = y-x;
post-condition: A
weakest pre-condition:
37
Orientation
Start
true
x = a = read();
a=x
y = b = read();
A
no yes
x != y
B A
no yes
write(x); x<y
A A
B x=x−y; y=y−x;
Stop A
38
WPJy = b;K (A) ≡ A[b/y]
≡ gcd(a, b) = gcd(x, b)
39
Orientation
Start
true
x = a = read();
a=x
y = b = read();
A
no yes
x != y
B A
no yes
write(x); x<y
A A
B x=x−y; y=y−x;
Stop A
40
For the statements: a = read(); x = a; we calculate:
41
Conditionals
b ≡ y>x
¬b ∧ A ≡ x ≥ y ∧ gcd(a, b) = gcd(x, y)
b ∧ A ≡ y > x ∧ gcd(a, b) = gcd(x, y)
gcd(a, b) = gcd(x, y)
... i.e., exactly A
42
Orientation
Start
true
x = a = read();
a=x
y = b = read();
A
no yes
x != y
B A
no yes
write(x); x<y
A A
B x=x−y; y=y−x;
Stop A
43
Conditionals II
The argument for the assertion before the loop is analogous:
b ≡ y 6= x
¬b ∧ B ≡ A ∧ x = y
b ∧ A ≡ A ∧ x 6= y
44
Summary of the Approach
• Annotate each program point with an assertion.
• Program start should receive annotation true.
• Verify for each statement s between two assertions A and B, that A
implies the weakest pre-condition of s for B i.e.,
A ⇒ WPJsK(B)
• Verify for each conditional branching with condition b, whether the assertion A
before the condition implies the weakest pre-condition for the post-conditions B0
and B1 of the branching, i.e.,
A ⇒ WPJbK (B0 , B1 )
An annotation with the last two properties is called locally consistent (lc).
45
1.3 Correctness
Questions
46
Recap (1)
σ = {x 7→ 5, y 7→ −42}
A[σ(x)/x]x∈A
We write: σ |= A.
47
Example
σ = {x 7→ 5, y 7→ 2}
A ≡ (x > y)
A[5/x, 2/y] ≡ (5 > 2)
≡ true
σ = {x 7→ 5, y 7→ 12}
A ≡ (x > y)
A[5/x, 12/y] ≡ (5 > 12)
≡ false
48
Trivial Properties
σ |= A1 and σ |= A2 is equivalent to
σ |= A1 ∧ A2
σ |= A1 or σ |= A2 is equivalent to
σ |= A1 ∨ A2
49
Recap (2)
50
Example
Start
0
x = a = read();
y = b = read();
no yes
6 x != y 2
no yes
write(x); 4 x<y 3
7 x=x−y; y=y−x;
Stop 5
51
Assume that we start in point 3 with {x 7→ 6, y 7→ 12}.
σ ⊕ {x 7→ d} = {z 7→ σ z | z 6≡ x} ∪ {x 7→ d}
{x 7→ 6, y 7→ 12} ⊕ {y 7→ 6} = {x 7→ 6, y 7→ 6}
52
Theorem
Let p be a MiniJava program, let π be an execution trace starting in program
point u and leading to program point v.
Assumptions:
• The program points in p are annotated by assertions which are
lc.
• The program point u is annotated with A.
• The program point v is annotated with B.
Conclusion:
If the initial state of π satisfies the assertion A, then the final state
satisfies the assertion B.
53
Remarks
• If the start point of the program is annotated with true, then every execution
trace reaching program point v satisfies the assertion at v.
• In order to prove that an assertion A holds at a program point v, we
require a lc annotation satisfying:
(1) The start point is annotated with true.
(2) The assertion at v implies A.
• So far, our method does not provide any guarantee that v is ever reached !!!
• If a program point v can be annotated with the assertion false, then v
cannot be reached.
54
Proof
Let π = (u0 , σ0 )s1 (u1 , σ1 ) . . . sm (um , σm )
Assumption: σ0 |= A.
Proof obligation: σm |= B.
Idea
Induction on the length m of the execution trace.
Base m = 0:
The endpoint of the execution equals the startpoint.
==⇒ σ0 = σm and A ≡ B
==⇒ the claim holds.
55
Important Notion: Evaluation of Expressions
Program State
σ = {x 7→ 5, y 7→ −1, z 7→ 21}
Arithmetic Expression
t ≡ 2∗z+y
Evaluation
JtK σ = J2 ∗ z + yK {x 7→ 5, y 7→ −1, z 7→ 21}
= 2 · 21 + (−1)
= 41
56
Proposition
For (arithmethic) expressions t, e,
E.g., consider t ≡ x + y, e ≡ 2 ∗ z
for σ = {x → 7 5, y 7→ −1, z 7→ 21}.
57
Proposition
Proof
σ ⊕ {x 7→ JeK σ} |= t1 < t2
iff Jt1 K (σ ⊕ {x 7→ JeK σ}) < Jt2 K (σ ⊕ {x 7→ JeK σ})
iff Jt1 [e/x]K σ < Jt2 [e/x]K σ
iff σ |= t1 [e/x] < t2 [e/x] □
58
Proposition
Proof
Induction on the structure of formula A □
59
Induction Proof of Correctness (cont.)
Step m > 0:
Induction Hypothesis: The statement holds already for m − 1.
Let B ′ denote the assertion at point um−1 .
==⇒ σm−1 |= B ′
Case 1. sm ≡ ;
Then• σm−1 = σm
• WPJ;K (B) ≡ B
==⇒ B ′ ⇒ B
==⇒ σm−1 = σm |= B □
60
Induction Proof of Correctness (cont.)
Case 2. sm ≡ write(e);
Then• σm−1 = σm
• WPJwrite(e);K (B) ≡ B
′
==⇒ B ⇒ B
==⇒ σm−1 = σm |= B □
61
Induction Proof of Correctness (cont.)
Case 4. sm ≡ x = read();
Then• σm = σm−1 ⊕ {x 7→ c} for some c ∈ Z
• WPJx = read();K (B) ≡ ∀ x. B
==⇒ B ′ ⇒ ∀ x. B ⇒ B[c/x]
==⇒ σm |= B □
62
Induction Proof of Correctness (cont.)
Step m > 0:
Induction Hypothesis: The statement holds already for m − 1.
Let B ′ denote the assertion at point um−1 .
==⇒ σm−1 |= B ′
63
Induction Proof of Correctness (cont.)
Case 1. σm |= b
==⇒ B ′ ⇒ WPJbK (C, B) where
WPJbK (C, B) ≡ (¬b ⇒ C) ∧ (b ⇒ B)
==⇒ σm |= b ∧ (b ⇒ B)
==⇒ σm |= B □
Case 2. σm |= ¬b
==⇒ B ′ ⇒ WPJbK (B, C) where
WPJbK (B, C) ≡ (¬b ⇒ B) ∧ (b ⇒ C)
==⇒ σm |= ¬b ∧ (¬b ⇒ B)
==⇒ σm |= B □
65
1.4 Optimization
Observation
If the program has no loops, a weakest pre-condition can be calculated for each
program point !!!
66
Example
B3
x=x+2;
x = x + 2; B2
z = z + x; z=z+x;
i = i + 1; B1
i=i+1;
B
67
Example (cont.)
Assume B ≡ z = i2 ∧ x = 2i − 1
Then we calculate:
68
Idea
Meaningful choices:
→ Before the condition
→ At the entry of the loop body
→ At the exit of the loop body ...
• For all other program points, the assertions are obtained by WPJ...K().
69
Example
1 int a, i, x, z;
2 a = read();
3 i = 0;
4 x = -1;
5 z = 0;
6
7 while (i != a) {
8 x = x + 2;
9 z = z + x;
10 i = i + 1;
11 }
12
13 assert(z == a * a);
14
15 write(z);
70
Example
Start
a = read();
i = 0;
x = −1;
z = 0;
B
no yes
i != a
z = a2 B
write(z); x=x+2;
z=z+x;
Stop i=i+1;
71
We verify
WPJi != aK(z = a2 , B) ≡ (i = a ∧ z = a2 ) ∨ (i 6= a ∧ B)
≡ (i = a ∧ z = a2 ) ∨ (i 6= a ∧ z = i2 ∧ x = 2i − 1)
⇐ (i = a ∧ z = i2 ∧ x = 2i − 1) ∨ (i 6= a ∧ z = i2 ∧ x = 2i − 1)
≡ z = i2 ∧ x = 2i − 1 ≡ B
72
Orientation
Start
a = read();
i = 0;
x = −1;
z = 0;
B
no yes
i != a
z = a2 B
write(z); x=x+2;
z=z+x;
Stop i=i+1;
73
We verify:
WPJz = 0;K(B) ≡ 0 = i2 ∧ x = 2i − 1
≡ i = 0 ∧ x = −1
WPJx = -1;K(i = 0 ∧ x = −1) ≡ i=0
WPJi = 0;K(i = 0) ≡ true
WPJa = read();K(true) ≡ true
74
1.5 Termination
Problem
• By our approach, we can only prove that an assertion is valid at a program point
whenever that program point is reached !!!
• How can we guarantee that a program always terminates ?
• How can we determine a sufficient condition which guarantees termination of the
program ??
75
Examples
• The GCD program only terminates for inputs a, b with a = b or a > 0 and b > 0.
• The square program terminates only for inputs a ≥ 0.
• while(true); never terminates.
• Programs without loops terminate always!
76
Example
1 int i, j, t;
2 t = 0;
3 i = read();
4 while (i > 0) {
5 j = read();
6 while (j > 0) { t = t + 1; j = j - 1; }
7 i = i - 1;
8 }
9 write(t);
Question
How can we turn this observation into a method that is applicable to arbitrary loops ?
78
Idea
• Make sure that each loop is executed only finitely often ...
• For each loop, identify an indicator value r, that has two properties
(1) r > 0 whenever the loop is entered;
(2) r is decreased during every iteration of the loop.
• Transform the program in a way that, alongside ordinary program execution, the
indicator value r is computed.
• Verify that properties (1) and (2) hold!
79
Eample: Safe GCD Program
1 int a, b, x, y;
2 a = read(); b = read();
3 if (a < 0) x = -a; else x = a;
4 if (b < 0) y = -b; else y = b;
5 if (x == 0) { write(y); return; }
6 if (y == 0) { write(x); return; }
7 while (x != y)
8 if (y > x) y = y - x;
9 else x = x - y;
10 write(x);
80
We choose: r =x+y
Transformation
1 int a, b, x, y, r;
2 a = read(); b = read();
3 if (a < 0) x = -a; else x = a;
4 if (b < 0) y = -b; else y = b;
5 if (x == 0) { write(y); return; }
6 if (y == 0) { write(x); return; }
7 r = x + y;
8 while (x != y) {
9 if (y > x) y = y - x;
10 else x = x - y;
11 r = x + y;
12 }
13 write(x);
81
Orientation
Start no yes
x == 0
no yes
a = read(); y == 0 write(y);
write(x);
b = read();
r=x+y;
no yes
a<0
no yes
x = a; x = −a; 3 x != y 1 r>0
no yes
x<y
write(x);
no yes
b<0 x=x−y; y=y−x;
2
y = b; y = −b; r >x+y
r=x+y;
Stop
82
At program points 1, 2 and 3, we assert:
Then we have:
A⇒r>0 und B ⇒r >x+y
83
We verify:
84
Orientation
Start no yes
x == 0
no yes
a = read(); y == 0 write(y);
write(x);
b = read();
r=x+y;
no yes
a<0
no yes
x = a; x = −a; 3 x != y 1 r>0
no yes
x<y
write(x);
no yes
b<0 x=x−y; y=y−x;
2
y = b; y = −b; r >x+y
r=x+y;
Stop
85
Further propagation of C through the control-flow graph completes the locally
consistent annotation with assertions.
We conclude:
86
General Method
1 r = e0;
2 while(b) {
3 assert(r > 0);
4 s;
5 assert(r > e1);
6 r = e1;
7 }
87
1.6 Modular Verification and Procedures
88
Idea
• Modularize the correctness proof in a way that sub-proofs for replicated program
fragments can be reused.
• Consider statements of the form:
{A} p {B}
A : pre-condition
B : post-condition
89
Examples
90
Modular verification can be used to prove the correctness of programs using
functions/methods.
Simplification
We only consider
• procedures, i.e., static methods without return values;
• global variables, i.e., all variables are static as well.
// will be generalized later
91
Example
92
Comment
{a ≥ b} mm(); {a = x}
93
Approach
94
... in the Example
We verify:
mm() a≥b
no yes
a>b
x = b; x = a;
y = a; y = b;
Stop a=x
95
... in the Example
We verify:
mm() a≥b
no yes
a=b a>b true
x = b; x = a;
y = a; y = b;
Stop a=x
96
Discussion
• The approach also works in case the procedure has a return value: that can be
simulated by means of a global variable return which receives the respective
function results.
• It is not obvious, though, how pre- and post-conditions of procedure calls can be
chosen if a procedured is called in multiple places ...
• Even more complicated is the situation when a procedure is recursive: the it has
possibly unboundedly many distinct calls !?
97
Example
98
Comment
99
Problem
• In the logic, we must be able to distinguish between the ith and the (i + 1)th call.
• This is easier, if we have logical auxiliaries l = l1 , . . . , ln at hand to store
(selected) values before the call ...
In the Example
{A} f(); {B} where
A ≡ x = l ∧ x > 1 ∧ m0 = m1 = 1
B ≡ l > 1 ∧ m1 ≤ 2l ∧ m0 ≤ 2l−1
100
General Approach
101
... in the Example
f() A
x = x−1;
x = l − 1 ∧ x > 0 ∧ m0 = m1 = 1
no yes
x>1
D
f();
C
t = m1;
m1=m1+m0;
m0 = t;
Stop B
102
• We start with an assertion for the end point:
B ≡ l > 1 ∧ m1 ≤ 2l ∧ m0 ≤ 2l−1
103
Question
How can the global hypothesis be used to deal with a specific procedure call ???
Idea
• The assertion {A} f(); {B} represents a value table for f().
• This value table can be logically represented by the implication:
∀ l. (A[h/x] ⇒ B)
104
Examples
∀ l. (h > 1 ∧ h = l ∧ h0 = h1 = 1) ⇒
l > 1 ∧ m1 ≤ 2l ∧ m0 ≤ 2l−1
≡ (h > 1 ∧ h0 = h1 = 1) ⇒ m1 ≤ 2h ∧ m0 ≤ 2h−1
105
Another pair (A1 , B1 ) of assertions forms a valid triple {A1 } f(); {B1 } , if
we are able to prove that
∀ l. A[h/x] ⇒ B A1 [h/x]
B1
Example: double()
A ≡ x=l B ≡ x = 2l
A1 ≡ x ≥ 3 B1 ≡ x ≥ 6
We verify:
x = 2h h≥3
x≥6
106
Remarks
{x = l} double(); {x = 2l}
{x = l − 1} double(); {x = 2(l − 1)}
{x = l} double(); {x = 2l}
{x = l ∧ l > 0} double(); {x = 2l ∧ l > 0}
107
Remarks (cont.)
{x = l} double(); {x = 2l}
{x > 0 ∧ x = l} double(); {x = 2l}
{x = l} double(); {x = 2l}
{x = l} double(); {x = 2l ∨ x = −1}
108
Application to Fibonacci
A ≡ x > 1 ∧ l = x ∧ m0 = m1 = 1
A[(l − 1)/l] ≡ x > 1 ∧ l − 1 = x ∧ m0 = m1 = 1
≡ D
B ≡ l > 1 ∧ m1 ≤ 2l ∧ m0 ≤ 2l−1
B[(l − 1)/l] ≡ l − 1 > 1 ∧ m1 ≤ 2l−1 ∧ m0 ≤ 2l−2
≡ C
109
Orientation
f() A
x = x−1;
x = l − 1 ∧ x > 0 ∧ m0 = m1 = 1
no yes
x>1
D
f();
C
t = m1;
m1=m1+m0;
m0 = t;
Stop B
110
For the conditional, we verify:
⇐ x > 0 ∧ x = l − 1 ∧ m0 = m1 = 1
111
1.7 Procedures with Local Variables
Example
1 {int y = 17; double(); write(y);}
112
• The values of local variables are automatically preserved, if the global hypothesis
has the following properties:
→ The pre- and post-conditions: {A}, {B} of procedures only speak
about global variables !
→ The h are only used for global variables !!
• As a new specific instance of adaptation, we obtain:
113
Summary
114
Functional Programming
115
John McCarthy, Stanford
116
Robin Milner, Edinburgh
117
Xavier Leroy, INRIA, Paris
118
2 Basics
• Interpreter Environment
• Expressions
• Definitions of Values
• More Complex Datatypes
• Lists
• Definitions (cont.)
• User-defined Datatypes
119
2.1 The Interpreter Environment
The basic interpreter is called with ocaml. A more elaborate version is utop.
1 seidl@linux:~> ocaml
2 OCaml version 5.1.1
3 ...
4 #
1 # #use "Hello.ml";;
120
2.2 Expressions
1 # 3+4;;
2 - : int = 7
3 # 3+
4 4;;
5 - : int = 7
6 #
121
Pre-defined Constants and Operators
122
Type Comparison operators
int = <> < <= >= >
float = <> < <= >= >
bool = <> < <= >= >
string = <> < <= >= >
char = <> < <= >= >
1 # -3.0 /. 4.0;;
2 - : float = -0.75
3 # "So" ^ " " ^ "it" ^ " " ^ "goes";;
4 - : string = "So it goes"
5 # 1 > 2 || not (2.0 < 1.0);;
6 - : bool = true
123
2.3 Definitions of Values
124
Another definition of seven does not assign a new value to seven, but creates a new
variable with the name seven.
125
2.4 More Complex Datatypes
• Pairs
1 # (3 , 4);;
2 - : int * int = (3, 4)
3 # (1=2,"hello");;
4 - : bool * string = (false, "hello")
• Tuples
1 # (2, 3, 4, 5);;
2 - : int * int * int * int = (2, 3, 4, 5)
3 # ("hello", true, 3.14159);;
4 -: string * bool * float = ("hello", true, 3.14159)
126
Simultaneous Definition of Variables
It is used in places where a pattern requires a variable to receive parts of a value which
are of no interest.
127
Records: Example
128
Remark
... Records are tuples with named components whose ordering, therefore, is
irrelevant.
... As a new type, a record must be introduced before its use by means of a type
declaration.
... Type names and record components start with a small letter.
129
... with pattern matching
1 # let {given = x; sur = y; age = z} = paul;;
2 val x : string = "Paul"
3 val y : string = "Meier"
4 val z : int = 24
130
Case Distinction: match and if
1 match n
2 with 0 -> "null"
3 | 1 -> "one"
4 | _ -> "uncountable!"
5
6 match e
7 with true -> e1
8 | false -> e2
1 if e then e1 else e2
131
Watch out for redundant and incomplete matches!
1 # let n = 7;;
2 val n : int = 7
3
4 # match n with 0 -> "zero";;
5 Warning: this pattern-matching is not exhaustive.
6 Here is an example of a value that is not matched:
7 1
8 Exception: Match_failure ("", 5, -13).
9
10 # match n
11 with 0 -> "zero"
12 | 0 -> "one"
13 | _ -> "uncountable!";;
14 Warning: this match case is unused.
15 - : string = "uncountable!
132
2.5 Lists
1 # let mt = [];;
2 val mt : 'a list = []
3
4 # let l1 = 1::mt;;
5 val l1 : int list = [1]
6
7 # let l = [1;2;3];;
8 val l : int list = [1; 2; 3]
9
10 # let l = 1::2::3::[];;
11 val l : int list = [1; 2; 3]
133
Caveat
All elements must have the same type:
1 # 1.0 :: 1 :: [];;
2 This expression has type int but is here used with type float
134
Pattern Matching on Lists
1 # match l
2 with [] -> -1
3 | x::xs -> x;;
4 -: int = 1
135
2.6 Definition of Functions
136
→ Alternatively, we may introduce a variable whose value is a function.
→ This function definition starts with fun, followed by the sequence of formal
parameters.
→ After -> follows the specification of the return value.
→ The variables from the left can be accessed on the right.
137
Caveat
Functions may additionally access the values of variables which have been visible at
their point of definition:
138
Caveat
A function is a value:
1 # double;;
2 - : int -> int = <fun>
139
Recursive Functions
140
If functions call themselves indirectly via other other functions, they are called
mutually recursive.
141
Definition by Case Distinction
1 # let rec length = fun l -> match l with
2 | [] -> 0
3 | x::xs -> 1 + length xs;;
4 val length : 'a list -> int = <fun>
5 # length [1; 2; 3];;
6 - : int = 3
142
Case distinction for several arguments
1 # let rec app l y = match l with
2 | [] -> y
3 | x::xs -> x :: app xs y;;
4 val app : 'a list -> 'a list -> 'a list = <fun>
5 # app [1; 2] [3; 4];;
6 - : int list = [1; 2; 3; 4]
143
Local Definitions
1 # let x = 5 in
2 let sq = x * x in
3 sq + sq;;
4 - : int = 50
5
6 # let facit n =
7 let rec iter m yet =
8 if m > n then yet
9 else iter (m + 1) (m * yet) in
10 iter 2 1;;
11 val facit : int -> int = <fun>
144
2.7 User-defined Datatypes
145
Disadvantages
146
Example: Playing cards
147
Advantages
→ The representation is intuitive.
→ Typing errors are recognized:
1 # (Culbs,Nine);;
2 Unbound constructor Culbs
Remark
→ By type, a new type is defined.
→ The alternatives are called constructors and are separated by |.
→ Every constructor starts with a capital letter and is uniquely assigned to a type.
148
Enumeration Types (cont.)
Constructors can be compared:
149
By that, e.g.,
1 # is_trump (Gras,Jack);;
2 - : bool = true
3 # is_trump (Clubs,Nine);;
4 - : bool = false
150
Another useful function:
Remark
The function string_of_color returns for a given color the corresponding string
in constant time (the compiler, hopefully, uses jump tables).
151
Now, OCaml can (almost) play cards:
152
1 ...
2 # let take card2 card1 =
3 if takes card2 card1 then card2 else card1;;
4
5 # let trick card1 card2 card3 card4 =
6 take card4 (take card3 (take card2 card1));;
7
153
Sum Types
Sum types generalize of enumeration types in that constructors now may have
arguments.
154
1 ...
2 let get x = match x with
3 | Some y -> y
4 let value x a = match x with
5 | Some y -> y
6 | None -> a
7 let map f x = match x with
8 | Some y -> Some (f y)
9 | None -> None
10 let join a = match a with
11 | Some a' -> a'
12 | None -> None
155
Option is a module, which collects useful functions and values for option.
A constructor defined inside type t = Con of <type> | ...
has functionality Con : <type> -> t — must, however, always occur applied
...
1 # Some;;
2 The constructor Some expects 1 argument(s),
3 but is here applied to 0 argument(s)
4
5 # None;;
6 - : 'a option = None
7
8 # Some 10;
9 - : int option = Some 10
10
156
The type option is polymorphic – which means that it can be constructed for any type
'a, in particular int or string.
Polymorphic types with parameters 'a, 'b, 'c are then introduced by
type ('a,'b,'c) t = ...
157
Datatypes can be recursive:
158
Recursive datatypes lead to recursive functions:
159
Another Example
160
3 A closer Look at Functions
• Tail Calls
• Higher-order Functions
→ Currying
→ Partial Application
• Polymorphic Functions
• Polymorphic Datatypes
• Anonymous Functions
161
3.1 Tail Calls
A tail call in the body e of a function is a call whose value provides the value of e ...
1 let f x = x + 5
2
3 let g y = let z = 7
4 in if y > 5 then f (-y)
5 else z + f y
==⇒ From a tail call, we need not return to the calling function.
==⇒ The stack space of the calling function can immediately be recycled !!!
162
A recursive function f is called tail recursive, if all (direct or indirect) calls to f and all
functions mutually recursive with f in the right-hand sides of any of these functions
are tail calls.
Examples
1 let fac x = let rec facit n acc =
2 if n <= 1 then acc
3 else facit (n - 1) (n * acc)
4 in facit x 1
5
6 let rec collatz x = if x < 2 then x
7 else if x mod 2 = 0 then collatz (x / 2)
8 else collatz (3 * x + 1)
163
Discussion
164
Reversing a List – Version 1
1 []
2 [n-1]
3 [n-1; n-2]
4 ...
5 [n-1; ...; 1]
165
Reversing a List – Version 2
166
3.2 Higher Order Functions
1 let f (a, b) = a + b + 1
2
3 let g a b = a + b + 1
At first sight, f and g differ only in the syntax. But they also differ in their types:
1 # f;;
2 - : int * int -> int = <fun>
3
4 # g;;
5 - : int -> int -> int = <fun>
167
• Function f has a single argument, namely, the pair (a,b). The return value is
given by a+b+1.
• Function g has the one argument a of type int. The result of application to a is
a function that, when applied to the other argument b, returns the result a+b+1 :
1 # f (3, 5);;
2 - : int = 9
3
4 # let g1 = g 3;;
5 val g1 : int -> int = <fun>
6
7 # g1 5;;
8 - : int = 9
168
Haskell B. Curry, 1900–1982
169
In honor of its inventor Haskell B. Curry, this principle is called Currying.
170
1 ...
2
3 # let plus (x,y) = x+y;;
4 val plus : int * int -> int = <fun>
5
6 # curry plus;;
7 - : int -> int -> int = <fun>
8
9 # let plus2 = curry plus 2;;
10 val plus2 : int -> int = <fun>
11
12 # let plus3 = curry plus 3;;
13 val plus3 : int -> int = <fun>
14
15 # plus2 (plus3 4);;
16 - : int = 9
171
3.3 Some List Functions
172
1 let rec find_opt f = function
2 [] -> None
3 | x::xs -> if f x then Some x
4 else find_opt f xs
Remarks
→ These functions abstract from the behavior of the function f. They specify the
recursion according the list structure — independently of the elements of the list.
→ Therefore, such functions are sometimes called recursion schemes or (list)
functionals.
→ List functionals are independent of the element type of the list. That type must
only be known to the function f.
→ Functions which operate on equally structured data of various type, are called
polymorphic.
173
3.4 Polymorphic Functions
The OCaml system infers the following types for the given functionals:
1 map : ('a -> 'b) -> 'a list -> 'b list
2
3 fold_left : ('a -> 'b -> 'a) -> 'a -> 'b list -> 'a
4
5 fold_right : ('a -> 'b -> 'b) -> 'a list -> 'b -> 'b
6
7 find_opt : ('a -> bool) -> 'a list -> 'a option
→ 'a and 'b are type variables. They can be instantiated by any type (but each
occurrence with the same type).
174
→ By partial application, some of the type variables may be instantiated:
1 # string_of_int;;
2 val : int -> string = <fun>
3
4 # map string_of_int;;
5 - : int list -> string list = <fun>
6
7 # fold_left (+);;
8 val it : int -> int list -> int = <fun>
175
→ If a functional is applied to a function that is itself polymorphic, the result may
again be polymorphic:
176
Some of the Simplest Polymorphic Functions
1 let compose f g x = f (g x)
2 let twice f x = f (f x)
3 let rec iter f g x = if g x then x else iter f g (f x);;
4
5 val compose : ('a -> 'b) -> ('c -> 'a) -> 'c -> 'b = <fun>
6 val twice : ('a -> 'a) -> 'a -> 'a = <fun>
7 val iter : ('a -> 'a) -> ('a -> bool) -> 'a -> 'a = <fun>
8
1 type 'a tree = Leaf of 'a | Node of ('a tree * 'a tree)
→ tree is called type constructor, because it allows to create a new type from
another type, namely its parameter 'a.
→ In the right-hand side, only those type variables may occur, which have been
listed on the left.
→ The application of constructors to data may instantiate type variables:
178
1 # Leaf 1;;
2 - : int tree = Leaf 1
3
4 # Node (Leaf ('a', true), Leaf ('b', false));;
5 - : (char * bool) tree = Node (Leaf ('a', true), Leaf ('b', false))
179
1 let rec size = function
2 | Leaf _ -> 1
3 | Node(t,t') -> size t + size t'
4
180
1 ...
2
181
3.6 Application: Queues
Wanted
182
First Idea
• Represent the queue by a list:
183
Discussion
184
Second Idea
• Represent the queue as two lists !!!
1 type 'a queue = Queue of 'a list * 'a list
2
3 let is_empty = function
4 Queue ([],[]) -> true
5 | _ -> false
6
7 let queue_of_list list = Queue (list,[])
8
9 let list_of_queue = function
10 Queue (first,[]) -> first
11 | Queue (first,last) ->
12 first @ List.rev last
• The second list represents the tail of the list and therefore in reverse ordering ...
185
Second Idea (cont.)
• Insertion is in the second list:
186
Discussion
187
3.7 Anonymous Functions
As we have seen, functions are data. Data, e.g., [1;2;3] can be used without naming
them. This is also possible for functions:
188
Alonzo Church, 1903–1995
189
• Pattern matching can be used by applying match ... with for the
corresponding argument.
• In case of a single argument, function can be considered ...
190
Anonymous functions are convenient if they are used just once in a program. Often,
they occur as arguments to functionals:
191
4 A Larger Application:
Balanced Trees
2 3 5 7 11 13 17
192
Properties
193
Wanted:
194
First Idea
195
First Idea
3 13
2 5 11 17
196
Discussion
197
Second Idea
198
An AVL Tree
199
An AVL Tree
200
Not an AVL Tree
201
G.M. Adelson-Velskij, 1922 E.M. Landis, Moskau, 1921-1997
202
We prove:
fib(k) ≥ Ak−1
√
5+1
nodes where A= 2 // golden cut
203
We calculate:
(1) Each AVL tree of depth k > 0 has at least
fib(k) ≥ Ak−1
√
5+1
nodes where A= 2 // golden cut
(2) Every AVL tree with n > 0 internal nodes has depth at most
1
· log(n) + 1
log(A)
==⇒ N (k) = N (k − 1) + N (k − 2) + 1
≥ fib(k − 1) + fib(k − 2)
= fib(k)
205
Second Idea (cont.)
206
Representation
207
Representation
3 2
2 1 1
208
Third Idea
• Instead of the absolute depth, we store at each node only whether the difference
in depth of the two subtrees is negative, positive or equal to zero !!!
• As datatype, we therefore define
209
Representation
3 2
2 1 1
210
Representation
N P
P E E
211
Insertion
• If the tree is a leaf, i.e., empty, an internal node is created with two new leaves.
• If the tree in non-empty, the new value is compared with the value at the root.
→ If it is larger, it is inserted to the right.
→ Otherwise, it is inserted to the left.
• Caveat: Insertion may increase the depth and thus
Caveat: may destroy the AVL property !
• That must be subsequently dealt with ...
212
1 let rec insert x avl = match avl
2 with Null -> (Eq (Null,x,Null), true)
3 | Eq (left,y,right) -> if x < y then
4 let (left,inc) = insert x left
5 in if inc then (Neg (left,y,right), true)
6 else (Eq (left,y,right), false)
7 else let (right,inc) = insert x right
8 in if inc then (Pos (left,y,right), true)
9 else (Eq (left,y,right), false)
10 ...
• Besides the new AVL tree, the function insert also returns the information
whether the depth of the result has increased.
• If the depth is not increased, the marking of the root need not be changed.
213
1 | Neg (left,y,right) -> if x < y then
2 let (left,inc) = insert x left
3 in if inc then let (avl,_) = rotateRight (left,y,right)
4 in (avl,false)
5 else (Neg (left,y,right), false)
6 else let (right,inc) = insert x right
7 in if inc then (Eq (left,y,right), false)
8 else (Neg (left,y,right), false)
9 | Pos (left,y,right) -> if x < y then
10 let (left,inc) = insert x left
11 in if inc then (Eq (left,y,right), false)
12 else (Pos (left,y,right), false)
13 else let (right,inc) = insert x right
14 in if inc then let (avl,_) = rotateLeft (left,y,right)
15 in (avl,false)
16 else (Pos (left,y,right), false);;
214
Comments
• Insertion into the less deep subtree never increases the total depth.
The depths of the two subtrees, though, may become equal.
• Insertion into the deeper subtree may increase the difference in depth to 2.
then the node at the root must be rotated in order to decrease the difference ...
215
rotateRight
N P
E N
216
rotateRight
N E
N E
217
rotateRight
N E
P E E
218
rotateRight
N E
P E P
219
rotateRight
N E
P N E
220
1 let rotateRight (left, y, right) = match left
2 with Eq (l1,y1,r1) -> (Pos (l1, y1, Neg (r1,y,right)), false)
3 | Neg (l1,y1,r1) -> (Eq (l1, y1, Eq (r1,y,right)), true)
4 | Pos (l1, y1, Eq (l2,y2,r2)) ->
5 (Eq (Eq (l1,y1,l2), y2, Eq (r2,y,right)), true)
6 | Pos (l1, y1, Neg (l2,y2,r2)) ->
7 (Eq (Eq (l1,y1,l2), y2, Pos (r2,y,right)), true)
8 | Pos (l1, y1, Pos (l2,y2,r2)) ->
9 (Eq (Neg (l1,y1,l2), y2, Eq (r2,y,right)), true)
• The extra bit now indicates whether the depth of the tree after rotation has
decreased ...
• This is not the case only when the deeper subtree is of the form Eq (...)
— which does never occur here.
221
rotateLeft
P N
E P
222
rotateLeft
P E
P E
223
rotateLeft
P E
N E E
224
rotateLeft
P E
N E P
225
rotateLeft
P E
N N E
226
1 let rotateLeft (left, y, right) = match right
2 with Eq (l1,y1,r1) -> (Neg (Pos (left,y,l1), y1, r1), false)
3 | Pos (l1,y1,r1) -> (Eq (Eq (left,y,l1), y1, r1), true)
4 | Neg (Eq (l1,y1,r1), y2 ,r2) ->
5 (Eq (Eq (left,y,l1),y1, Eq (r1,y2,r2)), true)
6 | Neg (Neg (l1,y1,r1), y2 ,r2) ->
7 (Eq (Eq (left,y,l1),y1, Pos (r1,y2,r2)), true)
8 | Neg (Pos (l1,y1,r1), y2 ,r2) ->
9 (Eq (Neg (left,y,l1),y1, Eq (r1,y2,r2)), true)
227
Discussion
• Insertion requires at most as many calls of insert as the depth of the tree.
• After returning from a call for a subtree, at most three nodes must be
re-arranged.
• The total effort therefore is bounded by a constand multiple to log(n).
• In general, though, we are not interested in the extra bit at every call. Therefore,
we define:
228
Extraction of the Minimum
229
1 let rec extract_min avl = match avl
2 with Null -> (None, Null, false)
3 | Eq (Null,y,right) -> (Some y, right, true)
4 | Eq (left,y,right) -> let (first,left,dec) = extract_min left
5 in if dec then (first, Pos (left,y,right), false)
6 else (first, Eq (left,y,right), false)
7 | Neg (left,y,right) -> let (first,left,dec) = extract_min left
8 in if dec then (first, Eq (left,y,right), true)
9 else (first, Neg (left,y,right), false)
10 | Pos (Null,y,right) -> (Some y, right, true)
11 | Pos (left,y,right) -> let (first,left,dec) = extract_min left
12 in if dec then let (avl,b) = rotateLeft (left,y,right)
13 in (first,avl,b)
14 else (first, Pos (left,y,right), false)
230
Discussion
• Rotation is only required when extracting from a tree of the form Pos (...)
and the depth of the left subtree is decreased.
• Altogether, the number of recursive calls is bounded by the depth. For every call,
at most three nodes are re-arranged.
• Therefore, the total effort is bounded by a constant multiple of log(n).
• Functions for maximum or last element from an interval are constructed
analogously ...
231
5 Practical Features of OCaml
• Exceptions
• Input and Output as Side-effects
• Sequences
232
5.1 Exceptions
In case of a runtime error, e.g., division by zero, the OCaml system generates an
exception:
1 # 1 / 0;;
2 Exception: Division_by_zero.
3
4 # List.tl (List.tl [1]);;
5 Exception: Failure "tl".
6
7 # Char.chr 300;;
8 Exception: Invalid_argument "Char.chr".
234
Pre-defined Constructors for Exceptions
Division_by_zero division by 0
Invalid_argument of string wrong usage
Failure of string general error
Match_failure of string * int * int incomplete match
Not_found not found
Out_of_memory memory exhausted
End_of_file end of file
Exit for the user ...
An exception is a first class citizen, i.e., a value from a datatype exn ...
235
1 # Division_by_zero;;
2 - : exn = Division_by_zero
3
4 # Failure "complete nonsense!";;
5 - : exn = Failure "complete nonsense!"
1 # exception Hell;;
2 exception Hell
3
4 # Hell;;
5 - : exn = Hell
236
1 # Division_by_zero;;
2 - : exn = Division_by_zero
3
4 # Failure "complete nonsense!";;
5 - : exn = Failure "complete nonsense!"
237
Handling of Exceptions
238
1 let rec member x l = try if x = List.hd l then true
2 else member x (List.tl l)
3 with Failure _ -> false
4
5 # member 2 [1;2;3];;
6 - : bool = true
7 # member 4 [1;2;3];;
8 - : bool = false
Following the keyword with, the exception value can be inspected by means of
pattern matching for the exception datatype exn :
1 try <exn>
2 with <pat1> -> <exp1> | ... | <patN> -> <expN>
==⇒ several exceptions can be caught (and thus handled) at the same time.
239
The programmer may trigger exceptions on his/her own
by means of the keyword raise ...
1 # 1 + (2 / 0);;
2 Exception: Division_by_zero.
3
4 # 1 + raise Division_by_zero;;
5 Exception: Division_by_zero.
Handling of an exception, results in the evaluation of another expression (of the correct
type) — or raises another exception.
240
Exception handling may occur at any sub-expression, arbitrarily nested:
8 # g (6, 1);;
9 - : string = "Error: Division by zero"
10
11 # g (6, 3);;
12 - : string = "9"
241
5.2 Textual Input and Output
• Reading from the input and writing to the output violates the paradigm of purely
functional programming !
• These operations are therefore realized by means of side-effects, i.e., by means of
functions whose return value is irrelevant (e.g., unit).
• During execution, though, the required operation is executed
==⇒ now, the ordering of the evaluation matters !!!
242
• Naturally, OCaml allows to write to standard output:
1 # read_line ();;
2 Hello World!
3 - : string = "Hello World!"
243
In order to read from file, the file must be opened for reading ...
1 # close_in infile;;
2 - : unit = ()
244
Further Useful Values
1 stdin : in_channel
2 input_char : in_channel -> char
3 in_channel_length : in_channel -> int
245
Output to files is analogous ...
The words written seperately, may only occur inside the file, once the file has been
closed ...
1 # close_out outfile;;
2 - : unit = ()
246
5.3 Sequences
1 # print_string "Hello";
2 print_string " ";
3 print_string "world!\n";;
4 Hello world!
5 - : unit = ()
247
Often, several strings must be output !
Given a list of strings, the list functional List.iter can be used:
248
6 The Module System of OCaml
→ Modules
→ Signatures
→ Information Hiding
→ Functors
→ Separate Compilation
249
6.1 Modules
In order to organize larger software systems, OCaml offers the concept of modules:
1 module Pairs =
2 struct
3 type 'a pair = 'a * 'a
4 let pair (a,b) = (a,b)
5 let first (a,b) = a
6 let second (a,b) = b
7 end
250
On this input, the compiler answers with the type of the module, its signature:
1 module Pairs :
2 sig
3 type 'a pair = 'a * 'a
4 val pair : 'a * 'b -> 'a * 'b
5 val first : 'a * 'b -> 'a
6 val second : 'a * 'b -> 'b
7 end
1 # first;;
2 Unbound value first
251
Access onto Components of a Module
Components of a module can be accessed via qualification:
1 # Pairs.first;;
2 - : 'a * 'b -> 'a = <fun>
Thus, several functions can be defined all with the same name:
1 # module Triples =
2 struct
3 type 'a triple = Triple of 'a * 'a * 'a
4 let first (Triple (a, _, _)) = a
5 let second (Triple (_, b, _)) = b
6 let third (Triple (_, _, c)) = c
7 end;;
8 ...
252
1 ...
2 module Triples :
3 sig
4 type 'a triple = Triple of 'a * 'a * 'a
5 val first : 'a triple -> 'a
6 val second : 'a triple -> 'a
7 val third : 'a triple -> 'a
8 end
9
10 # Triples.first;;
11 - : 'a Triples.triple -> 'a = <fun>
253
... or several implementations of the same function:
1 # module Pairs2 =
2 struct
3 type 'a pair = bool -> 'a
4 let pair (a,b) = fun x -> if x then a else b
5 let first ab = ab true
6 let second ab = ab false
7 end;;
254
Opening Modules
In order to avoid explicit qualification, all definitions of a module can be made directly
accessible:
1 # open Pairs2;;
2
3 # pair;;
4 - : 'a * 'a -> bool -> 'a = <fun>
5
6 # pair (4,3) true;;
7 - : int = 4
the keyword include allows to include the definitions of another module into the
present module ...
255
1 # module A = struct let x = 1 end;;
2 module A : sig val x : int end
3
4 # module B =
5 struct
6 open A
7 let y = 2
8 end;;
9 module B : sig val y : int end
10
11 # module C =
12 struct
13 include A
14 include B
15 end;;
16 module C : sig val x : int val y : int end
256
Nested Modules
Modules may again contain modules:
257
1 ...
2 let first q = Pairs.first (Pairs.first q)
3 let second q = Pairs.second (Pairs.first q)
4 let third q = Pairs.first (Pairs.second q)
5 let fourth q = Pairs.second (Pairs.second q)
6 end
7
8 # Quads.quad (1, 2, 3, 4);;
9 - : (int * int) * (int * int) = ((1, 2), (3, 4))
10
11 # Quads.Pairs.first;;
12 - : 'a * 'b -> 'a = <fun>
258
6.2 Module Types or Signatures
... an Example
259
1 module Sort = struct
2 let single lst = map (fun x -> [x]) lst
3
4 let rec merge l1 l2 = match (l1, l2)
5 with ([],_) -> l2
6 | (_,[]) -> l1
7 | (x::xs,y::ys) -> if x <= y then x :: merge xs l2
8 else y :: merge l1 ys
9
10 let rec merge_lists = function
11 [] -> [] | [l] -> [l]
12 | l1::l2::ll -> merge l1 l2 :: merge_lists ll
13
14 let sort lst = let lst = single lst
15 in let rec doit = function
16 [] -> [] | [l] -> l
17 | l -> doit (merge_lists l)
18 in doit lst
19 end
260
The implementation allows to access the auxiliary functions single, merge and
merge_lists from the outside:
1 # Sort.single [1;2;3];;
2 - : int list list = [[1]; [2]; [3]]
In order to hide the functions single and merge_lists, we introduce the signature
261
The functions single and merge_lists are no longer exported:
262
Signatures and Types
The types mentioned in the signature must be Instances of the types for the exported
definitions.
In that way, these types are spezialized:
263
1 # module A1 : A1 = A;;
2 Signature mismatch:
3 Modules do not match: sig val f : 'a -> 'b -> 'a end
4 is not included in A1
5 Values do not match:
6 val f : 'a -> 'b -> 'a
7 is not included in
8 val f : 'a -> 'b -> 'b
9
10 # module A2 : A2 = A;;
11 module A2 : A2
12
13 # A2.f;;
14 - : int -> char -> int = <fun>
264
6.3 Information Hiding
For reasons of modularity, we often would like to prohibit that the structure of
exported types of a module are visible from the outside.
Example
1 module ListQueue = struct
2 type 'a queue = 'a list
3 let empty_queue () = []
4 let is_empty = function
5 [] -> true | _ -> false
6 let enqueue xs y = xs @ [y]
7 let dequeue (x::xs) = (x,xs)
8 end
265
A signature allows to hide the implementation of a queue:
266
1 # module Queue : Queue = ListQueue;;
2 module Queue : Queue
3
4 # open Queue;;
5
6 # is_empty [];;
7 This expression has type 'a list but is here used with type
8 'b queue = 'b Queue.queue
==⇒
The restriction via signature is sufficient to obfuscate the true nature of the type
queue.
267
If the datatype should be exported together with all constructors, its definition is
repeated in the signature:
268
6.4 Functors
Since (almost) everything in OCaml is higher order, it is no surprise that there are
modules of higher order: Functors.
269
First, we specify the functor’s argument and result by means of signatures:
270
1 ...
2 module Fold : GenFold = functor (X : Decons) ->
3 struct
4 let rec fold_left f b t =
5 match X.decons t with None -> b |
6 Some (x, t) -> fold_left f (f b x) t
7
8 let rec fold_right f t b =
9 match X.decons t with None -> b |
10 Some (x, t) -> f x (fold_right f t b)
11
12 let size t = fold_left (fun a x -> a + 1) 0 t
13
14 let list_of t = fold_right (fun x xs -> x :: xs) t []
15
16 let iter f t = fold_left (fun () x -> f x) () t
17 end
Now, we can apply the functor to the module to obtain a new module ...
271
1 module MyQueue = struct open Queue
2 type 'a t = 'a queue
3
272
1 module FoldAVL = Fold (MyAVL)
2 module FoldQueue = Fold (MyQueue)
Caveat
A module satisfies a signature whenever it implements it !
It is not required to explicitly declare that !!
273
6.5 Separate Compilation
• In reality, deployed OCaml programs will not run within the interactive shell.
• Instead, there is a compiler ocamlc ...
> ocamlc Test.ml
that interpretes the contents of the file Test.ml as a sequence of definitions
of a module Test.
• As a result, the compiler ocamlc generates the files
274
• If there is already a file Test.mli this is interpreted as the signature for
Test. Then we call
> ocamlc Test.mli Test.ml
• Given a module A and a module B, then these should be compiled by
> ocamlc B.mli B.ml A.mli A.ml
• If a re-compilation of B should be omitted, ocamlc may receive a
pre-compiled file
> ocamlc B.cmo A.mli A.ml
• For practical management of required re-compilation after modification of files,
Linux offers the tool make. The script of required actions then is stored in a
Makefile.
• ... alternatively, dune can be used.
275
7 Formal Verification for OCaml
Question
How can we make sure that an OCaml program behaves as it should ???
We require:
• a formal semantics
• means to prove assertions about programs ...
276
7.1 MiniOCaml
We consider ...
• only base types int, bool as well as tuples and lists
• recursive function definitions only at top level
277
This fragment of OCaml is called MiniOCaml.
Expressions in MiniOCaml can be described by the grammar
Short-cut
fun x1 -> …fun xk -> e ≡ fun x1 . . . xk -> e
278
Caveat
• The set of admissible expressions must be further restricted to those which are
well typed, i.e., for which the OCaml compiler infers a type ...
(1, [true; false]) well typed
(1 [true; false]) not well typed
([1; true], false) not well typed
• We also rule out if ... then ... else ... , since it can be simulated by
match ... with true -> ... | false -> ....
• We could also have omitted let ... in ... (why?)
279
A program consists of a sequence of mutally recursive global definitions of variables
f1 , . . . , f m :
let rec f1 = E1
and f2 = E2
...
and fm = Em
280
7.2 A Semantics for MiniOCaml
Question
Which value is returned for the expression E ??
281
A MiniOCaml Program ...
282
Idea
283
Tuples
e1 ⇒ v1 ... e k ⇒ vk
(TU)
(e1 , . . . , ek ) ⇒ (v1 , . . . , vk )
Lists
e1 ⇒ v1 e2 ⇒ v2
(LI)
e1 :: e2 ⇒ v1 :: v2
Global definitions
f =e e ⇒ v
(GD)
f ⇒ v
284
Local definitions
e1 ⇒ v1 e0 [v1 /x] ⇒ v0
(LD)
let x = e1 in e0 ⇒ v0
Function calls
285
By repeated application of the rule for function calls, a rule for functions with multiple
arguments can be derived:
286
Pattern Matching
Built-in operators
e1 ⇒ v1 e2 ⇒ v2 v1 op v2 ⇒ v
(OP)
e1 op e2 ⇒ v
Unary operators are treated analogously.
287
The built-in equality operator
v=v ⇒ true
v1 = v2 ⇒ false
given that v, v1 , v2 are values that do not contain functions, and v1 , v2 are
syntactically different.
Example 1
17 ⇒ 17 4 ⇒ 4 17 + 4 ⇒ 21
(OP)
17 + 4 ⇒ 21 21 ⇒ 21 21 = 21 ⇒ true
(OP)
17 + 4 = 21 ⇒ true
288
The built-in equality operator
v=v ⇒ true
v1 = v2 ⇒ false
given that v, v1 , v2 are values that do not contain functions, and v1 , v2 are
syntactically different.
17 + 4 ⇒ 21
(OP)
17 + 4 ⇒ 21 21 = 21 ⇒ true
(OP)
17 + 4 = 21 ⇒ true
289
Example 2
290
Example 3
291
app = fun x y -> ... [] ⇒ [] 2::[] ⇒ 2::[]
(GD) (PM)
app ⇒ fun x y -> ... match [] ... ⇒ 2::[]
(APP')
app [] (2::[]) ⇒ 2::[]
(LI)
app = fun x y -> ... 1 :: app [] (2::[]) ⇒ 1::2::[]
(GD) (PM)
app ⇒ fun x y -> ... match 1::[] ... ⇒ 1::2::[]
(APP')
app (1::[]) (2::[]) ⇒ 1::2::[]
292
Discussion
• The big-step operational semantics is not well suited for tracking step-by-step
how evaluation by MiniOCaml proceeds.
• It is quite convenient, though, for proving that the evaluation of a function for
particular argument values terminates:
For that, it suffices to prove that there are values to which the corresponding
function calls can be evaluated …
293
Example Claim
Proof
Induction on the length n of the list l1 .
294
n>0: I.e., l1 = h::t.
In particular, we assume that the claim already holds for all shorter lists. Then we have:
app t l2 ⇒ l
295
Discussion (cont.)
• The big-step semantics also allows to verify that optimizing transformations are
correct, i.e., preserve the semantics.
• Finally, it can be used to prove the correctness of assertions about functional
programs !
• The big-step operational semantics suggests to consider expressions as
specifications of values.
• Expressions which evaluate to the same values, should be interchangeable ...
296
Caveat
C :: = const | (C1 , . . . , Ck ) | [] | C1 :: C2
• Apparently, a value of MiniOCaml is comparable if and only iff its type does not
contain functions:
297
Discussion
• Apparently, the functions to the right and left of the equality sign cannot be
compared by OCaml for equality.
==⇒
298
Extension of Equality
The equality = of OCaml is extended to expression which may not terminate, and
functions.
Non-termination
e1 , e 2 both not terminating
e1 = e2
Termination
e1 ⇒ v1 e2 ⇒ v2 v1 = v2
e1 = e2
299
Structured values
v1 = v1′ . . . vk = vk′
(v1 , . . . , vk ) = (v1′ , . . . , vk′ )
v1 = v1′ v2 = v2′
v1 :: v2 = v1′ :: v2′
Functions
e1 [v/x1 ] = e2 [v/x2 ] for all v
fun x1 -> e1 = fun x2 -> e2
==⇒ extensional equality
300
We have:
e ⇒ v
e = v
e1 = e2 e1 terminates
e1 = e2 ⇒ true
e1 = e2 ⇒ true
e1 = e2 ei terminates
301
Substitution Lemma
e1 = e2
e[e1 /x] = e[e2 /x]
302
Discussion
• The lemma tells us that in every context, all occurrences of the expression e1 can
be replaced by the expression e2 — whenever e1 and e2 represent the same
values.
• The lemma can be proven by induction on the depth of the required derivations
(which we omit).
• The exchange of expressions proven equal, allows us to design a calculus for
proving the equivalence of expressions ...
303
We provide us with a repertoir of rewrite rules for reducing the equality of expressions
to the equality of, possibly simpler expressions ...
e1 terminates
let x = e1 in e = e[e1 /x]
304
Proof of the let rule
e1 ⇒ v1
305
Then
e[e1 /x] = e[v1 /x] = v
Because of the big-step semantics, however, we have:
Accordingly,
let x = e1 in e = e[e1 /x]
306
By repeated application of the rule for function calls, an extra rule for functions with
multiple arguments can be deduced:
307
Rule for pattern matching
e0 = []
match e0 with [] -> e1 | ... | pm -> em = e1
308
7.3 Equational Proofs for MiniOCaml
Example 1
1 let rec app = fun x -> fun y -> match x
2 with [] -> y
3 | h::t -> h :: app t y
309
Idea: Induction on the length n of x
We deduce:
def x
app x [] = app [] []
app
= match [] with [] -> [] | h::t -> h :: app t []
match
= []
def x
= x
310
n>0 Then: x = h::t where t has length n − 1.
We deduce:
def x
app x [] = app (h::t) []
app
= match h::t with [] -> [] | h::t -> h :: app t []
match
= h :: app t []
I.H.
= h :: t
def x
= x
311
Analogously we proceed for assertion (2) ...
n=0 Then: x = []
We deduce:
def x
app x (app y z) = app [] (app y z)
app
= match [] with [] -> app y z | h::t -> ...
match
= app y z
match
= app (match [] with [] -> y | ...) z
app
= app (app [] y) z
def x
= app (app x y) z
312
n>0 Then x = h::t where t has length n − 1.
We deduce:
def x
app x (app y z) = app (h::t) (app y z)
app
= match h::t with [] -> app y z
| h::t -> h :: app t (app y z)
match
= h :: app t (app y z)
I.H.
= h :: app (app t y) z
match,app
= app (h :: app t y) z
match
= app (match h::t with [] -> []
| h::t -> h :: app t y) z
app
= app (app (h::t) y) z
def x
= app (app x y) z
313
Discussion
• For the correctness of our induction proofs, we require that all occurring function
calls terminate.
• In the example, it suffices to prove that for all x, y, there exists some v
such that:
app x y ⇒ v
314
Example 2
Claim
315
More generally,
app (rev x) y = rev1 x y for all lists x, y.
316
n>0 Then x = h::t where t has length n − 1.
317
Discussion
• Again, we have implicitly assumed that all calls of app, rev and rev1 terminate.
• Termination of these can be proven by induction on the length of their first
arguments.
• The claim:
rev x = rev1 x []
follows from:
app (rev x) y = rev1 x y
by setting: y = [] and assertion (1) from example 1.
318
Example 3
319
Claim
sorted x ∧ sorted y → sorted (merge x y)
for all lists x, y.
n=0 Then: x = [] = y
We deduce:
def x,y
sorted (merge x y) = sorted (merge [] [])
merge,match
= sorted []
sorted
= true
320
n>0
Case 1: x = [].
We deduce:
def x
sorted (merge x y) = sorted (merge [] y)
merge,match
= sorted y
by assumption
= true
Case 2: y = [] analogous.
321
Case 3: x = x1::xs ∧ y = y1::ys ∧ x1 ≤ y1.
We deduce:
def x,y
sorted (merge x y) = sorted (merge (x1::xs) (y1::ys))
merge,match,x1 ≤ y2
= sorted (x1 :: merge xs (y1::ys))
def y
= sorted (x1 :: merge xs y)
= ...
Case 3.1: xs = []
We deduce:
def xs
... = sorted (x1 :: merge [] y)
merge,match
= sorted (x1 :: y)
sorted,match,x1 ≤ y1
= sorted y
by assumption
= true
322
Case 3.2: xs = x2::xs' ∧ x2 ≤ y1.
We deduce:
def xs
... = sorted (x1 :: merge (x2::xs') y)
merge,match,x2 ≤ y1
= sorted (x1 :: x2 :: merge xs' y)
sorted,match,x1 ≤ x2
= sorted (x2 :: merge xs' y)
match,merge
= sorted (merge xs y)
I.H.
= true
323
Case 3.3: xs = x2::xs' ∧ x2 > y1.
We deduce:
def xs/y
... = sorted (x1 :: merge (x2::xs') (y1::ys))
merge,match,y1 < x2
= sorted (x1 :: y1 :: merge (x2::xs') ys)
def xs
= sorted (x1 :: y1 :: merge xs ys)
sorted,match,x1 ≤ y1
= sorted (y1 :: merge xs ys)
match,merge
= sorted (merge xs y)
I.H.
= true
324
Case 4: x = x1::xs ∧ y = y1::ys ∧ x1 > y1.
We deduce:
def x,y
sorted (merge x y) = sorted (merge (x1::xs) (y1::ys))
merge,match,y1 < x1
= sorted (y1 :: merge (x1::xs) ys)
def x
= sorted (y1 :: merge x ys)
= ...
Case 4.1: ys = []
We deduce:
def ys
... = sorted (y1 :: merge x [])
merge,match
= sorted (y1 :: x)
sorted,match,y1 < x1
= sorted x
by assumption
= true
325
Case 4.2: ys = y2::ys' ∧ x1 > y2.
We deduce:
def ys
... = sorted (y1 :: merge x (y2::ys'))
merge,match,y2 < x1
= sorted (y1 :: y2 :: merge x ys')
sorted,match,y1 ≤ y2
= sorted (y2 :: merge x ys')
match,merge
= sorted (merge x ys)
I.H.
= true
326
Case 4.3: ys = y2::ys' ∧ x1 ≤ y2.
We deduce:
def x,ys
... = sorted (y1 :: merge (x1::xs) (y2::ys'))
merge,match,x1 ≤ y2
= sorted (y1 :: x1 :: merge xs (y2::ys'))
def ys
= sorted (y1 :: x1 :: merge xs ys)
sorted,match,y1 < x1
= sorted (x1 :: merge xs ys)
match,merge
= sorted (merge x ys)
I.H.
= true by induction hypothesis
327
Discussion
• Again, we have assumed for the proof that all calls of the functions sorted
and merge terminate.
• As an additional techniques, we required a thorough case distinction over the
various possibilities for arguments in calls.
• The case distinction made the proof longish and cumbersome.
// The case n=0 is in fact superfluous.
// since it is covered by the cases 1 and 2
328
8 Parallel Programming
329
When your program requires multiple threads, use
When you want to play with it within utop, use the following sequence of commands:
1 #thread;;
2 #directory "+unix";;
3 #load "unix.cma";;
4 #directory "+threads";;
5 #load "threads.cma";;
330
Example
331
Comments
332
Further useful Functions
• The function join: t -> unit blocks the current thread until the
evaluation of the given thread has terminated.
• The function kill: t -> unit stops a given thread (not implemented);
• The function delay: float -> unit delays the current thread by a time
period in seconds;
• The function exit: unit -> unit terminates the current thread.
333
... running the compiled code yields:
1 > ./a.out
2 Hello Echo!
3 Hello Echo!
4 0
5 >
334
8.1 Channels
335
• Each call new_channel() creates another channel.
• Arbitrary data may be sent across a channel !!!
• always wraps a value into an event.
• Sending and receiving generates events ...
• Synchronization on evente returns their values.
• sync (send ch str) exposes the event of sending to the outside world and
blocks the sender, until another thread has read the value from the channel ...
• sync (receive ch) blocks the receiver, until a value has been made available
on the channel. Then this value is returned as the result.
• Synchronous communication is one alternative for exchange of data between
threads as well as for orchestration of concurrency ==⇒ rendezvous
• It can be used to realize asynchronous communication between threads.
337
In the example, main spawns a thread. Then it sends it a string and waits for the
answer. Accordingly, the new thread waits for the transfer of a string value over the
channel. As soon as the string is received, an answer is sent on the same channel.
Caveat
If the ordering of send and receive is not carefully designed, threads easily get
blocked ...
1 > ./a.out
2 main is running ...
3 Greetings!
4 It got it!
5 >
338
Example: A global memory cell
The implementation must take care that the get and put calls are sequentialized.
339
This task is delegated to a server thread that reacts to get and put:
The channel transports requests to the memory cell, which either provide the new
value or the back channel ...
340
1 let get cell = let reply = new_channel () in
2 sync (send cell (Get reply));
3 sync (receive reply)
The function get sends a new back channel on the channel cell. If the latter is
received, it waits for the return value.
The function put sends a Put element which contains the new value for the memory
cell.
341
Of interest now is the implementation of the cell itself:
1 let new_cell x =
2 let cell = new_channel () in
3 let rec serve x =
4 match sync (receive cell) with
5 | Get reply ->
6 sync (send reply x);
7 serve x
8 | Put y -> serve y
9 in
10 let _ = create serve x in
11 cell
342
Creation of the cell with initial value x spawns a server thread that evaluates the call
serve x.
Caveat
The server thread is possibly non-terminating!
This is why it can respond to arbitrarily many requests.
Only because it is tail-recursive, it does not successively consume the whole storage
...
343
1 let main =
2 let cell = new_cell 1 in
3 print_int (get cell);
4 print_string "\n";
5 put cell 2;
6 print_int (get cell);
7 print_string "\n"
1 > ./a.out
2 1
3 2
4 >
Instead of get and put, also more complex query or update operations could be
executed by the cell server ...
344
Example: Locks
Often, only one at a time out of several active threads should be allowed access to a
given resource. In order to realize such a mutual exclusion, locks can be applied:
345
Execution of the operation acquire returns an element of type ack which is
used to return the lock:
For simplicity, ack is chosen itself as the channel by which the lock is returned.
346
The unlock channel is created by acquire itself
1 let new_lock () =
2 let lock = new_channel () in
3 let rec acq_server () = rel_server (sync (receive lock))
4 and rel_server ack =
5 sync (receive ack);
6 acq_server ()
7 in
8 let _ = create acq_server () in
9 lock
347
Core of the implementation are the two mutually recursive functions acq_server and
rel_server.
acq_server expects an element ack, i.e., a channel, and upon reception, calls
rel_server.
rel_server expects a signal on the received channel indicated that the lock is
released ...
348
1 let dead =
2 let l1 = new_lock () in
3 let l2 = new_lock () in
4 let th (l1, l2) =
5 let a1 = acquire l1 in
6 let _ = delay 1.0 in
7 let a2 = acquire l2 in
8 release a2; release a1;
9 print_int (id (self ())); print_string " finished\n"
10 in
11 let t1 = create th (l1, l2) in
12 let t2 = create th (l2, l1) in
13 join t1
The result is
1 > ./a.out
Occasionally, there is more than one copy of a resource. Then semaphores are the
method of choice ...
350
Idea
Again, a server is realized using an accumulating parameter, now maintaining the
number of free resources or, if zero, the queue of waiting threads ...
Apparently, the queue does not maintain the waiting threads, but only their back
channels.
352
8.2 Selective Communication
is meant to read integers from two channels and send their sum to the third.
353
First Attempt
Disadvantage
If a value arrives at the second input channel first, the thread nonetheless must wait.
354
Second Attempt
355
Idea
1 wrap : 'a event -> ('a -> 'b) -> 'b event
356
The list thus consists of (int*int) events.
The functions
Typically, that event occurs that finds its communication partner first.
357
Further Examples
The function
358
Apparently, the event list may also consist of send events — or contain both kinds.
359
In general, there could be a tree of events:
sync
f3 f4
send c3 x send c4 y
f1 f2
receive c1 receive c2
360
→ The leaves are basic events.
→ A wrapper function may be applied to any given event.
→ Several events of the same type may be combined into a choice.
→ Synchronization on such an event tree activates a single leaf event. The result is
obtained by successively applying the wrapper functions from the path to the
root.
361
Example: A Swap Channel
Upon rendezvous, a swap channel is meant to exchange the values of the two
participating threads. The signature is given by
In the implementation with ordinary channels, every participating thread must offer the
possibility to receive and to send.
362
As soon as a thread successfully completed to send (i.e., the other thread successfully
synchronized on a receive event), the second value must be transmitted in opposite
direction.
Together with the first value, we therefore transmit a channel for the second value:
363
1 module Swap = struct
2 open Thread
3 open Event
4
5 type 'a swap = ('a * 'a channel) channel
6
7 let new_swap () = new_channel ()
8
9 let swap ch x =
10 let c = new_channel () in
11 choose [
12 wrap (receive ch) (fun (y, c) -> sync (send c x); y);
13 wrap (send ch (x, c)) (fun () -> sync (receive c));
14 ]
15 end
364
Timeouts
365
1 module Timer = struct open Thread open Event
2
3 let set_timer t = let ack = new_channel () in
4 let serve () = delay t; sync (receive ack) in
5 let _ = create serve () in
6 send ack ()
7
366
8.3 Threads and Exceptions
An exception must be handled within the thread where it has been raised.
367
... yields
1 > /.a.out
2 Thread 1 killed on uncaught exception Division_by_zero
3 main terminated regularly ...
Also, uncaught exceptions within the wrapper function terminate the running thread:
368
Then we have
1 > ./a.out
2 Fatal error: exception Division_by_zero
Caveat
Exceptions can only be caught in the body of the wrapper function itself, not behind
the sync !
369
8.4 Buffered Communication
A channel for buffered communication allows to send without blocking. Receiving still
may block, if no messages are available. For such channels, we realize a module
Mailbox:
For the implementation, we rely on a server which maintains a queue of sent but not
yet received messages.
370
Then we implement:
371
1 ...
2 let new_mailbox () = let in_chan = new_channel ()
3 and out_chan = new_channel () in
4 let rec serve q =
5 if is_empty q then serve (enqueue
6 (sync (Event.receive in_chan)) q)
7 else select [
8 wrap (Event.receive in_chan)
9 (fun y -> serve (enqueue y q));
10 wrap (Event.send out_chan (first q))
11 (fun () -> let _, q = dequeue q in
12 serve q);
13 ]
14 in
15 let _ = create serve (new_queue ()) in
16 (in_chan, out_chan)
17 end
... where first : 'a queue -> 'a returns the first element in the queue
without removing it. 372
8.5 Multicasts
373
The operation new_port generates a fresh port where a message can be received.
The (non-blocking) operation multicast sends to all registered ports.
374
The operation multicast sends the message on channel send_ch. The Operation
receive reads from the mailbox of the port.
The multicast channel’s server thread maintains the list of ports:
1 ...
2 let new_mchannel () = let send_ch = new_channel () in
3 let req = new_channel () in
4 let rec serve ports = select [
5 wrap (Event.receive req) (fun p -> serve (p :: ports));
6 wrap (Event.receive send_ch) (fun x ->
7 let _ = create (List.iter (
8 fun p -> M.send p x)) ports in
9 serve ports)
10 ]
11 in
12 let _ = create serve [] in
13 (send_ch, req)
14 ...
375
Note that the server thread must respond both to port requests over the channel req
and to send requests over send_ch.
Caveat
Our implementation supports addition, but not removal of obsolete ports.
For an example run, we use a test expression main:
376
1 ...
2 let main = let mc = new_mchannel () in
3 let thread i = let p = new_port mc in
4 while true do
5 let x = sync (receive p) in
6 print_int i; print_string ": ";
7 print_string (x ^ "\n")
8 done
9 in
10 let _ = create thread 1 in
11 let _ = create thread 2 in
12 let _ = create thread 3 in
13 delay 1.0;
14 multicast mc "Hello!";
15 multicast mc "World!";
16 multicast mc "... the end.";
17 delay 10.0
18 end
377
We obtain
1 - ./a.out
2 3: Hello!
3 2: Hello!
4 1: Hello!
5 3: World!
6 2: World!
7 1: World!
8 3: ... the end.
9 2: ... the end.
10 1: ... the end.
378
Summary
379
Perspectives
• Beyond the language concepts discussed in the lecture, OCaml has diverse
further concepts, which also enable object oriented programming.
• Moreover, OCaml has elegant means to access functionality of the operating
system, to employ graphical libraries and to communicate with other computers
...
380
9 Datalog: Computing with Relations
381
Discussion
Lecturer:
Name Telefon Email
Esparza 17204 [email protected]
Nipkow 17302 [email protected]
Seidl 18155 [email protected]
382
Module:
Title Room Time
Discrete Structures MI 1 Thu 12:15-13, Fri 10-11:45
Pearls of Informatics III MI 3 Thu 8:30-10
Funct. Programming and Verification MI 1 Tue 16-18
Optimization MI 2 Mon 12-14, Di 12-14
Student:
Matr.nr. Name Sem.
123456 Hans Dampf 03
007042 Fritz Schluri 11
543345 Anna Blume 03
131175 Effi Briest 05
383
Discussion (cont.)
offers:
Name Title
Esparza Discrete Structures
Nipkow Pearls of Informatics III
Seidl Funct. Programming and Verification
Seidl Optimization
384
attends:
Matr.nr. Title
123456 Funct. Programming and Verification
123456 Optimization
123456 Discrete Structures
543345 Funct. Programming and Verification
543345 Discrete Structures
131175 Optimization
385
Possible Queries
==⇒ Datalog
386
Idea: Table ⇐=⇒ Relation
R ⊆ U1 × . . . × Un
where Ui is the set of all possible values for the ith component. In our example,
there are:
387
Predicates can be defined by enumeration of facts ...
388
Rules can be used to deduce further facts ...
389
The knowledge base consisting of facts and rules now can be queried ...
• Datalog finds all values for Z so that the query can be deduced from the
given facts by means of the rules.
• In our examples these are:
1 Z = "Hans Dampf"
2 Z = "Anna Blume"
3 Z = "Effi Briest"
390
Further Queries
Caveat
A query may contain none, one or several variables.
391
An Example Proof
The rule
1 has_attendant (X,Y) :- offers (X,Z), attends (M,Z),
2 student (M,Y,_).
we prove
offers ("Seidl", "Funct. Programming ...")
attends (543345, "Funct. Programming ...")
student (543345, "Anna Blume", 3)
has_attendant ("Seidl", "Anna Blume")
392
Example 2: A Weblog
Entry
Weblog
edits contains + ID
Group + Title + Contents
+ Date
Person
has member + Account owns
+ Name
− Password
trusts
393
Task: Specification of access rights
394
Specification in Datalog
395
Remark
• All available predicates or even fresh auxiliary predicates can be used for the
definition of new predicates.
• Apparently, predicate definitions may be recursive.
• Together with a person X owning an entry, also all persons are entitled to
modify trusted by X.
• Together with a person Y entitled to read, also all persons are entitled to
read trusted by Y.
396
9.1 Answering a Query
Problem
1 equals (X,X).
397
Theorem
Assume that W is a finite set of facts and rules with the following properties:
(1) Facts do not contain variables.
(2) Every variable in the head, also occurs in the body.
Then the set of provable facts is finite.
Proof Sketch
For every provable fact p(a1,...,ak), it is shown that each constant ai
already occurs in W .
398
Calculation of All Provable Facts
Successively compute the sets R(i) of all facts having proofs of depth at most i
...
F(M ) = {h[a/X] | ∃ h :- l1 , . . . , lk . ∈ W :
l1 [a/X], . . . , lk [a/X] ∈ M }
// [a/X] a substitution of the variables X
// k can be equal to 0.
399
We have: R(i) = F i (∅) ⊆ F i+1 (∅) = R(i+1)
Example
1 edge (a,b).
2 edge (a,c).
3 edge (b,d).
4 edge (d,a).
5 t (X,Y) :- edge (X,Y).
6 t (X,Y) :- edge (X,Z), t (Z,Y).
400
Relation edge :
a b c d
a
b
c
d
401
t (0) a b c d t (1) a b c d
a a
b b
c c
d d
402
t (2) a b c d t (3) a b c d
a a
b b
c c
d d
403
Discussion
• Our considerations are strong enough to calculate all facts implied by a Datalog
program.
• From that, the set of answer substitutions can be extracted.
• The naive approach, however, is hopelessly inefficient.
• Smarter approaches try to avoid multiple calculations of the ever identical same
facts ...
• In particular, only those facts need be proven which are useful for answering the
query ==⇒ compiler construction, databases
404
9.2 Operations on Relations
405
1. Union
406
... in Datalog:
r(X1 , . . . , Xk ) :- s1 (X1 , . . . , Xk ).
r(X1 , . . . , Xk ) :- s2 (X1 , . . . , Xk ).
Example
407
2. Intersection
408
... in Datalog:
r(X1 , . . . , Xk ) :- s1 (X1 , . . . , Xk ),
s2 (X1 , . . . , Xk ).
Example
409
3. Relative Complement
410
... in Datalog:
Example
411
Caveat
The query
1 p("Hello!").
2 ?- not (p(X)).
1 p("Hello!").
2 q("Damn ...").
3 ?- q(X), not (p(X)).
4 X = "Damn ..."
412
Caveat (cont.)
413
4. Cartesian Product
S1 × S2 = {(a1 , . . . , ak , b1 , . . . , bm ) | (a1 , . . . , ak ) ∈ S1 ,
(b1 , . . . , bm ) ∈ S2 }
... in Datalog:
414
a b c d
a a a
b b b
c c c
d d d
415
Example
Comments
416
5. Projection
... in Datalog:
417
a b c d
a a
b b
1 c c
d d
418
a b c d a b c d
a a
b b
1,1 c c
d d
419
6. Join
S1 ⋊
⋉ S2 = {(a1 , . . . , ak , b1 , . . . , bm ) | (a1 , . . . , ak+1 ) ∈ S1 ,
(b1 , . . . , bm ) ∈ S2 ,
ak+1 = b1 }
... in Datalog:
420
Discussion
S1 ⋊
⋉ S2 = π1,...,k,k+2,...,k+1+m (
S1 × S2 ∩
U k × π1,1 (U ) × U m−1 )
The presented operations on relations form the basis of Relational Algebra ...
421
Background
422
Example
423
Perspective
• Besides a query language, a realistic database language must also offer the
possibility for insertion / modification / deletion.
• The implementation of a database must be able to handle not just toy
applications like our examples, but to deal with gigantic mass data !!!
• It must be able to reliably execute multiple concurrent transactions without
messing up individual tasks.
• A database also should be able to survive power supply failure
==⇒ Database Lecture
424