Unit III Relational Algebra and
Relational Calculus
Schema Diagram for the Banking Enterprise
Query Languages
• Language in which user requests information from the
database.
• Categories of languages
– procedural
– non-procedural
• “Pure” languages:
– Relational Algebra
– Tuple Relational Calculus
– Domain Relational Calculus
• Pure languages form underlying basis of query
languages that people use.
Relational Algebra
• Procedural language
• Six basic operators
– select
– project
– union
– set difference
– Cartesian product
– rename
• The operators take one or more relations as
inputs and give a new relation as a result.
Select Operation – Example
• Relation r A B C D
1 7
5 7
12 3
23 10
• A=B ^ D > 5 (r)
A B C D
1 7
23 10
Select Operation
• Notation: p(r)
• p is called the selection predicate
• Defined as:
p(r) = {t | t r and p(t)}
Where p is a formula in propositional calculus
consisting of terms connected by : (and), (or),
(not)
Each term is one of:
<attribute>op <attribute> or
<constant>
where op is one of: =, , >, . <.
• Example of selection:
branch-name=“Perryridge”(account)
Project Operation – Example
• Relation r: A B C
10 1
20 1
30 1
40 2
A C A C
A,C (r)
1 1
1 = 1
1 2
2
Project Operation
• Notation:
A1, A2, …, Ak (r)
where A1, A2 are attribute names and r is a relation
name.
• The result is defined as the relation of k columns
obtained by erasing the columns that are not listed
• Duplicate rows removed from result, since relations
are sets
• E.g. To eliminate the branch-name attribute of account
account-number, balance (account)
Union Operation – Example
• Relations r, s:
A B A B
1 2
2 3
1 s
r
r s: A B
1
2
1
3
Union Operation
• Notation: r s
• Defined as:
r s = {t | t r or t s}
• For r s to be valid.
1. r, s must have the same arity (same number of attributes)
2. The attribute domains must be compatible (e.g., 2nd column
of r deals with the same type of values as does the 2nd
column of s)
• E.g. to find all customers with either an account or a loan
customer-name (depositor) customer-name (borrower)
Set Difference Operation – Example
• Relations r, s:
A B A B
1 2
2 3
1 s
r
r – s: A B
1
1
Set Difference Operation
• Notation r – s
• Defined as:
r – s = {t | t r and t s}
• Set differences must be taken between
compatible relations.
– r and s must have the same arity
– attribute domains of r and s must be compatible
Cartesian-Product Operation-
Example
Relations r, s: A B C D E
1 10 a
10 a
2 20 b
r 10 b
s
r x s:
A B C D E
1 10 a
1 10 a
1 20 b
1 10 b
2 10 a
2 10 a
2 20 b
2 10 b
Cartesian-Product Operation
• Notation r x s
• Defined as:
r x s = {t q | t r and q s}
• Assume that attributes of r(R) and s(S) are
disjoint. (That is,
R S = ).
• If attributes of r(R) and s(S) are not disjoint,
then renaming must be used.
Composition of Operations
• Can build expressions using multiple operations
• Example: A=C(r x s) A B C D E
• rxs 1 10 a
1 10 a
1 20 b
1 10 b
2 10 a
2 10 a
2 20 b
2 10 b
A B C D E
1 10 a
• A=C(r x s) 2 20 a
2 20 b
Rename Operation
• Allows us to name, and therefore to refer to, the
results of relational-algebra expressions.
• Allows us to refer to a relation by more than one name.
Example:
x (E)
returns the expression E under the name X
If a relational-algebra expression E has arity n, then
x (A1, A2, …, An) (E)
returns the result of expression E under the name X, and
with the
attributes renamed to A1, A2, …., An.
Banking Example
branch (branch-name, branch-city, assets)
customer (customer-name, customer-street,
customer-only)
account (account-number, branch-name,
balance)
loan (loan-number, branch-name, amount)
depositor (customer-name, account-number)
borrower (customer-name, loan-number)
Example Queries
• Find all loans of over $1200
amount > 1200 (loan)
Find the loan number for each loan of an amount greater than
$1200
loan-number (amount > 1200 (loan))
Example Queries
• Find the names of all customers who have a loan, an
account, or both, from the bank
customer-name (borrower) customer-name (depositor)
Find the names of all customers who have a loan and an
account at bank.
customer-name (borrower) customer-name (depositor)
Example Queries
• Find the names of all customers who have a loan at
the Perryridge branch.
customer-name (branch-name=“Perryridge”
(borrower.loan-number = loan.loan-number(borrower x loan)))
Find the names of all customers who have a loan at the
Perryridge branch but do not have an account at any branch of
the bank.
customer-name (branch-name = “Perryridge”
(borrower.loan-number = loan.loan-number(borrower x loan))) –
customer-name(depositor)
Example Queries
• Find the names of all customers who have a loan at the
Perryridge branch.
Query 1
customer-name(branch-name = “Perryridge” (
borrower.loan-number = loan.loan-number(borrower x loan)))
Query 2
customer-name(loan.loan-number = borrower.loan-number(
(branch-name = “Perryridge”(loan)) x borrower))
Example Queries
Find the largest account balance
• Rename account relation as d
• The query is:
balance(account) - account.balance
(account.balance < d.balance (account x d (account)))
Formal Definition
• A basic expression in the relational algebra consists of
either one of the following:
– A relation in the database
– A constant relation
• Let E1 and E2 be relational-algebra expressions; the
following are all relational-algebra expressions:
– E1 E2
– E1 - E2
– E1 x E2
– p (E1), P is a predicate on attributes in E1
– s(E1), S is a list consisting of some of the attributes in E1
– x (E1), x is the new name for the result of E1
Additional Operations
We define additional operations that do not add any
power to the
relational algebra, but that simplify common queries.
• Set intersection
• Natural join
• Division
• Assignment
Set-Intersection Operation
• Notation: r s
• Defined as:
• r s ={ t | t r and t s }
• Assume:
– r, s have the same arity
– attributes of r and s are compatible
• Note: r s = r - (r - s)
Set-Intersection Operation - Example
A B A B
• Relation r, s: 1
2
2
3
1
r s
A B
2
• rs
Natural-Join
Notation: r s
Operation
• Let r and s be relations on schemas R and S respectively.
Then, r s is a relation on schema R S obtained as
follows:
– Consider each pair of tuples tr from r and ts from s.
– If tr and ts have the same value on each of the attributes in R
S, add a tuple t to the result, where
• t has the same value as tr on r
• t has the same value as ts on s
• Example:
R = (A, B, C, D)
S = (E, B, D)
– Result schema = (A, B, C, D, E)
– r s is defined as:
r.A, r.B, r.C, r.D, s.E (r.B = s.B r.D = s.D (r x s))
Natural Join Operation – Example
• Relations r, s:
A B C D B D E
1 a 1 a
2 a 3 a
4 b 1 a
1 a 2 b
2 b 3 b
r s
r s
A B C D E
1 a
1 a
1 a
1 a
2 b
Division Operation
rs
• Suited to queries that include the phrase “for all”.
• Let r and s be relations on schemas R and S
respectively where
– R = (A1, …, Am, B1, …, Bn)
– S = (B1, …, Bn)
The result of r s is a relation on schema
R – S = (A1, …, Am)
rs={t | t R-S(r) u s ( tu r ) }
Division Operation – Example
Relations r, s: A B
B
1
1
2
3 2
1 s
1
1
3
4
6
1
2
r s: A r
Another Division Example
Relations r, s:
A B C D E D E
a a 1 a 1
a a 1 b 1
a b 1 s
a a 1
a b 3
a a 1
a b 1
a b 1
r
r s: A B C
a
a
Division Operation (Cont.)
• Property
– Let q – r s
– Then q is the largest relation satisfying q x s r
• Definition in terms of the basic algebra operation
Let r(R) and s(S) be relations, and let S R
r s = R-S (r) –R-S ( (R-S (r) x s) – R-S,S(r))
To see why
– R-S,S(r) simply reorders attributes of r
– R-S(R-S (r) x s) – R-S,S(r)) gives those tuples t in
R-S (r) such that for some tuple u s, tu r.
•
Assignment Operation
The assignment operation () provides a convenient way to express
complex queries.
– Write query as a sequential program consisting of
• a series of assignments
• followed by an expression whose value is displayed as a result of the
query.
– Assignment must always be made to a temporary relation variable.
• Example: Write r s as
temp1 R-S (r)
temp2 R-S ((temp1 x s) – R-S,S (r))
result = temp1 – temp2
– The result to the right of the is assigned to the relation variable on the left
of the .
– May use variable in subsequent expressions.
Example Queries
• Find all customers who have an account from at least
the “Downtown” and the Uptown” branches.
Query 1
CN(BN=“Downtown”(depositor account))
CN(BN=“Uptown”(depositor account))
where CN denotes customer-name and BN denotes
branch-name.
Query 2
customer-name, branch-name (depositor account)
temp(branch-name) ({(“Downtown”), (“Uptown”)})
Example Queries
• Find all customers who have an account at all
branches located in Brooklyn city.
customer-name, branch-name (depositor account)
branch-name (branch-city = “Brooklyn” (branch))
Extended Relational-Algebra-
Operations
• Generalized Projection
• Outer Join
• Aggregate Functions
Generalized Projection
• Extends the projection operation by allowing
arithmetic functions to be used in the projection list.
F1, F2, …, Fn (E)
• E is any relational-algebra expression
• Each of F , F , …, F are are arithmetic expressions
1 2 n
involving constants and attributes in the schema of E.
• Given relation credit-info(customer-name, limit, credit-
balance), find how much more each person can spend:
customer-name, limit – credit-balance (credit-info)
Aggregate Functions and Operations
• Aggregation function takes a collection of values and returns a
single value as a result.
avg: average value
min: minimum value
max: maximum value
sum: sum of values
count: number of values
• Aggregate operation in relational algebra
G1, G2, …, Gn g F1( A1), F2( A2),…, Fn( An) (E)
– E is any relational-algebra expression
– G1, G2 …, Gn is a list of attributes on which to group (can be empty)
– Each Fi is an aggregate function
– Each Ai is an attribute name
Aggregate Operation – Example
• Relation r:
A B C
7
7
3
10
sum-C
g sum(c) (r)
27
Aggregate Operation – Example
• Relation account grouped by branch-name:
branch-name account-number balance
Perryridge A-102 400
Perryridge A-201 900
Brighton A-217 750
Brighton A-215 750
Redwood A-222 700
branch-name g sum(balance) (account)
branch-name balance
Perryridge 1300
Brighton 1500
Redwood 700
Aggregate Functions (Cont.)
• Result of aggregation does not have a name
– Can use rename operation to give it a name
– For convenience, we permit renaming as part of
aggregate operation
branch-name g sum(balance) as sum-balance (account)
Outer Join
• An extension of the join operation that avoids
loss of information.
• Computes the join and then adds tuples form one
relation that do not match tuples in the other
relation to the result of the join.
• Uses null values:
– null signifies that the value is unknown or does not
exist
– All comparisons involving null are (roughly speaking)
false by definition.
• Will study precise meaning of comparisons with nulls later
Outer Join – Example
• Relation loan
loan-number branch-name amount
L-170 Downtown 3000
L-230 Redwood 4000
L-260 Perryridge 1700
Relation borrower
customer-name loan-number
Jones L-170
Smith L-230
Hayes L-155
Outer Join – Example
• Inner Join
loan Borrower
loan-number branch-name amount customer-name
L-170 Downtown 3000 Jones
L-230 Redwood 4000 Smith
Left Outer Join
loan Borrower
loan-number branch-name amount customer-name
L-170 Downtown 3000 Jones
L-230 Redwood 4000 Smith
L-260 Perryridge 1700 null
Outer Join – Example
• Right Outer Join
loan borrower
loan-number branch-name amount customer-name
L-170 Downtown 3000 Jones
L-230 Redwood 4000 Smith
L-155 null null Hayes
Full Outer Join
loan borrower
loan-number branch-name amount customer-name
L-170 Downtown 3000 Jones
L-230 Redwood 4000 Smith
L-260 Perryridge 1700 null
L-155 null null Hayes
Null Values
• It is possible for tuples to have a null value, denoted by null,
for some of their attributes
• null signifies an unknown value or that a value does not
exist.
• The result of any arithmetic expression involving null is null.
• Aggregate functions simply ignore null values
– Is an arbitrary decision. Could have returned null as result
instead.
– We follow the semantics of SQL in its handling of null values
• For duplicate elimination and grouping, null is treated like
any other value, and two nulls are assumed to be the same
– Alternative: assume each null is different from each other
– Both are arbitrary decisions, so we simply follow SQL
Null Values
• Comparisons with null values return the special
truth value unknown
– If false was used instead of unknown, then not (A < 5)
would not be equivalent to A >= 5
• Three-valued logic using the truth value unknown:
– OR: (unknown or true) = true,
(unknown or false) = unknown
(unknown or unknown) = unknown
– AND: (true and unknown) = unknown,
(false and unknown) = false,
(unknown and unknown) = unknown
– NOT: (not unknown) = unknown
– In SQL “P is unknown” evaluates to true if predicate P
evaluates to unknown
• Result of select predicate is treated as false if it
evaluates to unknown
Modification of the Database
• The content of the database may be modified
using the following operations:
– Deletion
– Insertion
– Updating
• All these operations are expressed using the
assignment operator.
Deletion
• A delete request is expressed similarly to a query,
except instead of displaying tuples to the user,
the selected tuples are removed from the
database.
• Can delete only whole tuples; cannot delete
values on only particular attributes
• A deletion is expressed in relational algebra by:
rr–E
where r is a relation and E is a relational algebra
query.
Deletion Examples
• Delete all account records in the Perryridge branch.
account account – branch-name = “Perryridge” (account)
Delete all loan records with amount in the range of 0 to 50
loan loan – amount 0and amount 50 (loan)
Delete all accounts at branches located in Needham.
r1 branch-city = “Needham” (account branch)
r2 branch-name, account-number, balance (r1)
r3 customer-name, account-number (r2 depositor)
account account – r2
depositor depositor – r3
Insertion
• To insert data into a relation, we either:
– specify a tuple to be inserted
– write a query whose result is a set of tuples to be
inserted
• in relational algebra, an insertion is expressed by:
r r E
where r is a relation and E is a relational algebra
expression.
• The insertion of a single tuple is expressed by
letting E be a constant relation containing one
tuple.
Insertion Examples
• Insert information in the database specifying that Smith
has $1200 in account A-973 at the Perryridge branch.
account account {(“Perryridge”, A-973, 1200)}
depositor depositor {(“Smith”, A-973)}
Provide as a gift for all loan customers in the Perryridge
branch, a $200 savings account. Let the loan number serve
as the account number for the new savings account.
r1 (branch-name = “Perryridge” (borrower loan))
account account branch-name, account-number,200 (r1)
depositor depositor customer-name, loan-number(r1)
Updating
• A mechanism to change a value in a tuple
without charging all values in the tuple
• Use the generalized projection operator to do
this task
r (r)
F1, F2, …, FI,
• Each Fi is either
– the ith attribute of r, if the ith attribute is not
updated, or,
– if the attribute is to be updated Fi is an expression,
involving only constants and the attributes of r, which
gives the new value for the attribute
Update Examples
• Make interest payments by increasing all balances by 5 percent.
account AN, BN, BAL * 1.05 (account)
where AN, BN and BAL stand for account-number, branch-name
and balance, respectively.
Pay all accounts with balances over $10,000 6 percent interest
and pay all others 5 percent
account AN, BN, BAL * 1.06 ( BAL 10000 (account))
AN, BN, BAL * 1.05 (BAL 10000 (account))
Views
• In some cases, it is not desirable for all users to
see the entire logical model (i.e., all the actual
relations stored in the database.)
• Consider a person who needs to know a
customer’s loan number but has no need to
see the loan amount. This person should see a
relation described, in the relational algebra, by
customer-name, loan-number (borrower loan)
• Any relation that is not of the conceptual
model but is made visible to a user as a “virtual
relation” is called a view.
View Definition
• A view is defined using the create view statement which
has the form
create view v as <query expression
where <query expression> is any legal relational algebra
query expression. The view name is represented by v.
• Once a view is defined, the view name can be used to refer
to the virtual relation that the view generates.
• View definition is not the same as creating a new relation
by evaluating the query expression
– Rather, a view definition causes the saving of an expression; the
expression is substituted into queries using the view.
View Examples
• Consider the view (named all-customer) consisting of
branches and their customers.
create view all-customer as
branch-name, customer-name (depositor account)
branch-name, customer-name (borrower loan)
We can find all customers of the Perryridge branch by writing:
customer-name
(branch-name = “Perryridge” (all-customer))
Updates Through View
• Database modifications expressed as views must be translated to
modifications of the actual relations in the database.
• Consider the person who needs to see all loan data in the loan
relation except amount. The view given to the person, branch-loan,
is defined as:
create view branch-loan as
branch-name, loan-number (loan)
• Since we allow a view name to appear wherever a relation name is
allowed, the person may write:
branch-loan branch-loan {(“Perryridge”, L-37)}
Updates Through Views (Cont.)
• The previous insertion must be represented by an insertion into
the actual relation loan from which the view branch-loan is
constructed.
• An insertion into loan requires a value for amount. The insertion
can be dealt with by either.
– rejecting the insertion and returning an error message to the user.
– inserting a tuple (“L-37”, “Perryridge”, null) into the loan relation
• Some updates through views are impossible to translate into
database relation updates
– create view v as branch-name = “Perryridge” (account))
v v (L-99, Downtown, 23)
• Others cannot be translated uniquely
– all-customer all-customer {(“Perryridge”, “John”)}
• Have to choose loan or account, and
create a new loan/account number!
Views Defined Using Other Views
• One view may be used in the expression defining
another view
• A view relation v is said to depend directly on a
1
view relation v if v is used in the expression
2 2
defining v 1
• A view relation v is said to depend on view
1
relation v if either v1 depends directly to v or there is
2 2
a path of dependencies from v1 to v2
• A view relation v is said to be recursive if it
depends on itself.
View Expansion
• A way to define the meaning of views defined in terms of
other views.
• Let view v be defined by an expression e that may itself
1 1
contain uses of view relations.
• View expansion of an expression repeats the following
replacement step:
repeat
Find any view relation v in e
i 1
Replace the view relation v by the expression
i
defining vi
until no more view relations are present in e 1
• As long as the view definitions are not recursive, this loop
will terminate
Tuple Relational Calculus
• A nonprocedural query language, where each query is of
the form
{t | P (t) }
• It is the set of all tuples t such that predicate P is true for t
• t is a tuple variable, t[A] denotes the value of tuple t on
attribute A
• t r denotes that tuple t is in relation r
• P is a formula similar to that of the predicate calculus
Predicate Calculus Formula
1. Set of attributes and constants
2. Set of comparison operators: (e.g., , , , , ,
)
3. Set of connectives: and (), or (v)‚ not ()
4. Implication (): x y, if x if true, then y is true
x y x v y
5. Set of quantifiers:
t r (Q(t)) ”there exists” a tuple in t in relation r
such that predicate Q(t) is true
t r (Q(t)) Q is true “for all” tuples t in relation r
Banking Example
• branch (branch-name, branch-city, assets)
• customer (customer-name, customer-street,
customer-city)
• account (account-number, branch-name,
balance)
• loan (loan-number, branch-name, amount)
• depositor (customer-name, account-number)
• borrower (customer-name, loan-number)
Example Queries
• Find the loan-number, branch-name, and amount
for loans of over $1200
{t | t loan t [amount] 1200}
Find the loan number for each loan of an amount greater than $1200
{t | s loan (t[loan-number] = s[loan-number] s [amount]
1200)}
Notice that a relation on schema [loan-number] is implicitly defined
by the query
Example Queries
• Find the names of all customers having a loan, an
account, or both at the bank
{t | s borrower( t[customer-name] = s[customer-name])
u depositor( t[customer-name] = u[customer-name])
Find the names of all customers who have a loan and an account
at the bank
{t | s borrower( t[customer-name] = s[customer-name])
u depositor( t[customer-name] = u[customer-
name])
Example Queries
• Find the names of all customers having a loan
at the Perryridge branch
{t | s borrower(t[customer-name] = s[customer-name]
u loan(u[branch-name] = “Perryridge”
u[loan-number] = s[loan-number]))}
Find the names of all customers who have a loan at the
Perryridge branch, but no account at any branch of the bank
{t | s borrower( t[customer-name] = s[customer-name]
u loan(u[branch-name] = “Perryridge”
u[loan-number] = s[loan-number]))
not v depositor (v[customer-name] =
t[customer-name]) }
Example Queries
• Find the names of all customers having a loan from
the Perryridge branch, and the cities they live in
{t | s loan(s[branch-name] = “Perryridge”
u borrower (u[loan-number] = s[loan-number]
t [customer-name] = u[customer-name])
v customer (u[customer-name] = v[customer-name]
t[customer-city] = v[customer-city])))}
Example Queries
• Find the names of all customers who have an
account at all branches located in Brooklyn:
{t | c customer (t[customer.name] = c[customer-name])
s branch(s[branch-city] = “Brooklyn”
u account ( s[branch-name] = u[branch-name]
s depositor ( t[customer-name] = s[customer-name]
s[account-number] = u[account-number] )) )}
Safety of Expressions
• It is possible to write tuple calculus expressions that
generate infinite relations.
• For example, {t | t r} results in an infinite relation
if the domain of any attribute of relation r is infinite
• To guard against the problem, we restrict the set of
allowable expressions to safe expressions.
• An expression {t | P(t)} in the tuple relational calculus
is safe if every component of t appears in one of the
relations, tuples, or constants that appear in P
– NOTE: this is more than just a syntax condition.
• E.g. { t | t[A]=5 true } is not safe --- it defines an infinite set with
attribute values that do not appear in any relation or tuples or
constants in P.
Domain Relational Calculus
• A nonprocedural query language equivalent in
power to the tuple relational calculus
• Each query is an expression of the form:
{ x , x , …, x | P(x , x , …, x )}
1 2 n 1 2 n
– x1, x2, …, xn represent domain variables
– P represents a formula similar to that of the
predicate calculus
Example Queries
• Find the loan-number, branch-name, and amount for
loans of over $1200
{ l, b, a | l, b, a loan a > 1200}
Find the names of all customers who have a loan of over $1200
{ c | l, b, a ( c, l borrower l, b, a loan a > 1200)}
Find the names of all customers who have a loan from the
Perryridge branch and the loan amount:
{ c, a | l ( c, l borrower b( l, b, a loan
b = “Perryridge”))}
or { c, a | l ( c, l borrower l, “Perryridge”, a loan)}
Example Queries
• Find the names of all customers having a loan, an
account, or both at the Perryridge branch:
{ c | l ({ c, l borrower
b,a( l, b, a loan b = “Perryridge”))
a( c, a depositor
b,n( a, b, n account b = “Perryridge”))}
Find the names of all customers who have an account at all
branches located in Brooklyn:
{ c | s, n ( c, s, n customer)
x,y,z( x, y, z branch y = “Brooklyn”)
a,b( x, y, z account c,a depositor)}
Safety of Expressions
{ x , x , …, x | P(x , x , …, x )}
1 2 n 1 2 n
is safe if all of the following hold:
1. All values that appear in tuples of the expression are
values from dom(P) (that is, the values appear either in
P or in a tuple of a relation mentioned in P).
2. For every “there exists” subformula of the form x
(P (x)), the subformula is true if and only if there is a
1
value of x in dom(P ) such that P (x) is true.
1 1
3. For every “for all” subformula of the form x (P (x)),
1
the subformula is true if and only
if P (x) is true for all values x
1 from dom (P ).
1