The Relational Database Model
The Relational Database Model
Functional dependence
The attribute B is functionally dependent on the attribute A if
each value in column A determines one and only one value in
column B
Consider the above table, there is no single unique attribute
which can be a primary key or a determinant.
In this case, it is possible to have more than one single attribute
to define functional dependence.
FLIGHT_NO,FLIGHT DATE -> FLIGHT_TIME,FLIGHT_DURATION
Such a multi attribute key is called a composite key.
Any attribute that is part of a key is known as key attribute.
in the previous STUDENT table, the combination of last name,
first name, initial and phone is likely to produce unique matches
(incase STU_NUM was not there)
For example –
STU_LNAME, STU_FNAME,STU_INIT,STU_PHONE -> STU_HRS,STU_CLASS
Or
STU_LNAME, STU_FNAME,STU_INIT,STU_PHONE -> STU_HRS,STU_CLASS,
STU_GPA
Or
STU_LNAME, STU_FNAME,STU_INIT,STU_PHONE -> STU_HRS,STU_CLASS,
STU_GPA,STU_DOB
The notion of functional dependence can be further refined by specifying
full functional dependence.
If the attribute (B) is functionally dependent on a composite
key (A) but not on any subset of that composite key, the
attribute (B) is fully functionally dependent on (A)
Types of Keys (cont’d.)
Within the broad key classification, several specialized keys can be defined
12
Types of Keys (cont’d.)
Candidate key
A super key without unnecessary attributes, that is a minimal super key.
It can have more than one attributes as long as it there are not duplicates
STU_NUM or the combination
STU_LNAME,STU_FNAME,STU_INIT,STU_PHONE is an example of a
Candidate key
A Primary key is a candidate key chosen to be the unique row identifier.
Also, primary key is a super key as well as a candidate key.
A maintain entity integrity, a null (no data entry at all) is not
permitted in the primary key.
Example
Year Month Date Major Minor
2008 01 13 0 1
2008 04 23 0 2
2009 11 05 1 0
2010 04 05 1 1
15
Controlled redundancy
Makes the relational database work
Tables within the database share common attributes
enables tables to be linked together
Multiple occurrences of values not redundant when
required to make the relationship work
Redundancy exists only when there is unnecessary
duplication of attribute values
Example in table below –
16
In the above diagram, the link is indicated by the line that
connects the VENDOR and PRODUCT tables.
The link is created when two tables share an attribute with
common values.
The primary key of one table (VENDOR) appears as foreign key in
a related table (PRODUCT)
A foreign key (FK) is an attribute whose values match the
primary key values in the related table.
VENDOR(VEND_CODE, VEND_CONTACT,VEND_AREA,VEND_PHONE)
PRODUCT (PROD_CODE, PROD_DESCRIPT,PROD_PRICE, PROD_ON_HAND,
VEND_CODE)
VEND_CODE is primary key in the VENDOR table, and occurs as a
foreign key in the PRODUCT table.
Examples
19
Secondary key
Key used strictly for data retrieval purposes. For example in
CUSTOMER table, primary key is customer number which is difficult
to remember. So, secondary key can be a combination of customer’s
last name and phone number but would yield many matches. The
combination may not yield a unique outcome.
20
21
Integrity Rules
22
23
24
Example
Table : AGENT Table : CUSTOMER
Primary Key : CUS_CODE
Primary Key: AGENT_CODE
Foreign Key : AGENT_CODE
Referential Integrity :
Entity Integrity : The Primary AGENT_CODE entries in the
key in both the tables has no CUSTOMER table matches all the
AGENT_CODE in AGENT_TABLE
null entries and all entries are
unique
Relational Database Operators
The data in relational tables are of limited value unless the data can be
manipulated to generate useful information.
Relational algebra defines theoretical way of manipulating table contents
using the eight relational operators.
Relational algebra is a procedural query language, which takes instances of
relations as input and yields instances of relations as output. It uses
operators to perform queries.
They accept relations as their input and yield relations as their output
The eight relational operators are: UNION, INTERSECT, DIFFERENCE,
PRODUCT, SELECT, PROJECT, JOIN, and DIVIDE.
Application of these operators are based on relational algebra theory.
They define functions to manipulate data in one or more tables (relations).
SQL commands can be used to accomplish relational algebra operations
which will learnt in future chapters.
26
1. SELECT, also known as restrict, yields values for all rows found in a table that
satisfy a given condition. SELECT can be used to list all of the row values, or
one’s which match specified condition, as shown in fig below-
Formally, SELECT is denoted by the lowercase Greek letter sigma (σ). Sigma is followed by the
condition to be evaluated (called a predicate) as a subscript, and then the relation is listed in
parentheses.
Notation − σp(r)
Where σ stands for selection and r stands for relation. p is prepositional logic formula which may
use connectors like and, or, and not. These terms may use relational operators like − =, ≠, ≥, < , >, ≤.
For example, to SELECT all of the rows in the CUSTOMER table that have the value ‘10010’ in the
CUS_CODE attribute, you would write the following:
Project
PROJECT produces a list of all values for selected attributes.
It yields a vertical subset of a table.
Formally, PROJECT is denoted by the Greek letter pi (π). Pi is followed by the list
of attributes to be returned as subscripts, and then the relation listed in parentheses.
Notation − ∏A1, A2, An (r)
Where A1, A2 , An are attribute names of relation r.
For example, to PROJECT the CUS_FNAME and CUS_LNAME attributes in the CUSTOMER
table, you would write the following:
28
UNION
UNION combines all rows from two tables, excluding duplicate rows i.e.
Union of two relations is a relation that includes all the tuples that are in Relation 1 and
in Relation 2 and tuples common in both relations are shown only once.
The tables must have the same attribute characteristics to be used in the UNION. They
must be union-compatible.
Notation − r U s
Where r and s are either database relations or relation result set
For a union operation to be valid, the following conditions must hold −
r, and s must have the same number of attributes.Attribute domains must be
compatible. Duplicate tuples are automatically eliminated
If the relations SUPPLIER and VENDOR are union compatible, then a UNION between
them would be denoted as follows:
Supplier vendor
INTERSECT
INTERSECT yields only the rows that appear in both the tables.
The tables must be union compatible. One cannot use INTERSECT, if
one of the attributes is numeric and one is character based.
INTERSECT is denoted by the symbol . If the relations SUPPLIER and VENDOR are
union-compatible, then an INTERSECT between them would be denoted as follows:
supplier vendor
5. DIFFERENCE yields all rows in one table that are not found in
the other table.
0 A right outer join yields all of the rows in the AGENT table, including those
that do not have matching values in CUSTOMER table. An example in figure
3.16
Outer Joins
Outer joins are especially useful when one is trying to determine
what value(s) in related tables cause referential integrity
problems.
Such problems are created when foreign key values do not match
the primary key values in the related tables.
Divide
In relational algebra, an operator that answers queries about one set of
data being associated with all values of data in another set of data.
0 The output of DIVIDE operation is a single column with the values of
column “a” from the dividend table rows where the value of the
common column in both table matches.
0 Table 1 is divided by table 2 to produce table 3. Tables must have
a common column.
0 Table 1 and 2 both contain the column CODE but don’t share LOC
0 To be included in the resulting Table 3, a value in the unshared
column (LOC) must be associated with every value in Table 1.
0 The only value associated with both A and B is 5.
The Data Dictionary
and System Catalog
Data dictionary
0 It automatically produces database documentation. Data
dictionary contains metadata that describes the data stored in
the database.
0 It stores:
0 the names of the data items in the database
0 the types and sizes of the data items
0 the constraints on each data item
0 the names of authorized users, the data items that each user can
access, and the types of access allowed.
40
System Catalog
0 System catalog is a very detailed system data dictionary. It
describes all objects within the database.
0 System catalog is a system-created database whose tables store
the database characteristics and contents.
0 System catalog tables can be queried just like any other tables.
0 System catalog automatically produces database documentation.
0 All data dictionary information are found in the system catalog.
42
Relationships within the
Relational Database
0 Relationships are classified as one-to-one (1:1), one-to-
many(1:M), and many-to-many (M:M)
0 1:M relationship is the relational database norm.
0 Data model in fig 3.18 shows implementation in figure 3.19.
43
0 As in above figure, each painting is painted by one and only one
painter but each painter could have painted any paintings.
0 There is only one row in the PAINTER table, but there are many
rows in the PAINTING table.
0 1:1 relationship implies, one entity can be related to one only
other entity and vice-versa.
0 For example, one department chair – a professor – can chair only
one department, and one department can have only one
department chair. The entities PROFESSOR and DEPARTMENT
exhibit a 1:1 relationship. It is modelled in figure 3.22 and its
implementation is shown in figure 3.23
0 M:N relationships is not supported directly in the relational
environment.
0 However the can be implemented by creating a new entity in 1:M
relationships with original entities and a linking entity.
0 Consider the below example of STUDENT can take many CLASSes
and each class can contain many STUDENTs. The M:M relationship
is shown in figure 3.24
47
TABLE NAME : STUDENT
53
Indexes
0 To locate a particular book in the library, one doesn’t look
through each and every book but use the library’s catalog. This
catalog is indexed by title, topic and author.
0 An index points one to the book’s location to make retrieval
quick and easy.
0 An Index is an orderly arrangement to logically access rows in
a table
0 Index key
0 Index’s reference point
0 Points to data location identified by the key
0 Unique index
0 Index in which the index key can have only one pointer value
(row) associated with it
0 Each index is associated with only one table
54
55
Codd’s Relational Database Rules
0 In 1985, Codd published a list of 12 rules to define a relational
database system
0 Products marketed as “relational” that did not meet minimum
relational standards
0 Even dominant database vendors do not fully support all 12 rules
56
Foundation Rule
A relational database management system must manage its stored data
using only its relational capabilities.
1. Information Rule
All information in the database should be represented in one and
only one way - as values in a table.
2. Guaranteed Access Rule
Each and every value is guaranteed to be logically accessible by a
combination of table name, primary key value and column name.
3. Systematic Treatment of Null Values
Nulls must be represented and treated in a systematic way,
independent of data type
4. Dynamic On-line Catalog Based on the Relational Model
The metadata must be stored and managed as ordinary data,
that is in tables within the database. Such data must be available
to authorized users using the standard database relational
language.
5. Comprehensive Data Sublanguage Rule
A relational system may support several languages and various modes of
terminal use. However, there must be at least one language whose statements
are expressible, per some well-defined syntax, as character strings and whose
ability to support all of the following is comprehensible:
4. data definition
5. view definition
6. data manipulation (interactive and by program)
7. integrity constraints
8. authorization
9. transaction boundaries (begin, commit, and rollback).
6. View Updating
All views that are theoretically updateable are also updateable by the system.
7. High-level Insert, Update, and Delete
The database must support set-level inserts, updates and deletes..
8. Physical Data Independence
Application programs and terminal activities remain logically unaffected
whenever any changes are made in either storage representation or access
methods.
9. Logical Data Independence
Application programs and ad hoc facilities are logically unaffected when changes are
made to the table structures.
10. Integrity Independence
Integrity constraints specific to a particular relational database must be definable in
the relational data sublanguage and storable in the catalog, not in the application
programs.
11. Distribution Independence
The data manipulation sublanguage of a relational DBMS must enable application
programs and terminal activities to remain logically unaffected whether and whenever
data are physically centralized or distributed.
12. Non subversion Rule
If a relational system has or supports a low-level access of data, there must not be a
way to bypass the integrity rules of the database.
References -
Database principles
Fundamental of design, implementation and management
-Carlos Coronel, Steven Morris, Peter Ros