CSCI 5333: Database Management System
PART-3, Chapter-7
Database Design and the
Entity-Relationship Model
(ER-Data Model)
Khondker S. Hasan
Department of Computing Sciences
University of Houston – Clear Lake
Chapter 6: Entity-Relationship Model
Outline
Design Process
Modeling
Constraints
E-R Diagram
Design Issues
Weak Entity Sets
Extended E-R Features
Design of the Example Database
Reduction to Relation Schemas
Database Design
2
Database Design Phases
The initial phase of database design is to characterize fully the data
needs of the prospective database users.
Next, the designer chooses a data model and, by applying the
concepts of the chosen data model, translates these requirements
into a conceptual schema of the database.
A fully developed conceptual schema also indicates the functional
requirements of the enterprise.
In a “specification of functional requirements”, users describe the
kinds of operations (or transactions) that will be performed on the
data.
3
Design Phases (Cont.)
The process of moving from an abstract data model to the
implementation of the database proceeds in two final design phases.
Logical Design – Deciding on the database schema. Database
design requires that we find a “good” collection of relation
schemas.
Business decision – What attributes should we record in the database?
Computer Science decision – What relation schemas should we have and how
should the attributes be distributed among the various relation schemas?
Physical Design – Deciding on the physical layout of the database
4
Entity-Relationship Model
The ER data mode was developed to facilitate database design
by allowing specification of an enterprise schema that
represents the overall logical structure of a database.
Based on a perception of a real-world enterprise that consists
of a set of basic objects called entity and relationship.
ER model is an Object based logical model.
Specifies enterprise schema
represents overall logical structure of a database
Semantic of data
Constraints
maps meaning and interactions of real-world enterprises onto a
conceptual schema
5
Entities and Attributes
Entity
thing e.g. person, car, account, tree
has a set of properties that uniquely identify it
may be concrete or conceptual
Entity Set
the collection of all entities of the same type that share the
same properties e.g. customer, supplier, student
individual entities that constitute a set are said to be the
extension of the entity set
6
Entity Sets -- instructor and student
instructor_ID instructor_name student-ID student_name
7
Entities and Attributes
Attributes
the descriptive properties of each member of the entity set
customer=(cus-name, cus-address, cus-phone)
database stores similar information for each entity, but
values may differ
setof permitted values for each attribute known as the
domain or value set of the attribute
8
Entities and Attributes
Formally,an attribute of an entity set is a function that maps
from the entity set into a domain
NY ...
MD FL domain
VA
OK
cus-state
Entity Set
9
Entities and Attributes
Attribute Types
simple
not divided into subparts, e.g. cus-state
composite
can be divided into subparts
cus-name firstname, middlename, surname
help us group together related attributes
single-valued
attribute may contains a single value for each entity
e.g. customer age, birth date etc.
multi-valued
attribute may contain a set of values
e.g. dependents name, phone #
10
Entities and Attributes
null attributes
used when an entity does not have a value for an attribute
can also mean that the value is unknown
E.g., middle-name
derived
the value for this type of attribute can be derived from other attributes
e.g. total-due derived from multiple purchases from a supplier
The value of a derived attribute is not stored, but is computed when
required.
11
Entities and Attributes
Redundant Attribute:
Suppose we have entity sets:
instructor, with attributes: ID, name, dept_name, salary
department, with attributes: dept_name, building, budget
We model the fact that each instructor has an associated department using
a relationship set inst_dept
The attribute dept_name appears in both entity sets.
Since it is the primary key for the entity set department, it replicates
information present in the relationship and is therefore redundant in the
entity set instructor and needs to be removed.
BUT: when converting back to tables, in some cases the attribute gets
reintroduced, as we will see later.
12
E-R Diagram With Composite, Multivalued,
and Derived Attributes
13
Relationship Sets
A relationship is an association among two or more entities
A relationship represents an association that exists in the real-
world enterprise. e.g., depositor
Verbs are good candidates for relationships
A relationship set is a set of relationships of the same type
Formally, it is a mathematical relation on n 2 entity sets
x xy y
14
xy is a relation
Relationship Sets
If E1,…,En are entity sets, then a relationship set R is a subset of
{(e1, e2,…, en) | e1 E1, e2 E2,…,en En}
Association between entity sets is referred to as participation.
That is, the entity set E1,…,En participate in relationship set R.
15
Relationship Set advisor
16
Relationship Sets
Relationships may have descriptive attributes
access_date
customer depositor account
Most relationship sets are binary, one that involves two entity sets,
Occasionally, there may be more than two entity sets involved in a
relationship.
17
Relationship Sets (Cont.)
An attribute can also be associated with a relationship set.
For instance, the advisor relationship set between entity sets
instructor and student may have the attribute date which tracks
when the student started being associated with the advisor
18
Recursive Relationship Sets (Roles)
The function an entity plays in a relationship is called its role
The labels “manager” and “worker” are called roles; they specify how
employee entities interact via the works_for relationship set.
Roles are indicated in E-R diagrams by labeling the lines that connect diamonds
to rectangles.
Role labels are optional, and are used to clarify semantics of the relationship
If relationship set not distinct (recursive relationship), role names may be
necessary
19
Degree of a Relationship Set
Binary relationship
involve two entity sets (or degree two).
most relationship sets in a database system are binary.
Relationships between more than two entity sets are rare. Most
relationships are binary. (More on this later.)
Example: students work on research projects under the
guidance of an instructor.
relationship proj_guide is a ternary relationship between
instructor, student, and project
20
Mapping Constraints
E-R schema may specify certain constraints to which the database must
conform
Mapping cardinalities
Existence dependencies
Participation constraints
Mapping Cardinalities (cardinality ratios)
express the number of entities to which another entity can be
associated via a relationship set
For a binary relationship R between entity sets A and B, the mapping
cardinality must be one of the following...
21
Mapping Constraints
One to One
an entity in A is associated with at most one entity in B. An entity
in B is associated with at most one entity in A
a1 b1
a2 b2
a3 b3
a4 b4
22
Mapping Constraints
One to Many
an entity in A is associated with any number of entities in B.
An entity in B can be associated with at most one entity in A
A B
b1
a1 b2
a2 b3
b4
23
Mapping Constraints
Many to One
An entity in A is associated with at most one entity in B. An
entity in B can be associated with any number of entities in A
A B
a1
a2 b1
a3 b2
a4
24
Mapping Constraints
Many to Many
An entity in A is associated with any number of entities in B.
An entity in B is associated with any number of entities in A
A B
a1 b1
a2 b2
a3 b3
a4 b4
25
Existence Dependencies
If the existence of entity x depends on the existence of
entity y, then x is said to be existence dependent on y
y is dominant entity
x is subordinate entity
payment is existent dependent on loan
26
Participation Constraints
The participation of an entity set E in a relationship R is said
to be total if every entity in E participates in at least one
relationship in R (denoted by double line)
If only some entities in E participate in relationships in R, the
participation is said to be partial
27
Keys
It is necessary to specify how entities within entity sets and
relationship sets are uniquely identified
We use keys to distinguish between entities
A superkey is a set of one or more attributes, that taken
collectively, allows us to uniquely identify an entity in an entity
set
social-security-number and student-name
28
Keys
However, superkeys may contain extraneous attributes
Minimal superkeys are known as candidate keys, e.g.,
{customer-id }, {cust-name, cust-street} are candidate keys.
A primary key is the candidate key that is chosen to uniquely
identify individual entities
A primary key that consists of more than one attribute is known
as a composite key
An attribute of an entity that is the primary key in another
entity is known as a foreign key
29
Guidelines for choosing primary key
Identify all candidate keys for all entities and select
primary key based on the following guideline:
minimal set of attributes
least likely to have value changed
least likely to lose uniqueness
fewest characters
easiest to user from users perspective
30
E-R Diagrams
The overall logical structure of the database can be expressed
graphically using an E-R Diagram
Joins entities to
Entity Sets relationships and attributes
to entities
Weak Entity Attributes
Relationship
Multi-valued attributes
Sets
Identifying Derived attributes
Relationship
31
E-R Diagrams
Attributes that are primary keys are underlined
Total participation in a relationship is denoted by a double line
Partial participation in a relationship is denoted by a single line
A directed line into an entity means that entity forms the
‘One’; part of a Many to One / One to Many relationship
Non-directed lines denote Many relationships
32
E-R Diagrams
x xy y
Many to Many
x xy y One to Many
x xy y Many to One
x xy y One to One
33
Practice Problem
Construct an E-R diagram for a car insurance company whose customers
own one or more cars each.
Each car has associated with it zero to any number of recorded
accidents.
Each insurance policy covers one or more cars, and has one or more
premium payments associated with it.
Each payment is for a particular period of time, and has an associated
due date, and the date when the payment was received.
34
Weak Entity Sets
A weak entity set is an entity set that does not have sufficient
attributes to form a primary key (opp. strong entity set)
It must be associated with another entity set, called identifying or
owner entity set. That is, the weak entity set is said to be existence
dependent on the identifying set.
The relationship between weak entity set and identifying entity set is
known as identifying relationship. Must be part of a Many-to-one
relationship (from weak to identifying entity set) and participation is
total.
Discriminator of a weak entity set is a set of attributes that
distinguishes entities dependent on one particular strong entity. E.g.,
payment-number
35
Weak Entity Sets
The primary key of a weak entity set is formed by a combination of
theprimary key of the strong entity set the weak entity set
depends on
plus the weak entity sets discriminator
Weak entity set designated by double box and the corresponding
identifying relation by a double diamond
36
Extended E-R Features
Specialization
an entity set may have certain entities that are distinct in some way from
other entities e.g. extra attributes
- The process of sub grouping within an entity set is called specialization.
37
Extended E-R Features
a/c number balance
account
savings checking
interest overdraft 38
Extended E-R Features
a/c number balance
account
ISA
savings checking
interest overdraft 39
Extended E-R Features
Specialization also referred to as superclass-subclass relationship
Subclass inherits the attributes of the superclass
Top-down approach: design process is from upper-level to lower-level
40
Extended E-R Features
Generalization
same as specialization except done using a bottom-up approach, i.e. identify entities
with common features and combine them into a superclass
Simple inversion of specialization
We will apply both processes, in combination for designing ER_schema
41
Extended E-R Features
Constraints: for accurate design, DB designers may choose to place
certain constraints
condition defined
membership of lower-level entity set determined by an entity satisfying a
condition
E.g., for distinguishing Account, we can use account_type attribute.
user defined
membership defined by database user. That is, DB user assigns entities to a
given entity set.
Not automatic, user in charge
E.g., membership of an employee in a group
disjoint
entitybelong to no more than one lower-level entity set. E.g., bank account
can be either savings-account or checking-account but can’t be both. E.g.,
Savings and checking account are disjoint 42
Extended E-R Features
overlapping
may belong to more than one lower-level entity set
E.g., manager can work in multiple work group
total
each higher-level entity must belong to one lower-level entity
set
E.g., Account
partial
some higher-level entities many not belong to any lower-level
entity set. E.g., Employee in a work group
43
Extended E-R Features
Aggregation
Not possible to express relationships among relationships
It’san abstraction through which relationships are treated as
higher-level entities.
customer borrower loan
loan-
officer
44
employee
Extended E-R Features
Every customer-loan pair is repeated in loan officer. Data
redundancy
If combined into one relationship, implies that a loan officer must be
assigned to each loan.
Treat customer, loan and borrower as a high level entity
45
Extended E-R Features
customer borrower loan
loan-
officer
employee
46
Conceptual Database Design
A design methodology is a structured approach that uses
procedures, techniques, tools and documentation aids to support
and facilitate the process of design
Conceptual Database Design is the process of constructing a model
of the information used in an enterprise, independent of ALL
physical considerations
47
Step 1 - Identify Entity Types
Identity and define the main objects (entities) of the enterprise
examine user requirements
look for noun and noun-phrases
look for major objects
identify existence dependent entities (if any)
Entity, relationship or attribute?
Depends on the enterprise
Document entity types
record names and descriptions
48
Step 2 - Identify Relationships
Determine relationships that exist among the entities
sometimes recorded as verbs in user requirements
relationships may be implicit as well as explicit
relationships may be binary, ternary, recursive etc.
Examples
Customer places Order
Instructor teaches Students
Employee is assigned to Project
Author writes Books
Determine the cardinality and participation constraints
Document relationships
name, descriptions, cardinality, participation
49
Step 3 - Identify attributes & Domains
Determine the attributes of the entities and relationships
attributes are the particular piece of information we need to describe
each entity
nouns, noun phrases
what information are we required to hold?
single/composite, derived etc.
Determine the value sets for each attribute
Document the name and value type
Document attributes
name, description, aliases, data type and length, NULL? Etc.
50
Step 4 – Define Constraints
Once the relationships between entities have been defined,
cardinality and participation constraints must be defined
For each relationship, indicate the cardinality, e.g. one-to-one,
many-to-one etc.
Also indicate the participation, total or partial
51
Step 5 - Determine Candidate and Primary Keys
Identify all candidate keys for all entities and select primary key
Guidelines for choosing primary key
minimal set of attributes
least likely to have value changed
least likely to lose uniqueness
fewest characters
easiest to user from users perspective
Note if set is strong or weak
Document the key
52
Step 6 - Specialization/Generalization
Optional
depends on enterprise being modeled
Step 7 - Draw E-R Diagram
Step 8 - Review
53
Total and Partial Participation
Total participation (indicated by double line): every entity in the entity set participates
in at least one relationship in the relationship set
participation of student in advisor relation is total
every student must have an associated instructor
Partial participation: some entities may not participate in any relationship in the
relationship set
Example: participation of instructor in advisor is partial
54
Notation for Expressing More Complex Constraints
A line may have an associated minimum and maximum cardinality, shown in the form
l..h, where l is the minimum and h the maximum cardinality
A minimum value of 1 indicates total participation.
A maximum value of 1 indicates that the entity participates in at most one
relationship
A maximum value of * indicates no limit.
Instructor can advise 0 or more students. A student must have 1 advisor; cannot
have multiple advisors
55
Notation to Express Entity with Complex Attributes
56
Practice Problem #2
Design an E-R diagram for keeping track of the
activities of your favorite sports team.
You should store the matches played, the scores in
each match, the players in each match, and
individual player statistics for each match.
Summary statistics should be modeled as derived
attributes.
57
Design an E-R diagram for keeping track of the activities of your favorite sports team. You
should store the matches played, the scores in each match, the players in each match, and
individual player statistics for each match. Summary statistics should be modeled as derived
attributes.
58
Top 5 Free Database Diagram (ERD) Design Tools
dbdiagram.io
draw.io
Lucidchart
SQLDBM
SQLDBM
QuickDBD
dbdiagram.io
59
59
60