INTRODUCTION TO
INFORMATION
MODELING
Matilda Wilson
INTRODUCTION
• In today’s competitive environment, data (or
information) and its efficient management is the most
critical business objective of an organisation.
• It is also a fact that we are in the age of information
explosion where people are bombarded with data and it
is a difficult task to get the right information at the
right time to take the right decision.
• Therefore, the success of an organisation is now, more
than ever, dependent on its ability to acquire accurate,
reliable and timely data about its business or operation
for effective decision-making process.
• Database system is a tool that simplifies the above
tasks of managing the data and extracting useful
information in a timely fashion.
• It analyses and guides the activities or business
purposes of an organisation.
• It is the central repository of the data in the organisation’s
information system and is essential for supporting the organisation’s
functions, maintaining the data for these functions and helping users
interpret the data in decision-making.
• Managers are seeking to use knowledge derived from databases for
• competitive advantages,
• for example, to determine customer buying pattern
• , tracking sales,
• support customer relationship management (CRM),
• on-line shopping,
• employee relationship management,
• implement decision support system (DSS),
• managing inventories and so on.
• To meet the changing organisational needs:
• database structures must be flexible to accept
• new data
• accommodate new relationships
• to support the new decisions.
• With the rapid growth in computing technology and
its application in all spheres of modern society,
databases have become an integral component of
our everyday life.
• We encounter several activities in our day-to-day life
that involve interaction with a database,
• for example, bank database to withdraw and
deposit money
• air or railway reservation databases for booking of
tickets
• library database for searching of a particular book
• supermarket goods databases to keep the inventory, to check
for sufficient credit balance while purchasing goods using credit
• In fact, databases and database management systems (DBMS)
have become essential for managing
• our business,
• governments,
• banks,
• universities and every other kind of human endeavour.
• Thus, they are a critical element of today’s software industry to
support these requirements and a daunting task to solve the
problems of managing huge amounts of data that are increasingly
being stored.
MOTIVATION
Motivation for Information Modeling
This lecture provides a motivation for studying conceptual modeling
and present a brief historical and structural overview of information
systems.
Database management systems are widely used and are a major
productivity tool for businesses that are information oriented.
For a database to be used effectively, its data should be correct,
complete, and efficiently accessed.
Motivation for Information Modeling
• This requires that the database is well-designed. Designing a
database involves building a formal model of the business domain
or universe of discourse (UoD).
• To do this properly requires a good understanding of the UoD and a
means of specifying this understanding in a clear, unambiguous
way.
• Object-Role Modeling (ORM) simplifies the analysis and design
process by using natural language, intuitive diagrams, and
examples, and by examining the information in terms of
simple, elementary facts.
.
Motivation for Information Modeling
• By expressing the model in terms of natural concepts, such
as objects and roles, this fact-oriented method provides a
truly conceptual approach to modeling.
• Other valuable modeling approaches include Entity-Relationship
(ER) modeling and object-oriented modeling
• Although ER and UML models are typically more compact than
ORM models, they are arguably less suitable than ORM for
formulating, transforming, or evolving a conceptual information
model.
Motivation for Information Modeling
• ER models and UML class diagrams are further removed from
natural language, lack the expressibility and simplicity of a role-
based constraint notation, are less stable in the face of domain
evolution, are harder to populate with fact instances, and may
hide information about the semantic domains that glue the model
together.
• However, ER and UML models better highlight the major
features of the domain being modeled by representing currently
less important features as attributes.
• ORM is used as our basic conceptual modeling method.
. Motivation for Information Modeling
• ER models and UML class diagrams are useful as well,
especially for providing compact summaries, and are best
developed as views of ORM models.
• For database applications, conceptual models typically need to
be mapped to attribute-based logical and physical models.
• ER models provide designs that are closer to relational database
structures.
• For object-oriented applications, UML models can incorporate
implementation details as well as behavior and deployment
aspects not covered by the ORM and ER approaches.
Motivation for Information Modeling
• Programming tasks are typically coded in third generation
languages such as C# and Java.
• Fourth generation database languages such as SQL are
declarative in nature, enabling users to declare what has to be
done without the fine detail of how to do it, and are set
oriented rather than record oriented.
• Fifth generation languages such as ConQuer enable users to
query conceptual models directly.
Motivation for Information Modeling
• Hierarchic and network database systems store some facts in
record types and some facts in links between record types.
• Relational database systems store all facts in tables.
• No matter how “intelligent” software systems become, people
are needed to describe the universe of discourse and to ask the
relevant questions about it.
Information Modeling
• Modeling an information requirements for any business domain in
away that can be easily understood by the business users,
automatically generate a database structure to store that information
• Although information models are commonly called data models,
information adds semantics or meaning to the data, which may be
just a bunch of numbers or character strings.
( a ) 10 100
X Y
(b)10 100
Number Square
10 100
PatientNumber Temperature
10 100
PatientNumbeer Temperature (oF)
Data Modeling
• The data model is a
cornerstone for every
information system,
because it describes the
entities that the system
will create and maintain
during its lifetime.
Data Modeling
• Data modeling is a technique for organizing and documenting a
system’s DATA. Data modeling is sometimes called database modeling
because a data model is usually implemented as a database.
Systems thinking is the application of formal systems theory and concepts to
systems problem solving.
An ERD depicts data in terms of the entities and relationships described by the
data.
MODELING ENTERPERISE DATA
Enterprise
Enterprise View
View
Coporate
strategies/
Congruent goals etc.
Conceptual Business
Data Models Needs
Logical Models
fy
Maintenance
Satis
Physical Models
I m pl ProductionSystem
e m en
t
Data Modeling (Cont.)
• Building the data model is probably the most important activity
during requirements definition,
• In the process of understanding how the data is organized and
identifying the relationships that exist between entities, you can
discover most of the functionality that the system will satisfy.
Business area of Data Model
• The data model of a business area tends to be
relatively stable, compared, for example, to the
set of operational procedures or organizational
structure, which changes frequently.
• Therefore, basing the implementation of the
future system upon a well-defined data model is
a good first step towards developing a system
that meets the real requirements of the users.
Techniques and Approaches
• Two techniques are used interchangeably to model the data of a
system:
• Entity relationship model (E-R/M)
• Data normalization
E-R/M and Normalization
• E-R/M (and EE-R/M) aims at identifying the entities that are part of
the system, the attributes that make up these entities, and the
dependencies between entities.
• Normalization makes the data model created using the E-R more
robust and extends the life of systems based on the model.
E-R and Normalization
(Cont.)
• These two techniques
go hand in hand and
should be applied
conscientiously during
data modeling
activities.
Approaches to Data Modeling
• Two approaches to
data modeling:
Top-down
Bottom-up
Top-Down vs.
Bottom-Up Approach
Bottom-Up
TOP-DOWN
Identify Entities
Identify Entities
Discover Relationship
Discover Relationship Define Attributes
Analyze/synthesize Data
Define Attributes
Collect Data
Top-Down vs.
Bottom-Up Approach
Bottom-Up
TOP-DOWN
Identify Entities
Identify Entities
Discover Relationship
Discover Relationship Define Attributes
Analyze/synthesize Data
Define Attributes
Collect Data
Conceptual Data Model
Principles
Graphical Languages
Modeling
Constraints
30
Principles
• Main approach – object-oriented
• Class (entity set, object)
• Association (relationship, relation)
• Data member (attribute, property)
• Instance (entity, occurrence)
31
Languages
• Entity Relationship model (E-R) (ERM)
• Entity set
• Relationship
• Attribute
32
Languages
• MERISE
• Object with occurrences
• Relation
• Propertiy
ENROLL
STUDENT SUBJECT
-Name
-Name
-Address IS GIVEN
TEXTBOOK
33
Languages
• Object Role Modeling (ORM)
34
Languages
• Class Diagram
• Class with instances
• Association
• Property
STUDENT
+enrolls +is enrolled SUBJECT
+Name
+FamilyName +SubName
+Address * *
35
Conceptual
• Goals
Model
• Starting from the dictionary and the rules this model tries to reveal
the relations among the data and their interaction
• Example – School
Rules Dictionary
1. Every class has a one and only • Student’s Address,
one room. • Subject,
2. Every subject is teaches by only • Number of Hours,
one teacher. • Class Name,
3. Every class is taught a subject a • Student's Family Name,
fixed number of hours. • Teacher‘s Name,
4. Every student can have no more • Mark,
one mark in every subject. • Room Number,
5. The school manages the • Student’s Name
timetable and the rating of
students and teachers..
36
Concepts
• Class (Entity class, Entity instance)
• Association
• Relationship between entity instances
• Attribute
• Properties
37
Defining an Entity Class
•Give it a name (a noun)
• Define its attributes
• Define the rules
• What belongs to the class?
• How the instances are identified in the class?
• Identifying an instance (Identifier)
NAME STUDENT
1.First Name
1.Attribute 2.Last Name
2.Attribute 3.Address
38
Association
• Give a name (a verb)
• Determine the participating classes
• Define the cardinalities
39
Examples
40
Identifier of an Association
41
Cardinalities of an Association
• Cardinalities One to One
• 0..1 – 0..1 – Every student can use one
locker
• 0..1 – 1
• 1 – 1 Every student uses a locker and
ther are no free lockers
42
Cardinalities of an Association
• Cardinalities One to Many
• 1 – 1..N
• 0..1 – 1..N
• 1 – 0..N
• 0..1 – 0..N
43
Cardinalities of an Association
• Cardinalities Many to Many
• 1..N – 1..N
• 0..N – 1..N
• 0..N – 0..N
44
Cardinalities of an Association
• Generalization
• Minimal cardinality
• Mandatory participation of every instance - 1
• Optionally participation of every instance - 0
• Maximal cardinality
• To only one instance of the other class – 1
• To multiple instances of the other class - N
45
Dimension of an Association
• Number of different classes participating in it
• Multidimensional
Make cours
-HourNumber
-End4 -End3
TEACHER * * SUBJECT
-TeacherName -SubName
-End5
ROOM *
-RoomNo
46
Dimension of an Association
• Multidimensional
47
Dimension of an Association
• One-dimensional
(Reflexive)
48
Aggregate Associations
• Aggregation
• Composition
49
Weak Entities
• It is identified through the association
50
Recommendations
• Don’t use high dimension associations
• Be aware not replace classes by associations
51
Case Study – Management Rules
1. Each
6. specialist
A patient has one or more
is characterized by: specialties
• Unique Number
7. Each specialist can give consultations in one or more policlinics
• Name
8. Each
• policlinic
address groups several specialists
• Phone
9. A patient canNumber
make an appointment for a consultation with specialist in a given policlinic, The specialist must work in this
2. policlinic
General practitioner is characterized by:
• appointment
10.The Serial Number
is for a date that is later than the date of appointment
• Name
11.If the
•
consultation
Phone Number
does not take place a new appointment must be made no matter what are reasons for the failure
12.Lists
3. Each of patient
appointment for every
is supervised by specialist
a GP are made at the beginning of the day.
13.In
4. Athe end of is
policlinic every day two reports
characterized by: are made:
•• AName
log of appointment made
•• AAddress
log of consultations done
• Phone Number
5. A specialist is characterized by:
• Serial Number
• Name
• Phone Number
52
Case Study - Policlinic
53
Case Study - Policlinic
54
Subtypes
• Example – Hardware components
order
55
Subtypes
56
Data Modeling for
Database Design
Study Objectives
• Understand concepts of data modeling and its purpose
• Learn how relationships between entities are defined
and refined, and how such relationships are incorporated
into the database design process
• Learn how ERD components affect database design and
implementation
• Learn how to interpret the modeling symbols
58
Data Model
• Model: an abstraction of a real-world object or
event
• Useful in understanding complexities of the real-
world environment
• Data model
• A diagram that displays a set of tables and the
relationships between them
• Next Slide: “Restaurant” Access data model using
Entity Relationship Diagram (ERD)
Access Data Model using ERD
60
What is an Entity Relationship Diagram (ERD)?
• ERD is a data modeling technique used in software engineering to
produce a conceptual data model of an information system.
• So, ERDs illustrate the logical structure of databases.
• ERD development using a CASE tool
• Powerdesigner by SAP
• Data Modeler by Orcale
61
The Importance of Data Model
• Blue print: official documentation
• Blue print of house
• Employee’s w/o DB knowledge can understand
• a data model diagram vs. a list of tables
• Used as an effective Communication Tool
• Improve interaction among the managers, the designers,
and the end users
• Independence from a particular DBMS
• Network DB, Object-oriented DB, etc.
Data Model (con’t)
• The data modeling revolves around discovering and
analyzing organizational and users data
requirements.
• Requirements based on policies, meetings,
procedures, system specifications, etc.
• Identify what data is important
• Identify what data should be maintained
63
ERD
• The major activity of this phase is identifying entities,
attributes, and their relationships to construct model
using the Entity Relationship Diagram.
• Entity table
• Attribute column
• Relationship line
64
How to find entities?
• Entity:
• "...anything (people, places, objects, events, etc.) about
which we store information (e.g. supplier, machine tool,
employee, utility pole, airline seat, etc.).”
• Tangible: customer, product
• Intangible: order, accounting receivable
• Look for singular nouns (beginner)
• BUT a proper noun is not a good candidate….
65
Entity
Instance
Entity instance: a single occurrence of an entity.
• 6 instances
Entity: student Student Last First
ID Name Name
2144 Arnold Betty
3122 Taylor John
3843 Simmons Lisa
instance 9844 Macy Bill
2837 Leath Heather
2293 Wrench Tim
66
How to find attributes?
• Attribute:
• Attributes are data objects that either identify or
describe entities (property of an entity).
• In other words, it is a descriptor whose values are
associated with individual entities of a specific entity type
• The process for identifying attributes is similar except now you
want to look for and extract those names that appear to be
descriptive noun phrases.
67
How to find relationships?
• Relationship:
• Relationships are associations between entities.
• Typically, a relationship is indicated by a verb connecting
two or more entities.
• Employees are assigned to projects
• Relationships should be classified in terms of cardinality.
• One-to-one, one-to-many, etc.
68
How to find cardinalities?
• Cardinality:
• The cardinality is the number of occurrences in one
entity which are associated to the number of occurrences
in another.
• There are three basic cardinalities (degrees of
relationship).
• one-to-one (1:1), one-to-many (1:M), and many-to-many
(M:N)
69
Identifier
“attributes that uniquely identify entity instances”
• Becomes a PK in RDS
• Composite identifiers are identifiers that consist of
two or more attributes
• Identifiers are represented by underlying the name of
the attribute(s)
• Employee (Employee_ID), student (Student_ID)
70
Crow’s Foot Notation
• Known as IE notation (most popular)
• Entity:
• Represented by a rectangle, with its name on the top.
The name is singular (entity) rather than plural
(entities).
71
Attributes
• Identifiers are represented by underlying the name
of the attribute(s)
72
Basic Cardinality Type
• 1-to-1 relationship
• 1-to-M relationship
• M-to-N relationship
Cardinality con’t
Example Model
75
Data Model by Peter Chen’ Notation
(first - original)
Business Rule Example 1
• Finalized business rules must be bi-
directional.
• Draft: one sentence
• Finalized: two sentences
• A professor advises many students
(professor to student). Each student
is advised by one professor (student
to professor).
• A professor must teach many classes.
Each class must be taught by one
professor.
77
Business Rule 1
• Business Rules are used to define entities, attributes, relationships and
constraints.
• Usually though they are used for the organization that stores or uses data to be
an explanation of a policy, procedure, or principle.
• The data can be considered significant only after business rules are defined.
• W/o them it cannot be considered as data for RDS but just records.
78
Business Rule 2
• When creating business rules, keep them simple, easy to understand, and keep
them broad.
• so that everyone can have a similar understanding and interpretation.
• Sources of business rules:
• Direct interviews with internal & external stakeholders
• Site visitations (collect data) and observation of the work process or procedure
• Review and study of documents (Policies, Procedures, Forms, Operation manuals, etc..)
79
Discovering Business Rules
• Real world example on the class website
• After reviewing and studying the interview and various forms, develop a draft
business rules - does not need to be bi-directional and less precise wording…
• Keep on going until “optimized”
• Then, finalize Business Rules: bi-directional.
Business Rule Example 2
• A sales representative must write many
invoices. Each invoice has to be written
by one sales representative.
• Each sales representative must be
assigned to many department. Each
department has only one sales
representative.
• A customer has to generate many
invoices. An invoice is generated by
only one customer.
81
Attributes
“Describe detail information about an entity ”
• Entity: Employee
• Attributes:
• Employee-Name
• Address (composite)
• Phone Extension
• Date-Of-Hire
• Job-Skill-Code
• Salary
82
Classes of attributes
• Simple attribute
• Composite attribute
• Derived attributes
• Single-valued attribute
• Multi-valued attribute
83
Simple/Composite attribute
• A simple attribute cannot be subdivided.
• Examples: Age, Gender, and Marital status
• A composite attribute can be further subdivided to
yield additional attributes.
• Examples:
• ADDRESS -- Street, City, State, Zip
• PHONE NUMBER -- Area code, Exchange number
84
Derived attribute
• is not physically stored within the database
• instead, it is derived by using an algorithm.
• Example 1: Late Charge of 2%
• MS Access: InvoiceAmt * 0.02
• Example 2: AGE can be derived from the date of birth
and the current date.
• MS Access: int(Date() – Emp_Dob)/365)
85
Single-valued attribute
• can have only a single (atomic) value.
• Examples:
• A person can have only one social security number.
• A manufactured part can have only one serial number.
• A single-valued attribute is not necessarily a simple
attribute.
• Part No: CA-08-02-189935
• Location: CA, Factory#:08, shift#: 02, part#: 189935
86
Multi-valued attributes
• can have many values.
• Examples:
• A person may have several college degrees.
• A household may have several phones with different numbers
• A car color
87
Example - “Movie Database”
• Entity:
• Movie Star
• Attributes:
• SS#: “123-45-6789” (single-valued)
• Cell Phone: “(661)123-4567, (661)234-5678” (multi-
valued)
• Name: “Harrison Ford” (composite)
• Address: “123 Main Str., LA, CA” (composite)
• Gender: “Female” (simple)
• Age: 24 (derived)
88
Procedure of ERD
• Relatively simple representations of complex real-
world data structures
• Data modeling is iterative process.
• “complete” and “100% error free” model is not
possible!
• Only “Optimized” model is possible….
89