CH2Database Models.2024
CH2Database Models.2024
Diploma: FIS
Page 1 of 12
Welcome
This document contains all learning material for your course and always remember that your course material is
organized in the same way that is reflected in your Learner Guide. Your learning content is divided into the
Learning Units and each unit is aligned to a specific chapter of your prescribed textbook. All learning material
(notes, slides, videos, etc) will be uploaded into the relevant learning unit folder. You are encouraged to read your
material from Learning Unit 1 through to Learning Unit 5 in that order as the preceding unit builds onto the
previous one. It is also important to play around and familiarize yourself with the content on WiSeUp.
This learning unit is linked to Chapter 2 and is the first units that will be covered in the first semester. This chapter
examines data modeling. Data modeling is the first step in the data- base design journey, serving as a bridge between real-
world objects and the computer database. One of the most frustrating problems of database design is that designers,
programmers, and end users see data in different ways. Consequently, different views of the same data can lead to database
designs that do not reflect an organization’s actual operation, thus failing to meet end-user needs and data efficiency requirements.
To avoid such failures, database designers must obtain a precise description of the data’s nature and many uses within the
organization. Communication among database designers, programmers, and end users should be frequent and clear. Data
modeling clarifies such communication by reducing the complexities of database design to more easily understood abstractions
that define entities, relations, and data transformations.
Learning objectives
After successful completion of this learning unit, the students will be able to:
First, you will learn some basic data-modeling concepts and how current data models have developed from earlier models. Tracing
the development of those database models will help you understand the database design and implementation issues that are
addressed in the rest of this book.
In chronological order, you will be introduced to the hierarchical and network models, the relational model, and the entity
relationship (ER) model. You will also learn about the use of the entity relationship diagram (ERD) as a data-modeling tool and
the different notations used for ER diagrams. Next, you will be introduced to the object-oriented (OO) model and the
object/relational model. Then, you will learn about the emerging NoSQL data model and how it is being used to fulfill the current
need to manage very large social media data sets efficiently and effectively. Finally, you will learn data
abstraction help reconcile varying views of the same data.
Page 2 of 12
LESSON 2.1- Data Modeling and Data Models
Data modeling is the process of documenting a complex system design as an easily understood diagram using text and symbols
to represent the way data needs to flow.
A data model is a relatively simple representation, usually graphical, of more complex real-world data structures. In general
terms, a model is an abstraction of a more complex real-world object or event. data model is a plan, or blueprint, for a
database design—it is a generalized, non–DBMSspecific design. By analogy, consider the construction of your dorm or
apartment building.
The contractor did not just buy some timber, call for the concrete trucks, and start work. Instead, an architect constructed
plans and blueprints for that building long before construction began
A model’s main function is to help you understand the complexities of the real-world environment. Within the database
environment, a data model represents data structures and their characteristics, relations, constraints, transformations, and
other constructs with the purpose of supporting a specific problem domain. Data models are built during the analysis and
design phase of a project to ensure that the requirements of a new application are fully understood
Data model can be thought as a flowchart that illustrates the relationships between data
Designing a database properly is fundamental to establishing a database that meets the needs of the users. Data models
capture the nature of and relationships among data and are used at different levels of abstraction as a database is
conceptualized and designed. The effectiveness and efficiency of a database is directly associated with the structure of the
database. Various graphical systems exist that convey this structure and are used to produce data models that can be
understood by end users, systems analysts, and database designers.
Page 3 of 12
LESSON 2.3- Data Model Basic Building Blocks
The basic building blocks of all data models are entities, attributes, relationships, and constraints
• An entity is anything (a person, a place, a thing, or an event) about which data are to be collected and stored. An
entity represents a particular type of object in the real world. Entities may be physical objects, such as customers or
products, but entities may also be concepts, such as flight routes or musical concerts.
• Attribute: For example, a CUSTOMER entity would be described by attributes such as customer last name, customer
first name, customer phone, customer address, and customer credit limit. Attributes are the equivalent of fields in
file systems.
• Relationship: For example, a relationship exists between customers and agents that can be described as follows: an
agent can serve many customers, and each customer may be served by one agent.
o Data models use three types of relationships: one-to-many, many-to-many, and one-to-one.
o One-to-many (1:M or 1..*) relationship. a customer (the “one”) may generate many invoices, but each invoice
(the “many”) is generated by only a single customer.
o The “CUSTOMER generates INVOICE” relationship would also be labeled 1: M.
o Many-to-many (M:N or *..*) relationship. An employee may learn many job skills, and each job skill may be
learned by many employees. Database designers label the relationship “EMPLOYEE learns SKILL” a student
can take many classes and each class can be taken by many students (STUDENT takes CLASS)
o One-to-one (1:1 or 1..1) relationship. A retail company the store will be managed by a single employee. In
turn, each store manager, who is an employee, manages only a single store. Therefore, the relationship
“EMPLOYEE manages STORE”
• A constraint is a restriction placed on the data. Constraints are important because they help to ensure data integrity.
Constraints are normally expressed in the form of rules. For example: An employee’s salary must have values that are
between 6,000 and 350,000.
When database designers go about selecting or determining the entities, attributes, and relationships that will be used to build a data
model, they might start by gaining a thorough understanding of what types of data exist in an organization, how the data is used, and
in what time frames it is used. From a database point of view, the collection of data becomes meaningful only when it reflects properly
defined business rules.
A business rule is a brief, precise, and unambiguous description of a policy, procedure, or principle within a specific organization.
Business rules derived from a detailed description of an organization’s operations help to create and enforce actions within that
organization’s environment. Business rules must be rendered in writing and updated to reflect any change in the organization’s
operational environment.
Properly written business rules are used to define entities, attributes, relationships, and constraints. To be effective, business rules must
be easy to understand and widely disseminated to ensure that every person in the organization shares a common interpretation of the rules.
Examples of business rules are as follows:
• A customer may generate many invoices.
• An invoice is generated by only one customer.
• A training session cannot be scheduled for fewer than 10 employees or for more than 30 employees.
Note that those business rules establish entities, relationships, and constraints. For example, the first two business rules establish two
entities (CUSTOMER and INVOICE) and a 1:M relationship between those two entities. The third business rule establishes a constraint
Page 4 of 12
(no fewer than 10 people and no more than 30 people) and two entities (EMPLOYEE and TRAINING), and also implies a relationship
between EMPLOYEE and TRAINING
Discovering Business Rules
The main sources of business rules are company managers, policy makers, department managers, and written
documentation such as a company’s procedures, standards, and operations manuals. A faster and more direct source of
business rules is direct interviews with end users.
Attribute name
• Required to be descriptive of the data represented by the attribute
Proper naming
• Facilitates communication between parties
• Promotes self-documentation
Page 5 of 12
LESSON 2-5 The Evolution Of Data Models
The quest for better data management has led to several models that attempt to resolve the previous model’s critical
shortcomings and to provide solutions to ever-evolving data management needs. Dozens of different tools and techniques
for constructing data models have been defined over the years.
Hierarchical models
Hierarchical models:
developed to manage large
amounts of data for complex
manufacturing projects
Represented by an upside-
down tree which contains
segments
Segments are the equivalent
of a file system’s record type
Depicts a set of one-to-many
(1:M) relationships
Network Model
The network model is
better than a hierarchical
model.
• Network models:
created to represent
complex data
relationships
effectively
• Improved database
Entities are represented as a connected network with each other.
performance and
One child entity can have more than one parent entity. For example, in the figure, the Subject has
imposed a database
two children. One child is a STUDENT and another one is Degree
standard
• Allows a record to
have more than one
parent
Relational model
The relational model was introduced in 1970. The relational model represented a major breakthrough for both users and
designers you can think of a relation (sometimes called a table) composed of intersecting rows and columns.
Each row in a relation is called a tuple. Each column represents an attribute.
Page 7 of 12
LESSON 2.6- Big Data
“Big Data.” refers to a movement to find new and better ways to manage large amounts of web- and sensor-
generated data and derive business insight from it, Big Data it’s a massive volume of both structured and
unstructured data that is so it is difficult to process using traditional database and software techniques.
Basic characteristics of Big Data databases4: volume, velocity, and variety, or the 3 Vs.
Volume: refers to the amount of data being stored
Velocity is the measure of how fast the data is coming in. Facebook has to handle a tsunami of photographs every
day. It has to ingest it all, process it, file it, and somehow, later, be able to retrieve it
Variety refers to the data being collected comes in different data formats.
Page 8 of 12
Advantages and Disadvantages of various Data Models
Data Model Advantages Disadvantages
Hierarchical Model Promotes data sharing Requires knowledge of physical data storage
Parent/child relationship promotes conceptual simplicity and characteristics
data integrity Navigational system requires knowledge of
Database security is provided and enforced by DBMS hierarchical path
Efficient with 1:M relationships Changes in structure require changes in all application
programs
Implementation limitations
No data definition
Lack of standards
Network Model Conceptual simplicity System complexity limits efficiency
Handles more relationship types Navigational system yields complex implementation,
Data access is flexible application development, and management
Data owner/member relationship promotes data integrity Structural changes require changes in all application
Conformance to standards programs
Includes data definition language (DDL) and data manipulation
language (DML)
Relational Model Structural independence is promoted using independent Requires substantial hardware and system software
tables overhead
Page 9 of 12
Tabular view improves conceptual simplicity Conceptual simplicity gives untrained people the tools
Ad hoc query capability is based on SQL to use a good system poorly
Isolates the end user from physical-level details May promote information problems
Improves implementation and management simplicity
Entity Relationship Visual modeling yields conceptual simplicity Limited constraint representation
Model Visual representation makes it an effective communication Limited relationship representation
tool No data manipulation language
Is integrated with the dominant relational model Loss of information content occurs when attributes
are removed from entities to avoid crowded displays
Object-Oriented Model Semantic content is added Slow development of standards caused vendors to
Visual representation includes semantic content supply their own enhancements
Inheritance promotes data integrity Complex navigational system
Learning curve is steep
High system overhead slows transactions
NOSQL High scalability, availability, and fault tolerance are provided Complex programming is required
Uses low-cost commodity hardware There is no relationship support
Supports Big Data There is no transaction integrity support
Key-value model improves storage efficiency In terms of data consistency, it provides an eventually
consistent model.
LEVELS OF ABSTRACTION
The process of hiding irrelevant information at each level of a database is known as data abstraction.
The type of information that is relevant and irrelevant depends upon the level itself which we will see later in the post.
Data abstraction in DBMS is very helpful in dealing with the complex database system because it breaks the problem into
sub-problems, which makes it easy to manage.
Page 10 of 12
Example: We use Google daily but we have no ideas of its data storage. The information like how and where Google stores
its data is irrelevant for us that's why the information is hidden from us. This is known as data abstraction
There are three levels of data abstraction in DBMS which reduce the complexity of the database and also provide data
independence at each level.
Physical level
This is the first or lowest level of abstraction which describes how a record is actually stored in the system memory. It is a
low-level representation of the database. Physical level deals with the storage of the data for the whole database system.
The Database Administrator (DBA) manages the physical level. DBA decides certain things like the drive where the data will
be actually stored in the system and whether the storage will be centralized or decentralized.
Logical level
This is the second level of abstraction in DBMS. It describes the data stored in the database and relationship among them.
The logical level contains the data that is actually stored in the database. It defines the overall structure of the database and
relationships between the data.
In simple words, we create the blueprint of the database at the logical level.
Example: Take the example of the university database. We need to store data about teachers and students. But what data
we are going to store? What are their types? How they will be related to each other?
At the logical level, we will define all of them. Take the table of teachers that contains TEACHER_ID, NAME, SALARY and table
of students that contains STUDENT_ID, NAME, COURSE, PROJECT_NAME, PROJECT_GUIDE and so on. The project guide will
only contain the entry present in TEACHER_ID. Here we define the structure of the database and relationships among the
data.
View level
This is the last level of abstraction in DBMS. It is intended for final users.
The application program (which general users use) tries to view that data according to the user role. We hide the data from
a view that is irrelevant to them. This is easier to understand with an example.
Example: Students only need to view their score, courses, attendance and other details that are relevant fo them. Students
cannot view the teacher's salary because the data is irrelevant to them.
But teachers can view each and every detail of the students as well as their own data.
Here we create two separate views. One for the students and the other one for the teachers with the appropriate set of data.
The external model is the end users’ view of the data environment. The term end users refers to people who use the application programs
to manipulate the data and generate information. End users usually operate in an environment in which an application has a specific
business unit focus. Companies are generally divided into several business units, such as sales, finance, and marketing. Each business
unit is subject to specific constraints and requirements, and each one uses a subset of the overall data in the organization.
The conceptual model represents a global view of the entire database by the entire organization. That is, the conceptual model
integrates all external views (entities, relationships, constraints, and processes) into a single global view of the data in the
enterprise, as shown in Figure 2.8. Also known as a conceptual schema, it is the basis for the identification and high-le e
main data objects (avoiding any database model-specific details).
The conceptual model represents a global view of the entire database by the entire organization. That is, the conceptual model
integrates all external views (entities, relationships, constraints, and processes) into a single global view of the data in the
enterprise, as shown in Figure 2.8. Also known as a conceptual schema, it is the basis for the identification and high-le
e main data objects (avoiding any
database model-specific details).
The Internal Model
Page 11 of 12
Once a specific DBMS has been selected, the internal model maps the conceptual model to the DBMS. The internal model is the
representation of the database as “seen” by the DBMS. In other words, the internal model requires the designer to match the conceptual
model’s characteristics and constraints to those of the selected implementation model.
The physical model operates at the lowest level of abstraction, describing the way data is saved on storage media such as magnetic, solid
state, or optical media. The physical model requires the definition of both the physical storage devices and the (physical) access methods
required to reach the data within those storage devices, making it both software and hardware dependent. The storage structures used
are dependent on the software (the DBMS and the operating system) and on the type of storage devices the computer can handle.
Page 12 of 12