student-ch02-data-models-solutions
student-ch02-data-models-solutions
Data Models
Answers to Review Questions
1. Discuss the importance of data modeling.
A data model is a relatively simple representation, usually graphical, of a more complex real world
object event. The data model’s main function is to help us understand the complexities of the real-
world environment. The database designer uses data models to facilitate the interaction among
designers, application programmers, and end users. In short, a good data model is a
communications device that helps eliminate (or at least substantially reduce) discrepancies between
the database design’s components and the real world data environment. The development of data
models, bolstered by powerful database design tools, has made it possible to substantially diminish
the database design error potential. (Review Section 2.1 in detail.)
3. How would you translate business rules into data model components?
As a general rule, a noun in a business rule will translate into an entity in the model, and a verb
(active or passive) associating nouns will translate into a relationship among the entities. For
example, the business rule “a customer may generate many invoices” contains two nouns (customer
and invoice) and a verb (“generate”) that associates them.
5. Explain how the entity relationship (ER) model helped produce a more structured
relational database design environment.
An entity relationship model, also known as an ERM, helps identify the database's main entities and
their relationships. Because the ERM components are graphically represented, their role is more
easily understood. Using the ER diagram, it’s easy to map the ERM to the relational database
model’s tables and attributes. This mapping process uses a series of well-defined steps to generate
all the required database structures. (This structures mapping approach is augmented by a process
known as normalization, which is covered in detail in Chapter 7, “Normalization of Database
Tables.”)
An object has greater semantic content because it embodies both data and behavior. That is, the
object contains, in addition to data, also the description of the operations that may be performed by
the object.
For use with Database Principles, Second Edition, Cengage Learning EMEA
9. How would you model Question 8 with an OODM? (Use Figure 2.7 as your guide.)
The OODM that corresponds to question 9’s ERD is shown in Figure Q2.9:
CUSTOMER
*
PAYMENT
11. In terms of data and structural independence, compare file system data management with
the five data models discussed in this chapter.
Remind students to review the definitions of data- and structural independence found in Chapter 1,
“Database Systems.” Data independence exists when it is possible to make changes in the data
storage characteristics without affecting the application program’s ability to access the data.
Conversely, data dependence exists when an application program is unable to access the data after
a change in the data storage characteristics has been made. The practical significance of data
dependence is the difference between the logical data format (how the human being views the
data) and the physical data format (how the computer “sees” the data). Because a file system
exhibits data dependence, any program that accesses a file system’s file must not only tell the
computer what to do, but also how to do it.
In contrast to the file system, a relational database exhibits data independence. Therefore,
anyprogram can access the data regardless of a change in the data storage characteristics. For
example, if you want to get a listing of all the customers whose last name is “Smith” in a relational
database, the command set
SELECT *
FROM CUSTOMER
WHERE CUS_LNAME = “Smith”;
reads the same regardless of the last name field’s size (for example, up to 20 bytes or up to 35
bytes) or characterics (fixed field length or variable field length).
Structural independence exists when it is possible to make changes in the database structure
without affecting the application program’s ability to access the data. For example, the preceding
SQL command set works fine regardless of whether the CUS_LNAME is listed first, third, or last
in the CUSTOMER table structure.
For use with Database Principles, Second Edition, Cengage Learning EMEA
The comparisons are summarized in Table Q1.11.
1:1
An academic department is chaired by one professor; a professor may chair only one academic
department.
1:*
A customer may generate many invoices; each invoice is generated by one customer.
*:*
An employee may have earned many degrees; a degree may have been earned by many employees.
A relational diagram is a visual representation of the relational database’s entities, the attributes
within those entities, and the relationships between those entities. Therefore, it is easy to see what
the entities represent and to see what types of relationships (1:1, 1:*, *:*) exist among the entities
and how those relationships are implemented. An example of a relational diagram is found in the
text’s Figure 2.4.
You have physical independence when you can change the physical model without affecting the
internal model. Therefore, a change in storage devices or methods and even a change in operating
system will not affect the internal model.
The terms physical model and internal model may require a bit of additional discussion:
• The physical model operates at the lowest level of abstraction, describing the way data are
saved on storage media such as disks or tapes. The physical model requires the definition of
both the physical storage devices and the (physical) access methods required to reach the data
within those storage devices, making it both software- and hardware-dependent. The storage
structures used are dependent on the software (DBMS, operating system) and on the type of
storage devices that the computer can handle. The precision required in the physical model’s
For use with Database Principles, Second Edition, Cengage Learning EMEA
definition demands that database designers who work at this level have a detailed knowledge
of the hardware and software used to implement the database design.
• The internal model is the representation of the database as “seen” by the DBMS. In other
words, the internal model requires the designer to match the conceptual model’s
characteristics and constraints to those of the selected implementation model. An internal
schema depicts a specific representation of an internal model, using the database constructs
supported by the chosen database.
Over the last few years, a new wave of data has “emerged” to the limelight. Such data have alsways
exsisted but did not recive the attention that is receiving today. These data are characterized for
being high volume (petabyte size and beyond), high frequency (data are generated almost
constantly), and mostly semi-structured. These data come from multiple and vatied sources such as
web site logs, web site posts in social sites, and machine generated information (GPS, sensors, etc.)
Such data; have been accumulated over the years and companies are now awakining to the fact that
it contains a lot of hidden information that could help the day-to-day business (such as browsing
patterns, purchasing preferences, behaivor patterns, etc.) The need to manage and leverage this data
has triggered a phenomenon labeled “Big Data”. Big Data refers to a movement to find new and
better ways to manage large amounts of web-generated data and derive business insight from it,
while, at the same time, providing high performance and scalability at a reasonable cost.
Every time you search for a product on Amazon, send messages to friends in Facebook, watch a
video in YouTube or search for directions in Google Maps, you are using a NoSQL database.
NoSQL refers to a new generation of databases that address the very specific challenges of the “big
data” era and have the following general characteristics:
• Not based on the relational model.
These databases are generally based on a variation of the key-value data model rather than in the
relational model, hence the NoSQL name. The key-value data model is based on a structure
composed of two data elements: a key and a value; in which for every key there is a
corresponding value (or a set of values). The key-value data model is also referred to as the
attribute-value or associative data model. In the key-value data model, each row represents one
attribute of one entity instance. The “key” column points to an attribute and the “value” column
contains the actual value for the attribute. The data type of the “value” column is generally a long
string to accommodate the variety of actual data types of the values that are placed in the
column.
• Support distributed database architectures.
One of the big advantages of NoSQL databases is that they generally use a distributed
architecture. In fact, several of them (Cassandra, Big Table) are designed to use low cost
commodity servers to form a complex network of distributed database nodes
For use with Database Principles, Second Edition, Cengage Learning EMEA
• Provide high scalability, high availability and fault tolerance.
NoSQL databases are designed to support the ability to add capacity (add database nodes to the
distributed database) when the demand is high and to do it transparently and without downtime.
Fault tolerant means that if one of the nodes in the distributed database fails, the database will
keep operating as normal.
• Support very large amounts of sparse data.
Because NoSQL databases use the key-value data model, they are suited to handle very high
volumes of sparse data; that is for cases where the number of attributes is very large but the
number of actual data instances is low.
• Geared toward performance rather than transaction consistency.
One of the biggest problems of very large distributed databases is to enforce data consistency.
Distributed databases automatically make copies of data elements at multiple nodes – to ensure
high availability and fault tolerance. If the node with the requested data goes down, the request
can be served from any other node with a copy of the data. However, what happen if the network
goes down during a data update? In a relational database, transaction updates are guaranteed to
be consistent or the transaction is rolled back. NoSQL databases sacrifice consistency in order to
attain high levels of performance. NoSQL databases provide eventual consistency. Eventual
consistency is a feature of NoSQL databases that indicates that data are not guaranteed to be
consistent immediately after an update (across all copies of the data) but rather, that updates will
propagate through the system and eventually all data copies will be consistent.
For use with Database Principles, Second Edition, Cengage Learning EMEA
Problem Solutions
Use the contents of Figure 2.3 to work problems 1-5.
1. Write the business rule(s) that governs the relationship between AGENT and CUSTOMER.
Given the data in the two tables, you can see that an AGENT – through AGENT_CODE -- can
occur many times in the CUSTOMER table. But each customer has only one agent. Therefore, the
business rules may be written as follows:
One agent can have many customers.
Each customer has only one agent.
Given these business rules, you can conclude that there is a 1:* relationship between AGENT and
CUSTOMER.
For use with Database Principles, Second Edition, Cengage Learning EMEA
3. If the relationship between AGENT and CUSTOMER were implemented in a hierarchical
model, what would the hierarchical structure look like? Label the structure fully, identifying
the root segment and the Level 1 segment.
CUSTOMER 1
CUSTOMER 2 Child Segments
CUSTOMER 3
CUSTOMER 4
For use with Database Principles, Second Edition, Cengage Learning EMEA
5. Using the ERD you drew in Problem 2, create the equivalent OO model. (Use Figure 2.7 as
your guide.)
AGENT
*
CUSTOMER
For use with Database Principles, Second Edition, Cengage Learning EMEA
Using Figure P2.1 as your guide, work Problems 6–7. The DealCo ERD shows the initial entities
and attributes for the DealCo stores, located in two regions of the country.
Using Figure P2.2 as your guide, answer Problems 7−9. The Tiny University Class ERD shows
For use with Database Principles, Second Edition, Cengage Learning EMEA
Figure P2.2 The Tiny University Class ERD
7. Identify each relationship type and write all of the business rules.
The simplest way to illustrate the relationship between ENROLL, CLASS, and STUDENT is to
discuss the data shown in Table P2.7. As you examine the Table P2.7 contents and compare the
attributes to relational schema shown in Figure P2.8, note these features:
• We have added an attribute, ENROLL_SEMESTER, to identify the enrollment period.
• Naturally, no grade has yet been assigned when the student is first enrolled, so we have
entered a default value “NA” for “Not Applicable.” The letter grade – A, B, C, D, F, I
(Incomplete), or W (Withdrawal) -- will be entered at the conclusion of the enrollment
period, the SPRING-13 semester.
• Student 11324 is enrolled in two classes; student 11892 is enrolled in three classes, and
student 10345 is enrolled in one class.
For use with Database Principles, Second Edition, Cengage Learning EMEA
All of the relationships are 1:*. The relationships may be written as follows:
COURSE generates CLASS. One course can generate many classes. Each class is generated by
one course.
CLASS is referenced in ENROLL. One class can be referenced in enrollment many times. Each
individual enrollment references one class. Note that the ENROLL entity is also related to
STUDENT. Each entry in the ENROLL entity references one student and the class for which that
student has enrolled. A student cannot enroll in the same class more than once. If a student
enrolls in four classes, that student will appear in the ENROLL entity four times, each time for a
different class.
STUDENT is shown in ENROLL. One student can be shown in enrollment many times. (In
database design terms, “many” simply means “more than once.”) Each individual enrollment
entry shows one student.
9. United Broke Artists (UBA) is a broker for not-so-famous painters. UBA maintains a small
database to track painters, paintings, and galleries. A painting is painted by a particular
artist, and that painting is exhibited in a particular gallery. A gallery can exhibit many
paintings, but each painting can be exhibited in only one gallery. Similarly, a painting is
painted by a single painter, but each painter can paint many paintings. Using PAINTER,
PAINTING, and GALLERY, in terms of a relational database:
a. What tables would you create, and what would the table components be?
We would create the three tables shown in Figure P2.9a. (Use the teacher’s Ch02_UBA
database in your instructor's resources to illustrate the table contents.)
For use with Database Principles, Second Edition, Cengage Learning EMEA
As you discuss the UBA database contents, note in particular the following business rules that
are reflected in the tables and their contents:
• A painter can paint may paintings.
• Each painting is painted by only one painter.
• A gallery can exhibit many paintings.
• A painter can exhibit paintings at more than one gallery at a time. (For example, if a
painter has painted six paintings, two may be exhibited in one gallery, one at another, and
three at the third gallery. Naturally, if galleries specify exclusive contracts, the database
must be changed to reflect that business rule.)
• Each painting is exhibited in only one gallery.
The last business rule reflects the fact that a painting can be physically located in only one
gallery at a time. If the painter decides to move a painting to a different gallery, the database
must be updated to remove the painting from one gallery and add it to the different gallery.
For use with Database Principles, Second Edition, Cengage Learning EMEA
11. Describe the relationships (identify the business rules) depicted in the ERD shown in Figure
P2.3.
12. Convert the ERD from Problem 11 into a UML Class Diagram.
For use with Database Principles, Second Edition, Cengage Learning EMEA
13. Describe the relationships shown in the ERD in Figure P2.4
14. Create a UML ERD for each of the following descriptions. (Note: The word many merely
means “more than one” in the database modeling environment.)
a. Each of the MegaCo Corporation’s divisions is composed of many departments. Each of
those departments has many employees assigned to it, but each employee works for only
one department. Each department is managed by one employee, and each of those
managers can manage only one department at a time.
For use with Database Principles, Second Edition, Cengage Learning EMEA
FIGURE P2.14A The MegaCo ERD
As you discuss the contents of Figure P2.14A, note the 1:1 relationship between the
EMPLOYEE and the DEPARTMENT in the “manages” relationship and the 1:* relationship
between the DEPARTMENT and the EMPLOYEE in the “is assigned to” relationship.
b. During some period of time, a customer can rent many videotapes from the BigVid store.
Each of the BigVid’s videotapes can be rented to many customers during that period of
time.
The solution is presented in Figure P2.14B. Note the *:* relationship between CUSTOMER
and VIDEO. Such a relationship is not implementable in a relational model.
c. An airliner can be assigned to fly many flights, but each flight is flown by only one airliner.
For use with Database Principles, Second Edition, Cengage Learning EMEA
We have created a small Ch02_Airline database in access to let you explore the
implementation of the model. The tables and the relational diagram are shown in the
following two figures.
For use with Database Principles, Second Edition, Cengage Learning EMEA
d. The KwikTite Corporation operates many factories. Each factory is located in a region.
Each region can be “home” to many of KwikTite’s factories. Each factory employs many
employees, but each of those employees is employed by only one factory.
e. An employee may have earned many degrees, and each degree may have been earned
by many employees.
Note that this *:* relationship must be broken up into two 1:* relationships before it can be
implemented in a relational database. Use the Airline ERD’s decomposition in Figure P2.14C
as the focal point in your discussion.
For use with Database Principles, Second Edition, Cengage Learning EMEA