Database Principles: Fundamentals of
Design, Implementations and Management
Stephen Morris, Keeley Crockett,
Peter Rob and Carlos Coronel
CHAPTER 15 Databases For
Decision Support
In this chapter, you will learn:
• How business intelligence provides a
comprehensive business decision support
framework
• About business intelligence architecture, its
evolution and reporting styles
• The Data Warehouse Life cycle
• How to prepare data for the data warehouse
using the Extraction, Transformation and
Loading Process.
In this chapter, you will learn (continued):
• What star schemas are and how they are
constructed
• About data analytics, data mining, and
predictive analysis
• How SQL extensions are used to support
OLAP-type data manipulations
The Need for Data Analysis
• Managers must be able to track daily
transactions to evaluate how the business is
performing
• By tapping into operational database,
management can develop strategies to meet
organizational goals
• Data analysis can provide information about
short-term tactical evaluations and strategies
The Need for Data Analysis (continued)
The Need for Data Analysis (continued)
Business Intelligence
• Business intelligence (BI) is a term that describes a
comprehensive, cohesive, and integrated set of tools and
processes used to capture, collect, integrate, store, and analyze
data with the purpose of generating and presenting
information to support business decision making.
• Intelligence is based on learning and understanding the
facts about the business environment.
Business Intelligence (continued)
Business Intelligence (continued)
• Implementing BI in an organization involves
capturing not only internal and external
business data, but also the metadata, or
knowledge about the data.
• In practice, BI is a complex proposition that
requires a deep understanding and alignment
of the business processes, business data, and
information needs of users at all levels in an
organization
Business Intelligence (continued)
• BI provides a framework for:
1. Collecting and storing operational data
2. Aggregating the operational data into decision support data
3. Analyzing decision support data to generate information
4. Presenting such information to the end user to support business
decisions
5. Making business decisions, which in turn generate more data that
are collected, stored, and so on (restarting the process)
6. Monitoring results to evaluate outcomes of the business decisions,
which again provides more data to be collected, stored, and so on
7. Predicting future behaviors and outcomes with a high degree of
accuracy
Business Intelligence Architecture
Business Intelligence Architecture
(continued)
Business Intelligence Architecture
(continued)
Business Intelligence Architecture
(continued)
Business Intelligence Architecture
(continued)
Business Intelligence Architecture
(continued)
• BI tools focus on the strategic and tactical use of information.
• Therefore, BI uses an arrangement of best management
practices to manage data as a corporate asset.
• Master data management (MDM) is a collection of concepts,
techniques, and processes for the proper identification,
definition, and management of data elements within an
organization.
• MDM’s main goal is to provide a comprehensive and consistent
definition of all data within an organization.
• MDM ensures that all company resources (people, procedures,
and IT systems) that work with data have uniform and consistent
views of the company’s data.
Business Intelligence Architecture
(continued)
• Governance is a method or process of
government.
• BI provides a method for controlling and
monitoring business health and for consistent
decision making.
• Having such governance creates accountability
for business decisions.
• In the present age of business flux,
accountability is increasingly important.
Business Intelligence Architecture
(continued)
• Key performance indicators (KPIs) are quantifiable numeric or scale-
based measurements that assess the company’s effectiveness or
success in reaching its strategic and operational goals.
• Some examples of KPIs are:
• General. Year-to-year measurements of profit by line of business,
same-store sales, product turnovers, product recalls, sales by
promotion, and sales by employee
• Finance. Earnings per share, profit margin, revenue per employee,
percentage of sales to account receivables, and assets to sales
• Human resources. Applicants to job openings, employee turnover,
and employee longevity
• Education. Graduation rates, number of incoming freshmen, student
retention rates, publication rates, and teaching evaluation scores
Business Intelligence Architecture
(continued)
• A modern BI system provides three distinctive reporting
styles:
– Advanced reporting. A BI system presents insightful information
about the organization in a variety of presentation formats.
– Monitoring and alerting. After a decision has been made, the BI
system offers ways to monitor the decision’s outcome. The BI
system provides the end user with ways to define metrics and
other key performance indicators to evaluate different aspects of
an organization.
– Advanced data analytics. A BI system provides tools to help the
end user discover relationships, patterns, and trends hidden
within the organization’s data. These tools are used to create two
types of data analysis: explanatory and predictive.
Business Intelligence Benefits
• Integrating architecture. Like any other IT project, BI has the potential of
becoming the integrating umbrella for a disparate mix of IT systems within
an organization.
• Common user interface for data reporting and analysis. BI front ends can
provide up-to-the-minute consolidated information using a common
interface for all company users. IT departments no longer have to provide
multiple training options for diverse interfaces. End users benefit from
similar or common interfaces in different devices that use multiple clever
and insightful presentation formats.
• Common data repository fosters single version of company data. In the past,
multiple IT systems supported different aspects of an organization’s
operations. Such systems collected and stored data in separate data stores.
• Improved organizational performance. BI can provide competitive
advantages in many different areas, from customer support to
manufacturing processes.
Business Intelligence Evolution
Business Intelligence Evolution
Decision Support Systems
• Decision support is methodology (or series of
methodologies) designed to extract information from data
and to use such information as a basis for decision making
• Decision support system (DSS)
– Arrangement of computerized tools used to assist managerial
decision making within business
– Usually requires extensive data “massaging” to produce
information
– Used at all levels within organization
– Often tailored to focus on specific business areas
– Provides ad hoc query tools to retrieve data and to display data in
different formats
Decision Support Systems (continued)
• Composed of following four main components:
– Data store component
• Basically a DSS database
– Data extraction and data filtering component
• Used to extract and validate data taken from operational database and
external data sources
– End-user query tool
• Used to create queries that access database
– End-user presentation tool
• Used to organize and present data
Decision Support Systems (continued)
Operational Data vs. Decision Support Data
• Operational Data
– Mostly stored in relational database
– Optimized to support transactions representing daily operations
• DSS Data
– Give tactical and strategic business meaning to operational data
– Differs from operational data in following three main areas:
• Timespan
• Granularity
• Dimensionality
Operational Data vs. Decision Support Data (cont.)
Operational Data vs. Decision Support
Data (continued)
DSS Database Requirements
• A specialized DBMS tailored to provide fast
answers to complex queries.
• Four main requirements:
– Database schema
– Data extraction and loading
– End-user analytical interface
– Database size
DSS Database Requirements (continued)
• Database schema
– Must support complex data representations
– Must contain aggregated and summarized data
– Queries must be able to extract multidimensional
time slices
DSS Database Requirements (continued)
DSS Database Requirements (continued)
DSS Database Requirements (continued)
• Data extraction
– Should allow batch and scheduled data extraction
– Should support different data sources
• Flat files
• Hierarchical, network, and relational databases
• Multiple vendors
• Data filtering
– Must allow checking for inconsistent data
DSS Database Requirements (continued)
• End-user analytical interface
– One of most critical DSS DBMS components
– Permits user to navigate through data to simplify
and accelerate decision-making process
The Data Warehouse
• Integrated, subject-oriented, time-variant,
nonvolatile collection of data that provides
support for decision making
• Usually a read-only database optimized for
data analysis and query processing
• Requires time, money, and considerable
managerial effort to create
The Data Warehouse (continued)
The Data Warehouse (continued)
Figure 15.4 Creating a Data warehouse
Twelve Rules That Define a Data
Warehouse
• Data warehouse and operational environments are separated
• Data warehouse data are integrated
• Data warehouse contains historical data over long time
horizon
• Data warehouse data are snapshot data captured at given
point in time
• Data warehouse data are subject oriented
Twelve Rules That Define a Data
Warehouse (continued)
• Data warehouse data are mainly read-only with periodic
batch updates from operational data
– No online updates allowed
• Data warehouse development life cycle differs from classical
systems development
• Data warehouse contains data with several levels of detail:
current detail data, old detail data, lightly summarized data,
and highly summarized data
• Data warehouse environment is characterized by read-only
transactions to very large data sets
Twelve Rules that Define a Data
Warehouse (continued)
• Data warehouse environment has system that traces data
sources, transformations, and storage
• Data warehouse’s metadata are critical component of this
environment
• Data warehouse contains chargeback mechanism for resource
usage that enforces optimal use of data by end users
Data marts
• Data mart
– Small, single-subject data warehouse subset
– Each is more manageable data set than data
warehouse
– Provides decision support to small group of people
– Typically lower cost and lower implementation
time than data warehouse
Designing and Implementing a
Data warehouse
• The Data Warehouse as an Active Decision
Support Framework
• A Company-Wide Effort That Requires User
Involvement
– Building the perfect data warehouse is not just a
matter of knowing how to create a star schema; it
requires managerial skills to deal with conflict
resolution, mediation and arbitration.
Designing and Implementing a Data
warehouse (continued)
• The DW designer must:
– Involve end users in the process.
– Secure end users’ commitment from the
beginning.
– Create continuous end-user feedback.
– Manage end-user expectations.
– Establish procedures for conflict resolution
Designing and Implementing a Data
warehouse (continued)
• The data warehouse designer must satisfy:
• Data integration and loading criteria.
• Data analysis capabilities with acceptable query
performance.
• End-user data analysis needs.
• The foremost technical concern in implementing a
data warehouse is to provide end-user decision
support with advanced data analysis capabilities—at
the right moment, in the right format, with the right
data, and at the right cost.
Designing and Implementing a Data
warehouse (continued)
Designing and Implementing a Data
warehouse (continued)
• Apply Database Design Procedures
• Each business process that is to be modelled within the data
warehouse must be described in detail in order too:
• Identify business measures. For example a sales business
measure may be the number of a particular product that has
been sold in a week.
• Identify the level of detail or granularity of the data.
– The general rule of thumb is design for one grain finer than what the users
require.
• Check all data sources to ensure that the level of data
required can actually be obtained from the existing source
systems.
The Extraction, Transformation,
Loading Process
• The Extraction, Transformation, Loading process
(ETL) is critical to a successful data warehouse.
• It must ensure that the data that is loaded into
warehouse is high-quality, accurate, relevant,
useful, and accessible.
• The is the most time consuming phase in
building a warehouse as routines must be
developed to select the required fields from
often many sources of data.
The Extraction, Transformation,
Loading Process (continued)
• Types of data that may be extracted include:
– Operational data. The main source of data into the warehouse. This
data can directly come from any DBMS or application within the
organisation.
– Historical Archived data. This type of data is useful to perform
predictive analytics Unique data extraction and transformation
routines are therefore required to load the data in the warehouse
during the first time load.
– Internal data. Data within the organisation such as budgets or sales
forecasts which may exist in spread sheets.
– External data. Important for comparing the business performance to
enable an organisation to be competitive. Sources of external data
include real time data feeds, newspapers and reports (from the
internet) and marketing data that has been purchased.
The Extraction, Transformation,
Loading Process (continued)
The Extraction, Transformation,
Loading Process (continued)
• Some common source data anomalies include:
– Name and address inconsistencies
– Multiple coding problems
– Different Country standards
– Missing Values
– Referential integrity