0% found this document useful (0 votes)
5 views18 pages

CHapter 2 Data Data Warehousing and OLAP Technologies

Chapter 2 discusses data warehousing and OLAP technologies, defining a data warehouse as an integrated, subject-oriented, time-variant, and non-volatile database that supports decision-making. It contrasts operational databases, which handle day-to-day transactions, with data warehouses that focus on historical data analysis for informed decision-making. The chapter also highlights the functionalities of OLAP for data analysis and the challenges faced in data warehousing, such as data integration and ensuring data quality.

Uploaded by

gutageremu024
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views18 pages

CHapter 2 Data Data Warehousing and OLAP Technologies

Chapter 2 discusses data warehousing and OLAP technologies, defining a data warehouse as an integrated, subject-oriented, time-variant, and non-volatile database that supports decision-making. It contrasts operational databases, which handle day-to-day transactions, with data warehouses that focus on historical data analysis for informed decision-making. The chapter also highlights the functionalities of OLAP for data analysis and the challenges faced in data warehousing, such as data integration and ensuring data quality.

Uploaded by

gutageremu024
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 18

Chapter 2

Data Warehousing and OLAP


technologies

1
9/25/2020
Data Mining
Introduction to Data Warehouse
 What is data warehouse?
 Data warehouse vs. Operational DBMS
 OLTP VS. OLAP
 Design of data warehouse
 Conceptual modeling of data warehouses
 Data warehouse design process
 Data warehouse models
 OLAP functionalities on data warehouses
2
9/25/2020
Data Mining
Data warehousing

▪ Data warehouse is an integrated, subject-oriented, time-variant,


non-volatile database that provides support for decision making.
▪ Integrated  centralized, consolidated database that integrates
data derived from the entire organization.
• Consolidates data from multiple & diverse sources with diverse
formats.
• Helps managers to better understand the company’s
operations.
 Subject-Oriented  Data warehouse contains data organized by3
9/25/2020
Data Mining
topics.
Data warehousing
▪ Time variant  In contrast to the operational database that
focus on current transactions, the data warehouse represent
the flow of data through time.
▫ Data warehouse contains data that reflect what happened
last week, last month, past five years, and so on.

 Non volatile  Once data enter the data warehouse, they are
never removed. Because the data in the warehouse represent
the company’s entire history.
 Because data is added all the time, warehouse is growing.
4
9/25/2020
Data Mining
Data Warehouse
▪ Data warehouse
▫ A data warehouse is a relational database management
system responsible for the collection and storage of data to
support management decision making and problem solving.
▫ It enables managers and other business professionals for
data mining, online analytical processing, market research
and decision support.
▫ Current evolution of Decision Support Systems (DSSs)
▪ Data mart
▫ A subset of a data warehouse for small and medium-size
5
9/25/2020
Data businesses
Mining or departments within larger companies.
Data Warehouse Stores Heterogeneous
Data

6
9/25/2020
Data Mining
Data Warehouse as part of Data
Mining

7
9/25/2020
Data Mining works with Data
Warehouse
. ▪ Data Warehouse provides the Enterprise with
a memory

• Data Mining provides the Enterprise with


intelligence

8
9/25/2020
Data Mining
Operational Database vs data warehouse
▪ The Operational Database is the source of information for the data warehouse.
▪ It includes detailed information used to run the day to day operations of the business.
▪ The data frequently changes as updates are made and reflect the current value of the last
transactions.
▪ Operational Database Management Systems also called as OLTP (Online Transactions
Processing Databases), are used to manage dynamic data in real-time.
▪ Data Warehouse Systems serve users or knowledge workers in the purpose of data analysis and
decision-making. Such systems can organize and present information in specific formats to
accommodate the diverse needs of various users.
▪ These systems are called as Online-Analytical Processing (OLAP) Systems.

9
9/25/2020
Operational Database Vs Data Warehouse
 The data warehouse and operational environments are separated.
 Data warehouse receives its data from operational databases
 Data warehouse environment is characterized by read-only transactions to very large data
sets.
 Operational environment is characterized by numerous update transactions to a few data
entities at a time.
 Data warehouse contains historical data over a long time horizon.
 Ultimately Information is created from data warehouses. Such Information becomes the
basis for rational decision making.
 The data found in data warehouse is analyzed to discover previously unknown data
characteristics, relationships, dependencies, or trends.
10
9/25/2020
Data Mining
Data Processing Technologies
▪ OLAP – Online Analytical Processing
► Refers to an advanced data analysis environment that supports
decision making.
► Access to multidimensional databases providing managerially
useful display techniques

▪ Data mining tools analyze the data, uncover problems or opportunities


hidden in the data relationships.
▫ E.g.: Credit system : who are likely not to pay their debts?
▫ Crime Database : Who are likely to commit what kind of crime?

▪ OLAP provides top-down, query-driven analysis


▪ Data mining provides bottom-up, discovery-driven analysis
11
9/25/2020
Data Mining
Data Processing Technologies
▪ OLTP (on-line transaction processing)
► Major task of traditional relational DBMS
► Day-to-day operations: purchasing, inventory, banking, manufacturing,
payroll, registration, accounting, etc.
▪ OLAP (on-line analytical processing)
► Major task of data warehouse system
► Data analysis and decision making
▪ Distinct features (OLTP vs. OLAP):
► User and system orientation: customer vs. market
► Data contents: current, detailed vs. historical, consolidated
► Database design: ER + application vs. star + subject
► View: current, local vs. evolutionary, integrated
► Access patterns: update vs. read-only but complex queries

12
9/25/2020
Data Mining
Data Processing Technologies
▪ OLTP (on-line transaction processing)
► Major task of traditional relational DBMS
► Day-to-day operations: purchasing, inventory, banking,
manufacturing, payroll, registration, accounting, etc.
▪ OLAP (on-line analytical processing)
► Major task of data warehouse system
► Data analysis and decision making

13
9/25/2020
Data Mining
Data Processing Technologies
▪ Distinct features (OLTP vs. OLAP):
► User and system orientation: customer vs. market
► Data contents: current, detailed vs. historical, consolidated
► Database design: ER + application vs. star + subject
► View: current, local vs. evolutionary, integrated
► Access patterns: update vs. read-only but complex queries

14
9/25/2020
Data Mining
Data Processing Technologies
OLTP OLAP
users clerk, IT professional knowledge worker
function day to day operations decision support
DB design application-oriented subject-oriented
data current, up-to-date historical,
detailed, flat relational summarized, multidimensional
isolated integrated, consolidated
usage repetitive ad-hoc
access read/write lots of scans
index/hash on prim. key
unit of work short, simple transaction complex query
# records accessed tens millions
#users thousands hundreds
DB size 100MB-GB 100GB-TB
metric transaction throughput query throughput, response

15
9/25/2020
Data Mining
Common Challenges in Data
Warehousing
▪ Data Integration
▪ Ensuring data quality
▪ Handling large data volumes
▪ Scalability
▪ Maintaining performance
▪ Managing Cost
▪ User adoption

16
9/25/2020
Data Mining
Assigment-1(Individual)
1. Why data warehouse is selected for Data analysis and Decision making?
2. How is security managed in data warehouse?

3. How do you ensure data quality in a data warehouse?

Deadline 16/7/2017 E.C


Note: you have to submit by handwriting

17
9/25/2020
Data Mining
Question?

18

You might also like