0% found this document useful (0 votes)
218 views

Chapter 1

This document provides an introduction and overview of data warehousing. It discusses the evolution of information technology and databases. The key aspects covered include the definition of a data warehouse, how it differs from operational transaction systems, its uses in supporting management decision making, and the roadmap for building a data warehouse including data extraction, loading, and use of metadata.

Uploaded by

vovox1
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
218 views

Chapter 1

This document provides an introduction and overview of data warehousing. It discusses the evolution of information technology and databases. The key aspects covered include the definition of a data warehouse, how it differs from operational transaction systems, its uses in supporting management decision making, and the roadmap for building a data warehouse including data extraction, loading, and use of metadata.

Uploaded by

vovox1
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 24

Introduction to Data

Warehousing
Lecture Notes
32113AdvancedDatabase

Advanced Database

Chapter 1 Sri Madhisetty 2010 Sp


ring

Outline of Lecture

Introduction to Databases
Evolution of information technology.
Introduction to Databases
What is a Data Warehouse?
Differences between OLTP and DSS
Systems
Types of Data and Their Uses

Advanced Data

Chapter 1 Sri Madhise

What is a database?

a collection of records
shared access
a central repository

Advanced Data

Chapter 1 Sri Madhise

What are main components


that make a database?

Files Tables
Records rows
Fields columns

Advanced Data

Chapter 1 Sri Madhise

What is a relational
database?

Data is stored in tables and an relationship


can be made based on common
information/data
Access to the database is through ANSI
structured query language(SQL).

Advanced Data

Chapter 1 Sri Madhise

What is a join?

A relation that can be established through


SQL relating data between 1 or more tables
to form a meaningful result set.

Advanced Data

Chapter 1 Sri Madhise

What is a key?

a common field in 2 tables


primary key
secondary key

Advanced Data

Chapter 1 Sri Madhise

Shifting Business Landscape

Consumer
demand

Competition &
Complexity

Advanced Data

Operating
efficiencies

Chapter 1 Sri Madhise

Responding to Change
Consumer
demand
Business
Business
Intelligence Corporate Management
Information
factory
Competition &
Complexity

Advanced Data

Business
Operations

Chapter 1 Sri Madhise

Operating
efficiencies

What is a data warehouse?

a collection of records
shared access
a central repository

Advanced Data

Chapter 1 Sri Madhise

10

What does a data warehouse


do?

a collection of records
shared access
a central repository

Advanced Data

Chapter 1 Sri Madhise

11

How is a Data Warehouse


different from OLTP

Organized by a particular subject vs. Organized by


transactions.
Few users comparatively to OLTP.
Queries accesses entire tables vs. few records.
Very Large database(VLDB) vs. Small/Medium
database
Un-normalized data structure vs. Normalized.
Single database for all business units vs. multiple
databases.
Periodic update(bulk) vs. Continuous update(online)

Advanced Data

Chapter 1 Sri Madhise

12

How is a Data Warehouse


different from OLTP contd..
OPERATIONAL PROCESSING MANAGEMENT/DECISION SUPPORT

TPS, OLTP
DSS, OLAP
ORIENTATION:
transaction / event
decision
TO SUPPORT:
operations
management
SCOPE:
narrow
broad / global
TEMPORAL VIEW:
now, current
historical [forecast]
SCOPE of DATA:
internal
internal & external
AMOUNT of DATA:
individual records
many records database
RESPONSE TIME:
(sub)seconds
minutes days
PROCESSING:
primitive - update
derivative - read only
direct (raw), atomic
aggregates, summary, trend
USAGE FREQUENCY:
frequent
infrequent
USAGE PATTERN:
stable, predictable
peaks, variable, unpredictable
DATABASE STRUCTURE:
static
dynamic (views)
DATABASE CONTENT:
dynamic (updates)
static (analysis, discovery)
SYSTEM REQUIREMENTS/DESIGN: known
variable, shifting, vague
PRIORITIES:
performance, throughput
flexibility, user autonomy
availability, accuracy, security
exploratory

REFERENCES: INMON+LOPER, BISCHOFF (p.12)

Advanced Data

Chapter 1 Sri Madhise

13

Why data warehouse?


Technology

End User Value

File Access Systems

Physical Data Access

Network DB

Support for complex data inter-relations

Hierarchical DB

Support for hierarchical views of data

Inverted File DB

Support of unplanned text based enquiry.

Relational DB

Flexibility and ease of updating

Query Language

Support of simple ad hoc reporting

4 GL

Development of simple reporting systems

Decision Support Systems

Ability to support financial and statistical data


analysis

Executive Information System

Ability to present information to management

Corporate Information Factory

Support for users enterprise wide customizing


each data to their needs.

Advanced Data

Chapter 1 Sri Madhise

14

Evolution of End User


Computing

Inmon, Building DW, fig. 1.1

1960s

Problems of
redundancy
synchronization
maintenance
LOTS of MASTER FILES!!!

1970s

Database/s

Single source of data


for all processing
REPORTS

Transaction
Processing
1980
Online, high-performance
s
transaction processing

DBMS
?

to support operations

Advanced Data

Chapter 1 Sri Madhise

Management / Decision
Support Systems
15

Who is a data warehouse


for?
Compared to OLTP (TRANSACTION PROCESSING) SYSTEMS

USER: a different type - manager, decision maker


DATA: looking at broad vistas
summaries, over time to detect patterns and trends
up to the minute updates not required
NEEDS: different processing requirements
response times greatly relaxed, and highly variable

ALL OF WHICH SUGGESTS:

PLATFORM: separated from operations


separate systems
separate data stores

Data Warehousing
Advanced Data

Chapter 1 Sri Madhise

16

Sample Data warehouse user Manager as Decision Maker


GOALS & CRITERIA
for Choice
profitability

DECISION

stimuli

REQUESTS

MANAGER
conceptualization of
a choice situation

External Influences

EXCEPTIONS

EXPECTED
RESULTS

_____
_____
_____
_____
_____

FILTER

accountability

ACTION

ACTUAL RESULTS
in operations

Advanced Data

Chapter 1 Sri Madhise

COMPARISON
of actual and
expected results

17

New Data warehouse


definition

A datawarehouse is a subject-oriented,
integrated, time-variant and non-volatile
collection of data in support of managements
decision making process.
It is the process whereby organizations
extract value from their informational assets
through use of special stores called data
warehouses

Advanced Data

Chapter 1 Sri Madhise

18

Types

Operational Data Store: Operational data


mirror. Eg: Item in stock.
Enterprise data warehouse: Historical
analysis, Complex pattern analysis.
Data Marts

Advanced Data

Chapter 1 Sri Madhise

19

Uses of a data warehouse

Presentation of standard reports and graphs


For dimensional analysis
Data mining

Advanced Data

Chapter 1 Sri Madhise

20

Advantages

Lowers cost of information access


Improves customer responsiveness
Identifies hidden business opportunities
Strategic decision making

Advanced Data

Chapter 1 Sri Madhise

21

Roadmap to Data
Warehousing

Data extracted, transformed and cleaned


Stored in a database - RDBMS, MDD
Query and Reporting systems
Executive Information System and Decision
Support System

Advanced Data

Chapter 1 Sri Madhise

22

Data Extraction and Load

Find sources of data : Tables, files,


documents, commercial databases, emails,
Internet
Bad data Quality: Same name but different
things, Different Units
Tool to clean data - Apertus
Tool to convert codes, aggregate and
calculate derived values - SAS
Data Reengineering tools

Advanced Data

Chapter 1 Sri Madhise

23

Metadata

Database that describes various aspects of


data in the warehouse
Administrative Metadata: Source database
and contents, Transformations required,
History of Migrated data
End User Metadata:
Definition of warehouse data
Descriptions of it
Consolidation Hierarchy

Advanced Data

Chapter 1 Sri Madhise

24

You might also like