Lecture 2 Data Access
Lecture 2 Data Access
Lecture 2
Conducted by
Ms. Akila Brahmana
Department of ICT
Faculty of Technology
University of Ruhuna
Objectives
Describe the importance of defining the access strategy early
in the development of the warehouse.
Identify the different categories of data access tools.
Identify the different database models that support OLAP
query tools.
Identify OLAP query techniques.
Identify non-OLAP tools for data access.
Identify factors influencing query tool choice.
Data Warehouse vs. Heterogeneous DBMS
Traditional heterogeneous DB integration:
Build wrappers/mediators on top of heterogeneous databases
Query driven approach
When a query is posed to a client site, a meta-dictionary is used to
translate the query into queries appropriate for individual
heterogeneous sites involved, and the results are integrated into a
global answer set
Complex information filtering, compete for resources
Data warehouse: update-driven, high performance
Information from heterogeneous sources is integrated in
advance and stored in warehouses for direct query and analysis
Data Warehouse vs. Operational DBMS
OLTP (on-line transaction processing)
Major task of traditional relational DBMS
Day-to-day operations: purchasing, inventory, banking, manufacturing, payroll, registration,
accounting, etc.
OLAP (on-line analytical processing)
Major task of data warehouse system
Data analysis and decision making
Distinct features (OLTP vs. OLAP):
User and system orientation: customer vs. market
Data contents: current, detailed vs. historical, consolidated
Database design: ER + application vs. star + subject
View: current, local vs. evolutionary, integrated
Access patterns: update vs. read-only but complex queries
OLTP vs. OLAP
OLTP OLAP
Users clerk, IT professional knowledge worker
6. Multi-user support
7. Unrestricted cross dimensional operations
8. Intuitive data manipulation
9. Flexible reporting
10. Unlimited dimensions and aggregation levels
Relational Database Model
Table Columns
Rows
Customer Store
Store
Time Time
SALES FINANCE
Product
GL_Line
Desktop
OLAP
Server
Desktop
MD Server OLAP
Database
Server
Warehouse
The MOLAP Model
multidimensional structure.
The presentation layer provides the
multidimensional view.
MOLAP
Application Layer
Engine
Warehouse
The MOLAP Model
Data
DSS Client
Arrays
Cached
Offloaded from server
Efficient storage and processing MOLAP
Application Layer
Engine
Complexity hidden from the user
Analysis using pre aggregated
summaries and pre calculated measures
Warehouse
The ROLAP Model
Multiple
SQL
Warehouse
Server
The ROLAP Model
Data and metadata in Server
DSS Client
Multidimensional views of data
High connectivity
Unlimited ROLAP
Application Layer
Database size
Engine
Query criteria
Multiple
SQL
Complex SQL generated by tool
Warehouse
Server
OLAP Query Characteristics
Access to large amounts of data
Analysis of data relationships by many business criteria
Analysis of data by time
Display of data across different dimensions
Complex calculations using formula
Quick response
Standard Query Techniques
Customer
Why?
Slice /Dice Time
What? Account
Why?
Why?
Drill-down
Standard Query Techniques
Why? Drill-up
What?
Drill-across
Why?
Why?
Pivoting
Standard Query Techniques
Roll up (drill-up): summarize data
by climbing up hierarchy or by dimension reduction
Other operations
drill across: involving (across) more than one fact table
drill through: through the bottom level of the cube to its backend
relational tables (using SQL)
A Sample Data Cube
Example: Standard Query Techniques
Subject: Sales
Slice
The Slice OLAP operations takes
one specific dimension from a cube
given and represents a new sub-
cube, which provides information
from another point of view.
It can create a new sub-cube by
choosing one or more dimensions.
The use of Slice implies the
specified granularity level of the
dimension.
Dice
OLAP Dice emphasizes two or
more dimensions from a cube
given and suggests a new sub-
cube, as well as Slice operation
does.
Drill Up
Drill-up is an operation to gather
data from the cube either by
ascending a concept hierarchy for a
dimension or by dimension reduction
in order to receive measures at a less
detailed granularity.
Query
Computer architecture Performance
OK
Openness
Simple Complex
Performance Analysis
Management
Enterprise-wide perspective
Tools Comparison
Relational
Known environment
Use with operational and warehouse systems
No complex analysis functions
Multidimensional
Quick access to data
Extensive libraries of complex functions
Strong modeling and forecasting capabilities
Use with operational and warehouse systems
Difficulty of changing dimensions
Summary