0% found this document useful (0 votes)
25 views17 pages

Lecture 4

The document summarizes key differences between data warehouses and online transaction processing (OLTP) systems. It also provides examples of typical applications of data warehouses, such as fraud detection, profitability analysis, direct marketing, credit risk prediction, and yield management. Finally, it discusses how data warehouses are being applied to new domains like agriculture to improve decision making.

Uploaded by

za6372571
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views17 pages

Lecture 4

The document summarizes key differences between data warehouses and online transaction processing (OLTP) systems. It also provides examples of typical applications of data warehouses, such as fraud detection, profitability analysis, direct marketing, credit risk prediction, and yield management. Finally, it discusses how data warehouses are being applied to new domains like agriculture to improve decision making.

Uploaded by

za6372571
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd

Lecture-4

1
Data Warehouse Vs. OLTP
OLTP: OnLine Transaction Processing (MIS or Database System)

Data Warehouse OLTP


Scope * Application –Neutral * Application specific
* Single source of “truth” * Multiple databases with repetition
* Evolves over time * Off the shelf application
* How to improve business * Runs the business

Data * Historical, detailed data * Operational data


Perspective * Some summary * No summary
* Lightly denormalized * Fully normalized

Queries * Hardly uses PK * Based on PK


* Number of results * Number of results returned in
returned in thousands hundreds

Time factor * Minutes to hours * Sub seconds to seconds


* Typical availability 6x12 * Typical availability 24x7

2
Comparison of Response Times
 On-line analytical processing (OLAP) queries must be
executed in a small number of seconds.
 Often requires denormalization and/or sampling.

 Complex query scripts and large list selections can


generally be executed in a small number of minutes.

 Sophisticated clustering algorithms (e.g., data mining)


can generally be executed in a small number of hours
(even for hundreds of thousands of customers).

3
Putting the pieces together

Data Data Warehouse Server OLAP Servers Clients


(Tier 0) (Tier 1) (Tier 2) (Tier 3)
Semistructured


MOLAP
Sources Query/Reporting

www data
Meta
Data 
 Extract
Data Analysis 
 







IT
Archived
data
Transform
Load
(ETL)
Warehouse
ROLAP 
Data Mining

Business
Users
Users
Operational
Data Bases 

Data sources Data Marts  Tools
Business Users

4
Types & Typical Applications
of DWH

5
Types of data warehouse
 Financial
 Telecommunication
 Insurance
 Human Resource
 Global
 Exploratory

6
Types of data warehouse
Financial
 First data warehouse that an organization
builds. This is appealing because:

 Nerve center, easy to get attention.

 In most organizations, smallest data set.

 Touches all aspects of an organization, with a


common denomination i.e. money.

 Inherent structure of data directly influenced by the


day-to-day activities of financial processing.

Word of caution: Net balances.


7
Types of data warehouse
Telecommunication
Dominated by sheer volume of data.

Many ways to accommodate call level detail:


 Only a few months of call level detail,
 Storing lots of call level detail scattered over different
storage media,

 Storing only selective call level detail, etc.

 Unfortunately, for many kinds of processing, working at


an aggregate level is simply not possible.

8
Types of data warehouse
Insurance
Insurance data warehouses are similar to other
data warehouses BUT with a few exceptions.
Stored data that is very, very old, used for actuarial
processing.
Typical business may change dramatically over
last 40-50 years, but not insurance.
In retailing or telecomm there are a few important
dates, but in the insurance environment there are
many dates of many kinds.

9
Types of data warehouse
Insurance
Insurance data warehouses are similar to other
data warehouses BUT with a few exceptions.
Long operational business cycles, in years.
Processing time in months. Thus the operating
speed is different.
Transactions are not gathered and processed, but
are in kind of “frozen”.
Thus a very unique approach of design &
implementation.

10
Typical Applications
Impact on organization’s core business is to streamline
and maximize profitability.

 Fraud detection.
 Profitability analysis.
 Direct mail/database marketing.
 Credit risk prediction.
 Customer retention modeling.
 Yield management.
 Inventory management.

ROI on any one of these applications can justify HW/SW


& consultancy costs in most organizations.
11
Typical Applications
Fraud detection

 By observing data usage patterns.


 People have typical purchase patterns.
 Deviation from patterns.
 Certain cities notorious for fraud.
 Certain items bought by stolen cards.
 Similar behavior for stolen phone cards.

12
Typical Applications
Profitability Analysis
 Banks know if they are profitable or not.
 Don’t know which customers are profitable.
 Typically more than 50% are NOT profitable.
 Don’t know which one?
 Balance is not enough, transactional behavior is the
key.
 Restructure products and pricing strategies.
 Life-time profitability models (next 3-5 years).

13
Typical Applications
Direct mail marketing

 Targeted marketing.
 Offering high bandwidth package NOT to all users.
 Know from call detail records of web surfing.
 Saves marketing expense, saving pennies.
 Knowing your customers better.

14
Typical Applications
Credit risk prediction

 Who should get a loan?


 Customer segregation i.e. stable vs. rolling.
 Qualitative decision making NOT subjective.
 Different interest rates for different customers.
 Do not subsidize bad customer on the basis of
good.

15
Typical Applications
Yield Management
 Works for fixed inventory businesses.
 The price of item suddenly goes to zero.
 Item prices vary for varying customers.
 Example: Air Lines, Hotels etc.
 Price of (say) Air Ticket depends on:
 How much in advance ticket was bought?
 How many vacant seats were present?
 How profitable is the customer?
 Ticket is one-way or return?

16
Recent Application
Agriculture Systems
 Agri and related data collected for decades.
 Metrological data consists of 50+ attributes.
 Decision making based on expert judgment.
 Lack of integration results in underutilization.
 What is required, in which amount and when?

17

You might also like