0% found this document useful (0 votes)

50 views17 pages

Unit5 DM&DW

Uploaded by

srp27391

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

50 views17 pages

Unit5 DM&DW

Uploaded by

srp27391

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 17

III BCA

Data Warehouse

Unit 5 : Data Warehouse

Introduction
A data warehouse is a repository of information collected from multiple sources,
stored under a unified schema, and that usually resides at a single site.

• Data warehouses generalize and consolidate data in multidimensional space.

The construction of data warehouses involves data cleaning, data integration,
and data transformation and can be viewed as an important preprocessing step
for data mining.
• Moreover, data warehouses provide on-line analytical processing (OLAP) tools
for the interactive analysis of multidimensional data of varied granularities,
which facilitates effective data generalization and data mining.
• An ordinary Database can store MBs to GBs of data and that too for a specific
purpose. For storing data of TB size, the storage shifted to the Data
Warehouse.
• Metadata Repository : Metadata are data about data. When used in a data
warehouse, metadata are the data that define warehouse objects.
Definition of Data Warehouse

A data warehouse is a subject-oriented, integrated, time-variant, and A data

warehouse is a subject-oriented, integrated, time-variant, and nonvolatile collection
of data in support of management’s decision making process

• The Definition presents the major features( Characteristics )of a data

warehouse. The four keywords—subject-oriented, integrated, time-variant,
and nonvolatile—distinguish data warehouses from other data repository
systems, such as relational database systems, transaction processing systems,
and file systems.
Key features of data warehouse are :

▪ Subject-oriented : A data warehouse is organized around major subjects

such as customer, supplier, product, and sales.
▪ Integrated : A data warehouse is usually constructed by integrating
multiple heterogeneous sources, such as relational databases, flat files, and
online transaction records.
▪ Time-variant : Data are stored to provide information from an historic
perspective (e.g., the past 5–10 years). Every key structure in the data
warehouse contains, either implicitly or explicitly, a time element.

Govt. first Grade College, Shimoga 1

III BCA
Data Warehouse
▪ Nonvolatile : A data warehouse is always a physically separate data store.
Due to this separation, a data warehouse does not require transaction
processing, recovery, and concurrency control mechanisms.
▪ It usually requires only two operations in data accessing: initial loading of
data and access of data.
Differences between Operational Database Systems and Data Warehouses

• The major task of online operational database systems is to perform online

transaction and query processing. These systems are called online transaction
processing (OLTP) systems.
• Data warehouse systems, on the other hand, serve users or knowledge workers
in the role of data analysis and decision making. These systems are known as
online analytical processing (OLAP) systems.

The major distinguishing features of OLTP and OLAP are summarized as follows:

• Users and system orientation: An OLTP system is customer-oriented and

is used for transaction and query processing by clerks, clients, and information
technology professionals. An OLAP system is market-oriented and is used for
data analysis by knowledge workers, including managers, executives, and
analysts.
• Data contents: An OLTP system manages current data that, typically, are
too detailed to be easily used for decision making. An OLAP system manages
large amounts of historic data, provides facilities for summarization and
aggregation
• Database design: An OLTP system usually adopts an entity-relationship
(ER) data model and an application-oriented database design. An OLAP
system typically adopts either a star or a snowflake model and a subject-
oriented database design.
• View: An OLTP system focuses mainly on the current data within an
enterprise or department, without referring to historic data or data in different
organizations. In contrast, an OLAP system often spans multiple versions of a
database schema, due to the evolutionary process of an organization.
• Access patterns: The access patterns of an OLTP system consist mainly of
short, atomic transactions. Such a system requires concurrency control and
recovery mechanisms. However, accesses to OLAP systems are mostly read-
only operations although many could be complex queries.

Govt. first Grade College, Shimoga 2

III BCA
Data Warehouse

Data Warehousing: A Multitiered Architecture

Data warehouses often adopt a three-tier architecture, as presented in
Figure 3.1.

1. The bottom tier is a warehouse database server that is almost always a

relational database system. Back-end tools and utilities are used to feed data
into the bottom tier from operational databases or other external sources
2. The middle tier is an OLAP server that is typically implemented using either
a relational OLAP (ROLAP) model (i.e., an extended relational DBMS that
maps operations on multidimensional data to standard relational operations);
or a multidimensional OLAP (MOLAP) model (i.e., a special-purpose server
that directly
implements multidimensional data and operations).
3. The top tier is a front-end client layer, which contains query and reporting
tools, analysis tools, and/or data mining tools (e.g., trend analysis, prediction,
and so on).

Data Warehouse Modeling

Data warehouse modeling refers to the process and techniques used to design
the structure of a data warehouse

• Data modeling refers to the process of handling and designing the data model
within a data warehouse platform.
• It consists of making an appropriate database schema so as to transfer the data
that can be stored and of useful to user.
• Data warehouse modeling includes models, different schemas , measures and
concept hierarchies to design the structure of data warehouse .

Govt. first Grade College, Shimoga 3

III BCA
Data Warehouse
❖ Data Warehouse Models
From the architecture point of view, there are three data warehouse models: the
enterprise warehouse, the data mart, and the virtual warehouse.
• Enterprise warehouse: An enterprise warehouse collects all of the information
about subjects spanning the entire organization
• Data mart: A data mart contains a subset of corporate-wide data that is of
value to a specific group of users. The scope is confined to specific selected
subjects. For example, a marketing data mart may confine its subjects to
customer, item, and sales.
• Virtual warehouse: A virtual warehouse is a set of views over operational
databases. For efficient query processing, only some of the possible summary
views may be materialized. A virtual warehouse is easy to build but requires
excess capacity on operational database servers.
❖ Schemas for Multidimensional Data Models

The most popular data model for a data warehouse is a multidimensional model.
Such a model can exist in the form of a star schema, a snowflake schema, or a fact
constellation schema.

• Star schema: The most common modeling paradigm is the star schema, in
which the data warehouse contains :
(1) a large central table (fact table) containing the bulk of the data, with no
redundancy, and
(2) a set of smaller attendant tables (dimension tables), one for each
dimension.
(3)The schema graph resembles a starburst, with the dimension tables
displayed in a radial pattern around the central fact table.

Govt. first Grade College, Shimoga 4

III BCA
Data Warehouse
• Snowflake schema: The snowflake schema is a variant of the star schema
model, where some dimension tables are normalized, thereby further splitting
the data into additional tables. The resulting schema graph forms a shape
similar to a snowflake.

• Fact constellation: Sophisticated applications may require multiple fact tables

to share dimension tables. This kind of schema can be viewed as a collection of
stars, and hence is called a galaxy schema or a fact constellation.

❖ Measures: Their Categorization and Computation

A data cube measure is a numerical function that can be evaluated at each
point in the data cube space.

• A measure value is computed for a given point by aggregating the data

corresponding to the respective dimension-value pairs defining the given point.
• Measures can be organized into three categories (i.e., distributive, algebraic,
holistic),based on the kind of aggregate functions used.

Govt. first Grade College, Shimoga 5

III BCA
Data Warehouse
1. Distributive : These measures can be computed in a distributive manner.
This means that the result can be obtained by partitioning the data, computing
the measure on each partition, and then combining the results.
• Examples : sum(), min(), and max() are distributive aggregate functions
2. Algebraic : These measures are composed of multiple distributive measures
combined using algebraic formulas. They can be computed by breaking them
down into basic distributive measures and then combining the results using
an algebraic formula.
• For example, avg() (average) can be computed by sum()/count(), where both
sum() and count() are distributive aggregate functions
3. Holistic : These measures require access to all the data to compute the result
and cannot be broken down into smaller pieces for intermediate computation.
• Common examples of holistic functions include median(), mode(), and
rank().
❖ Concept Hierarchies

A concept hierarchy defines a sequence of mappings from a set of low-level

concepts to higher-level, more general concepts.

• A conceptual hierarchy includes a set of nodes organized in a tree, where the

nodes define values of an attribute known as concepts
• The hierarchies allow the user to summarize the data at various levels.

(a) hierarchy for location (b) a lattice for time

Data Cube and OLAP :

Data cube :
Grouping of data in a multidimensional matrix is called data cubes.

Govt. first Grade College, Shimoga 6

III BCA
Data Warehouse
▪ In Data warehousing, we generally deal with various multidimensional data
models as the data will be represented by multiple dimensions and multiple
attributes.
• This multidimensional data is represented in the data cube as the cube
represents a high-dimensional space.
• The Data cube pictorially shows how different attributes of data are arranged
in the data model

Data cube classification:

The data cube can be classified into two categories:

• Multidimensional data cube: It basically helps in storing large amounts of

data by making use of a multi-dimensional array. It increases its efficiency by
keeping an index of each dimension. Thus, dimensional is able to retrieve data
fast.
• Relational data cube: It basically helps in storing large amounts of data by
making use of relational tables. Each relational table displays the dimensions
of the data cube. It is slower compared to a Multidimensional Data Cube.

Advantages of data cubes:

• Multi-dimensional analysis: Data cubes enable multi-dimensional analysis

of business data, allowing users to view data from different perspectives and
levels of detail.
• Interactivity: Data cubes provide interactive access to large amounts of data,
allowing users to easily navigate and manipulate the data to support their
analysis.
• Speed and efficiency: Data cubes are optimized for OLAP analysis, enabling
fast and efficient querying and aggregation of data.
• Data aggregation: Data cubes support complex calculations and data
aggregation, enabling users to quickly and easily summarize large amounts of
data.
• Improved decision-making: Data cubes provide a clear and comprehensive
view of business data, enabling improved decision-making and business
intelligence.
• Accessibility: Data cubes can be accessed from a variety of devices and
platforms, making it easy for users to access and analyze business data from
anywhere.

Govt. first Grade College, Shimoga 7

III BCA
Data Warehouse
Disadvantages of data cube:

• Complexity: OLAP systems can be complex to set up and maintain, requiring

specialized technical expertise.
• Data size limitations: OLAP systems can struggle with very large data sets
and may require extensive data aggregation or summarization.
• Performance issues: OLAP systems can be slow when dealing with large
amounts of data, especially when running complex queries or calculations.
• Data integrity: Inconsistent data definitions and data quality issues can
affect the accuracy of OLAP analysis.
• Cost: OLAP technology can be expensive, especially for enterprise-level
solutions, due to the need for specialized hardware and software.
• Inflexibility: OLAP systems may not easily accommodate changing business
needs and may require significant effort to modify or extend

OLAP :

OLAP stands for Online Analytical Processing, which is a technology that enables
multi-dimensional analysis of business data.

▪ It provides interactive access to large amounts of data and supports complex

calculations and data aggregation. OLAP is used to support business
intelligence and decision-making processes.

Types of OLAP Systems:

1. MOLAP (Multidimensional OLAP): Uses specialized storage to handle

multidimensional data and provide fast query performance.
▪ MOLAP uses array-based multidimensional storage engines for
multidimensional views of data.
2. ROLAP (Relational OLAP): Leverages relational databases to store data
and performs on-the-fly aggregation.
▪ ROLAP servers are placed between relational back-end server
and client front-end tools
3. HOLAP (Hybrid OLAP): Combines the capabilities of MOLAP and
ROLAP to balance the benefits of both.
▪ HOLAP servers allows to store the large data volumes of detailed
information.

Govt. first Grade College, Shimoga 8

III BCA
Data Warehouse
Characteristics of OLAP system

Fast :

It defines which the system targeted to deliver the most feedback to the client
within about five seconds, with the elementary analysis taking no more than one
second and very few taking more than 20 seconds.

Analysis :

It defines that the system can manage with any business logic and statistical
analysis that is appropriate for the application and the user, the keep it easy enough
for the target user. Although some pre programming can be required, we don’t think
it acceptable in the event that all application definitions need to be permit the client
to characterize modern Adhoc calculations as portion of the examination and to record
on the information in any wanted strategy.

Shared :

It defines that the system implements all the security requirements for
confidentiality (probably down to cell level) and, multiple write access is required,
concurrent update areas at a suitable level, It is not all applications required users
to write data back, but for the increasing number that does, the system must be able
to handle several updates in an appropriate, secure manner.

Multidimensional :

This is the basic requirement. OLAP system must provide a multidimensional

conceptual view of the data, including full support for hierarchies, as this is certainly
the most logical method to analyze business and organizations.

Information :

The system should be able to hold all the data needed by the applications. Data
sparsity should be handled in an efficient manner.

Data cube Operations

Data cube operations are key concepts in OLAP (Online Analytical Processing)
systems, which are used for analyzing data in a multidimensional space.

• These operations allow users to manipulate and analyze the data cube to derive
meaningful insights.

Govt. first Grade College, Shimoga 9

III BCA
Data Warehouse
• These operations allow users to navigate through data cubes in a flexible and
interactive way, enabling detailed data analysis.

Here are the main operations:

❖ Roll-up :

The roll-up operation (also called the drill-up operation by some vendors)
performs aggregation on a data cube, either by climbing up a concept hierarchy for a
dimension or by dimension reduction

• When roll-up is performed by dimension reduction, one or more dimensions are

removed from the given cube. For example, consider a sales data cube
containing only the two dimensions location and time. Roll-up may be
performed by removing, say, the time dimension, resulting in an aggregation
of the total sales by location, rather than by location and by time.

❖ Drill-down:
Drill-down is the reverse of roll-up. It navigates from less detailed data to

more detailed data.

• Drill-down occurs by descending the time hierarchy from the level of quarter to
the more detailed level of month. The resulting data cube details the total sales
per month rather than summarizing them by quarter.

Govt. first Grade College, Shimoga 10

III BCA
Data Warehouse
❖ Slice and dice:
• Slice : The slice operation performs a selection on one dimension of the given
cube, resulting in a sub cube.
This operation filters the unnecessary portions. Suppose in a particular
dimension, the user doesn’t need everything for analysis, rather a particular
attribute.
• Dice : The dice operation defines a sub cube by performing a selection on two or
more dimensions

Figure: Slice operation Figure: Dice operation

❖ Pivot (rotate):

Pivot (also called rotate) is a visualization operation that rotates the data axes
in view in order to provide an alternative presentation of the data.

• It may contain swapping the rows and columns or moving one of the row-
dimensions into the column dimensions.

Govt. first Grade College, Shimoga 11

III BCA
Data Warehouse

Multidimensional Data Model

The multi-Dimensional Data Model is a method which is used for ordering data in
the database along with proper arrangement and assembling of the contents in
the database.

• It represents data in the form of data cubes.

• Data cubes allow to model and view the data from many dimensions and
perspectives.
• Data warehouses and OLAP tools are based on a multidimensional data model.
This model views data in the form of a data cube.
• A data cube allows data to be modeled and viewed in multiple dimensions. It
is defined by dimensions and facts.
Dimensions :

• dimensions are the perspectives or entities with respect to which an

organization wants to keep records.
• Each dimension may have a table associated with it, called a dimension table,
which further describes the dimension.
• For example, a dimension table for item may contain the attributes item name,
brand, and type.
• Dimension tables can be specified by users or experts, or automatically
generated and adjusted based on data distributions.
Facts :

• A multidimensional data model is typically organized around a central theme,

like sales, for instance. This theme is represented by a fact table. Facts are
numerical measures. Think of them as the quantities by which we want to
analyze relationships between dimensions.
• Examples of facts for a sales data warehouse include dollars sold (sales amount
in dollars), units sold (number of units sold), and amount budgeted.
• The fact table contains the names of the facts, or measures, as well as keys to
each of the related dimension tables.

Govt. first Grade College, Shimoga 12

III BCA
Data Warehouse

The following stages should be followed by every project for building a Multi
Dimensional Data Model :

Stage 1 : Assembling data from the client : In first stage, a Multi Dimensional
Data Model collects correct data from the client.

Stage 2 : Grouping different segments of the system : In the second stage, the
Multi Dimensional Data Model recognizes and classifies all the data to the respective
section they belong to and also builds it problem-free to apply step by step.

Stage 3 : Noticing the different proportions : In this stage, the main factors are
recognized according to the user’s point of view. These factors are also known as
“Dimensions”.

Stage 4 : Preparing the actual-time factors and their respective qualities : In

the fourth stage, the factors which are recognized in the previous step are used
further for identifying the related qualities. These qualities are also known as
“attributes” in the database.

Stage 5 : Finding the actuality of factors which are listed previously and
their qualities: In the fifth stage, A Multi Dimensional Data Model separates and
differentiates the actuality from the factors which are collected by it.

Stage 6 : Building the Schema to place the data, with respect to the
information collected from the steps above : In the sixth stage, on the basis of
the data which was collected previously, a Schema is built.

Govt. first Grade College, Shimoga 13

III BCA
Data Warehouse

Data cube implementation

The implementation of a data cube in a data warehouse involves several key
steps to design, build, and utilize the cube for multi-dimensional data analysis.

Data cubes in data mining can be classified into two main categories -

1. Multidimensional data cube – This type of data cube in data mining is based
on the concept of dimensions and measures
2. Relational data cube – This type of data cube in data mining is based on the
relational database model and represents data in tables with rows and
columns.

Here's a detailed overview:

1. Requirement Analysis
Understand the business requirements to determine what dimensions and
measures are needed. This includes identifying the key metrics and
dimensions of analysis, such as time, geography, and product categories
2. Schema Design :
Design the database schema to support the data cube. This typically
involves creating a star schema or a snowflake schema.
• Star Schema: Central fact table connected to multiple dimension tables.
• Snowflake Schema: A normalized form of the star schema where dimension
tables are further broken down into related tables.
3. ETL Process

Extract, Transform, and Load (ETL) data into the data warehouse:

• Extract: Retrieve data from source systems.

• Transform: Cleanse, format, and prepare data.
• Load: Insert data into the fact and dimension tables in the data warehouse
4. Create the Data Cube
• Use OLAP tools to create the data cube.
• Tools like Microsoft SQL Server Analysis Services (SSAS), Oracle OLAP,
and IBM Cognos are commonly used.
5. Build the Cube:
• Select Measures: Choose columns from the fact table (e.g., SalesAmount,
UnitsSold).
• Define Dimensions: Specify the dimension tables and create hierarchies
(e.g., Year > Quarter > Month > Day).
• Process the Cube: Build and populate the cube with data

Govt. first Grade College, Shimoga 14

III BCA
Data Warehouse
6. Querying the Data Cube

Once the cube is created and processed, you can query it using MDX
(Multidimensional Expressions) or other supported query languages.

OLAP implementation
When implementing an OLAP system, there are a few key considerations to keep
in mind:

1.Data Model Design: Carefully design the data model to align with the analytical
requirements of the organization. This includes defining dimensions, hierarchies, and
measures.

2.Data Integration: Ensure seamless integration of data from various sources into
the OLAP database. This may involve data extraction, transformation, and loading
(ETL) processes.

3.Scalability and Performance: Plan for scalability and performance optimizations as

the volume of data and user queries increase over time. This may involve partitioning
data, optimizing aggregations, and using caching mechanisms.

4.Vision and strategy development

To choose and implement the most suitable solution your team need to define
business objectives first. It is one of the most important steps as only the clear
understanding of what you need and how to get it can lead to success. The next stage
is also to identify the strategy.

5.Data preparation

First of all, it is important to learn as much as possible about the system. That
is why before preparing your data to be transferred into the new system, you should
check OLAP data characteristics first:

▪ ·OLAP data is summarized;

▪ OLAP data is more departmentalized comparing with data warehouse
which serves corporate-wide needs;
▪ The system stores and uses less data than a data warehouse.

Make sure that conditions suit you and start your data preparation then.

6.Vendor and platform choice

Govt. first Grade College, Shimoga 15

III BCA
Data Warehouse
Summarizing all the information and your requirements, it’s time to choose
vendor and an OLAP system finally. The choice should be made regarding the kind
of the system:

▪ ROLAP
▪ MOLAP
▪ HOLAP

7.Review : Summarize all the requirements, data and steps before implementation.

OLAP implementation steps :

Step 1: dimensional modeling

Step 2 : select the data required for removing into OLAP system

Step 3 : data extraction for the OLAP system

Step 4 : loading data to the OLAP server

Step 5 : data aggregation and derived data computation

Step 6: implementation of OLAP application on desktop

Step 7 : user’s training organization

OLAP Software
OLAP (Online Analytical Processing) software is a critical component in the
field of data warehousing, enabling complex analytical and ad-hoc queries with a
rapid execution time.

Here’s an overview of OLAP software in the context of data warehouses:

Key Concepts

Multidimensional Data Models:

Cubes: OLAP organizes data into cubes instead of traditional tables. A cube is
a multi-dimensional array of data, and each dimension represents a different
attribute (e.g., time, geography, product).

Dimensions and Measures: Dimensions are the perspectives or entities with

respect to which an organization wants to keep records, and measures are the
numerical data being tracked (e.g., sales amount, profit).

Govt. first Grade College, Shimoga 16

III BCA
Data Warehouse
Key Features:

▪ Fast Query Performance: OLAP systems are optimized for quick query
execution to support real-time analysis.
▪ Complex Calculations: Supports complex calculations and aggregations, such
as SUM, AVG, COUNT, etc.
▪ Data Drilling: Allows users to drill down into details or roll up to higher-level
summaries.
▪ Slicing and Dicing: Enables data to be viewed from different perspectives by
slicing along dimensions or dicing subsets of the cube.

Benefits

• Enhanced Data Analysis

• Improved Decision Making
• Flexibility

OLAP software in data warehousing provides a powerful framework for analyzing

complex datasets, facilitating advanced reporting, and supporting strategic decision-
making processes in organizations.

Govt. first Grade College, Shimoga 17

UNIT1
No ratings yet
UNIT1
108 pages
Lecture 02 Data Warehouses
No ratings yet
Lecture 02 Data Warehouses
3 pages
DWBI Unit-1
No ratings yet
DWBI Unit-1
19 pages
DWM Unit 1. Introduction To Data Warehousing
100% (4)
DWM Unit 1. Introduction To Data Warehousing
12 pages
Data Warehouse
No ratings yet
Data Warehouse
3 pages
Data Warehousing Interview Q&A
No ratings yet
Data Warehousing Interview Q&A
14 pages
Data Mining
No ratings yet
Data Mining
98 pages
Data Warehousing and Data Mining
No ratings yet
Data Warehousing and Data Mining
10 pages
DBMS II Seven 7
No ratings yet
DBMS II Seven 7
13 pages
DW&DM Material
No ratings yet
DW&DM Material
107 pages
MCS 221 Notes
No ratings yet
MCS 221 Notes
24 pages
Data Mining Notes (1, 2, 3,4)
No ratings yet
Data Mining Notes (1, 2, 3,4)
82 pages
DWDM Lecture Materials 231015 173712
No ratings yet
DWDM Lecture Materials 231015 173712
62 pages
Module 1 DMDW
No ratings yet
Module 1 DMDW
64 pages
DM Chapter 4
No ratings yet
DM Chapter 4
8 pages
Data Mining UNIT 2 LECTURE NOTES
No ratings yet
Data Mining UNIT 2 LECTURE NOTES
32 pages
Lesson 2. Data Warehouse Basic Concepts
No ratings yet
Lesson 2. Data Warehouse Basic Concepts
18 pages
KPMG Example Data Dictionary
No ratings yet
KPMG Example Data Dictionary
1,205 pages
Unit 2
No ratings yet
Unit 2
31 pages
Module-1: Data Warehousing & Modelling
No ratings yet
Module-1: Data Warehousing & Modelling
13 pages
UNIT - 1 - Datawarehouse & Data Mining
100% (1)
UNIT - 1 - Datawarehouse & Data Mining
24 pages
Data Mining UNIT I
No ratings yet
Data Mining UNIT I
11 pages
Unit 1 DWDM
No ratings yet
Unit 1 DWDM
122 pages
Introduction To Data Warehousing Concepts
No ratings yet
Introduction To Data Warehousing Concepts
8 pages
Introduction To Data Warehouse
No ratings yet
Introduction To Data Warehouse
15 pages
Data Warehousing & Dimensional Modeling Concepts !!
No ratings yet
Data Warehousing & Dimensional Modeling Concepts !!
33 pages
Data Warehousing and Dimensional Modeling Notes by Neil Bagchi
No ratings yet
Data Warehousing and Dimensional Modeling Notes by Neil Bagchi
33 pages
Unit 1
No ratings yet
Unit 1
99 pages
Data Mining Unit 1
No ratings yet
Data Mining Unit 1
31 pages
Data Warehousing, Business Analytics and Online Analytical - 1
No ratings yet
Data Warehousing, Business Analytics and Online Analytical - 1
35 pages
Introduction To DataStage
No ratings yet
Introduction To DataStage
111 pages
Informatica FAQs
No ratings yet
Informatica FAQs
143 pages
Lec 01 - Intro To Data Warehouse
No ratings yet
Lec 01 - Intro To Data Warehouse
54 pages
Data Analyst Interview Questions and Answers
No ratings yet
Data Analyst Interview Questions and Answers
25 pages
DWM Unit-I Notes
No ratings yet
DWM Unit-I Notes
9 pages
Chapter-2 DATA WAREHOUSE PDF
100% (1)
Chapter-2 DATA WAREHOUSE PDF
28 pages
Data Warehouse Definition: - Users and System Orientation
No ratings yet
Data Warehouse Definition: - Users and System Orientation
6 pages
Implementation of Data Warehouse
No ratings yet
Implementation of Data Warehouse
11 pages
The Data Warehouse Is A Place Where People Can Access Their Data. The Goals of A Data Warehouse Are As Follows
No ratings yet
The Data Warehouse Is A Place Where People Can Access Their Data. The Goals of A Data Warehouse Are As Follows
22 pages
ch4 DW Summary
No ratings yet
ch4 DW Summary
8 pages
Commcare HQ Readthedocs Io en Latest
No ratings yet
Commcare HQ Readthedocs Io en Latest
458 pages
Unit-1.1 Data Warehouse
No ratings yet
Unit-1.1 Data Warehouse
29 pages
DWH Meterial
No ratings yet
DWH Meterial
9 pages
DWDM Lecture Notes
No ratings yet
DWDM Lecture Notes
139 pages
Rebelo Et Al 2024 Beyond The Jump A Scoping Review of External Training Load Metrics in Volleyball
No ratings yet
Rebelo Et Al 2024 Beyond The Jump A Scoping Review of External Training Load Metrics in Volleyball
16 pages
Data War Eh Puse
No ratings yet
Data War Eh Puse
51 pages
Data Warehouse
No ratings yet
Data Warehouse
4 pages
Data Warehouse
No ratings yet
Data Warehouse
69 pages
Srivatsa's - Resume (5 Years of Experience On Oracle Apex +SQL &PLSQL)
No ratings yet
Srivatsa's - Resume (5 Years of Experience On Oracle Apex +SQL &PLSQL)
5 pages
DMDW1
No ratings yet
DMDW1
13 pages
Data Mining& Data Warehousing.
No ratings yet
Data Mining& Data Warehousing.
13 pages
3.1 What Is Data Warehouse?: Unit Iii
No ratings yet
3.1 What Is Data Warehouse?: Unit Iii
33 pages
Data Dictionary
No ratings yet
Data Dictionary
11 pages
Management Reporting Such As Annual and Quarterly Comparisons
No ratings yet
Management Reporting Such As Annual and Quarterly Comparisons
37 pages
DWDM
No ratings yet
DWDM
15 pages
Data Ware House Concepts
No ratings yet
Data Ware House Concepts
12 pages
DW Basics
No ratings yet
DW Basics
8 pages
Kalai - AD - CCS341 - DW - UNIT 3
No ratings yet
Kalai - AD - CCS341 - DW - UNIT 3
46 pages
A Practitioners Guide To Tableau Prep Builder
No ratings yet
A Practitioners Guide To Tableau Prep Builder
257 pages
Topic 03 Data Integration
No ratings yet
Topic 03 Data Integration
32 pages
Pharma Batch: Data Warehousing
No ratings yet
Pharma Batch: Data Warehousing
32 pages
Dinesh Kada - ETL - Automation - QA
No ratings yet
Dinesh Kada - ETL - Automation - QA
4 pages
Final Interview Questions (Etl - Informatica) : Subject Oriented, Integrated, Time Variant, Non Volatile
100% (1)
Final Interview Questions (Etl - Informatica) : Subject Oriented, Integrated, Time Variant, Non Volatile
77 pages
Sindhu Resume
No ratings yet
Sindhu Resume
3 pages
Department of Computer Science and Engineering: Rajalakshmi Institute of Technology
No ratings yet
Department of Computer Science and Engineering: Rajalakshmi Institute of Technology
16 pages
DWDM Unit 1
No ratings yet
DWDM Unit 1
103 pages
Siri Garikipati-SQL Bi
No ratings yet
Siri Garikipati-SQL Bi
9 pages
SAP BW 7.0 Bussines Content
No ratings yet
SAP BW 7.0 Bussines Content
82 pages
Deepesh M Sr. BA - USC
No ratings yet
Deepesh M Sr. BA - USC
6 pages
Ab Initio 1
100% (4)
Ab Initio 1
115 pages
DWDM Lecturenotes PDF
No ratings yet
DWDM Lecturenotes PDF
133 pages
Top 50 SQL Server Interview Question
No ratings yet
Top 50 SQL Server Interview Question
15 pages
ER/Studio® 8.0.2 Evaluation Guide
No ratings yet
ER/Studio® 8.0.2 Evaluation Guide
84 pages
DATA WAREHOUSE Basic Concepts
No ratings yet
DATA WAREHOUSE Basic Concepts
26 pages
Banupriya M - RF
No ratings yet
Banupriya M - RF
3 pages
Sap BPC 7.5 To 10.1
No ratings yet
Sap BPC 7.5 To 10.1
69 pages
Snowflake
No ratings yet
Snowflake
43 pages
Unit No. 8
No ratings yet
Unit No. 8
24 pages
AbInitio ABHINAV
No ratings yet
AbInitio ABHINAV
6 pages
Data Architect Interview Questions
No ratings yet
Data Architect Interview Questions
66 pages
OBIA 11.1.1.7.1 With ODI Step by Step Installation and Configuration
No ratings yet
OBIA 11.1.1.7.1 With ODI Step by Step Installation and Configuration
34 pages
Data Warehousing Concepts
No ratings yet
Data Warehousing Concepts
50 pages
SAP BW Interview Questions: What Is ODS?
No ratings yet
SAP BW Interview Questions: What Is ODS?
18 pages
Resume-Senior Data Engineer-Etihad Airways-Kashish Suri
No ratings yet
Resume-Senior Data Engineer-Etihad Airways-Kashish Suri
4 pages
Data Governance Plan PDF
100% (4)
Data Governance Plan PDF
36 pages
Database And Computer Management: SERIES 1, #3
From Everand
Database And Computer Management: SERIES 1, #3
Elias Mutegi
No ratings yet
The InfluxDB Handbook: Deploying, Optimizing, and Scaling Time Series Data
From Everand
The InfluxDB Handbook: Deploying, Optimizing, and Scaling Time Series Data
Robert Johnson
No ratings yet
Introduction to Microsoft SQL Server
From Everand
Introduction to Microsoft SQL Server
Eric Frick
No ratings yet
Databases: System Concepts, Designs, Management, and Implementation
From Everand
Databases: System Concepts, Designs, Management, and Implementation
Jonathan Rigdon
No ratings yet
Learn Data Warehousing in 24 Hours
From Everand
Learn Data Warehousing in 24 Hours
Alex Nordeen
No ratings yet

Unit5 DM&DW

Uploaded by

Unit5 DM&DW

Uploaded by

III BCA

Unit 5 : Data Warehouse

• Data warehouses generalize and consolidate data in multidimensional space.

A data warehouse is a subject-oriented, integrated, time-variant, and A data

• The Definition presents the major features( Characteristics )of a data

▪ Subject-oriented : A data warehouse is organized around major subjects

Govt. first Grade College, Shimoga 1

• The major task of online operational database systems is to perform online

• Users and system orientation: An OLTP system is customer-oriented and

Govt. first Grade College, Shimoga 2

Data Warehousing: A Multitiered Architecture

1. The bottom tier is a warehouse database server that is almost always a

Data Warehouse Modeling

Govt. first Grade College, Shimoga 3

Govt. first Grade College, Shimoga 4

• Fact constellation: Sophisticated applications may require multiple fact tables

❖ Measures: Their Categorization and Computation

• A measure value is computed for a given point by aggregating the data

Govt. first Grade College, Shimoga 5

A concept hierarchy defines a sequence of mappings from a set of low-level

• A conceptual hierarchy includes a set of nodes organized in a tree, where the

(a) hierarchy for location (b) a lattice for time

Data Cube and OLAP :

Govt. first Grade College, Shimoga 6

Data cube classification:

The data cube can be classified into two categories:

• Multidimensional data cube: It basically helps in storing large amounts of

Advantages of data cubes:

• Multi-dimensional analysis: Data cubes enable multi-dimensional analysis

Govt. first Grade College, Shimoga 7

• Complexity: OLAP systems can be complex to set up and maintain, requiring

▪ It provides interactive access to large amounts of data and supports complex

Types of OLAP Systems:

1. MOLAP (Multidimensional OLAP): Uses specialized storage to handle

Govt. first Grade College, Shimoga 8

This is the basic requirement. OLAP system must provide a multidimensional

Data cube Operations

Govt. first Grade College, Shimoga 9

Here are the main operations:

• When roll-up is performed by dimension reduction, one or more dimensions are

more detailed data.

Govt. first Grade College, Shimoga 10

Figure: Slice operation Figure: Dice operation

Govt. first Grade College, Shimoga 11

Multidimensional Data Model

• It represents data in the form of data cubes.

• dimensions are the perspectives or entities with respect to which an

• A multidimensional data model is typically organized around a central theme,

Govt. first Grade College, Shimoga 12

Stage 4 : Preparing the actual-time factors and their respective qualities : In

Govt. first Grade College, Shimoga 13

Data cube implementation

Here's a detailed overview:

• Extract: Retrieve data from source systems.

Govt. first Grade College, Shimoga 14

3.Scalability and Performance: Plan for scalability and performance optimizations as

4.Vision and strategy development

▪ ·OLAP data is summarized;

6.Vendor and platform choice

Govt. first Grade College, Shimoga 15

OLAP implementation steps :

Step 1: dimensional modeling

Step 3 : data extraction for the OLAP system

Step 4 : loading data to the OLAP server

Step 5 : data aggregation and derived data computation

Step 6: implementation of OLAP application on desktop

Step 7 : user’s training organization

Here’s an overview of OLAP software in the context of data warehouses:

Multidimensional Data Models:

Dimensions and Measures: Dimensions are the perspectives or entities with

Govt. first Grade College, Shimoga 16

• Enhanced Data Analysis

OLAP software in data warehousing provides a powerful framework for analyzing

Govt. first Grade College, Shimoga 17

You might also like