0% found this document useful (0 votes)

18 views17 pages

Unit 3

This document provides an overview of Dimensional Data Warehousing, detailing the structure and components such as fact tables, dimension tables, and surrogate keys. It explains the importance of dimensional models in Business Intelligence and discusses different types of fact tables and measures. Additionally, it introduces Multidimensional OLAP (MOLAP, ROLAP, HOLAP) and their respective advantages and disadvantages in data analysis.

Uploaded by

Red Roses

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

18 views17 pages

Unit 3

Uploaded by

Red Roses

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Universiy of MASCARA SAHRAOUI Mustapha

Unit 3: Dimensional Data Warehouse

CONTENTS
Objectives
Introduction

3.1 Dimensional Model

3.2 Facts Table
3.2.1 Types of Measure
3.2.2 Types of Fact Table
3.3 Dimension Tables
3.4 Surrogate Keys and Alternative Table Structure
3.4.1 Advantages of Surrogate Keys
3.4.2 Disadvantages of Surrogate Keys
3.4.3 Alternative Tables used in Data Warehousing
3.5 Multidimensional OLAP
3.5.1 MOLAP
3.5.2 ROLAP
3.5.3 HOLAP
3.6 Summary
3.7 Keywords
3.8 Review Questions
3.9 Further Readings

Objectives

After studying this unit, you will be able to:

 Describe about Dimensional Model
 Construct Facts Table
 Demonstrate Dimension Tables
 Discuss about Surrogate Keys and Alternative Table Structure
 Explain Multidimensional OLAP

Introduction

Dimensions are a common way of analysing data. Dimension model comprises of a fact table and numerous
dimensional tables and is used for assessing summarized data. Dimensional data modelling is the preferred modelling
technique in a BI environment. Knowing the basics of data warehousing and dimensions helps you design a better data
warehouse that fits your reporting
Universiy of MASCARA SAHRAOUI Mustapha

needs. This unit on data warehousing dimensions explains the importance of dimensions and dimension granularity and
stresses the importance of flattening hierarchies—with the goal being to make data more accessible and useful to users. It
also focuses on fact and dimension table.

3.1 Dimensional Model

Dimensional model comprises of a fact table and numerous dimensional tables and is used for assessing summarized data.
Since Business Intelligence reports are used in assessing the facts (aggregates) across various dimensions, dimensional
data modelling prefer the modelling technique in a BI environment.

Facts are normally calculated data like dollars’ worth or Sales or income. They correspond to the aim of a conclusion support
analysis.

Dimensions define the axis of enquiry of a fact.

Example:
For example, Product, Region and Time are the axes of enquiry of the Sales detail
Unit 3: Dimensional Data Warehouse

One such enquiry could be a scenario where the user might require to see the Sales (in dollars)
for a specific item in a market over a specific time span of time. In this case, we are calculating the
fact (Sales) over three dimensions (Product, Region and Time). Thus we can say that dimensions
give different views of the facts. They give structure to the otherwise unstructured facts.

It typically contains the attributes for the SQL answer set. Figure 3.1 shows an example of
dimensional model.

Figure 3.1: Example of Dimensional Model

Notes Self Assessment

Fill in the blanks:

1. Dimensional model comprises of a .............................. and numerous dimensional tables
and is used for assessing summarized data.
2. .................................. define the axis of enquiry of a fact.

3.2 Facts Table

30
Unit 3: Dimensional Data Warehouse
Fact table generally represent a process or reporting environment that is of value to the
organization. It is important to determine the identity of the fact table and specify exactly what
it represents. A fact table typically corresponds to an associative entity in the E-R model.
They must be listed in a logical fact table. Each measure has its own aggregation rules such as
ADD, AVG, MIN or MAX. Aggregation rules define the way by which business would like to
contrast standards of a measured value.

Facts are the measurements associated with fact table records at fact table
granularity.
The Figure 3.2 displays how Sales detail table is connected in a One-to-Many relationships with
other dimension tables.

Figure 3.2: Sales Details Table One-to-many Relationship

Source: [Link]
tOdQBrMG3FU/s1600/Star_Model.JPG

3.2.1 Types of Measure

Various types of measure in a fact table are:

Additive - Measures that can be added across any dimensions are additive measure.
Semi Additive - Measures that can be added across only some dimensions are semi additive.
Non Additive - Measures that cannot be added across any dimension are non-additive.

31
Unit 3: Dimensional Data Warehouse

3.2.2 Types of Fact Table Notes

There are basically three types of fact tables:

Transactional: A transactional table is the most basic and fundamental type of fact table.
The grain associated with a transactional fact table is usually specified as one row per line
in a transaction, e.g., every line on a receipt represents a transaction.

Periodic Snapshots: It takes a picture of the moment, where the moment could be anything
like performance summary of a salesman over the previous 3 months. A periodic snapshot
table is dependent on the transactional table.

Accumulating Snapshots: In this type of fact table the activity of a process is shown such
that it has a well-defined beginning and end.

Example: The processing of an order where an order moves through specific steps until
it is completed.

As steps towards fulfilling the order are completed, the row which is associated with it is
updated in the fact table. This type of table often has multiple date columns, each
representing a complete step in the process. Therefore, it’s important to have an entry in
the date dimension that represents an unknown date, as many of the milestone completion
time are unknown at the time the row is created.

Self Assessment

Fill in the blanks:

3. A fact table typically corresponds to an associative entity in the ..............................

4. Measures that can be added across only some dimensions are ..............................

5. .............................. take a picture of the moment, where the moment could be anything.

6. In .............................. table often has multiple date columns, each representing a complete
step in the process.

3.3 Dimension Tables

Dimension tables consist of attributes that describe fact records in the fact table. Some of these
attributes provide descriptive information; others are used to specify how fact table data should
be summarized to provide useful information to the person who is analysing the information.
Every dimension has a set of descriptive attributes. Dimension tables contain attributes that
describe business entities.

Example: The Client dimension can contain attributes like C_No., Area, State,
Country etc.

Did u know? In a dimensional table, columns can be used to categorize

the information into hierarchical levels.

32
Unit 3: Dimensional Data Warehouse

Notes For example, a dimension table for stores in the Standard-mart sample database includes the
following columns:

Table 3.1: Sample Dimension Table

Column Description
store_country Specifies t
level of the hierarchy.
store_state Specifies the state in which the store is located. This is the state level of the
hierarchy.
store_city Specifies the city or province in which the store is located. This is the city level of
the hierarchy.
store_id Specifies the individual store. This is the lowest level of the hierarchy. This field
contains the primary key of the store dimension table and is used to join the
dimension table to the fact table.
store_name Specifies the name of the store. The values in this column are used to identify the
store to users in a readable form.

Source: [Link]

Self Assessment

Fill in the blanks:

7. Dimension tables consist of attributes that describe ....................... in the fact table.
8. .......................... contain attributes that describe business entities.

3.4 Surrogate Keys and Alternative Table Structure

A surrogate key in a database is a unique identifier for either an entity in the modelled world or
an object in the database. The surrogate key is not derived from application data. Surrogate keys
are keys that are maintained within the data warehouse instead of keys taken from source data
systems.

Example: Say for the employee ‘Emp12 the Business unit changes from B1 to B2. Now, if
you use the natural primary key ‘Emp12 for your employees within your data warehouse then
everything would be allocated to Business unit ‘B22 even what actually belongs to ‘B1.’
If you use surrogate keys, you could create on the other day a new record for the Employee
‘Emp12 in your Employee Dimension with a new surrogate key.

Figure 3.3: Surrogate Key Example

33
Unit 3: Dimensional Data Warehouse
This way, in your fact table, you have your old data (i.e. before the day you added) with the SID
of the Employee ‘Emp12 >> ‘B1.’ All new data (i.e. after the day you added) would take the SID
of the employee ‘Emp12 >> ‘B2.’

3.4.1 Advantages of Surrogate Keys

Immutability: Surrogate keys do not change while the row exists. Thus applications cannot
misplace their reference in the database.

Change in Requirements: Attributes that uniquely recognize an entity might change over
the time, which might lead to invalidation of the suitability of the compound keys.

Example: An employee’s network username is chosen as a natural key. If it is merged

with another company, new employees must be inserted. Now, some of the new user names
may lead to conflict because their user names were developed independently.

In these cases, usually a new attribute should be added to the natural key (for example, an
old_company column). In the case of a surrogate key, only the table that characterizes the
surrogate key must be altered. But in the case of natural keys, all tables that use the natural
key will have to change.

Performance: Surrogate keys tend to be a compact data type, such as a four-byte integer.
This allows the database to query the single key column faster than it could multiple
columns.

Uniformity: When every table has a uniform surrogate key, some tasks can be easily
automated by composing the code in a table-independent way.

Validation: It is possible to design key-values that are in coordination with a well-known

pattern which can be automatically verified.

Example: The keys that are intended to be used in some column of some table might be
designed to “look differently from” those that are intended to be used in another column or
table, thereby simplifying the detection of application errors in which the keys have been
misplaced.

3.4.2 Disadvantages of Surrogate Keys

But surrogate keys also come with some disadvantages. The values of surrogate keys have no
relationship with the real world meaning of the data held in a row. Therefore over usage of
surrogate keys lead to the problem of disassociation and creates unnecessary ETL burden and
performance degradation.

Query optimization also becomes difficult when one disassociates the surrogate key with the
natural key. This is because when surrogate key takes the place of primary key, unique index is
applied on that column. And any query based on natural key identifier leads to full table scan as
that query cannot take the advantage of unique index on the surrogate key.

Referential Integrity: Referential integrity must be maintained between all dimension

tables and the fact table. Each fact record contains foreign keys which are related to primary
keys in the dimension tables.

34
Unit 3: Dimensional Data Warehouse

Notes

Caution Every fact record must have a related record in every dimension table used with
that particular fact table.

 Shared Dimensions: To maintain consistency dimension tables that are shared are created.
These tables are used by all components and data marts in the data warehouse.

3.4.3 Alternative Tables used in Data Warehousing

Auxiliary Table

This table is created with the SQL statements CREATE AUXILIARY TABLE and is used to hold
the data for a column that is defined in a base table.

Base Table

The most common type of table is base table. You can create a base table with the SQL CREATE
TABLE statement. All programs and users that refer to this type of table refer to the same
description of the table and to the same instance of the table.

Clone Table

A table that is structurally identical to a base table is known as clone table. You can create a clone
table by using an ALTER TABLE statement for the base table that includes an ADD CLONE
clause.

Example : In the DB2 catalogue, [Link] indicates that a clone table

exists.

 Empty Table

A table with zero rows is an empty table.

 History Table

A history table is used by Database to store historical versions of rows from the associated
system period temporal table.

 Materialized Query Table

Materialized query tables are useful for complex queries that run on large amounts of data.

Notes They are commonly used in data warehousing and business intelligence applications.

 Result

A table that contains a set of rows that a database selects or generates, directly or indirectly, from
one or more base tables in response to an SQL statement is known as result table. A result table
is not an object that you can define using a CREATE statement.

35
Unit 3: Dimensional Data Warehouse

 Temporal Table Notes

A temporal table is a table that records the period of time when a row is valid.

A table that is defined by the SQL statement CREATE GLOBAL TEMPORARY TABLE or DECLARE
GLOBAL TEMPORARY TABLE is temporary table. It is used to hold data temporarily.

 XML Table

It is a special table that holds only XML data. When you create a table with an XML column,
database implicitly creates an XML table space and an XML table to store the XML data.

Self Assessment

State whether the following statements are true or false:

9. The surrogate key is derived from application data.

10. Surrogate keys change while the row exists.

11. In the case of natural keys, all tables that use the natural key will have to change.

12. When every table has a uniform surrogate key, some tasks can be easily automated by
composing the code in a table-independent way.
13. The values of surrogate keys have relationship with the real world meaning of the data
held in a row.
14. The most common type of table is base table.

15. A table with zero rows is an empty table.

16. A temporal table is a table that records the period of time when a row is valid.
17. XML table is used to hold data temporarily.

3.5 Multidimensional OLAP

OLAP stands for On-Line Analytical Processing. In computing, OLAP is an approach to answering
Multi-Dimensional Analytical (MDA) queries swiftly. OLAP is part of the broader category of
business intelligence, which also includes relational database, report writing and data mining.
Depending on the underlying technology used, OLAP can be broadly divided into MOLAP and
ROLAP.

Did u know? In the OLAP world, there are mainly two different types: Multidimensional
OLAP (MOLAP) and Relational OLAP (ROLAP). Hybrid OLAP (HOLAP) is combination of
MOLAP and ROLAP.

3.5.1 MOLAP

In MOLAP, data is stored in a multidimensional cube. It fulfils the requirements for an analytic
application, where you require to access only summarized level of data. The storage is not in the
relational database, but in proprietary formats. Figure 3.4 shows physical multi-dimensional
cubes.

36
Unit 3: Dimensional Data Warehouse

Figure 3.4: Physical Multi-dimensional Cubes

This method stores the data in multi-dimensional arrays which is different from the two
dimensional relational structure.

Advantages:
MOLAP cubes are built for fast data retrieval and are thus optimal for slicing operations.
MOLAP can perform complex calculations quickly.
Disadvantages:
MOLAP is limited in the amount of data it can handle because all the calculations are
performed when the cube is built.
Cube technology generally do not already exist in the organization, therefore, to adopt
MOLAP technology, chances are additional investments in the form of human and capital
is needed.

3.5.2 ROLAP

This methodology depends on manipulating the data stored in the relational database. There
are detail level values in relational data warehouse.

Advantages:
ROLAP can handle large amounts of data.
ROLAP can leverage functionalities inherent in the relational database as they sit on top of
the relational database.
Disadvantages:
In ROLAP the performance can be slow. As it is known that ROLAP report is essentially a
SQL query on the relational database, the query time can be long if the underlying data
size is large thus the performance of same can be slow.

37
Unit 3: Dimensional Data Warehouse

ROLAP can be limited by SQL functionalities. As, ROLAP technology mainly relies on Notes
SQL statements and SQL statements do not fit all needs (like it’s not easy to do complex
queries in SQL), thus what ROLAP can do is traditionally limited by what SQL can do.

3.5.3 HOLAP

HOLAP technologies combine the advantages of MOLAP and ROLAP. The first product to provide
HOLAP storage was Holos but with time the technology also became available in other
commercial products such as Microsoft Analysis Services (MAS), Oracle Database OLAP Option,
MicroStrategy etc.

Task Compare and contrast the MOLAP and ROLAP.

Self Assessment

Fill in the blanks:

18. OLAP stands for .......................................
19. OLAP can be broadly divided into ....................... and ..................................
20. In MOLAP, data is stored in a ..............................
21. ............................ can leverage functionalities inherent in the relational database as they sit
on top of the relational database.
22. .................................... technologies combine the advantages of MOLAP and ROLAP.

Case Study Lolopop: Automated Data Warehouse

T
he essential concept of a data warehouse is to provide the ability to gather data into
optimized databases without regard for the generating applications or platforms.
Data warehousing can be formally defined as “the coordinated, architected, and
periodic copying of data from various sources into an environment optimized for analytical
and informational processing”.
The Challenge

Meaningful analysis of data requires us to unite information from many sources in many
forms, including: images; text; audio/video recordings; databases; forms, etc. The
information sources may never have been intended to be used for data analysis purposes.
These sources may have different formats, contain inaccurate or outdated information, be
of low transcription quality, be mislabelled or be incompatible.

New sources of information may be needed periodically and some elements of information
may be one time only artefacts.
A data warehouse system designed for analysis must be capable of assimilating these data
elements from many disparate sources into a common form. Correctly labelling and
describing search keys and transcribing data in a form for analysis is critical. Qualifying
Contd....

38
Unit 3: Dimensional Data Warehouse

Notes the accuracy of the data against its original source of authority is imperative. Any such
system must also be able to: apply policy and procedure for comparing information from
multiple sources to select the most accurate source for a data element; correct data elements
as needed; and check inconsistencies amongst the data. It must accomplish this while
maintaining a complete data history of every element before and after every change with
attribution of the change to person, time and place. It must be possible to apply policy or
procedure within specific periods of time by processing date or event data to assure
comparability of data within a calendar or a processing time horizon. When data originates
from a source where different policies and procedures are applied, it must be possible to
reapply new policies and procedures. Where quality of transcription is low qualifying the
data through verification or sampling against original source documents and media is
required. Finally, it must be possible to recreate the exact state of all data at any date by
processing time horizon or by event horizon.

The analytical system applied to a data warehouse must be applicable to all data and
combinations of data. It must take into account whether sufficient data exists at the necessary
quality level to make conclusions at the desired significance level. Where possible it must
facilitate remediation of data from original primary source(s) of authority.
When new data is acquired from new sources, it must be possible to input and register the
data automatically. Processing must be flexible enough to process these new sources
according to their own unique requirements and yet consistently apply policy and
procedure so that data from new sources is comparable to existing data.
When decisions are made to change the way data is processed, edited, or how policy and
procedure is applied, it must be possible to exactly determine the point in time that this
change was made. It must be possible to apply old policies and procedures for comparison
to old analyses, and new policy and procedure for new analyses.

Defining Data Warehouse Issues

The Lolopop partners served as principals in a data warehouse effort with objectives that
are shared by most users of data warehouses. During business analysis and requirements
gathering phase, we found that high quality was cited as the number one objective. Many
other objectives were actually quality objectives, as well. Based on our experiences, Lolopop
defines the generalized objectives in order of importance as:

Quality information to Create data and/or combine with other data sources
In this case, only about one in eight events could be used for analysis across databases.
Stakeholders said that reporting of the same data from the same incoming information
varied wildly when re-reported at a later date or when it came from another organization’s
analysis of the same data. Frequently the data in computer databases was demonstrably
not contained in the original documents from which they were transcribed. Conflicting
applications of policy and procedure by departments with different objectives, prejudices
and perspectives were applied inconsistently without recording the changes or their
sources, leaving the data for any given event a slave to who last interpreted it.

Timely response to requests for data

Here, the data was processed in time period batches. In some instances, it could take up to
four years to finalize a data period. Organizations requiring data for analysis simply went
to the reporting source and got their own copies for analysis, entirely bypassing the
official data warehouse and analytical sources.

Contd....

39
Business Intelligence

Notes
Consistent relating of information

An issue as simple as a name — the information that could be used to connect data events
to histories for individuals or other uniting objects — had no consistent method to
standardize or simplify naming conventions. Another example, Geographical Information
System (GIS) location information had an extravagant infrastructure that was constantly
changing. This made comparisons of data from two different time periods extremely
difficult.

Easy access to information

Often data warehouse technologies assume or demand a sophisticated understanding of

relational databases and statistical analysis. This prevents ordinary stakeholders from
using data effectively and with confidence. In some instances, the personnel responsible
for analysis lack the professional and technical skills to develop effective solutions. This
issue can stultify reporting to a few kinds of reports and variants that have been
programmed over time, and reduces data selection for the analyses to kind of magic
applied by clerical personnel responsible for generating reports.

Unleash management to formulate and uniformly apply policy and procedure

We found that management decisions and mandates could be hindered by an inability to

effectively capture, store, retrieve and analyse data.

In this particular instance, no management controls existed to analyse: source of low

quality; work rates; work effort to remediate (or even a concept of remediation);
effectiveness of procedures; effectiveness of work effort; etc.

Remediation is a good case in point. Management experienced difficulty with the concept
of remedying data transcription from past paper forms — even though the forms existed
in images that could be automatically routed. The perception was that quantity of data,
not quality, was the objective and that no one would ever attempt to fix data by verifying
it or comparing it to original documents.

Manage incoming data from non-integrated sources

Data from multiple, unrelated sources requires a plan to convert electronic data, manage
imaging and documents inputs, manage workflow and manage the analysis of data. In
this case, every interface required manual intervention. Since there was no system
awareness at the beginning of the capture process as to what was needed for analysis at the
end, it was very difficult to make rapid and time effective changes to accommodate changing
stakeholder needs.

Reproducible Reporting Results

We found that reporting of data was not reproducible and the reasons for differences in
reporting were not retrievable, undermining confidence in the data, analysis and reporting.
One may essentially summarize these objectives as quality challenges that require a basic
systems engineering approach for resolution.

Questions:

1. What were the challenges of lolopop automated data warehouse?

2. What were the data warehouse issues?

40
Business Intelligence

Notes 3.6 Summary

Dimensional model comprises of a fact table and numerous dimensional tables and is
used for assessing summarized data.

Fact table generally represent a process or reporting environment that is of value to the
organization.
A fact table typically corresponds to an associative entity in the E-R model.
Various types of measure in a fact table are: Additive, Semi Additive, Non-Additive.
There are basically three types of fact tables: Transactional, Periodic snapshots and
accumulating snapshots.

Dimension tables consist of attributes that describe fact records in the fact table.
A surrogate key in a database is a unique identifier for either an entity in the modelled
world or an object in the database.
Attributes that uniquely recognize an entity might change over the time, which might
lead to invalidation of the suitability of the compound keys.

But surrogate keys also come with some disadvantages. The values of surrogate keys have
no relationship with the real world meaning of the data held in a row.
Referential integrity must be maintained between all dimension tables and the fact table.
The most common type of table is base table. You can create a base table with the SQL
CREATE TABLE statement.
A table that contains a set of rows that a database selects or generates, directly or indirectly,
from one or more base tables in response to an SQL statement is known as result table.
OLAP stands for On-Line Analytical Processing. In computing, OLAP is an approach to
answering multi-dimensional analytical (MDA) queries swiftly.

In MOLAP, data is stored in a multidimensional cube. It fulfils the requirements for an

analytic application, where you require to access only summarized level of data.
HOLAP technologies combine the advantages of MOLAP and ROLAP.

3.7 Keywords

Accumulating Snapshots: In this type of fact table the activity of a process is shown such that it
has a well-defined beginning and end.
Auxiliary Table: This table is created with the SQL statements CREATE AUXILIARY TABLE and
is used to hold the data for a column that is defined in a base table.
Dimension Tables: Dimension tables consist of attributes that describe fact records in the fact
table.
Dimensional Model: Dimensional Modelling (DM) is the name of a set of techniques and concepts
used in data warehouse design. It is considered to be different from entity-relationship modelling
(ER).
Empty Table: It is a table with zero rows is an empty table.
E-R Model: In software engineering, an Entity-relationship model (ER model) is a data model
for describing a database in an abstract way.

40
Business Intelligence

Fact Table: Fact table generally represent a process or reporting environment that is of value to Notes
the organization.

HOLAP: HOLAP (Hybrid Online Analytical Processing) is a combination of ROLAP (Relational

OLAP) and MOLAP (Multidimensional OLAP) which are other possible implementations of
OLAP.
Multidimensional Online Analytical Processing (MOLAP): This is the more traditional way of
OLAP analysis. In MOLAP, data is stored in a multidimensional cube. The storage is not in the
relational database, but in proprietary formats.
Result Table: A table that contains a set of rows that a database selects or generates, directly or
indirectly, from one or more base tables in response to an SQL statement is known as result
table.
ROLAP: This methodology relies on manipulating the data stored in the relational database to
give the appearance of traditional OLAP's slicing and dicing functionality.
Surrogate Key: A surrogate key in a database is a unique identifier for either an entity in the
modelled world or an object in the database.
Temporal Table: A temporal table is a table that records the period of time when a row is valid.
Transactional Table: The grain associated with a transactional fact table is usually specified as
one row per line in a transaction.
XML Table: It is a special table that holds only XML data.

3.8 Review Questions

1. What is dimension?
2. Explain the dimensional model.
3. “Dimensions define the axis of enquiry of a fact.” Elucidate.
4. Give examples of dimensional model.
5. What do you understand by fact table?
6. Explain the types of measure and fact table.
7. “Dimension tables consist of attributes that describe fact records in the fact table”. Discuss.
8. Define the concept of surrogate key. Also write down the advantages and disadvantages.
9. What do you understand by alternative tables used in data warehousing?
10. Briefly explain about multidimensional OLAP.

Answers: Self Assessment

1. Fact table 2. Dimensions

3. E-R model 4. Semi additive
5. Periodic snapshots 6. Accumulating snapshots
7. Fact records 8. Dimension tables
9. False 10. False

41
Business Intelligence

Notes 11. True 12. True

13. False 14. True

15. True 16. True
17. False 18. On-Line Analytical Processing
19. MOLAP, ROLAP 20. Multidimensional cube
21. ROLAP 22. HOLAP

3.9 Further Readings

Books Carlo Vercellis (2011). “Business Intelligence: Data Mining and Optimization for Decision
Making”. John Wiley & Sons.
David Loshin (2012). “Business Intelligence: The Savvy Manager’s Guide”. Newnes.
Elizabeth Vitt, Michael Luckevich, Stacia Misner (2010). “Business Intelligence”.
O’Reilly Media, Inc.
Rajiv Sabhrwal, Irma Becerra-Fernandez (2010). “Business Intelligence”. John Wiley
& Sons.
Swain Scheps (2013). “Business Intelligence for Dummies”. Wiley.

Online links [Link]

[Link]/en-us/library/aa905979(v=sql.80).aspx?

[Link]/view/757?
[Link]/99papers/[Link]?

42
Business Intelligence

Data Warehouse Design for University Library
No ratings yet
Data Warehouse Design for University Library
58 pages
Dimensional Data Warehouse Models Guide
No ratings yet
Dimensional Data Warehouse Models Guide
19 pages
Data Warehouse Architecture Overview
No ratings yet
Data Warehouse Architecture Overview
39 pages
Dimensional Modeling for Data Warehousing
No ratings yet
Dimensional Modeling for Data Warehousing
47 pages
Understanding Dimensional Data Stores
No ratings yet
Understanding Dimensional Data Stores
6 pages
Dimensional Modeling in Data Warehousing
No ratings yet
Dimensional Modeling in Data Warehousing
14 pages
Understanding Fact and Dimension Tables
No ratings yet
Understanding Fact and Dimension Tables
7 pages
Dimensional Modeling in Data Warehousing
No ratings yet
Dimensional Modeling in Data Warehousing
50 pages
Dimensional Modeling in Data Warehousing
No ratings yet
Dimensional Modeling in Data Warehousing
7 pages
Data Warehouse Modeling Techniques
No ratings yet
Data Warehouse Modeling Techniques
63 pages
Dimensional Modeling in Data Warehousing
No ratings yet
Dimensional Modeling in Data Warehousing
10 pages
Data Warehouse Design and Implementation
No ratings yet
Data Warehouse Design and Implementation
27 pages
Dimensional Modeling in Data Warehousing
No ratings yet
Dimensional Modeling in Data Warehousing
29 pages
Dimensional Modeling SQL Course Overview
No ratings yet
Dimensional Modeling SQL Course Overview
30 pages
Overview of Fact Tables in Data Warehousing
No ratings yet
Overview of Fact Tables in Data Warehousing
3 pages
Dimensional Modeling in Data Warehousing
No ratings yet
Dimensional Modeling in Data Warehousing
9 pages
Understanding Fact and Dimension Tables
No ratings yet
Understanding Fact and Dimension Tables
11 pages
Data Warehouse Concepts Explained
No ratings yet
Data Warehouse Concepts Explained
7 pages
Understanding Fact and Dimension Tables
No ratings yet
Understanding Fact and Dimension Tables
4 pages
Dimensional Modeling in Data Warehousing
No ratings yet
Dimensional Modeling in Data Warehousing
8 pages
Star Schema in Dimensional Modeling
No ratings yet
Star Schema in Dimensional Modeling
33 pages
Dimensional Modeling Fundamentals
100% (1)
Dimensional Modeling Fundamentals
14 pages
Dimensional Modeling in Data Warehousing
No ratings yet
Dimensional Modeling in Data Warehousing
21 pages
Data Warehousing Fundamentals Explained
No ratings yet
Data Warehousing Fundamentals Explained
43 pages
Types of Dimensions in Data Warehousing
100% (1)
Types of Dimensions in Data Warehousing
6 pages
Data Warehousing: Dimension & Fact Tables
No ratings yet
Data Warehousing: Dimension & Fact Tables
2 pages
Understanding Data Warehousing Concepts
No ratings yet
Understanding Data Warehousing Concepts
11 pages
Dimensional Modeling in Data Warehousing
No ratings yet
Dimensional Modeling in Data Warehousing
32 pages
Data Warehouse Schema Overview
No ratings yet
Data Warehouse Schema Overview
18 pages
Understanding Data Warehousing Concepts
No ratings yet
Understanding Data Warehousing Concepts
25 pages
A.niranana Devi III Cse Ccs341 Dw- Unit IV (34 p x 109 c) (2)
No ratings yet
A.niranana Devi III Cse Ccs341 Dw- Unit IV (34 p x 109 c) (2)
33 pages
Understanding Dimension Tables in DW
No ratings yet
Understanding Dimension Tables in DW
15 pages
Dimension Tables in Data Warehousing
No ratings yet
Dimension Tables in Data Warehousing
88 pages
Understanding Accumulating Fact Tables
No ratings yet
Understanding Accumulating Fact Tables
24 pages
Understanding Dimensional Modeling in Data Warehousing
No ratings yet
Understanding Dimensional Modeling in Data Warehousing
7 pages
Dimensional Modeling in Data Warehousing
No ratings yet
Dimensional Modeling in Data Warehousing
18 pages
Understanding Dimension Tables in Data Warehousing
No ratings yet
Understanding Dimension Tables in Data Warehousing
3 pages
Dimensional Modeling Insights from Kimball
No ratings yet
Dimensional Modeling Insights from Kimball
10 pages
Data Warehousing Interview Guide
No ratings yet
Data Warehousing Interview Guide
17 pages
Fact and Dimension Tables in BI Models
No ratings yet
Fact and Dimension Tables in BI Models
60 pages
Dimensional Modeling: Confidential © L&T Infotech
No ratings yet
Dimensional Modeling: Confidential © L&T Infotech
20 pages
Dimensional Modeling in Data Warehousing
No ratings yet
Dimensional Modeling in Data Warehousing
10 pages
Types of Data Warehouses Explained
No ratings yet
Types of Data Warehouses Explained
42 pages
Key ETL Concepts for Data Warehousing
No ratings yet
Key ETL Concepts for Data Warehousing
10 pages
Types of Facts and Dimensions in Data Warehousing
No ratings yet
Types of Facts and Dimensions in Data Warehousing
5 pages
Designing Dimension Tables in Fabric
No ratings yet
Designing Dimension Tables in Fabric
26 pages
Dimensional Modeling & OLAP Overview
No ratings yet
Dimensional Modeling & OLAP Overview
97 pages
Designing Fact Tables in Microsoft Fabric
No ratings yet
Designing Fact Tables in Microsoft Fabric
7 pages
Star Schema in Dimensional Modeling
No ratings yet
Star Schema in Dimensional Modeling
18 pages
Dimensional Modeling in Data Warehousing
No ratings yet
Dimensional Modeling in Data Warehousing
32 pages
Dimensional Modeling Best Practices
No ratings yet
Dimensional Modeling Best Practices
22 pages
Types of Dimension Tables Explained
No ratings yet
Types of Dimension Tables Explained
3 pages
Dimensional Modeling Techniques Overview
No ratings yet
Dimensional Modeling Techniques Overview
44 pages
Understanding Dimensional Modeling Basics
No ratings yet
Understanding Dimensional Modeling Basics
41 pages
Data Warehouse Modeling Techniques
No ratings yet
Data Warehouse Modeling Techniques
4 pages
Fact Table Techniques Overview
No ratings yet
Fact Table Techniques Overview
27 pages
Dimensional vs ER Modeling in DWH
No ratings yet
Dimensional vs ER Modeling in DWH
45 pages
Understanding Data Warehousing Concepts
No ratings yet
Understanding Data Warehousing Concepts
11 pages
Dimensional Modeling in Data Warehousing
No ratings yet
Dimensional Modeling in Data Warehousing
26 pages
User-Friendly Farmer Website Design
No ratings yet
User-Friendly Farmer Website Design
5 pages
How To Use Microsoft Word With NVDA The Basics
No ratings yet
How To Use Microsoft Word With NVDA The Basics
29 pages
Cloudera Installation Guide 5.11.1
No ratings yet
Cloudera Installation Guide 5.11.1
18 pages
Setup Help
No ratings yet
Setup Help
4 pages
College Management System Overview
No ratings yet
College Management System Overview
7 pages
Database Terminology and Concepts Guide
No ratings yet
Database Terminology and Concepts Guide
8 pages
Algorithm Analysis and Design Guide
No ratings yet
Algorithm Analysis and Design Guide
7 pages
Collins Multilingual Translator Overview
100% (1)
Collins Multilingual Translator Overview
4 pages
Big Data: Challenges and Opportunities
No ratings yet
Big Data: Challenges and Opportunities
18 pages
IT Infrastructure Specialist Resume
No ratings yet
IT Infrastructure Specialist Resume
2 pages
XP6-R Six Relay Module
No ratings yet
XP6-R Six Relay Module
2 pages
Faceless YouTube Success Strategies
No ratings yet
Faceless YouTube Success Strategies
10 pages
IB7.11.1.7-4e Add v5.4 Add v6.13 PDF
No ratings yet
IB7.11.1.7-4e Add v5.4 Add v6.13 PDF
256 pages
SAP on Azure Workshop Overview
No ratings yet
SAP on Azure Workshop Overview
57 pages
Dynatrace Monitoring Overview Guide
No ratings yet
Dynatrace Monitoring Overview Guide
2 pages
General Arrangement of TC1 Drawing
No ratings yet
General Arrangement of TC1 Drawing
1 page
Roblox Emote and Dance Sidebar UI
No ratings yet
Roblox Emote and Dance Sidebar UI
6 pages
Mastering Productivity Tools in Office
No ratings yet
Mastering Productivity Tools in Office
38 pages
TYPE4 Checker User Manual
No ratings yet
TYPE4 Checker User Manual
56 pages
Automated Grading with OCR and NLP
No ratings yet
Automated Grading with OCR and NLP
33 pages
Vaisala AviMet Database Reporting Guide
No ratings yet
Vaisala AviMet Database Reporting Guide
40 pages
VESDA System Price List Overview
100% (2)
VESDA System Price List Overview
4 pages
Java Developer Resume - Vamshi Katterashala
No ratings yet
Java Developer Resume - Vamshi Katterashala
3 pages
AS 1418.1-2002 Amdt 1-2004 Cranes Hoists and Winches - General Requirements
No ratings yet
AS 1418.1-2002 Amdt 1-2004 Cranes Hoists and Winches - General Requirements
3 pages
PSO Optimization 8 Bus System
No ratings yet
PSO Optimization 8 Bus System
11 pages
Koehler Catalogo
No ratings yet
Koehler Catalogo
202 pages
RANCID: Network Backup Guide
100% (1)
RANCID: Network Backup Guide
3 pages
XFlow 2014 Tutorial Guide v94
No ratings yet
XFlow 2014 Tutorial Guide v94
200 pages
Compiler Example Programs PDF
No ratings yet
Compiler Example Programs PDF
2 pages
Meraki vs. Cisco: Wireless Solutions Comparison
No ratings yet
Meraki vs. Cisco: Wireless Solutions Comparison
5 pages

Unit 3

Uploaded by

Unit 3

Uploaded by

Universiy of MASCARA SAHRAOUI Mustapha

Unit 3: Dimensional Data Warehouse

3.1 Dimensional Model

After studying this unit, you will be able to:

3.1 Dimensional Model

Dimensions define the axis of enquiry of a fact.

Figure 3.1: Example of Dimensional Model

Notes Self Assessment

Fill in the blanks:

3.2 Facts Table

Figure 3.2: Sales Details Table One-to-many Relationship

3.2.1 Types of Measure

Various types of measure in a fact table are:

3.2.2 Types of Fact Table Notes

There are basically three types of fact tables:

Fill in the blanks:

3. A fact table typically corresponds to an associative entity in the ..............................

3.3 Dimension Tables

Did u know? In a dimensional table, columns can be used to categorize

Table 3.1: Sample Dimension Table

Fill in the blanks:

3.4 Surrogate Keys and Alternative Table Structure

Figure 3.3: Surrogate Key Example

3.4.1 Advantages of Surrogate Keys

Example: An employee’s network username is chosen as a natural key. If it is merged

Validation: It is possible to design key-values that are in coordination with a well-known

3.4.2 Disadvantages of Surrogate Keys

Referential Integrity: Referential integrity must be maintained between all dimension

3.4.3 Alternative Tables used in Data Warehousing

Example : In the DB2 catalogue, [Link] indicates that a clone table

A table with zero rows is an empty table.

 Materialized Query Table

 Temporal Table Notes

State whether the following statements are true or false:

9. The surrogate key is derived from application data.

10. Surrogate keys change while the row exists.

15. A table with zero rows is an empty table.

3.5 Multidimensional OLAP

Figure 3.4: Physical Multi-dimensional Cubes

Task Compare and contrast the MOLAP and ROLAP.

Fill in the blanks:

Case Study Lolopop: Automated Data Warehouse

Defining Data Warehouse Issues

Timely response to requests for data

Easy access to information

Often data warehouse technologies assume or demand a sophisticated understanding of

Unleash management to formulate and uniformly apply policy and procedure

We found that management decisions and mandates could be hindered by an inability to

In this particular instance, no management controls existed to analyse: source of low

Manage incoming data from non-integrated sources

Reproducible Reporting Results

1. What were the challenges of lolopop automated data warehouse?

2. What were the data warehouse issues?

Notes 3.6 Summary

In MOLAP, data is stored in a multidimensional cube. It fulfils the requirements for an

HOLAP: HOLAP (Hybrid Online Analytical Processing) is a combination of ROLAP (Relational

3.8 Review Questions

Answers: Self Assessment

1. Fact table 2. Dimensions

Notes 11. True 12. True

13. False 14. True

3.9 Further Readings

Online links [Link]

You might also like