CHAPTER 2.
DIMENSIONAL
DESIGN PROCESS
FOUR STEP DIMENSIONAL PROCESS
1. Select the business process to model
• Performance measurements were result from business measurement
process
• By focusing on business processes, rather than on business departements,
consistent information could be delivered more economically throughout the
organization ; only need publish the data once into the data mart ; avoid
duplicate data with different labels and terminology
• Best methods in selecting business process is by listening to business user
needs and understand the business requirements through available data
FOUR STEP DIMENSIONAL PROCESS
2. Declare the grain of business process
• Granularity represent the degree of detail associated with the fact table
measurements. The most atomic grain dimensional model is the most detail
information could be capture by drilling down ; produce flexible slice and dice
performance
• The less granular model is the more vulnerable to unexpected user requests to
drill down into the details. Hence, hinder the analytic process needed by user
• Grain statement could be achive through discussion with business user and
understanding of analytic process they might perform
• Example of grain : time periode in year, month, day, hour ; product in brand,
category, item
FOUR STEP DIMENSIONAL PROCESS
3. Choose the Dimensions
• Grain statement determines the primary dimensionality of the fact table.
• Its possible to add more dimensions to these fact table which naturally
take on only one value under each combination of the primary dimensions
• If additional dimension violates the grain by causing additional rows to be
generated then the grain statement must be revised to accommodate this
dimension
• Common dimensions include date, product, customer, transaction type
FOUR STEP DIMENSIONAL PROCESS
4. Identify the Facts
• Grain statement help determines the measurements that apprear in
the fact table and the measurements must be true to the grain
• A fact table could not have multiple granularity. Which means
different grain statement lead to different fact table
• Calculated fact table should be stored physically in the database
D E G E N E R AT E K E Y
• A dimension key in fact table with no
relation to dimension table.
• Degenerate key synonym to natural
key which usually occur when we have
a fast growing fact table
• Use for grouping data in fact table and
provide business meaning Legend :
Example of Retail Sales PK = Primary
• Example of degenerate key : Point Of Schema Key
Sales transaction number FK = Foreign Key
DD =
Degenerate Key
FA C T L E S S FA C T T A B L E
• A fact table without measurement metrics
• Exist only to provide additional information about what didn’t happen
in a fact table
• A factless fact table have different grain than the fact table
• Example of factless fact table is a fact table about promotion item
that did not sell where the actual fact table contain item sold
S O M E C 0 N C E R N S T O A D D R E SS
S N O W F L A K I N G
• The multitude of snowflaked tables makes for much complex
presentation. Simplicity is one of the primary objectives of a
denormalized dimensional model
• Slowing down database optimizers performance
• Slows down the user’s ability to browse within a dimension
• Penalize cross-attribute browsing and prohibit the use of bit-mapped
indexes
S O M E C 0 N C E R N S T O A D D R E SS
C E N T I P E D E D E S I G N
• Centipede design refer
dimensional schema where a
fact table have to much
dimension table
• On average, most business
process can be represented
with 15 dimension tables. If
your design has more than
that then consider ways to
combine correlated dimentions
into a single dimension. Example of centipede fact
table
S O M E C 0 N C E R N S T O A D D R E SS
S U R R O G AT E K E Y S
• Surrogate keys are integers that are assigned sequentially as needed to populate a
dimension.
• It purpose is to ease join process of dimension tables to the fact table
• Surrogate keys provide buffer for data warehouse environment from operational changes
• Surrogate keys provide mechanism to differentiate between historical operational codes that
being retain and historical operational codes that being reassigned after a periode of
dormany
• Surrogate keys allow intergration data from multiple operational source systems even if they
lack consistent source keys, especially in case of an acquisition or consolidation of data
• Surrogate keys needed to support handling changes to dimension table attributes
S O M E C 0 N C E R N S T O A D D R E SS
S U R R O G AT E K E Y S
• Incase if the degenerate dimensions are not unique, consider assigning
surrogate keys. For example incase of POS system not assign unique
transaction number across store than assigning surrogate keys is a must.
But by doing so, there no longer degenerate dimensions.
• It is not advicesable to simply gluing together several natural keys or
combining natural key with a time stamp in generating surrogate keys
• They are different ways in generating surrogate keys. Whichever it is you
should document the methods and use it consistently across all data
warehouse.
M A R K E T B A S K E T A N A LY S I S
• Example of applied grain
statement is a market basket fact
table which provide an affinity
grouping data (certain meaning
ful combination of data that
complement one another)
• The grain statement were
produce by pruning combination
of data from the highest to the Example of market basket (affinity grouping) fact
table transform from purchase transactions fact
lowest hierarchy table