Data Warehouse: Meaning, Features, Applications, Architecture, Functions, Terminology
Data Warehouse: Meaning, Features, Applications, Architecture, Functions, Terminology
The term "Data Warehouse" was first coined by Bill Inmon in 1990. According to Inmon, a
data warehouse is a subject-oriented, integrated, time-variant, and non-volatile collection
of data.
This data helps analysts to take informed decisions in an organization. An operational
database undergoes frequent changes on a daily basis on account of the transactions
that take place. Suppose a business executive wants to analyse previous feedback on
any data such as a product, a supplier, or any consumer data, then the executive will
have no data available to analyse because the previous data has been updated due to
transactions.
A data warehouses provides us generalized and consolidated data in multidimensional
view. Along with generalized and consolidated view of data, a data warehouses also
provides us Online Analytical Processing (OLAP) tools. These tools help us in interactive and
effective analysis of data in a multidimensional space.
In other words:
Data warehouse helps business executives to organize, analyze, and use their data for
decision making. Data warehouses are widely used in the following fields:
Financial services
Banking services
Consumer goods
Retail sectors
Controlled manufacturing
Data Warehouse Concept:
There are decision support technologies that help utilize the data available in a data
warehouse. These technologies help executives to use the warehouse quickly and effectively.
They can gather data, analyse it, and take decisions based on the information present in the
warehouse. The information gathered in a warehouse can be used in any of the following
domains:
Tuning Production Strategies - The product strategies can be well tuned by repositioning the
products and managing the product portfolios by comparing the sales quarterly or yearly.
Customer Analysis - Customer analysis is done by analysing the customer's buying preferences,
buying time, budget cycles, etc.
Operations Analysis - Data warehousing also helps in customer relationship management, and
making environmental corrections. The information also allows us to analyse business
operations.
ARCHITECHTURE
Functions, Tools and Utilities:
The following are the functions of data warehouse tools and utilities:
Data Extraction - Involves gathering data from multiple heterogeneous sources.
Data Cleaning - Involves finding and correcting the errors in data.
Data Transformation - Involves converting the data from legacy format to warehouse
format.
Data Loading - Involves sorting, summarizing, consolidating, checking integrity, and
building indices and partitions.
Refreshing - Involves updating from data sources to warehouse.
DWH- Terminology
Metadata
Metadata is defined as data about data. The data that are used to represent other data is
known as metadata. For example, the index of a book serves as a metadata for the contents
in the book. In other words, we can say that metadata is the summarized data that leads us
to the detailed data. In terms of data warehouse, we can define metadata as following:
Metadata is a roadmap to data warehouse.
Metadata in data warehouse defines the warehouse objects.
Metadata acts as a directory. This directory helps the decision support system to locate
the contents of a data warehouse.
Ctd..
Data Cube
Data cube helps us to represent data in multiple dimensions. It is defined by dimensions and
facts. The dimensions are the entities with respect to which an enterprise preserves the
records.
The 3-D table can be represented as 3-D data cube as shown in the following figure:
Data Mart
Data marts contain a subset of organization-wide data that is valuable to specific groups of
people in an organization. In other words, a data mart contains only those data that is specific to a
particular group. For example, the marketing data mart may contain only data related to items,
customers, and sales. Data marts are confined to subjects.
The implementation cycle of a data mart is measured in short periods of time, i.e., in weeks
rather than months or years.
The life cycle of data marts may be complex in the long run, if their planning and design are not
organization-wide.
Data marts are small in size.
Data marts are customized by department.
Data marts are flexible.
END OF
PRESENTATION