0% found this document useful (0 votes)
793 views

Data Warehouse Architecture

Data warehouse architecture involves organizing data from multiple sources into a central repository for analysis. There are three main architectures: single-tier, two-tier, and three-tier. The two-tier architecture stages data between source systems and the data warehouse for cleansing. The three-tier architecture adds a reconciled layer between sources and the warehouse. Data warehouse construction also uses top-down or bottom-up approaches - top-down builds data marts from a central warehouse while bottom-up constructs individual marts integrated later into a warehouse.

Uploaded by

Binay Yadav
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
793 views

Data Warehouse Architecture

Data warehouse architecture involves organizing data from multiple sources into a central repository for analysis. There are three main architectures: single-tier, two-tier, and three-tier. The two-tier architecture stages data between source systems and the data warehouse for cleansing. The three-tier architecture adds a reconciled layer between sources and the warehouse. Data warehouse construction also uses top-down or bottom-up approaches - top-down builds data marts from a central warehouse while bottom-up constructs individual marts integrated later into a warehouse.

Uploaded by

Binay Yadav
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Data Warehouse Architecture

Data Warehouse Architecture


Data Warehouse Architecture is complex as it’s an information system that contains historical and
commutative data from multiple sources.
Data warehouse architecture is a method of defining the overall architecture of data
communication processing and presentation that exist for end-clients computing within the
enterprise. Each data warehouse is different, but all are characterized by standard vital
components.

Single-tier Data Warehouse Architecture


Single-tier architectures are not implemented in real-time systems. They are used for batch and
real-time processing. The data is first transferred to a single-tier architecture where it is converted
into a format that is suitable for real-time processing.
Single-Tier architecture is not periodically used in practice. Its purpose is to minimize the amount
of data stored to reach this goal; it removes data redundancies.
The single-tier architecture has three layers:
 A source layer
 A data warehouse layer
 An analysis layer(Presentation)
In the single-tier architecture, only the source layer is physical. The data warehouse layer is
virtual and provides data in a multidimensional view, created by an intermediate processing
layer.

Er. Binay Yadav Page 1


Data Warehouse Architecture

Two-tier Data Warehouse Architecture


Two-tier architecture includes a staging area for all data sources, before the data warehouse
layer. By adding a staging area between the sources and the storage repository, you ensure all
data loaded into the warehouse is cleansed and in the appropriate format.

Most businesses that use data marts as a server make use of the two-tier data warehouse
architecture, which is also made up of two tiers:
1. The Data Tier
This is the layer where actual data is stored after various ETL processes have been used to load
data into the data warehouse.
It’s also made up of three layers:
 A source layer
 A data staging layer
 A data warehouse layer
Source layer: A data warehouse system uses a heterogeneous source of data. That data
is stored initially to corporate relational databases or legacy databases, or it may come
from an information system outside the corporate walls.
Data Staging: The data stored to the source should be extracted, cleansed to remove
inconsistencies and fill gaps, and integrated to merge heterogeneous sources into one
standard schema. The so-named Extraction, Transformation, and Loading Tools (ETL) can
combine heterogeneous schemata, extract, transform, cleanse, validate, filter, and load
source data into a data warehouse.
Data Warehouse layer: Information is saved to one logically centralized individual
repository: a data warehouse. The data warehouses can be directly accessed, but it can
also be used as a source for creating data marts, which partially replicate data warehouse
contents and are designed for specific enterprise departments. Meta-data repositories
store information on sources, access procedures, data staging, users, data mart schema,
and so on.

2. The Client Tier


This layer is where clients can use data stored in the data warehouse to generate insights for
making informed, data-driven decisions. You can modify or transform this layer based on the data
trends that you discover from your analysis reports.
And it’s made up of a single layer:
 An analysis layer

Er. Binay Yadav Page 2


Data Warehouse Architecture

Analysis Layer (Presentation Layer): In this layer, integrated data is efficiently, and flexible
accessed to issue reports, dynamically analyze information, and simulate hypothetical business
scenarios. It should feature aggregate information navigators, complex query optimizers, and
customer-friendly GUIs.
Three-tier Data Warehouse Architecture
The three-tier approach is the most widely used architecture for data warehouse systems.
The three-tier architecture is what most organizations go for when building a data warehouse
system. It solves the connectivity problems that the two-tier architecture commonly faces.
The three-tier architecture is made up of:
 A source layer
 A reconciled layer
 A data warehouse layer
The three-tier architecture is useful for extensive, enterprise-wide systems.
The three-tier architecture consists of the source layer (containing multiple source system), the
reconciled layer and the data warehouse layer (containing both data warehouses and data
marts). The reconciled layer sits between the source data and data warehouse.

The main advantage of the reconciled layer is that it creates a standard reference data model
for a whole enterprise. At the same time, it separates the problems of source data extraction and
integration from those of data warehouse population.
Essentially, the three-tier architecture also has three tiers:
1. The bottom tier is the database of the warehouse, where the cleansed and transformed
data is loaded.
2. The middle tier is the application layer giving an abstracted view of the database. It
arranges the data to make it more suitable for analysis.

3. The top-tier is where the user accesses and interacts with the data. It represents the
front-end client layer. You can use reporting tools, query, analysis or data mining
tools.

Data Warehouse Architecture


A data-warehouse is a heterogeneous collection of different data sources organized under a
unified schema. There are 2 approaches for constructing data-warehouse: Top-down approach
and Bottom-up approach are explained as below.

Er. Binay Yadav Page 3


Data Warehouse Architecture

1. Top-down approach:

The essential components are discussed below:


1. External Sources –
External source is a source from where data is collected irrespective of the type of data. Data
can be structured, semi structured and unstructured as well.
2. Stage Area –
Since the data, extracted from the external sources does not follow a particular format, so
there is a need to validate this data to load into data warehouse. For this purpose, it is
recommended to use ETL tool.
 E(Extracted): Data is extracted from External data source.

 T(Transform): Data is transformed into the standard format.

 L(Load): Data is loaded into data warehouse after transforming it into the standard format.

3. Data-warehouse –
After cleansing of data, it is stored in the data warehouse as central repository. It actually
stores the meta data and the actual data gets stored in the data marts. Note that
datawarehouse stores the data in its purest form in this top-down approach.

4. Data Marts –
Data mart is also a part of storage component. It stores the information of a particular function
of an organization which is handled by single authority. There can be as many number of data
marts in an organization depending upon the functions. We can also say that data mart
contains subset of the data stored in data warehouse.

5. Data Mining –
The practice of analyzing the big data present in data warehouse is data mining. It is used to
find the hidden patterns that are present in the database or in data warehouse with the help of
algorithm of data mining.
This approach is defined by Inmon as – data warehouse as a central repository for the
complete organization and data marts are created from it after the complete data warehouse
has been created.

Er. Binay Yadav Page 4


Data Warehouse Architecture

Advantages of Top-Down Approach –


1. Since the data marts are created from the data warehouse, provides consistent dimensional
view of data marts.

2. Also, this model is considered as the strongest model for business changes. That’s why; big
organizations prefer to follow this approach.

3. Creating data mart from data warehouse is easy.

Disadvantages of Top-Down Approach –


1. The cost, time taken in designing and its maintenance is very high.

2. Bottom-up approach:

1. First, the data is extracted from external sources (same as happens in top-down approach).

2. Then, the data go through the staging area (as explained above) and loaded into data marts
instead of data warehouse. The data marts are created first and provide reporting capability. It
addresses a single business area.

3. These data marts are then integrated into data warehouse.

4. This approach is given by Kinball as – data marts are created first and provides a thin view for
analyses and data warehouse is created after complete data marts have been created.

Advantages of Bottom-Up Approach –


1. As the data marts are created first, so the reports are quickly generated.

2. We can accommodate more number of data marts here and in this way data warehouse can
be extended.

3. Also, the cost and time taken in designing this model is low comparatively.

Disadvantage of Bottom-Up Approach –


1. This model is not strong as top-down approach as dimensional view of data marts is not
consistent as it is in above approach.

Er. Binay Yadav Page 5

You might also like