ETL Testing Interview Questions and Answers
ETL Testing Interview Questions and Answers
I will explain different ETL Testing Interview Questions and answers in next section :
Answer :
As the name suggested ETL stands for Extract Transform and Load.Just like the name applies ETL
tool Extracts data from the source.Transforms the data while in transit and then it loads the data in to
Specified database.
Definition 1 :
ETL is nothing but Extract,Transform and Loading of the data from multiple heterogeneous data
sources to a single or multiple sources.
Definition 2 :
Answer :
Extraction :
Take data from an external source and move it to the warehouse pre-processor database.
Transformation:
Transform data task allows point-to-point generating, modifying and transforming data.
Loading:
Question 3 : What is difference between Manual Testing and ETL Testing?(90% asked ETL
Testing Interview Questions)
Answer :
1.The main difference between manual testing and ETL testing is manual testing is related to the
functionality of the program and ETL testing is related to the databases and its count.
2.ETL is the automated testing process where you don’t need any technical knowledge other than the
software. Also, ETL testing is extremely faster, systematic and assure top results as needed by the
businesses.
3.Manual testing is highly time-consuming where you need technical knowledge to write the test cases
and the scripts. It is slow, needs efforts, and highly prone to errors.
Question 4 : Explain Need Of ETL Testing.(100% asked ETL Testing Interview Questions)
Answer :
Now a days we are migrating the tons of systems from old technology to new technology. At the time
of migration activities user also needs to migrate the data as well from old DBMS to latest DBMS.So
there is huge need to test that data is correct from target side.The following are some bullet points
where i have explained the necessity of ETL Testing :
To keep a check on the Data which are being transferred from one system (Old system) to the
other (New system).
To keep a track on the efficiency and speed of the process.
To be well acquainted with the ETL process before it gets implemented into your business and
production.
Question 5 : Where user can use ETL concepts. Give some examples.
Answer :
Before ETL tool user needs to write a long code for data transformation to data loading
ETL makes the life simple and one tool will manage all the scenarios of transformation and
loading of the data
There are following examples where we are using the ETL :
The data migrations are difficult efforts if you use the PL SQL or T-SQL development to do. If you
want to migrate the data using simple way use different ETL tools.
Now a days lot of companies are merging in to different MNCs. To move the data from one company
to another company the ETL concepts is been used.
Question 6 : Explain how ETL is used in third party data management.(100% asked ETL
Testing Interview Questions)
Answer :
The big organizations always gives different application development to different kind of
vendors.Means not a single vendor is managing everything. Lets take example of Telecommunication
project where billing is managed by one company and CRM is managed by other company.If CRM
company needs some data from the company who is managing the Billing. That company will receive
a data feed from the other company. To load the data from the feed ETL process is used.
Answer :
The most common example of ETL is ETL is used in Data warehousing.User needs to fetch the
historical data as well as current data for developing data warehouse. The Data warehouse data is
nothing but combination of historical data as well as transactional data. Its data sources might be
different.User needs to fetch the data from multiple heterogeneous systems and load it in to single
target system which is also called as data warehouse.
As The ETL definition suggests that ETL is nothing but Extract,Transform and loading of the
data;This process needs to be used in data warehousing widely. The simple example of this is
managing sales data in shopping mall. If user wants the historical data as well as current data in the
shopping mall first step is always user needs to follow the ETL process.Then that data will be used for
reporting purpose.
Answer :
An ETL tool is used to extract data from different data sources, transform the data, and load it into a
DW system. In contrast, a BI tool is used to generate interactive and adhoc reports for end-users,
dashboard for senior management, data visualizations for monthly, quarterly, and annual board
meetings.
Most common ETL tools include − SAP BO Data Services (BODS), Informatica, Microsoft – SSIS,
Oracle Data Integrator ODI, Talend Open Studio, Clover ETL Open source, etc.
Most common BI tools include − SAP Business Objects, SAP Lumira, IBM Cognos, JasperSoft,
Microsoft BI Platform, Tableau, Oracle Business Intelligence Enterprise Edition, etc.
Question 9 : What is difference between ETL Testing and Database Testing?(80% asked ETL
Testing Interview Questions)
Answer :
Following are different difference points between ETL testing and database testing :
Question 10 :What are different characteristics of Data Warehouse?(100% asked ETL Testing
Interview Questions)
Answer :
1. Data warehouse is a database which is separate from operational database which stores
historical information also.
2. Data warehouse database contains transactional as well as analytical data.
3. Data warehouse helps higher management to take stratagic as well as tactical decisions using
historical or current data.
4. Data warehouse helps consolidated historical data analysis.
5. Data warehouse helps business user to see the current trends to run the business.
6. Data warehouse is used for reporting and data analysis purpose.
Answer :
1. Data Mart
2. Online Analytical Processing (OLAP)
3. Online Transactional Processing
4. Predictive Analysis
Question 12 : Which are different steps of ETL testing process?(100% asked ETL Testing
Interview Questions)
Answer :
Based on the inputs from the estimation, an ETL environment is planned and worked out.
Step 4 :Test Data preparation and Execution :
Data for the test is prepared and executed as per the requirement.
Step 5 : Summary Report:
Upon the completion of the test run, a brief summary report is prepared for improvising and
concluding.
Question 13 : How ETL is used in Data migration projects. Explain with example.(60%
asked ETL Testing Interview Questions)
Answer :
ETL tools are widely used in data migration projects. If the organization is managing the data in oracle
10 g previously and now organization wants to go for SQL server cloud database then there is need to
migrate the data from Source to Target.To do this kind of migration the ETL tools are very useful. If
user wants to write the code of ETL it is very time consuming process. To make this simple the ETL
tools are very useful in which the coding is simple as compare to PL SQL or T-SQL code.So ETL
process i very useful in Data migration projects.
Question 14 : Explain multiple steps to choose ETL tool .(90% asked ETL Testing Interview
Questions)
Answer :
The choosing the ETL tool is very difficult thing. You need to consider lot of factors while choosing
the correct ETL tool according to the project.Choosing the ETL tool for specific project is very
stratagic move even you need it for a small project.Make sure that ETL tool migrations are no small
efforts.In this section i would like to give you some bullet points to consider while choosing your ETL
tool.
1.Data Connectivity :
ETL tool should be communicate with any source of data no matter where it comes from.This is very
critical.
2.Performance :
Moving and changing a data requires some serious processing power. So you need to check the
performance factors.
3.Transformation Flexibility :
Matching,Merging and changing the data is very critical.ETL data should provide these and many
transformation packages which allow modifications to the data in transformation phase with simple
drag and drop.
4.Data Quality :
Your data is not clean. The only way to leverage your data when your data is consistent and clean.
Once the ETL is ready you need to check that ETL will work on previous data as well as new coming
data.
You are playing with the organization data while doing the ETL process.So Choose vendor who is
very well known in the industry and whose support is really great.
Question 16 : Name some important ETL bugs.(70% asked ETL Testing Interview Questions)
Answer :
1.Source bugs
2.Calculation bugs
Answer :
Answer ;
The ETL mapping sheet contains all necessary information from the source file and stores the details
in rows and columns. This mapping sheet helps experts in writing SQL queries to speed up the testing
process.
The Database designer is responsible to create the mapping sheet.
Question 19 : What is fact and what are its types.(100% asked ETL Testing Interview
Questions)
Answer ;
Additive: A measure can participate arithmetic calculations using all or any dimensions.
Semi additive: A measure can participate arithmetic calculations using some dimensions.
Ex: temperature
Question 20 : Explain Data extraction phase in ETL with its type.(90% asked ETL Testing
Interview Questions)
Answer :
The data extraction is nothing but extracting the data from multiple heterogeneous sources with using
ETL tools.
1.Full Extraction : All the data from source systems or operational systems gets extracted to staging
area. (Initial Load)
2.Partial Extraction : Sometimes we get notification from the source system to update specific date.
It is called as Delta load.
Source System Performance : The Extraction strategies should not affect source system
performance.
Question 21 : What is Dimensions? Explain with example.
Answer :
Dimension table is table which describes the business entities of an enterprise which describes the
objects in a fact table.Dimension table has primary key which uniquely identifies each dimension
row.Dimension table is sometimes called as lookup or reference table.The primary key of dimension
table is used to associate relationship between fact table which contains foreign key.Dimension tables
are normally in de-normalized form because these tables are only used to analyse the data and not used
to execute transactions.
1. Query Constrainting
2. Grouping /Filtering
3. Report labeling
1.Time
2.Location
3.Item
4.Branch
Answer :
Data Extracted from source system is in to Raw format. We need to transform it before loading
in to target server.
Data has to be cleaned, mapped and transformed
There are following steps in transformation:
Answer:
1.Standardizing data : Data is fetched from multiple sources so it needs to be standardized as per the
target system.
2.Character set conversion : Need to transform the character sets as per the target systems.
(Firstname and last name example).
3.Calculated and derived values: In source system there is first val and second val and in target we
need the calculation of first val and second val.
4.Data Conversion in different formats : If in source system date in in DDMMYY format and in
target the date is in DDMONYYYY format then this transformation needs to be done at
transformation phase.
Answer :
The transactions are always needed to be divided for the better performance. The same processes are
known as Partitioning. It simply makes sure that the server can directly access the sources through
multiple connections.
Answer:
Data loading phase loads the prepared data from staging tables to main tables.
Initial load : Populating all the data tables from source system and loads it in to data warehouse
table.
Incremental Load : Applying the ongoing changes as necessary in periodic manner.
Full Refresh : Completely erases the data from one or more tables and reload the fresh data.
Answer :
Informatica
IBM Datastage
Abnitio
Clover ETL
1.Pentaho
2.Kettle