0% found this document useful (0 votes)

65 views

Azure Interview Questions

Uploaded by

konathala2007

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

65 views

Azure Interview Questions

Uploaded by

konathala2007

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 7

What is the use of Azure Active Directory?

Azure Active Directory is an identify and access management system.

It is very much similar to the active directories. It allows you to grant
your employee in accessing specific products and services within the
network.

Briefly describe the purpose of the ADF Service

ADF is used mainly to orchestrate the data copying between different

relational and non-relational data sources, hosted in the cloud or
locally in your datacentres’. In addition, ADF can be used for
transforming the ingested data to meet your business requirements. It
is ETL, or ELT tool for data ingestion in most Big Data solutions.

Data Factory consists of a number of components. Mention these

components briefly

 Pipeline: The activities logical container

 Activity: An execution step in the Data Factory pipeline that can
be used for data ingestion and transformation
 Mapping Data Flow: A data transformation UI logic
 Dataset: A pointer to the data used in the pipeline activities
 Linked Service: A descriptive connection string for the data
sources used in the pipeline activities
 Trigger: Specify when the pipeline will be executed
 Control flow: Controls the execution flow of the pipeline
activities

What is the difference between the Dataset and Linked Service in

Data Factory?

Linked Service is a description of the connection string that is used to

connect to the data stores. For example, when ingesting data from a
SQL Server instance, the linked service contains the name for the
SQL Server instance and the credentials used to connect to that
instance.
Dataset is a reference to the data store that is described by the linked
service. When ingesting data from a SQL Server instance, the dataset
points to the name of the table that contains the target data or the
query that returns data from different tables.

What is Data Factory Integration Runtime?

Integration Runtime is a secure compute infrastructure that is used by

Data Factory to provide the data integration capabilities across the
different network environments and make sure that these activities
will be executed in the closest possible region to the data store.

What is the difference between the Mapping data flow and

Wrangling data flow transformation activities in Data Factory?

Mapping data flow activity is a visually designed data transformation

activity that allows us to design a graphical data transformation logic
without the need to be an expert developer, and executed as an
activity within the ADF pipeline on an ADF fully managed scaled-out
Spark cluster.

Wrangling data flow activity is a code-free data preparation activity

that integrates with Power Query Online in order to make the Power
Query M functions available for data wrangling using spark
execution.

What is blob storage in Azure?

What is the difference between Azure Data Lake store and Blob
storage?
What are the steps for creating ETL process in Azure Data Factory?
What is the difference between HDInsight & Azure Data Lake
Analytics?
What are the top-level concepts of Azure Data Factory?
Explain triggers in ADF.
Steps for Creating ETL

 Create a Linked Service for source data store which is SQL Server
Database
 Assume that we have a cars dataset
 Create a Linked Service for destination data store which is Azure
Data Lake Store
 Create a dataset for Data Saving
 Create the pipeline and add copy activity
 Schedule the pipeline by adding a trigger

 What are the top-level concepts of Azure Data Factory?

 Pipeline: It acts as a carrier in which we have various processes
taking place.
 This individual process is an activity.
 Activities: Activities represent the processing steps in a pipeline.
A pipeline can have one or multiple activities. It can be anything
i.e process like querying a data set or moving the dataset from one
source to another.
 Datasets: Sources of data. In simple words, a data structure holds
our data.
 Linked services: These store information that is very important
when it comes to connecting an external source.
 For example: Consider SQL server, you need a connection string
that you can connect to an external device. You need to mention
the source and the destination of your data.

How to create and connect to Azure SQL Database?

First, we need to log into the Azure Portal with our Azure credentials.
Then we need to create an Azure SQL database in the Azure portal.
Click on “Create a resource” on the left side menu and it will open an
“Azure Marketplace”. There, we can see the list of services. Click
“Databases” then click on the “SQL Database”.
Create a SQL database
After clicking the “SQL Database”, it will open another section.
There, we need to provide the basic information about our database
like Database name, Storage Space, Server name, etc.

What is Azure Blob Storage?

Azure Storage is one of the cloud computing PaaS (Platform as a
Service) services provided by the Microsoft Azure team. It provides
cloud storage that is highly available, secure, durable, scalable, and
redundant. It is massively scalable and elastic. It can store and process
hundreds of terabytes of data or you can store the small amounts of
data required for a small business website.

What is Blob?
Blob is a service for storing large amounts of unstructured data that
can be accessed from anywhere in the world via HTTP or HTTPS.”
Blob stands for ” Binary Large Object “. It’s designed to store large
amounts of unstructured text or binary data like virtual hard disks,
videos, images or even log files.
The data can be exposed to the public or stored privately. It scales up
or down as your needs change. We no longer manage it, we only pay
for what we use.
Control flows and scale
To support the diverse integration flows and patterns in the modern
data warehouse, Data Factory enables flexible data pipeline modeling.
This entails full control flow programming paradigms, which include
conditional execution, branching in data pipelines, and the ability to
explicitly pass parameters within and across these flows. Control flow
also encompasses transforming data through activity dispatch to
external execution engines and data flow capabilities, including data
movement at scale, via the Copy activity.
Data Factory provides freedom to model any flow style that's required
for data integration and that can be dispatched on demand or
repeatedly on a schedule. A few common flows that this model
enables are:
Control flows:
Activities can be chained together in a sequence within a pipeline.
Activities can be branched within a pipeline.
Parameters:
Parameters can be defined at the pipeline level and arguments can be
passed while you invoke the pipeline on demand or from a trigger.
Activities can consume the arguments that are passed to the pipeline.
Custom state passing:
Activity outputs, including state, can be consumed by a subsequent
activity in the pipeline.
Looping containers:
The foreach activity will iterate over a specified collection of
activities in a loop.
Trigger-based flows:
Pipelines can be triggered on demand, by wall-clock time, or in
response to driven by event grid topics
Delta flows:
Parameters can be used to define your high-water mark for delta copy
while moving dimension or reference tables from a relational store,
either on-premises or in the cloud, to load the data into the lake.

What are the top-level concepts of Azure Data Factory?

An Azure subscription can have one or more Azure Data Factory
instances (or data factories). Azure Data Factory contains four key
components that work together as a platform on which you can
compose data-driven workflows with steps to move and transform
data.
Pipelines
A data factory can have one or more pipelines. A pipeline is a logical
grouping of activities to perform a unit of work. Together, the
activities in a pipeline perform a task. For example, a pipeline can
contain a group of activities that ingest data from an Azure blob and
then run a Hive query on an HDInsight cluster to partition the data.
The benefit is that you can use a pipeline to manage the activities as a
set instead of having to manage each activity individually. You can
chain together the activities in a pipeline to operate them sequentially,
or you can operate them independently, in parallel.
Data flows
Data flows are objects that you build visually in Data Factory which
transform data at scale on backend Spark services. You do not need to
understand programming or Spark internals. Just design your data
transformation intent using graphs (Mapping) or spreadsheets
(Wrangling).
Activities
Activities represent a processing step in a pipeline. For example, you
can use a Copy activity to copy data from one data store to another
data store. Similarly, you can use a Hive activity, which runs a Hive
query on an Azure HDInsight cluster to transform or analyze your
data. Data Factory supports three types of activities: data movement
activities, data transformation activities, and control activities.
Datasets
Datasets represent data structures within the data stores, which simply
point to or reference the data you want to use in your activities as
inputs or outputs.
Linked services
Linked services are much like connection strings, which define the
connection information needed for Data Factory to connect to external
resources. Think of it this way: A linked service defines the
connection to the data source, and a dataset represents the structure of
the data. For example, an Azure Storage linked service specifies the
connection string to connect to the Azure Storage account. And an
Azure blob dataset specifies the blob container and the folder that
contains the data.
Linked services have two purposes in Data Factory:
To represent a data store that includes, but is not limited to, a SQL
Server instance, an Oracle database instance, a file share, or an Azure
Blob storage account. For a list of supported data stores, see Copy
Activity in Azure Data Factory.
To represent a compute resource that can host the execution of an
activity. For example, the HDInsight Hive activity runs on an
HDInsight Hadoop cluster. For a list of transformation activities and
supported compute environments, see Transform data in Azure Data
Factory.

Apache Cassandra Administrator Associate - Exam Practice Tests
From Everand
Apache Cassandra Administrator Associate - Exam Practice Tests
Cristian Scutaru
No ratings yet
SAP PP Confi
No ratings yet
SAP PP Confi
64 pages
Iti Pdfs
No ratings yet
Iti Pdfs
10 pages
Profisee Datasheet Integrator 8.5x11
No ratings yet
Profisee Datasheet Integrator 8.5x11
1 page
ADF Course Content
No ratings yet
ADF Course Content
11 pages
Adf Loop PDF
100% (1)
Adf Loop PDF
4 pages
Azure Data Factory Interview Questions and Aswers
No ratings yet
Azure Data Factory Interview Questions and Aswers
5 pages
Top 50 Azure Data Factory Interview Questions and Answers
No ratings yet
Top 50 Azure Data Factory Interview Questions and Answers
14 pages
Azure Data Factory
No ratings yet
Azure Data Factory
5 pages
Question: Dimension Modeling Types Along With Their Significance
No ratings yet
Question: Dimension Modeling Types Along With Their Significance
27 pages
Azure Synapse Analytics PoC Environment
No ratings yet
Azure Synapse Analytics PoC Environment
8 pages
Azure Functions: Bryan Rodríguez
No ratings yet
Azure Functions: Bryan Rodríguez
27 pages
Databricks Certified Machine Learning Associate Exam Guide
No ratings yet
Databricks Certified Machine Learning Associate Exam Guide
9 pages
Cookbook
No ratings yet
Cookbook
264 pages
Salesforce Lookup Transformation
No ratings yet
Salesforce Lookup Transformation
265 pages
Databricks State of Data Report 010524 v9 Final
No ratings yet
Databricks State of Data Report 010524 v9 Final
27 pages
ADF Copy Data
No ratings yet
ADF Copy Data
85 pages
Azure Synpase Analytics Service
No ratings yet
Azure Synpase Analytics Service
22 pages
Cloud Training
No ratings yet
Cloud Training
158 pages
Instant Access To Data Lake Architecture Designing The Data Lake and Avoiding The Garbage Dump First Edition Bill Inmon Ebook Full Chapters
100% (4)
Instant Access To Data Lake Architecture Designing The Data Lake and Avoiding The Garbage Dump First Edition Bill Inmon Ebook Full Chapters
62 pages
Consume Salesforce API Using REST Client Connector
No ratings yet
Consume Salesforce API Using REST Client Connector
32 pages
ADF Notes
No ratings yet
ADF Notes
1 page
SnowPro Architect Exam Guide - 060123
No ratings yet
SnowPro Architect Exam Guide - 060123
12 pages
Azure Data Engineer Mock Interview - Project Special
No ratings yet
Azure Data Engineer Mock Interview - Project Special
11 pages
Interview Series ADF Part-1
No ratings yet
Interview Series ADF Part-1
17 pages
What Are Slowly Changing Dimensions
No ratings yet
What Are Slowly Changing Dimensions
2 pages
Pyspark
No ratings yet
Pyspark
31 pages
MDM 4
No ratings yet
MDM 4
159 pages
Interview Questions On ADF
No ratings yet
Interview Questions On ADF
2 pages
Database Testing Checklist
No ratings yet
Database Testing Checklist
9 pages
BiG DaTa
100% (1)
BiG DaTa
9 pages
Introduction To Snowflake Warehouses
No ratings yet
Introduction To Snowflake Warehouses
40 pages
Lightning
100% (1)
Lightning
469 pages
Mourya K Data Engineer
No ratings yet
Mourya K Data Engineer
7 pages
PySpark Questions
No ratings yet
PySpark Questions
5 pages
Azure DataBricks Interview Questions
No ratings yet
Azure DataBricks Interview Questions
17 pages
Structured Approach To Bi Testing
No ratings yet
Structured Approach To Bi Testing
13 pages
AGILE
No ratings yet
AGILE
1 page
MCA - BigData Notes
No ratings yet
MCA - BigData Notes
136 pages
ADF Copy Data
100% (1)
ADF Copy Data
81 pages
Salesforce Apex Language Reference PDF
No ratings yet
Salesforce Apex Language Reference PDF
3,455 pages
Fundamentals of Database Systems: Lesson 1: Introduction
No ratings yet
Fundamentals of Database Systems: Lesson 1: Introduction
35 pages
Dbms Quiz
No ratings yet
Dbms Quiz
13 pages
DBT Flow
No ratings yet
DBT Flow
15 pages
Enterprise Ebook
No ratings yet
Enterprise Ebook
45 pages
MSSQL Server 2008 Developer
No ratings yet
MSSQL Server 2008 Developer
240 pages
Matthieu - Lamairesse - Reda - Khouani - Why The Best Serverless Data Warehouse Is A Lakehouse - (DAIWT - PARIS)
No ratings yet
Matthieu - Lamairesse - Reda - Khouani - Why The Best Serverless Data Warehouse Is A Lakehouse - (DAIWT - PARIS)
38 pages
ERwin API
No ratings yet
ERwin API
72 pages
25 Python Materials
No ratings yet
25 Python Materials
3 pages
SQL Server Theory
No ratings yet
SQL Server Theory
2 pages
AI-100 Original
No ratings yet
AI-100 Original
112 pages
dATABASE Presentation
No ratings yet
dATABASE Presentation
20 pages
Introduction To Big Data and Hadoop
100% (1)
Introduction To Big Data and Hadoop
29 pages
Power BI Cheat Sheet
No ratings yet
Power BI Cheat Sheet
10 pages
Ambari Operations
No ratings yet
Ambari Operations
194 pages
Azure SQL Database Administration-Overview
No ratings yet
Azure SQL Database Administration-Overview
7 pages
Dataware Q&a Bank
100% (1)
Dataware Q&a Bank
42 pages
SQL Replication Basic
No ratings yet
SQL Replication Basic
22 pages
Singapore DBT
No ratings yet
Singapore DBT
137 pages
HBase Administration Cookbook
From Everand
HBase Administration Cookbook
Yifeng Jiang
No ratings yet
Instant Pentaho Data Integration Kitchen
From Everand
Instant Pentaho Data Integration Kitchen
Sergio Ramazzina
No ratings yet
Real Time System Quiz 1 EntesarElBanna 20170295
No ratings yet
Real Time System Quiz 1 EntesarElBanna 20170295
3 pages
DAA Practical File - 1900648
No ratings yet
DAA Practical File - 1900648
20 pages
Patt Patel CH 07
No ratings yet
Patt Patel CH 07
22 pages
Yamashita - Potential Risks of Hyperledger Fabric Smart Contracts
No ratings yet
Yamashita - Potential Risks of Hyperledger Fabric Smart Contracts
10 pages
DIALux Setup Information
No ratings yet
DIALux Setup Information
26 pages
Business Impact Analysis
No ratings yet
Business Impact Analysis
3 pages
k5bmc Application Logic
No ratings yet
k5bmc Application Logic
335 pages
Cover Page_20231221_071949_0000 (1).pdf (3) (1).pdf.pdf
No ratings yet
Cover Page_20231221_071949_0000 (1).pdf (3) (1).pdf.pdf
30 pages
21cs61 Model Paper Ans
No ratings yet
21cs61 Model Paper Ans
33 pages
Structures and Enumerations
No ratings yet
Structures and Enumerations
11 pages
Ejemplo Cronograma Del Proyecto Según Pud Rup
No ratings yet
Ejemplo Cronograma Del Proyecto Según Pud Rup
2 pages
Manav Rachna University
No ratings yet
Manav Rachna University
27 pages
Python Timetable
No ratings yet
Python Timetable
3 pages
Unit 3
No ratings yet
Unit 3
39 pages
Fundamentals of Data Structures Lab Manual
No ratings yet
Fundamentals of Data Structures Lab Manual
52 pages
(18BCS049) Online Website For Yoga and Gym
No ratings yet
(18BCS049) Online Website For Yoga and Gym
65 pages
TCS NQT Previous Year Questions & Answers
No ratings yet
TCS NQT Previous Year Questions & Answers
9 pages
Install Software Application Lo1
No ratings yet
Install Software Application Lo1
20 pages
Qualkitdo Slci Run Tests
No ratings yet
Qualkitdo Slci Run Tests
140 pages
Comprog1 - Ched Course Outline
No ratings yet
Comprog1 - Ched Course Outline
3 pages
PMGM Q 1
No ratings yet
PMGM Q 1
94 pages
Creating A Chat Server Using Java PDF
No ratings yet
Creating A Chat Server Using Java PDF
5 pages
MCA (Management) 2024 Syllabus_Sem_2
No ratings yet
MCA (Management) 2024 Syllabus_Sem_2
43 pages
Ang2 Build
43% (7)
Ang2 Build
4 pages
ADV-19-22-Advanced Notification of Tool Being Used To Hide Malicious Macros Within Microsoft Office
No ratings yet
ADV-19-22-Advanced Notification of Tool Being Used To Hide Malicious Macros Within Microsoft Office
5 pages
Backend Notes
No ratings yet
Backend Notes
46 pages
Cloud Native
No ratings yet
Cloud Native
1 page
Gmp-Man-5 0 1
No ratings yet
Gmp-Man-5 0 1
144 pages
OMEGAMON DB2 Performance Basics
No ratings yet
OMEGAMON DB2 Performance Basics
3 pages

Azure Interview Questions

Uploaded by

Azure Interview Questions

Uploaded by

What is the use of Azure Active Directory?

Azure Active Directory is an identify and access management system.

Briefly describe the purpose of the ADF Service

ADF is used mainly to orchestrate the data copying between different

Data Factory consists of a number of components. Mention these

 Pipeline: The activities logical container

What is the difference between the Dataset and Linked Service in

Linked Service is a description of the connection string that is used to

What is Data Factory Integration Runtime?

Integration Runtime is a secure compute infrastructure that is used by

What is the difference between the Mapping data flow and

Mapping data flow activity is a visually designed data transformation

Wrangling data flow activity is a code-free data preparation activity

What is blob storage in Azure?

 What are the top-level concepts of Azure Data Factory?

How to create and connect to Azure SQL Database?

What is Azure Blob Storage?

What are the top-level concepts of Azure Data Factory?

You might also like