Apache Airflow Developer TRAINING
Course Content
Training Duration- 5 Days
*****Every Day Two Module will be covered. Module 1 before Break and Module 2
After Lunch Break.
***All Module is having Practical Expects.
Pre-req::::::::::::::: python
*** 90 % Hands – 10 % Theory
Day1::: Apache Airflow INTRODUCTION
BIG data vs Normal ETL Pipelines
Why We Need Airflow ?
First Approach to Airflow
Introduction Airflow?
Airflow Architecture
Working Model of Airflow
Installing Airflow
Quick Tour of Airflow UI
Quick Tour of Airflow CLI
setting environment variable and starting web server
setting encryption to secure connection secrets
Configuration Option: Maximum Active Runs Airflow Configuration
Day 2:::
Airflow Configuration Overview
Configuration options: ORM Configuration
Configuration Option: Maximum Active Runs Explained
Explained Continued
Configuration Options: Additional Configuration Settings
Coding Your First Data Pipeline with Airflow
DAG Explation
Time to code your first DAG::::::::::::::: python
Day 3 :::
Operator
Let's use Operators Practically
Operator Relationships and Bitshift Composition
Adding dependencies
How the Scheduler Works?
A Quick Play With Backfill and Catchup
Workflow Description
Developing Data Pipeline
Hands on: Project Setup
Hands on: Data Retrieval from File System
Hands on: Merging DataFrames
Hands on: Aggregation Using Pandas
Hands on: Database Connectivity:: postgres // mysql db
Hands on: Creating Dags
Day 4 :::
Databases and Executors
Introduction Sequential Executor with SQLite
Local Executor with PostgreSQL
Configure a DAG with Local Executor and PostgreSQL
Celery Executor with PostgreSQL and RabbitMQ
[Practice] Configure a DAG with Celery Executor, PostgreSQL and RabbitMQ
Implementing Advanced Concepts in Airflow
Introduction
Minimising Repetitive Patterns With SubDAGs
Minimising a DAG with SubDAGs
How to Interact With External Sources Using Hooks
Getting Results From PostgreSQL Using Hooks
How to Share Data Between Your Tasks With XCOMs
Sharing Your First Messages Using XCOMs
How to Execute Tasks According To Criteria Using Branching
Make Your First Conditional Task Using Branching
Control Your Tasks With SLAs
Defining a SLA in a DAG
AIRFLOW SENSORS
Day 5 :::
Creating Airflow Plugins with Elasticsearch and PostgreSQL
Adding Functionalities to Apache Airflow
Creating a Hook to Interact With Elasticsearch
Creating a Transfer Operator PostgresqlToElasticsearch
Adding a View to Apache Airflow UI
DATA PROFILING IN AIRFLOW
Adhoc Queries
Querying Metadata Tables
Charts in Airflow
Executors
Configure Local Executor
Configure Celery Executor
Service Level Agreements (SLAs)
Security: Authentication, Roles, Encryption
Write Logs to a Remote Location
Monitor Airflow with StatsD, Prometheus and Grafana
Managed Airflow Services