HARSHAL GARG
+91-9977816300
[email protected]
LinkedIn - linkedin.com/in/harshalgarg
PROFESSIONAL SUMMARY EDUCATION
- Datarbricks and Azure Certified Data Engineer ~6 years in IT, worked on end-to-end B.E. – Electrical and Electronics
data migration projects. Interacted with clients, assisted team in requirement gathering, 2014 – 2018
timeline creation, and worked on Azure, Databricks, ETL pipelines, Apache airflow, AWS,
Power BI, SQL, Python and more UIT- RGPV, Bhopal
- Worked on a Cloud Platform which deploys infrastructure for multiple tenets to create
their own ETL service using Azure services which includes DevOps Tech stack like Azure
SKILLS
DevOps, CI/CD Pipelines, Docker, Kubernetes, Terraform
Azure, Data Factory, Databricks,
WORK EXPERIENCE Apache Airflow
Python, SQL, Power BI
Tiger Analytics, Remote - Data Engineer (July 2024 - current) Azure DevOps, CI/CD Pipelines, Docker,
Kubernetes, Terraform
Database Migration (On-Prem Systems to Azure SQL Cloud) (Ongoing)
CERTIFICATION
- Resource Gathering and project planning stage
Databricks Certified Data Engineer
Cognizant Servian, Bengaluru - Associate (Oct 2021 – July 2024) Associate
Azure Certified Data Engineer Associate
ETL Infra Migration (Legacy Systems to Azure Infrastructure – ETL Migration) Azure Fundamentals Certified
- Azure DataFactory ETL pipeline was used to transfer data from legacy systems to cloud AWS Certified
storage, transformed it and then stored different on-prem DBs. Scrum Master Certified
- ADF was used for minor data transformation and Databricks to write PySpark Python
scripts for complex transformation. Servian Toastmaster – Vice President
- Apache Airflow was used as the Orchestration framework to trigger and schedule the Education
ADF pipelines, it was integrated with BMC Control M and IBM DataStage.
- Triggered Service Bus, Event Grid and Event Hub for required pipelines
HOBBIES
Platform Creation (Multi-tenant Cloud Platform)
Books
- Creating a federated multi-tenant Cloud ETL infrastructure platform to enable Cycling
customers to build point to point integrations data flows and pipeline. Trekking
- Tenants will be provided with their own custom set of resources created via terraform Pool & Snooker
template and deployed using Azure DevOps Pipelines which will be interconnected.
- This ETL infrastructure includes Docker Containers which will be managed via Azure
Kubernetes Services, Storage Account, DataFactory, DataBricks, Azure Functions,
connected via Managed Identities and Azure UAI.
- ADF templates were created which helps users create data pipelines quickly and easily.
Infosys, Pune - Data Engineer (Nov 2018 – Sep 2021)
Data Migration and Report Recreation (SAP BW to AWS Redshift – DB Migration)
- Architecture design, created mapping document, wrote DDL and DML Redshift SQL
scripts to create 20+ tables and ingested the data in the new database.
- Designed and Built 15 Power BI interactive reports and dashboards, wrote DAX queries,
created workspace to deploy and set up auto refresh via Gateway connection.
- Wrote Python script running on AWS Glue which imports the data AWS Redshift data
from S3 Bucket to create thousands of HTML files.
- Azure DevOps Pipelines was used to automated database refresh and HTML file
creation and Azure Boards to maintain Agile workflow.