0% found this document useful (0 votes)

41 views6 pages

Kiran - Data Engineer

Kiran is a Data Engineer with over 14 years of experience in Data Warehouse projects and expertise in GCP, Big Data technologies, and ETL processes. He has a proven track record of developing data pipelines, migrating legacy systems to modern architectures, and implementing data management solutions using tools like Airflow, Spark, and Tableau. Kiran holds certifications as a GCP Professional Data Engineer and has worked with various clients, including Verizon and BestBuy, to enhance data processing and analytics capabilities.

Uploaded by

bikram115566

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

41 views6 pages

Kiran - Data Engineer

Uploaded by

bikram115566

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

Kiran

GCP/Hive/Python/Spark/Informatica/Sqoop and Tableau

Data Engineer

Professional Summary
 Innovative and experienced developer with 14 + years of experience in Data
Warehouse projects.
 Have strong experience on major components of Bigdata ecosystem like Hadoop,
Hive, Sqoop, Spark & Python, Google Dataproc, Google BigQuery & other GCP
components.
 Experience in data management and implementation of Bigdata applications using
Airflow. Developed custom operators by enhancing the existing Airflow operators.
 Certified GCP Professional Data Engineer and GCP Cloud Associate Engineer.
 Developed data pipelines and migrated existing legacy pipelines to GCP Composer
(Airflow).
 Migrated SAS programs to Airflow & BigQuery.
 Experience in spinning up the ephemeral GCP Dataproc clusters for pySpark jobs.
 Configured & Managed IAM policies at resource level using the Deployment
manager.
 Deployed GCP components (cloud functions, BigQuery, Datasets, Dataproc) using
GCP Deployment manager.
 Worked on building the data lake to extract data from traditional database to HDFS
environment, performing the data transformations using Hive.
 Experience in importing and exporting data using Sqoop from RDBMS to HDFS and
vice-versa.
 In-depth understanding of Hadoop Architecture including YARN and various
components such as HDFS, Resource Manager, Node Manager, Name Node, Data Node.
 Developed spark streaming applications using Kafka.
 Experienced in developing visualization reports using Tableau.
 Handled different data formats like Avro, Parquet, and ORC formats.
 Experienced in Design, Development, Testing and Maintenance of various Data
Warehousing and Business Intelligence (BI) applications in complex business environments
using Informatica.
 Well versed in Conceptual, Logical/Physical, Relational, and Multi-dimensional
modeling, Data analysis and Data Transformation (ETL).
 Extensively worked on the ETL mappings, analysis, and documentation of OLAP
reports requirements. Solid understanding of OLAP concepts and challenges, especially
with large data sets.
 Implemented complex business rules by developing robust mappings/mapplets
using various transformations like Unconnected and Connected lookups, Normalizer,
Source Qualifier, Router, Filter, Expression, Aggregator, Joiner, Update Strategy etc.
 Proficient in developing Entity-Relationship diagrams, Star/Snowflake Schema
Designs, and expert in modeling Transactional Databases and Data Warehouse.
 Excellent technical, logical, code debugging and problem-solving capabilities.

Technical Skills Summary

 Big Data: GCP BigQuery, Cloud Functions, Spark (PySpark),
Airflow, Hive, Sqoop.
 Programming: Python, Shell scripting, PL/SQL.
 RDBMS: Oracle 10g &11g, Teradata.
 ETL: Informatica, DataStage.
 Visualization Tools: Tableau.

WORK EXPERIENCE
08/2020 – till date
Data Engineer

Client: Verizon
November 2021 to Till Date
PROJECT DESCRIPTION:
Developing data pipelines to process sales, offers & customers data to build analytical tables.

Responsibilities:
 Worked closely with Product Owners (PO) for requirements, Jira stories and source
to target mappings (STM). Analyzed STM’s based on source databases.
 Developed custom Airflow operators by enhancing the existing Airflow operators to
perform validations and execute multiple BigQuery SQLs with a single operator.
 Created Airflow Operators (in python) to fetch the data using the API calls and to
GCS buckets and then to load the data to BigQuery.
 Developed common reusable python functions to utilize in Airflow DAGs.
 Developing Airflow (GCP composer) jobs to process BigQuery data.
 Migrating Teradata Stored Procedures to Airflow.
 Migrated oozie workflows to GCP Airflow.
 Migrated Hive scripts to BigQuery.
 Implemented BigQuery policy tags to restrict the data access to the PII fields.
 Processed privacy data and built analytical tables for privacy teams.
 Review the code developed by peers and approve the code reviews.
 Preparing the design documents and documenting all the work/changes done.
 Prepare scripts to automatically validate the data and identify the quality issues with
the data.
 Developed Airflow DAGs to get the data from APIs to GCS (Google Cloud Storage)
buckets.

Environment: Google Cloud Platform (GCP) BigQuery, Airflow, GitHub, Python.

Client: BestBuy
August 2020 to November 2021
PROJECT DESCRIPTION:
Developing data pipelines to ingest data using API calls and process the data to build the feature
store for AI/ML needs.
Responsibilities:
 Develop pySpark jobs for processing the sales & offers data and prepare tables to
analyze the campaigns and their impacts.
 Developing the cloud functions (python) to perform feature definition file
validations and then push the files into the GCS buckets.
 Worked closely with Data Scientists to build the feature store.
 Created Airflow Operators (in python) to fetch the using API calls and load the data
to GCS buckets and then to BigQuery.
 Developed logging methods using cloud functions & pubsub topics for the Airflow
DAG executions.
 Migrating SAS programs to Airflow DAGs & BigQuery.
 Preparing the deployment scripts to deploy and manage the GCP components using
GCP Deployment manager.
 Build Tableau reports as per the business team requirements.
 Developed Airflow DAGs to get the data from APIs to GCS (Google Cloud Storage)
buckets.

Environment: Google Cloud Platform (GCP) BigQuery, Airflow, Github, Python.

WIPRO TECHNOLOGIES 11/2012 – 07/2020

Data Engineer

Client: Kohl’s, CA, USA.

February 2018 to July 2020
PROJECT DESCRIPTION:
The scope of this project is to process sales & customer data to analyze the customer behavior
(Customer 3600), offers impacts (Loyalty programs) and also to generate data for Campaign.
Responsibilities:

 Driving end to end data analytic solutions including gathering the business
requirements, implementing the solutions, developing reports, deploying and
monitoring the data piplines.
 Analyze & process the customer data to provide customer 360 0 view for the
marketing team using Hive.
 Process the transaction data sourcing from Mosaic (ecom) & SalesHub system to
generate Analytical reports of sales Using Spark & Python.
 Process the Demand, verified data & Offers to analyze the various loyalty programs
using Spark.
 Implemented incremental model for multiple databases to ingest data from GCP
MySql source to GCP bucket using sqoop.
 Developed applications which can spin up ephemeral Google Dataproc clusters for
running pySpark jobs.
 Generate reports in Tableau using the aggregated data to give a visual
representation.
 Migrated the aggregated data from Hive to Bigquery for faster query responses for
the business teams.
 Performed testing & lead the QA team.
 Analyzed & fixed the code to optimize the flows to avoid long processing and
wastage of cluster resources.
 Analyze the YARN to understand the major blockages.
 Developed POCs on Airflow and then migrated the Data Pipelines to Airflow.
 Collaborated with the infrastructure, network, database, application, and BI teams
to ensure data quality and availability.

Environment: Google Cloud Platform(GCP) BigQuery, HDFS, Hive, Sqoop, Bitbucket, Jenkins, Linux
Shell, Teradata, Python, Agile and Scrum model.

Client: Walmart, AR, USA.

January 2016 to January 2018
PROJECT DESCRIPTION:
The scope of this project is to move data from RDBMS to Hadoop cluster, convert current ETL jobs to
Hive scripts & spark applications for data analysis.
Responsibilities:
 Used Sqoop to import data from RDMS (Informix, Oracle & Teradata) to HDFS and
later analyzed data using various Hadoop components. Also automated the steps to
import the data from various databases.
 Extensively worked on creating Hive external and internal tables and then applied
HiveQL to aggregate the data.
 Migrated ETL jobs to Hive scripts for transformations, joins, aggregations.
 Implemented Partitioning, Dynamic Partitions and Buckets in HIVE for increasing
performance benefit and helping in organizing data in a logical fashion
 Involved in converting Hive/SQL queries into Spark transformations using Spark SQL.
 Handled importing of data from various data sources, performed data control checks
using Spark and loaded data into HDFS.
 Developed basic reports in Tableau.
 Worked with datasets in Hive- creating loading and saving datasets using different
dataset operations.
 Worked with JIRA and Git.
Environment: HDFS, Sqoop, Hive, Informix, Oracle, pySpark, Tableau.

Client: SCHNEIDER ELECTRIC ,HYD,IN.

November 2012 to December 2015
PROJECT DESCRIPTION:
The scope of this project is to migrate the data from the legacy systems to SAP. As part of migration
we have to prepare the Data as per SAP structure and also Validate the Data before loading into SAP
system.
RESPONSIBILITIES:
As part of Data Preparation (Using Datastage) I have fulfilled the following responsibilities:
 Prepared UNIX scripts to validate the source files and moving them to respective
source folders.
 Scheduling the UNIX scripts to run the required Jobs as per the schedule time.
 Develop the PL/SQL programs to validations at Database levels.
 Tuning the PL/SQL programs for improving the performance.
 Design and development of jobs using DataStage Designer to load data from
different source files to target database.
 Extensively worked with performance tuning of Datastage Jobs and sessions.
 Run Data Stage jobs to process the data extracted (Biweekly) from the legacy
systems and make it suitable to load into SAP systems.
 Monitor and validate load process for each source extract and fix the issues in case
of any arises.
 Archiving files and maintenance of the development environment using Shell
scripts.

As part of Data Validation (Using Informatica) I have fulfilled the following responsibilities:
 Developed on PL/SQL procedures to make File Validation Checks.
 Involved in Informatica Development, administration and fixing production issues.
 Designed and developed Informatica 8x mappings to extract, transform and loading
of the data into Oracle 10g target tables.
 Worked on Informatica power center client tools like Designer, Workflow Manager,
Workflow Monitor, and Repository Manager.
 Implemented slowly changing dimensions methodology and developed mappings
to keep track of historical data.
 Involved in performance tuning the Informatica mappings.
 Expertise in using TOAD and SQL for accessing the Oracle database.
 Used TOAD to run SQL queries and validate the data in Data warehouse and Data
mart.

Environment: Informatica Power Center v9.1, Datastage v8.5, Flat files, Oracle10g & Oracle 11g,
PL/SQL, UNIX Shell Programming.
Tata Consultancy Ltd 09/2011 - 10/2012
Senior Software Engineer

Client: Agilent Technologies

September 2011 to October 2012
PROJECT DESCRIPTION:
The project comprises of almost three data marts. As a Data mart team, the main responsibility is to
work as per the user request to enhance the functionality of Informatica mappings. It also involved
in the new developments of the mappings and enhancements of the current mappings as per the
clients requirements.
RESPONSIBILITIES:
 Designed and Developed Informatica Mappings to Extract, Transform and Load data.
The source and target are based on Oracle.
 Used various transformations like Source Qualifier, Aggregator, expression, Joiner,
Connected and Unconnected lookups, Filters, Sequence Generator, Router, Update
strategy, Union and Stored Procedures to develop the mappings.
 Developed several Mappings and Mapplets using corresponding Source, Targets and
Transformations.
 Performed Unit Testing and wrote various test cases and precise documentation to
outline the dataflow for the mappings.
 Created various DDL Scripts for creating the tables with indexes and partitions.
 Created PL/SQL packages, Stored Procedures and Triggers for data transformation
on the data warehouse.
 Effectively worked on the performance tuning of the mappings for better
performance. Followed standard rules for performance tuning.
 Migrated the Mappings to different environments, development, testing, UAT and
Production.
 Used parameter files to provide the details of the Source and Target databases and
other parameters
 Preparing daily, weekly and monthly reports.

Environment: Informatica v8.X, CSV files, Oracle9i, PL/SQL Programming.

Infosys Technologies 07/2008 -

08/2011
Senior System Engineer

Client: Microsoft
July 2008 to August 2011
PROJECT DESCRIPTION:
The scope of this project is to integrate various source systems and develop the Datastage jobs to
implement the Business rules while loading data into the target systems.
RESPONSIBILITIES:
 Automated the repeated tasks with the help of Shell scripts.
 Design and development of jobs using DataStage Designer to load data from
different heterogeneous source files to target databases.
 Implemented many transformation activities in DataStage before loading the data
into various dimensions and fact tables.
 Used Transformer stage, Aggregator stage, Merge stage and Sequential file stage
and Sort stages in designing jobs.
 Used Row generator and Peek stage while testing the job designs.
 Worked with Local and Shared Containers.
 Used parallel processing capabilities, Session-Partitioning and Target Table
partitioning utilities.
 Created Reusable Transformations using Shared Containers.
 Developed PL/SQL stored procedures.
 Identified bottlenecks and performance tuned PL/SQL programs.

Environment: Datastage v7.X, Flat files, Oracle9i, PL/SQL, UNIX Shell scripting.

Education
Bachelor of Engineering 2004 - 2008
CBIT, Osmania University, India

Sandip - GCP
No ratings yet
Sandip - GCP
3 pages
Big Data Engineer Resume Overview
No ratings yet
Big Data Engineer Resume Overview
4 pages
Senior Data Engineer Resume
No ratings yet
Senior Data Engineer Resume
4 pages
DE Sample Resume
No ratings yet
DE Sample Resume
6 pages
Vidhatha - Data Engineer-8 Exp
No ratings yet
Vidhatha - Data Engineer-8 Exp
4 pages
Ravali Data Engineer GCP
No ratings yet
Ravali Data Engineer GCP
8 pages
Sr. Data Engineer with Azure Expertise
No ratings yet
Sr. Data Engineer with Azure Expertise
6 pages
1
No ratings yet
1
6 pages
Data Engineering Career Overview
No ratings yet
Data Engineering Career Overview
4 pages
Mahesh D: Senior AWS Big Data Engineer
No ratings yet
Mahesh D: Senior AWS Big Data Engineer
5 pages
RAJU AWS Data Engineer Resume
No ratings yet
RAJU AWS Data Engineer Resume
6 pages
Ratodyasinh Parmar Resume
No ratings yet
Ratodyasinh Parmar Resume
4 pages
Data Engineering Expert Profile
No ratings yet
Data Engineering Expert Profile
5 pages
JayshreeGaikwad GCP ETL
No ratings yet
JayshreeGaikwad GCP ETL
3 pages
Senior Big Data Developer Profile
No ratings yet
Senior Big Data Developer Profile
6 pages
PAVANKUMAR
No ratings yet
PAVANKUMAR
5 pages
ABHINAY VARMA PINNAMARAJU - Data Engineering
No ratings yet
ABHINAY VARMA PINNAMARAJU - Data Engineering
6 pages
Sai Krishna Sr. Big Data Engineer
No ratings yet
Sai Krishna Sr. Big Data Engineer
8 pages
Sivaganeshsomu GCP Big Query Hyderabad
No ratings yet
Sivaganeshsomu GCP Big Query Hyderabad
4 pages
Spark Developer Resume Overview
No ratings yet
Spark Developer Resume Overview
5 pages
Pranjal Soni: Professional Summary
No ratings yet
Pranjal Soni: Professional Summary
4 pages
Swetha G
No ratings yet
Swetha G
9 pages
Shiva - Updated Resume
No ratings yet
Shiva - Updated Resume
3 pages
Shabukarisadiq Resume
No ratings yet
Shabukarisadiq Resume
7 pages
BIGDATA DataEngineer Resume
No ratings yet
BIGDATA DataEngineer Resume
3 pages
Vijay - Data Engineer Re
No ratings yet
Vijay - Data Engineer Re
7 pages
Pruthvi GCP - Data Engineer +++++++
No ratings yet
Pruthvi GCP - Data Engineer +++++++
8 pages
MadhusudhanR Resume
No ratings yet
MadhusudhanR Resume
11 pages
Manoj DE
No ratings yet
Manoj DE
6 pages
Naresh DE
No ratings yet
Naresh DE
5 pages
Nihar Meher Resume
No ratings yet
Nihar Meher Resume
3 pages
Anil SrDEngineer
No ratings yet
Anil SrDEngineer
5 pages
Senior Data Engineer Resume Overview
No ratings yet
Senior Data Engineer Resume Overview
7 pages
Preetam Bigdata Aws Architect Salorix CV
No ratings yet
Preetam Bigdata Aws Architect Salorix CV
4 pages
Anvesh - Sr. Data Engineer
No ratings yet
Anvesh - Sr. Data Engineer
6 pages
DataEngineer Shreya GCP
No ratings yet
DataEngineer Shreya GCP
8 pages
Abhilash Resume
No ratings yet
Abhilash Resume
5 pages
Data Engineering Expertise Overview
No ratings yet
Data Engineering Expertise Overview
7 pages
Data Engineer: Cloud & ETL Expertise
No ratings yet
Data Engineer: Cloud & ETL Expertise
2 pages
Akhil Reddy GCP
No ratings yet
Akhil Reddy GCP
8 pages
Abhinay - Data Engineer
No ratings yet
Abhinay - Data Engineer
7 pages
XXXX - GCP Resume
No ratings yet
XXXX - GCP Resume
2 pages
Senior Data Engineer Resume
No ratings yet
Senior Data Engineer Resume
7 pages
Nikhil Kumar: Senior Big Data Engineer
No ratings yet
Nikhil Kumar: Senior Big Data Engineer
7 pages
Manoj Kumar
No ratings yet
Manoj Kumar
3 pages
Cloud Bigdata Amand AWS
No ratings yet
Cloud Bigdata Amand AWS
6 pages
Sr Data Engineer Expertise in Big Data & Cloud
No ratings yet
Sr Data Engineer Expertise in Big Data & Cloud
8 pages
Devinder Gill: Azure Data Engineer Profile
No ratings yet
Devinder Gill: Azure Data Engineer Profile
5 pages
Ramya Data Engineer
No ratings yet
Ramya Data Engineer
4 pages
Dice Resume CV Karthik S
No ratings yet
Dice Resume CV Karthik S
4 pages
Senior GCP Data Engineer Profile
No ratings yet
Senior GCP Data Engineer Profile
5 pages
Bharath DE
No ratings yet
Bharath DE
7 pages
Deepak Garg
No ratings yet
Deepak Garg
3 pages
Cloud Based Developer - JayshreeLaxmikantTupkar (3y - 5m)
No ratings yet
Cloud Based Developer - JayshreeLaxmikantTupkar (3y - 5m)
3 pages
Data Engineering Expertise Overview
No ratings yet
Data Engineering Expertise Overview
8 pages
SSREDDY
No ratings yet
SSREDDY
8 pages
AiU-Certified Tester Artificial Intelligence Syllabus-1
No ratings yet
AiU-Certified Tester Artificial Intelligence Syllabus-1
57 pages
Log 20250315 165832
No ratings yet
Log 20250315 165832
13 pages
C0001123183P JoineeMastersheetIntern 638381400469476645
No ratings yet
C0001123183P JoineeMastersheetIntern 638381400469476645
9 pages
Process Metrics
No ratings yet
Process Metrics
5 pages
Candidate Experience & Personal Info Form
No ratings yet
Candidate Experience & Personal Info Form
2 pages
Google Drive Integration Setup Guide
No ratings yet
Google Drive Integration Setup Guide
8 pages
Prompt Engineering
100% (6)
Prompt Engineering
108 pages
Test Planning
No ratings yet
Test Planning
1 page
Overview of Scrum Methodology
No ratings yet
Overview of Scrum Methodology
22 pages
Exploratory Test Charter Template
No ratings yet
Exploratory Test Charter Template
7 pages
Being Agile Karthik Ramesh Priya Jain
No ratings yet
Being Agile Karthik Ramesh Priya Jain
20 pages
02 - DevOps Placemat
No ratings yet
02 - DevOps Placemat
2 pages
Unit 3
No ratings yet
Unit 3
26 pages
RHadoop
No ratings yet
RHadoop
50 pages
Chapter 09 MRSHuawei's Big Data Platform
No ratings yet
Chapter 09 MRSHuawei's Big Data Platform
41 pages
Ashish Resume Final
No ratings yet
Ashish Resume Final
2 pages
Pyspark Developer Resume: Aman Jain
No ratings yet
Pyspark Developer Resume: Aman Jain
3 pages
Data Engineering with Scala and Spark
No ratings yet
Data Engineering with Scala and Spark
10 pages
Azure Databricks Mastery
No ratings yet
Azure Databricks Mastery
95 pages
Big Data Notes (Unit III & Unit IV)
No ratings yet
Big Data Notes (Unit III & Unit IV)
53 pages
CHANDRAMOHANP DigVerve
No ratings yet
CHANDRAMOHANP DigVerve
9 pages
Hadoop Training Course Outline
No ratings yet
Hadoop Training Course Outline
9 pages
Big Data Analytics and Technologies Overview
No ratings yet
Big Data Analytics and Technologies Overview
5 pages
Fullstack Big Data & Cloud Course
No ratings yet
Fullstack Big Data & Cloud Course
36 pages
Hadoop Seminar Report
No ratings yet
Hadoop Seminar Report
29 pages
MapReduce for Data-Intensive Apps
No ratings yet
MapReduce for Data-Intensive Apps
33 pages
Dremio vs. SQL Engines: Performance & Features
No ratings yet
Dremio vs. SQL Engines: Performance & Features
55 pages
Spark-Powered Big Data Platform by Atigeo
No ratings yet
Spark-Powered Big Data Platform by Atigeo
17 pages
Data Engineer 1569314680
No ratings yet
Data Engineer 1569314680
2 pages
BDA Hive
No ratings yet
BDA Hive
22 pages
Introduction to Apache Hive Overview
No ratings yet
Introduction to Apache Hive Overview
13 pages
Debarghya Das: Software Engineer Resume
No ratings yet
Debarghya Das: Software Engineer Resume
1 page
Edureka Training - Data Engineer Masters Program
No ratings yet
Edureka Training - Data Engineer Masters Program
49 pages
Databricks Certified Data Engineer Associate Course V2 Release
No ratings yet
Databricks Certified Data Engineer Associate Course V2 Release
300 pages
Banking & Fis Lecture 03
No ratings yet
Banking & Fis Lecture 03
14 pages
Verada - Timetable
No ratings yet
Verada - Timetable
5 pages
HBase Overview: Data Model & Clients
No ratings yet
HBase Overview: Data Model & Clients
34 pages
Cheat Sheet: Hive Basics
No ratings yet
Cheat Sheet: Hive Basics
1 page
UNIT 5 Complete Notes
No ratings yet
UNIT 5 Complete Notes
21 pages
Business Intelligence & NoSQL
No ratings yet
Business Intelligence & NoSQL
13 pages
Databricks, An Introduction: Chuck Connell, Insight Digital Innovation
No ratings yet
Databricks, An Introduction: Chuck Connell, Insight Digital Innovation
36 pages
ANSWER
No ratings yet
ANSWER
3 pages

Kiran - Data Engineer

Uploaded by

Kiran - Data Engineer

Uploaded by

Kiran

GCP/Hive/Python/Spark/Informatica/Sqoop and Tableau

Technical Skills Summary

Environment: Google Cloud Platform (GCP) BigQuery, Airflow, GitHub, Python.

Environment: Google Cloud Platform (GCP) BigQuery, Airflow, Github, Python.

WIPRO TECHNOLOGIES 11/2012 – 07/2020

Client: Kohl’s, CA, USA.

Client: Walmart, AR, USA.

Client: SCHNEIDER ELECTRIC ,HYD,IN.

Client: Agilent Technologies

Environment: Informatica v8.X, CSV files, Oracle9i, PL/SQL Programming.

Infosys Technologies 07/2008 -

You might also like