Profile Summary
7+years of IT experience in software analysis, design, development, testing and implementation of Python/
Pyspark, SQL and Azure technologies.
Experience in working with Cloud services like Azure Data Factory, Azure Synapse Analytics, Azure Data Bricks,
Azure Logic Apps and Azure Automation Accounts.
Implemented Lambda architectures using Azure Data platform capabilities like Azure Data Lake, Azure SQL Server,
Azure ML and Power BI.
Experience in working with Developer Toolkits like Anaconda, Jupyter Notebooks and PyCharm.
Experience in the Software Development Life Cycle (SDLC) phases which include Analysis, Design, Implementation,
Testing and Maintenance.
Extensively used Azure Fabric and migrated different notebooks and pipelines and building and maintaining the
data warehouse in One lake.
Extensively used Azure Data Lake and worked with different file formats csv, parquet, orc and delta.
Documented business, process and data flow documentation on multiple data systems using Lucid Chart
Designed Data models on multiple database systems, customer support, troubleshooting and modifying databases
to meet the customer’s requirements.
Enhanced existing Terraform scripts to automate AWS services which include CloudFront distribution, RDS, EC2 and
S3 Bucket
Expertise in Snowflake to create and Maintain Tables and views.
Hands-on experience with multiple ML/AI Algorithms such as Neural networks, Linear and Logistic Regression,
Support Vector Machine, Decision Tree, Naive Bayes using R and Python libraries (Pandas, NumPy, Matplotlib,
Whoosh, Pyspark) to perform predictive Analytics on various datasets
Working knowledge on Azure cloud IaaS and PaaS Services, Azure SQL, Azure storage and Azure Services.
Good working experience in using Spark SQL to manipulate Data Frames in Python.
Good knowledge in NoSQL databases including Cassandra and MongoDB.
Expertise on Datawarehouse’s like Azure Synapse and Snowflake.
Worked and learned a great deal from AWS Cloud services like EC2,SNS, SQS, S3, EBS, RDS, Amazon Red Shift,
Lambda and CloudWatch Log.
Good experience in Hive partitioning, bucketing and performing different types of joins on Hive tables and
implementing Hive SerDe like JSON and Avro.
Working experience in Development, Production and QA Environments.
Experience in NoSQL Column-Oriented Databases like HBase and its Integration with Hadoop cluster.
Experience in Azure Cosmos Db for enabling high reliability and low maintenance for querying costs.
Develop ETL mappings for various Sources (.TXT, .CSV) and also load the data from these sources into relational
tables with Azure Data Factory and Data Flows.
Experience working with Azure BLOB and Data lake storage and loading data into Azure Synapse analytics.
Experience with Amazon Web Services, AWS command line interface, and AWS data pipeline.
Experience in writing SQL, PL/SQL queries, Stored Procedures for accessing and managing databases such as
Oracle, MySQL, Azure Synapse and IBM DB2.
Developed Power BI reports & effective dashboards after gathering and translating end user requirements.
Used various sources to pull data into Power BI such as SQL Server, Excel, Oracle, SQL Azure etc.
Experience on Scripting and knowledge on Python.
Expertise in developing Spark code using Scala and Spark-SQL/Streaming for faster testing and processing of data.
Experience with Apache Spark ecosystem using Spark-SQL, Data Frames, RDD's (Resilient Distribution Dataset)
and knowledge on Spark MLlib.
Hands on experience with Big Data Ecosystems including Hadoop, MapReduce, Pig, Hive, Impala, Sqoop, Flume,
NIFI, Oozie, MongoDB, Zookeeper, Kafka, Maven, Spark, Scala, HBase, Cassandra.
Experience in installation, configuration, and deployment of Big Data solutions.
Extensive experience in developing stored procedures, functions, Views and Triggers, Complex queries using SQL
Server, TSQL and Oracle PL/SQL
Experience with Data Cleansing, Data Profiling and Data analysis. SQL and PL/SQL coding.
Conducted comprehensive code reviews for software development projects, providing constructive feedback on
code quality, best practices, and performance optimizations.
Technical Skills
● Cloud Platform: Azure Data Factory, Azure Synapse Analytics, Azure Data Bricks, Azure Data Lake, Azure Blob
Storage, Amazon AWS, EC2, MS Azure, Azure SQL Database, Azure SQL Data Warehouse, Azure Analysis Services,
AWS Lambda, SNS, SQS, Amazon Redshift, RDS, EC2, S3 Bucket.
● Programming Language: Python 3.6, Pyspark, Scala, SQL, PL/SQL, DB2
● Databases: SQL, Oracle
● Operating Systems: Windows 10/8/7, Linux
● IDE and Tools: Anaconda, Jupyter Notebook
● SDLC Methodologies: Agile, Waterfall
● Version Control: GIT, Azure Devops
● Reporting tool: PowerBI, Tableau
● Other Tools: Visual Studio 2010, Business Intelligence Studio 2008, Power BI, SQL Server Integration Services (SSIS)
2005/2008, SQL, SQL Server Reporting Services (SSRS) 2008, SQL Server 2008 R2, Postman, PowerShell.
Work Experience
Role: Senior Data Engineer
Client: - Mizuho
May 2024 -present – Iselin, New jersey United states.
Responsibilities
• Develop synapse and ADF Data copy pipelines, Data flows & Pyspark notebooks using Data Bricks and Synapse Analytics.
• Develop warehoises and Data copy pipelines, Pyspark notebooks using Microsoft Fabric.
• Created multiple scripts to automate ETL/ ELT process using Pyspark from multiple sources.
• Develop the Stored procedures for Datawarehouse management and scheduling.
• Developed code in Spark SQL for implementing Business logic with python as programming language.
• Involved in end-to-end development and automation of new ETL pipelines using SQL and Pyspark.
• Worked on Sequence files, Map side joins, bucketing, partitioning for hive performance enhancement and storage
improvement.
• Working on Designing and architecting the advanced commercial banking system application for pushing the files from
SFTP to GDWH (Golbal Data warehouse) with all necessary business applications and transformsation
• Writing pyspark code for all the transformations and creating workflows in Databricks for scheduling
• Maintaing the sentivie Trading data and applying necessary masking or hashing techniques and debugging for the errors
in data by using necessary calculation applying in business scenarios and resolving them
• Maintaing trading data and commiditoies data and transforming it and storing in G-DWH in snowflake in corporate
headoffice in Tokyo
• Creating end to end pipelines in Azure Data Factory for archestration
• Extensively worked on DLT pipelines and autoloading in Databricks
• Created and managed the unity catalog in Databricks
• Created the mapping documentation for implementing the medallain architecture
Role: Senior Data Engineer
Client:- Symphony Technology Group– California, USA
June 23 – May 24
Responsibilities
Develop synapse and ADF Data copy pipelines, Data flows & Pyspark notebooks using Data Bricks and Synapse
Analytics.
Created multiple scripts to automate ETL/ ELT process using Pyspark from multiple sources.
Develop the Stored procedures for Datawarehouse management and scheduling.
Develop the Notebooks to read data from Salesforce and SugarCRM Using SOQL/SOSL for Datawarehouse
management.
Develop ETL Pipelines to copy data from Business Central and EDH (Enterprise Data Hub) Using Rest API get and
post methods.
Develop Azure Logic Apps to copy Tempo Data from Jira to Business Central and scheduled to copy it bi-weekly
basis.
Created and maintained the Power BI reports for Accounting and Financials.
Develop and maintain the power BI reports.
Developed Pyspark scripts utilizing SQL and RDD in spark for data analysis and storing back into Azure Blob Storage and
S3.
Performed data migration from AWS S3 to Snowflake using Copy and performed Data Validation.
Used Snowpark API to work on python API’s and streamline the data using Pandas and performed various data
wrangling operations.
Performed Data Transformation using Snowflake and AWS Glue.
Developed ETL pipelines in and out of data warehouse using Python Writing complex custom SQL.
Responsible for creating new architecture for reading data from Jira cloud and generating the power BI reports.
Developed code in Spark SQL for implementing Business logic with python as programming language.
Involved in end-to-end development and automation of new ETL pipelines using SQL and Pyspark.
Worked on Sequence files, Map side joins, bucketing, partitioning for hive performance enhancement and storage
improvement.
Developing and maintaining the AWS Lambda function, DynamoDB tables, secret manager, and SQL Queues.
Connecting with Finance team for new developments and allocating to contractors or developing by self and
deploying them into production
Compiled and executed programs as necessary using Apache Spark in Scala to perform ETL jobs with ingested data.
Worked with Data Analysts on building purposeful analytics tables in DBT for cleaner schemas.
Developed file cleaners using Python libraries and made it clean.
Utilized Python Libraries like Boto3, NumPy for AWS.
Developing CI/CD pipelines for code merging and version controlling.
Responsible for implementing the automation to send an individual messages through slack for
time reporting in Jira for individual and team leads.
Data Extraction, aggregations, and consolidation of Sales data within Azure Environment using Pyspark.
Developed Pyspark code for AWS Glue jobs and for EMR.
Role:- Azure Data Engineer
Client:- Microsoft (Capgemini) – Bangalore, India
Feb 19 – June 23
Responsibilities
Extract Transform and Load data from Sources Systems to Azure Data Storage services using a combination of
Azure Data Factory, T-SQL, Spark SQL and U-SQL Azure Data Lake Analytics. Data Ingestion to one or more Azure
Services - (Azure Data Lake, Azure Storage, Azure SQL, Azure DW) and processing the data in In Azure Databricks.
Architected and designed the data flow for the collapse of 4 legacy data warehouses into an ADLS.
Designed the ELT strategy into an Analytics Sandbox and the Business Intelligence Facility (EDW)
Worked on On-Prem Datawarehouse migration to Azure Synapse.
Built ETL process using T-SQL completely and using SSIS for different loads.
Developing the data pipelines on Azure Data Factory using SAS or Pyspark (Spark 2.2 and Spark SQL)
Designed and built reusable data ETL process by developing Azure Synapse Pipelines to move data from On-Prem
to Azure data Lake.
Managed and optimized Fabric data workflows, ensuring high performance and minimal downtime in data
processing tasks.
Coding, testing, debugging, documenting, and maintaining SAS and python programs.
Extracting data from databases using SAS SQL procedures and Spark Core libraries to manipulate, aggregating,
and merging datasets.
Designing reports and developing insightful visualizations using SAS Visual Analytics and Power Bl.
Wrote and utilized PowerShell scripts and Azure CLI scripts to automate workloads.
Automated process using PowerShell scripts and Python scripts integrating with third party API’s.
Created automated PowerShell scripts for managing recurring activity on the SQL database.
Extract Transform and Load data from Sources Systems to Azure Data Storage services using a combination of
Azure Data Factory Palantir Foundry, Spark SQL and U-SOL Azure Data Lake Analytics.
Loaded and transformed large sets of semi structured data like XML, JSON, Avro, Parquet.
Experienced in creating Elastic pool databases and scheduling Elastic jobs for executing TSQL Procedures.
Integrated Custom Visuals based on business requirements using Power BI desktop.
Instituted a process of continuous iterative exploration and investigation of business metrics to identify patterns
and key performance indicators for data driven strategic decisions using AI, ML, and data mining techniques
(predictive modeling, natural language processing).
Conducted performance tuning and optimization of data transformations in Microsoft Fabric, improving
processing speed and resource utilization.
Developed complex SQL queries using stored procedures, common table expressions (CTEs), temporary table to
support Power BI and SSRS reports uning ANSI SQL standards.
Performed creative interactive report applications using R shiny package.
Written Templates for Azure Infrastructure as code using Terraform to build staging and production
environments. Integrated Azure Log Analytics with Azure VMs for monitoring the log files, store them and track
metrics and used Terraform as a tool, Managed different infrastructure Cloud resources.
Build and Deploy Data Pipelines on Azure to enable AI & ML capabilities.
Used SSIS variables to capture start time, end time and failure of a package and execute the package dynamically.
Designed SSIS packages to transfer the data from DB2, flat files and excel documents to the Staging Area in SQL
Server 2012 and 2014.
Developed complex calculated measures using Data Analysis Expression language (DAX).
Used python modules of urllib, urllib2, Requests for web crawling. Experience using all these ML techniques:
clustering, regression, classification, graphical models.
Embedded Power BI reports on SharePoint portal page and also managed access of reports and data for individual
users using Roles.
Designed cloud-based solutions in Azure by creating Azure SQL database, setting up Elastic pool jobs and design
tabular models in Azure analysis services.
Excellent knowledge of all the phases of Software Development Life Cycle (SDLC).
Experienced on Migrating SQL database to Azure data Lake, Azure data lake Analytics, Azure SQL Database, Data
Bricks and Azure SQL Data warehouse and controlling and granting database access and Migrating On premise
databases to Azure Data Lake store using Azure Data factory.
Developed spark applications in Python (Pyspark) on distributed environment to load huge number of CSV files
with different schema in to Hive ORC tables.
Experienced in Developing Spark applications using Spark - SQL in Databricks for data extraction, transformation,
and aggregation from multiple file formats for analyzing & transforming the data to uncover insights into the
customer usage patterns.
Good understanding of Spark Architecture including Spark Core, Spark SQL, Data Frames, Spark Streaming, Driver
Node, Worker Node, Stages, Executors and Tasks.
creating required Hive tables, data loading and writing hive queries.
Created Databricks notebooks using ANSI SQL, Python and automated notebooks.
Encoded and decoded Json objects using Pyspark to create and modify the data frames in Apache Spark
Involved in converting Hive/SQL queries into Spark transformations using Spark RDD and Pyspark concepts.
Involved in data migration from On-Prem to azure cloud.
Provided 24X7 production support to ensure timely and error-free delivery of reports.
Created Pipelines in ADF using Linked Services/Datasets/Pipeline/ to Extract, Transform and load data from
different sources like Azure SQL, Blob storage, Azure SQL Data warehouse, write-back tool and backwards.
Connected Azure SQL to Power BI and Built Power BI reports using SQL.
Developed complex SQL queries using stored procedures, common table expressions (CTEs), temporary table to
support Power BI and SSRS reports.
Developed complex calculated measures using Data Analysis Expression language (DAX).
Developed and maintained multiple Power BI dashboards/reports and content packs.
Prepared necessary pipelines to trigger containers with Azure Data Factory
Experienced in performance tuning of Spark Applications for setting right Batch Interval time, correct level of
Parallelism and memory tuning.
Managed the storage in Azure Data Lake storage (ADLS) by delivering the processed data using Highly optimized
storage techniques.
Developed JSON Scripts for deploying the Pipeline in Azure Data Factory (ADF) that processes the data using the
SQL Activity.
Varroc Group – Bangalore, India May 2018 – Jan 2019
Data Analyst
Responsibilities
Wrote complex SQL Queries, Stored Procedure, Triggers, Views & Indexes using DML, DDL commands and user
defined functions to implement the business logic.
Worked on SQL Joins, Sub Queries, Common Tables Expressions (CTE)
Worked on designing databases and relationships between tables.
Wrote T-SQL Queries and procedures to generate DML Scripts that modified database objects dynamically based
on inputs.
Created SSIS package to import and export data from various CSV files, Flat files, Excel spread sheets and SQL
Server.
Designed and developed different types of reports like matrix, tabular, chart reports using SSRS.
Developed various automated scripts for DI (Data Ingestion) and DL (Data Loading) using python & java map
reduce.
Worked with different file formats like Json, AVRO and Parquet and compression techniques like snappy.
Worked closely with various teams across the company to identify and solve business challenges.
Gathering and Analyzing requirements of clients, understanding, and converting them into database solutions
with solid exposure to architecture, analysis, design, testing, development, deployment, documentation &
implementation.
Import data from different data sources to create Power BI reports and Dashboards.
DRDO (Defense Research and Development Organization) - Hyderabad
Aug 2017 – April 2018
Junior Research Engineer
Responsibilities:
Assist in developing and maintaining software applications used for defense-related purposes, such as command
and control systems, simulation software, or data analysis tools.
Collaborate with senior developers and engineers to design, implement, and test software solutions.
Write code, debug programs, and troubleshoot software issues under the guidance of experienced team
members.
Conduct research on emerging technologies, software frameworks, and best practices relevant to defense
applications.
Analyze data, algorithms, and software architectures to identify potential improvements or optimizations for
existing systems.
Support system integration efforts by configuring software components, deploying applications, and ensuring
compatibility with hardware systems.
Participate in software testing activities, including unit testing, integration testing, and system validation, to verify
software functionality and performance.
Document test cases, test results, and software configurations to support quality assurance and compliance
requirements.
Learn and apply principles of cybersecurity and information assurance to ensure the security and integrity of
defense-related software systems.
Assist in identifying and mitigating security vulnerabilities, conducting code reviews, and implementing security
controls to protect sensitive data and systems.
Education Details:
Bachelors: Bachelor’s in mechanical engineering, JNTU Anantapur, India, 2018