0% found this document useful (0 votes)
46 views2 pages

Data Analyst Resume: Skills & Experience

Arun Kumar Dara is a Data Analyst with over 3 years of experience in delivering data solutions and insights across various sectors, specializing in data collection systems and statistical techniques. He holds a Master's degree in Data Analytics and has worked with companies like AIG and Hexaware, where he developed ETL pipelines, predictive models, and interactive dashboards. His technical skills include proficiency in Python, SQL, and various data visualization tools, alongside certifications in Power BI.

Uploaded by

auroracvsr
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
46 views2 pages

Data Analyst Resume: Skills & Experience

Arun Kumar Dara is a Data Analyst with over 3 years of experience in delivering data solutions and insights across various sectors, specializing in data collection systems and statistical techniques. He holds a Master's degree in Data Analytics and has worked with companies like AIG and Hexaware, where he developed ETL pipelines, predictive models, and interactive dashboards. His technical skills include proficiency in Python, SQL, and various data visualization tools, alongside certifications in Power BI.

Uploaded by

auroracvsr
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Arun Kumar Dara

(773) 453-5312 | daraarunkumar7@[Link] | [Link] | [Link]/Arundara7


SUMMARY
Data Analyst with 3+ years of experience delivering real-time insights and scalable data solutions across diverse sectors. Adept at interpreting data,
developing robust data collection systems, and applying statistical techniques to identify trends. Proven track record in building ETL pipelines, predictive
models, and interactive dashboards to drive actionable insights. Skilled at cross-functional collaboration to ensure data consistency and accuracy.
EDUCATION & CERTIFICATIONS
Indiana Wesleyan University, USA May 2025
Masters in Science, Management - Data Analytics Specialization
TKR College of Engineering & Technology, India Aug 2022
Bachelor of Technology, Engineering
EXPERIENCES
AIG Jun 2025 - Present
Data Analyst USA
• Engineered real-time claims ingestion pipelines using Google Pub/Sub, Dataflow (Apache Beam), and BigQuery to interpret incoming FNOL data,
ensuring regulatory compliance with Google Cloud DLP.
• Designed and implemented a robust claims data warehouse in BigQuery with advanced SQL transformations and Soda Core validations to achieve 99%
data accuracy for effective reporting and analysis.
• Developed an anomaly detection model in Python (CatBoost) that analyzed policy, exposure, and FNOL text features, improving fraud detection precision
by 12% and reducing false positives by 23%.
• Automated and deployed anomaly scoring through Vertex AI Endpoints, integrating real-time alerts with Slack Webhooks to support data-driven decision
making by SIU teams, reducing fraud detection time by 35 minutes.
• Created and maintained interactive dashboards in Power BI, embedding row-level security and drill-down capabilities to generate accurate reports and
insightful visualizations, cutting SIU triage cycles by 14%.
• Established data governance practices by documenting lineage in Data Catalog and monitoring model drift with Evidently AI, ensuring the consistency
and integrity of data across business units.
AIG Jan 2025 - May 2025
Data Analyst Intern USA
• Assisted in designing a real-time claims ingestion pipeline using Google Pub/Sub, Dataflow, and BigQuery to ensure accurate capture and interpretation
of FNOL data with Google Cloud DLP.
• Supported the development of a claims warehouse model in BigQuery by writing SQL transformations and performing Soda Core data quality checks
to enhance data reliability and facilitate analytical reporting.
• Contributed to building an anomaly detection model in Python (CatBoost) by preparing training datasets and conducting feature engineering, helping
analyze data patterns to identify fraudulent claims more effectively.
• Collaborated on the deployment of real-time scoring in Vertex AI and the integration of fraud alerts via Slack Webhooks to streamline monitoring efforts
and deliver timely insights.
• Created interactive dashboards in Looker under mentorship to visualize claims anomalies and trends, providing actionable insights to SIU teams and
supporting data-driven decision making.
Hexaware technologies Jan 2021 - Jul 2022
Data Analyst India
• Architected churn prediction models with Python (Scikit-learn, XGBoost) and AWS SageMaker, analyzing customer data to reduce attrition by 18% in
the banking sector.
• Transformed healthcare and retail datasets using advanced SQL and Azure Synapse to accelerate actuarial risk assessments by 5x, ensuring data
consistency and enabling strategic insights that secured $2.5M new business and boosted cross-sell revenue by 22%.
• Engineered healthcare data lakes with Delta Lake (FHIR/HL7) and centralized Snowflake marts, streamlining compliance audits and enhancing data
integrity for underwriting decisions.
• Built and visualized interactive analytics solutions by integrating manufacturing telemetry (PySpark, GraphQL APIs) and financial risk models (R Shiny,
Monte Carlo simulations, ggplot2) into Power BI and Plotly Dash dashboards for real-time defect tracking.
• Strengthened governance by implementing Great Expectations and Collibra, automating financial data quality checks for SOX compliance and ensuring
consistent, high-quality data for reporting.
• Enhanced credit scoring models by optimizing Databricks feature stores (utilizing UMAP and PCA) to improve precision by 15%, and led workshops
that promoted the adoption of self-service analytics across teams.
MPhasis Aug 2022 - Jul 2023
Data Analyst India
• Engineered scalable data pipelines using SQL, Apache Airflow, and dbt to integrate cross-border banking transactions into Snowflake, enhancing AML
anomaly detection by 30% and reducing compliance risks through timely data interpretation.
• Developed and optimized predictive models for customer lifetime value (CLV) using Python (LightGBM, SHAP) on Databricks, consolidating structured
and semi-structured CRM data to drive improved upsell conversions and insurance policy renewals.
• Standardized healthcare claims data by designing normalization engines with PySpark and FHIR/HL7 frameworks, shortening patient risk scoring cycles
from 2 weeks to 3 days and enabling effective precision-medicine analytics.
• Synthesized and visualized multimodal logistics datasets via GraphQL APIs and Neo4j, constructing Tableau dashboards that supported inventory
rebalancing and reduced warehouse stockouts.
• Instituted robust data quality frameworks with Great Expectations and AWS Lake Formation, automating validation checks for sensitive banking data to
ensure accuracy and reduce manual exception handling by 40%.
• Designed and evaluated scenario-based A/B experiments with R (ggplot2, Statsmodels), generating interactive dashboards to present insights that
optimized digital loan origination funnels and increased conversion rates.
• Accelerated advanced analytics initiatives by streamlining feature engineering for high-frequency trading signals using Python (Pandas, Dask) and Redis,
reducing model training cycles by 50% and fostering wider adoption of data storytelling practices.
PROJECTS
Patient Condition Classification Using Drug Reviews
• Designed an end-to-end NLP pipeline using Python (Pandas, NumPy, and Scikit-learn) to process 10,000+ patient drug reviews, extracting key side effects
and sentiment indicators.
• Built and optimized classification models (Logistic Regression, Random Forest, XGBoost) with GridSearchCV/RandomizedSearchCV, achieving 85%
accuracy in predicting patient conditions.
• Automated preprocessing workflows (tokenization, lemmatization, TF-IDF, sentiment scoring), cutting manual data cleaning by 40% and enabling
scalable clinical insights.
• Integrated demographic and health data to enrich predictions and deployed REST APIs via Flask for real-time drug effectiveness checks by healthcare
practitioners.
• Delivered actionable insights through interactive dashboards (Power BI, Tableau) and structured reports, improving treatment personalization and clinical
decision-making.
Stock Market Analysis for Reliance Industries
• Analyzed 8 years of stock data via Python & Yahoo Finance API, uncovering historical trends to guide trading strategies.
• Developed ARIMA, LSTM, and Random Forest forecasting models with technical indicators (MACD, RSI, and Bollinger Bands), achieving robust price
predictions.
• Automated data pipelines for continuous updates and integrated real-time dashboards in Streamlit, enabling investors to act on live forecasts.
• Designed portfolio optimization and quantitative trading algorithms, balancing returns with risk in volatile markets.
• Delivered detailed risk & performance reports, helping stakeholders evaluate scenarios and make data-driven investment decisions.
Book Recommendation System with Chatbot Integration
• Built a hybrid recommendation engine combining collaborative filtering and NLP-based content models to deliver personalized book suggestions.
• Integrated a conversational Al chatbot for seamless discovery, with A/B tested dialogue flows that boosted user satisfaction.
• Deployed the solution via Flask & Streamlit with a PostgreSQL-backed preference database for adaptive learning of user tastes.
• Leveraged external APIs for real-time metadata enrichment and presented engagement insights through interactive dashboards.
• Enhanced UX by analyzing behavior metrics and implementing improvements, increasing user retention by 30%.
TECHNICAL SKILLS
• Programming Languages: Python, R, SQL (advanced CTES, window functions, materialized views)
• Libraries & Frameworks: Pandas, NumPy, Scikit-learn, XGBoost, LightGBM, CatBoost, SHAP, Statsmodels, Dask, UMAP, PCA
• Visualization Tools: Power BI, Tableau, Plotly Dash, R Shiny, Looker, ggplot2
• Databases & ETL: Snowflake, Delta Lake, BigQuery, SQL (CTES, window functions, materialized views), Redis, ETL pipeline design
• Cloud & Deployment: AWS SageMaker, AWS Lake Formation, Azure Synapse, Google Cloud (BigQuery, Vertex AI, Pub/Sub, Dataflow, Data Catalog,
DLP), Databricks, Apache Airflow, dbt
• Big Data Technologies: PySpark, Apache Beam, Databricks, Redis
• Collaboration & Tools: Neo4j, GraphQL APIs, Slack Webhooks, Notion, Confluence, JIRA, Excel
• Methodologies: Agile/Scrum, A/B Testing, Monte Carlo Simulation, Feature Engineering, Data Storytelling, Cross-functional Collaboration
• Data Governance & Compliance: Great Expectations, Collibra, AWS Lake Formation, Soda Core, Google Cloud DLP, SOX Compliance, FHIR/HL7
Standards
Certifications
• PL-300: Microsoft Certified: Power BI Data Analyst Associate:June 2025

You might also like