Harmanan Gurvinder Kohli
ć [email protected] | Ħ +919819527168
a github.com/Harmanankohli | ] linkedin.com/in/harmanankohli/
Skills
Programming: Python, SQL
Libraries: Pandas, Numpy, scikit-learn, TensorFlow, Langchain, Keras, Matplotlib, Seaborn
Domains: Data Science, Machine Learning, NLP, Generative AI, Explainable AI
Cloud: Azure (Certified), GCP (Beginner)
Tools: Tableau, Excel, Streamlit, Selenium, SAP
Work Experience
Sterlite Technologies, Mumbai Oct 2024 - Apr 2025
Data Scientist
• Designed a safety stock optimization model (Z-score, AMC, lead time, SD), reducing inventory holding costs and
improving reorder accuracy.
• Applied Counterfactual Explanations (XAI) to RFQ classifications, identifying actionable parameter shifts that improved
deal conversion rates.
• Led initiatives in Generative AI and Retrieval-Augmented Generation (RAG), building internal prototypes for semantic
document search and knowledge access.
Tata Consultancy Services, Mumbai Jan 2020 - Jan 2024
Systems Engineer (SAP Consultant)
• Improved master data maintenance in SAP BPC, reducing processing time from several days to minutes by imple-
menting efficient data handling techniques.
• Enhanced forecasting and planning capabilities by refining depreciation, amortization calculations, cash flow projec-
tions, and cost center allocation, leading to more accurate financial predictions.
• Streamlined financial reporting by developing input forms and reports in SAP BPC, enabling stakeholders to access
real-time insights.
• Integrated SAP Planning data with SAP S4HANA to facilitate systematic comparisons of actual vs. planned data,
supporting strategic decisions.
• Collaborated with the SAP technical team to integrate GIS with SAP Real Estate Module (REFx), generating actionable
insights from spatial data.
• Assisted in the creation of SAP Fiori Apps in the Real Estate domain, collaborating with UI/UX and SAP technical teams
to enhance user experience and data accessibility.
Research Experience
Launchpad.ai | Fellowship.ai, Remote Jan 2024 - Mar 2024
Data Science Fellow
• Developed proof-of-concept projects using advanced models like Gemini and Phi-2, leveraging Retrieval Augmented
Generation to integrate LLM with Vector DB for relevant result generation.
• Built a robust data retrieval system by integrating LLM with Vector DB using Langchain, enhancing data accessibility
and usability.
• Collaborated with team members to create agents that improve performance by integrating LLM with external tools for
enhanced web search capabilities.
• Designed and implemented a web scraper using Python and Selenium to collect and process product data, facilitating
data-driven decision-making.
Master’s Dissertation: Explainable AI for Predicting Malware Infection Nov 2023 - Apr 2024
Graduate Researcher
• Developed a machine learning model to predict device susceptibility to malware infections, targeting the prevention of
data breaches and financial losses.
• Focused on creating a robust, interpretable model to enhance cybersecurity by predicting and explaining device vul-
nerabilities to malware.
• Utilized technical skills including Python, Data Cleaning, Exploratory Data Analysis, Natural Language Processing,
Machine Learning, Pandas, and Explainable AI.
Project Work
• Lyrics Generation using RNN / LSTM: Trained a model on lyrics dataset and then the model was able to generate
lyrics based on the words given.Performed an End-to-End Data Analysis and trained a LSTM model on the dataset
which was able to generate the lyrics based on the words given as input. Used technical skills like Python, Data Clean-
ing, Exploratory Data Analysis, Deep Learning – RNN / LSTM, Natural Language Processing, TensorFlow, Pandas.
• Automated Classification of Electrical Product PDFs: Built and deployed the machine learning model that classifies
product PDFs into specific electrical categories-Lighting, Fuses, Cables, or Others—based on the extracted text con-
tent. Developed and deployed an XGBoost model on Streamlit that classifies product PDFs into electrical categories
with accurate predictions based on extracted text. Used technical skills like Python, Data Cleaning, Exploratory Data
Analysis, Natural Language Processing, XGBoost, Pandas, Streamlit, Apache Tika.
Volunteer Experience
Open-Source Contributor at Scikit – Learn Library
• Issue: Ensuring that each function has an example on how to use that function in the documentation (#27892).
• Outcome: Contributed by adding an example showing how to use function f_regression() and silhouette_score().
• Pull request: f_regression() - #28104 and silhouette_score() - #28125
Education
BITS Pilani WILP Apr 2022 - May 2024
M.Tech. in Data Science and Engineering CGPA: 8.14/10
Relevant Coursework: Data Structures and Algorithms, Machine Learning, Data Mining, Deep Learning, Natural Language
Processing
SRM University, Haryana Jul 2014 - May 2018
B.Tech. in Computer Science and Engineering CGPA: 8.14/10
Relevant Coursework: Database Management System, Data Structures and Algorithms, Software Engineering, Data Min-
ing, Artificial Intelligence and Expert System
Awards and Honors
• Acknowledged Contributor, The Hundred-Page Language Models Book by Andriy Burkov Jan 2025
Contributed feedback and insights during the book’s development phase. Recognized in the author’s acknowledg-
ments for volunteer participation.
• Tata Consultancy Services – Multiple internal recognitions including: 2021–2023
Best Team Award (2x), On-the-Spot Awards (5x), Special Initiative Award
• Semi-Finalist, TechGig Code Gladiators May 2018
Selected for semi-finals in one of India’s largest competitive programming contests. Successfully cleared multiple
algorithmic rounds and demonstrated problem-solving skills under time pressure.
Certification
• Microsoft Certified: Azure AI Engineer Associate June 2025
Issued by Microsoft
• Machine Learning A-Z: AI, Python & R + ChatGPT Bonus [2023] August 2023
Issued by Udemy
• Data Scientist with Python Track Jul 2022
Issued by Datacamp