ARPITA KESHARWANI
E XPERIENCE LinkedIn | GitHub | Gitlab | Medium | Portfolio
• Indraprastha Institute of Information Technology March 2025 - Pre
Machine Learning Researcher New Delhi, India
◦ Reviewed Risk Calculators in Multiple Myeloma
◦ Development and Validationof a novel risk stratification model for MM.
• Cluster Innovation Centre February 2024 - August 2024
Undergraduate Mathematics Researcher New Delhi, India
◦ Developed a high-fidelity novel pollution model with variable flow rates using partial differential
equations, focused on River Yamuna’s data.
◦ Applied curve fitting and graphical data extraction for accurate parameter estimation from real-world
data, ensuring high precision and achieved improved accuracy and diversity compared to existing
models.
• Major League Hacking September 2023 - December 2023
Open Source Software Engineer Link Remote
◦ Configured and documented process and commands while working in SWE-23.Fall.A.1 pod; for
Apache-Airflow, enhancing the number of OSS contributions by 20%
◦ Optimized Apache-Airflow workflow by developing Python-based scripts from BASH, improving
efficiency and reducing complexity in collaboration with the Royal Bank of Canada.
• Indian Academy of Sciences May 2023 - August 2023
AI Researcher Link New Delhi, India
◦ Collected and curated large-scale fundoscopy image data from US and Shanghai datasets of total 600
patients, applying extensive data preprocessing and augmentation techniques using TensorFlow,
Keras, Scikit-learn, OpenCV, NumPy, and Pandas.
◦ Developed a deep learning model using MobileNet architecture conjoineed with dense-blocks and SVM
classifier for hypertensive retinopathy detection, achieving 96% accuracy, a 10% improvement over
baseline models along with satisfactory results in terms of Specificity (96%), Sensitivity (83%), and
F1-score (88%).
• BeyondExams June 2022 - November 2022
ML Intern Link New Delhi, India
◦ Engineered a rule-based Machine Learning model for automating the classification of 150,000 videos
from 50 diverse YouTube channels, utilizing Python and Jupyter; reduced manual tagging time by 75%,
enhancing team efficiency.
◦ Leveraged Natural Language Processing (NLP) techniques to extract video metadata from YouTube and
built a custom data pipeline, achieving a 93% classification accuracy.
E DUCATION
• University of L’Aquila September 2025 - July 2027
InterMaths Erasmus Mundus Joint Masters L’Aquila, Italy
• Cluster Innovation Centre, University of Delhi November 2021 - July 2025
Bachelor of Technology (Information Technology and Mathematical Innovations) with minor in Management New Delhi, India
◦ GPA: 9.46/10.00
◦ Data Structures and Algorithms (C), Prob & Stat (Py, R), Intro to CS (C), Object Oriented Programming
(C++, Java), Discrete Mathematics, Linear Algebra w/Computational Applications (Matlab), Partial
Differential Equations, Complex Analysis and Game theory, AI and Machine Learning (Py), Software
Development, Computational Social systems
P UBLICATIONS C=C ONFERENCE , J=J OURNAL , S=I N S UBMISSION , W=C URRENTLY W RITING
[S.1] Arpita Kesharwani and Prof. Shobha Bagai. (2024). Revisiting the Lake Pollution Model with a variable flow
rate: A case study on the river Yamuna in the Delhi NCR region. Manuscript submitted for publication in
International Journal of Mathematical Education in Science and Technology.
[S.2] Arpita Kesharwani, Abhishek Bhardwaj and Prof. Mahima Kaushik (2025). Computational Framework for
Predicting 2D and 3D Structures of Aptamers: Advancing Biomolecular Design for DNA/RNA
Therapeutics . Manuscript submitted in Journal of Molecular Graphics and Modelling
O PEN S OURCE W ORK
Wikimedia(Link) December 2024
Bulk User Information Fetcher
• Developing a tool to fetch the wikimedia contribution of the editors and technical contributors across all the wiki
projects, namespaces, time duration along with user rights and working status
Scribe-Org October 2024 - November 2024
Data Extraction, Collection, and Software Development
• Developed and expanded SPARQL queries to support data extraction in 10+ languages, facilitating multilingual data
analysis, with an average of approximately 2.63 million data items per language and a total of over 31 million data
items covered.
• Authored unit testing guidelines that increased code coverage by 20% and added CLI documentation for 3 primary
commands, enhancing the usability and developer experience of Scribe-Data CLI.
HacktoberFest October 2024
Data Extraction and Collection
• Contributed 38 pull requests (34 accepted), refactoring 2,000+ lines of code and authoring 5 documentation updates
to aid 40+ contributors. Developed SPARQL queries supporting 10+ languages, implemented unit testing guidelines
to boost coverage by 20%, and created CLI documentation for 3 key commands, enhancing usability for Scribe-Org.
Bokeh, Outreachy March 2023 - May 2023
Data Visualization
• Performed Data visualization on the New York taxi trip dataset.
P ROJECTS
• SoleSuite (Link) September 2024 - November 2023
Tools: [HTML, CSS, JavaScript, SQL]
◦ Developed a customizable ERP software tailored for the local shoe industry in Agra, optimized for medium
enterprises by avoiding superfluous features like those in Tally.
◦ Enhanced operational efficiency by focusing on relevant business functions and ease of use.
• Serenity Pathways (Link) January 2024 - April 2024
Tools: [React, Node.js, MongoDB]
◦ Designed and developed a website dedicated to spreading awareness about substance abuse, providing resources
and educational materials.
◦ Focused on an engaging, user-friendly interface to maximize accessibility and impact.
• Cricketer’s Retirement Age Prediction (Link) January 2023 - May 2023
Tools: [Neural Networks, JupyterGitLab, Tensorflow, Python]
◦ Developed an LSTM-based approach to predict the retirement age of cricket batters using historical performance
data of Test matches of 150 batsmen.
◦ Built two LSTM models: one to predict the total number of innings played in a career and another to predict the
retirement age using the data from the first model. Achieved an accuracy of 85% and 94% respectively for each
model.
• Frieze Pattern: Generation and Recognition(Link) January 2023 - May 2023
Tools: [OpenCV, Jupyter GitLab, Python]
◦ Pioneered an image-based recognition system for the 7 Frieze patterns, implementing novel algorithms for
automatic generation and recognition, contributing to the field of mathematical patterns in nature and
implemented it through web-app.
◦ For pattern recognition, implemented image processing techniques and Canny Edge Detection to extract motifs
and identify the symmetry type of Frieze patterns from any input photograph.
◦ For pattern generation, developed an algorithm to take input images and apply specific symmetry rules to
concatenate and form consistent Frieze Patterns with 100% accuracy and precision.
• FenderShoesWebsite (Link) September 2022 - November 2022
Tools: [HTML, CSS, JavaScript, MySQL]
◦ Created a marketplace website enabling direct customer purchase from the manufacturer, streamlining the supply
chain for Agra’s local shoe business.
◦ Improved business reach and customer experience through an intuitive online platform.
• Land-Use and Land-Cover Classification (Link) August 2022 - December 2022
Tools: [ArcGIS, QGIS]
◦ Analyzed the land-use and land-cover classification of Wayanad and Nilgiri and adjoining districts for the years
2014 and 2019. Also classified the area in terms of vegetation, built-up area, water bodies, and wetlands.
• Game-X (Link) April 2022 - May 2022
Tools: [Python, Object-Oriented Programming]
◦ Developed a comprehensive gaming platform featuring five classic games: Flappy Bird, Doodle Jump, Fruit Ninja,
Tic-Tac-Toe, and Battleship.
◦ Utilized object-oriented programming principles to create modular, maintainable, and efficient game code.
◦ Enhanced gameplay mechanics to ensure smooth user experiences across all game modules.
• Facial Recognition Using MATLAB (Link) June 2022 - July 2022
Tools and concepts: [MATLAB, PCA]
◦ Developed a facial recognition system in MATLAB using Principle Component Analysis utilizing MATLAB’s
powerful image processing and machine learning tools to reduce computational complexity, resulting in 90%
improved recognition accuracy.
S KILLS
• Programming Languages: Python(6 years), Java, C, R
• Web Technologies(2 years): HTML, CSS, Javascript, Full Stack Development
• Database Systems: MySQL(6 years), SPARQL
• Data Science & Machine Learning(2 years): ArcGIS, Image processing, NLP, Computer Vision
• Bioinformatics(1 year): Molecular simulation, computational Biology, Nucleotide structure prediction, Lammps,
Autodock Tools and Vina, Pymol
• DevOps & Version Control(4 years): Git, Github, Bootstrap, Software Development, Apache-Airflow,
JupyterGitLab
• Mathematics & Statistics(3 years): Matlab, Mathematica, Desmos, Mathematical modelling
• Research Skills(4 years): AI/ML/DL, Data Science, Data analysis
AWARDS AND A CHIEVEMENT
• Accepted among the 30 people to obtain scholarship for the Annual Indic Wikimedia Hachathon.
• Accepted into the Fall 2023 cohort of the Major League Hacking as Software Engineering Fellow , with an acceptance
rate of less than 2.5%.
• One of the 10 students nominated by Delhi university for Australian National university’s Future Research talent
Awards out of 114,494 undergraduates and 17,941 postgraduates.
• Secured All India Rank 16 in Delhi University Entrance Test (currently known as Central University Entrance Test)
out of more than 10k applicants.