Brijesh Rai
DATA SCIENTIST
[email protected] 9108870900 | 7996662290 Bengaluru, KARNATAKA
linkedin.com/in/brijesh-rai-a5bb12147
SUMMARY
Experienced individual with strong statistical or analytical skills and the ability to work with various data environments
and in the quantitative field.Looking to apply my prior around 4 years of extensive data science and machine learning experience in
a fast-paced and dynamic firm to create data science/AI algorithms that generate actionable insights. Coming
with good programming skills and experience in using different data science tools.
SKILLS
Statistical Programming Languages and Databases(RDBMS) : Python, SQL,MongoDB(No-SQL database).
Advanced Statistical Concepts,Techs and Modeling : Predictive,descriptive,inferential
analysis(sampling),univariate/multivariate statistics, hypothesis A/B
testing,ANOVA,correlation,probability,distributions,confidence intervals or central measure of data,sample size
estimation,statistical or quantitative analysis,feature engineering,exploratory data analysis(EDA).
Machine Learning or Statistical Algorithms : Supervised algorithms - Decision tree(CART,CHAID),random
forest(Bagging),data regression algorithms(linear/logistic regression,L1/L2 regularization),SVM,K-nearest
neighbours(KNN),naive bayes,boosting algorithms(ada boost,gradient boost,XG boost),stacking classifiers,Unsupervised
algorithms - segmentation or clustering algorithms(K-means,Hierarchical,DB-scan,optics),Time series
forecasting/analysis,ARIMA.
Machine Learning Metrics : Confusion matrix,recall,precision,area under
curve(AUC),davies_bouldin_score,silhouette_score,R2-score,F1-score,RMSE,hyper parameter tuning/optimisation,grid
search cv,cross-validation methods.
Data Visualization or Business Intelligence Tools : Tableau, power BI,matplotlib, and seaborn.
Model Deployment : AWS cloud, heroku.
Artificial Neural Networks(ANN) and Deep Learning : FNN,convolutional neural network(CNN),backward
propagation,gradient descent,activation/loss functions,data augmentation,dropout,max pooling,image
preprocessing.
NLP : Natural language generation(NLG) and understanding (NLU),wordcount vectoriser,TF IDF,word embeddings,text
preprocessing and normalization techniques,tokenization,stemming,lemmatization,text similarity,information retrieval,
topic modeling, LDA, and entity extraction.
Big Data Technologies and Frameworks : Hadoop, HDFS,map-reduce, apache sqoop, hive.
Python and Machine Learning Libraries :
Numpy,pandas,scikit(sklearn),scipy,NLTK,textblob,tensorflow,keras,regex,matplotlib,seaborn.
Interpersonal Skills : Critical/analytical thinking, multi-tasking, time management, data storytelling, strong
verbal/written communication, proactive problem-solving skills, intellectual curiosity, decision making, eagerness to
learn new approaches/methodologies, and business sense.
EXPERIENCE
Junior Data Scientist
Mastec Quadgen Solutions Sep 2018 - Sep 2019
Predictive machine learning classification algorithms are developed, resulting in 90% accuracy along with good recall
percent in predicting employee attrition in the organization and generating insights, the attrition rate in an organization
is decreased below 15% and increased productivity by 30% than the previous outcomes.
We communicated and coordinated with the HR department to understand business objectives, business metrics, and
business analytical problems, and to collect 2940 employee data .
Volunteered to lead the role in data analysis, data mining, data cleansing, data transformation/data manipulation, data
processing/exploration, and model building/evaluation, identified patterns, trends, outliners, and recommended
improvement strategies for KPIs across the organization to control attrition rate.
Data Scientist www.resumekraft.com
Mastec Quadgen Solutions Oct 2019 - May 2021
processing/exploration, and model building/evaluation, identified patterns, trends, outliners, and recommended
improvement strategies for KPIs across the organization to control attrition rate.
Data Scientist
Mastec Quadgen Solutions Oct 2019 - May 2021
Machine learning AI solutions are built that predicted telecom churn successfully with high model performance and
throughput with 80% accuracy, which resulted in reducing churn, retaining customers, and increasing the revenue
initially by 25 to 30%.
Headed up and collaborated with respective departments for project planning, data gatherings, and manipulating data
from a database which resulted in collecting data of 7043 clients.
Being a key player involved in building end-to-end solutions and instrumental in implementing ensemble modeling,was
responsible for providing reports and insightful recommendations to business leaders.
Data Scientist
Learnbay AI and Freelancer Jun 2021 - Present
Took on the challenge of working on real-world projects like Bank Marketing/Term Deposit and Retail Customer
Segmentation.
Scalable predictive models are implemented and achieved 91% validation accuracy in predicting the clients who will
subscribe to a term deposit, which helped in increased business and economy by 20% to 25%.
Developed clustering algorithms based on deep-dive statistical analysis and data modeling to segment the existing
retail customers and marketing campaigns on ideal customers resulting in increased sales by 15% compared with
previous outcomes.
Effectively mined relevant data of 5,41,909 customers from a database, extracting/aggregating data performed
using advanced querying and analytical tools, and presented portfolios of growth by performance dashboards and
visualization, cluster profiling, and pointing out key trends in the domain.
PERSONAL PROJECTS
Information Retrieval on Articles Jun 2021 - Dec 2021
Information retrieval NLP systems are built using the K-nearest neighbors(KNN) model to retrieve similar medical
articles using TF IDF scores of title and abstract text of articles.
Image Classification using CNN Jan 2022 - Aug 2022
Optimized image classifiers are developed using convolutional neural networks(CNN) along with minimized loss achieving
an accuracy of around 80% in classifying images of flowers with the help of TensorFlow Keras packages.
Fake News Detector Sep 2022 - Feb 2023
Created document classification NLP models using logical regression and by performing text mining and text preprocessing
using NLTK packages, achieved an accuracy of around 95% in predicting fake news.
Topic Modeling on Emails Mar 2023 - Present
Developing probabilistic NLP models using the LDA algorithm(Latent Dirichlet Allocation) used to perform topic modeling i.e,
tagging emails with related topics and visualization on extracted topics and with their keywords.
EDUCATION
Bachelor of Engineering in Electronics and Communication (ECE)
Nitte Meenakshi University (NMIT) Aug 2015 - May 2018
Diploma in Electronics and Communication Engineering(ECE)
Vivekananda Jul 2012 -May 2015
CERTIFICATION
Data Science and AI IBM Certification.
HOBBIES
Playing Cricket,Volleyball and Kabaddi. Travelling and Exploring places,foods.
Video Gaming
Travelling and Exploring places,foods.
www.resumekraft.com