Introduction to Data
science
What it is ?
“Goal is in extracting meaning from data and creating data products and
seeks to use all available and relevant data to effectively tell a story that
can be easily understood by non-practitioners.”
- From Wikipedia (not on latest version of the page)
What it is ?(Contd..)
General Tasks of a Data Scientist
Get a little domain understanding
Define the problem statement well
Pre-process data to fix data issues like duplicates, missing values, etc.
Visualize data to the extent possible for better understanding and to see basic
patterns
Identify what kind of a problem it is (Prediction/Forecasting, Classification,
Optimization and/or Managing Big Data)
Identify appropriate modeling techniques and build models
Analyze results and iterate, as needed; DO NOT trust software outputs blindly
‟ Remember:
Garbage In, Garbage Out
Visualize outputs and Communicate
General Tasks(Contd..)
Classification; Supervised and
Unsupervised
Prediction/Forecasting is finding the
line/plane closest to all points
Forecasting
Optimization
Analysis
Churn can be heartbreaking
Retail Analytics
How your shopping habits
reveal even the most powerful
and private information
Recommendation Engine
Text Mining
Natural Language Processing
Sentiment Analysis
Information Retrieval Systems
Other Important Applications
Pharmaceutical -Fraud detection in clinical trials; Drug development process
Healthcare - Non-compliance in taking prescription drugs
Banking and Insurance - Fraud detection; Credit scoring; Cross-selling and
upselling products;
Detecting money laundering; Forecasting stock prices
Travel and Hospitality - Improve customer experience
Politics - Predict winners; Identify fence-sitters
Other Important Applications
Retail and Telecommunications - Customer retention; Enhancing supply chain
efficiency;
Improving customer service quality; Planning store locations; Cross-selling and
upselling;
Recommendation systems; Sales forecasting
Government - Policy planning; Effective use of resources; Security against
terrorist attacks;
Effective policing by understanding crime patterns; Weather predictions;
Calamity predictions
How to build career in Data Science
Statistics
Machine Learning
Communications
Understanding customer domain - Be inquisitive
Asking the right questions of the data
Finally
Data is everywhere; you can’t escape it.
You Can Make Better Decisions Using Data Science / Big Data Analytics
Thank you