Data Science for Decision Makers: Enhance your leadership skills with data science and AI expertise

Ebook751 pages4 hours

Data Science for Decision Makers: Enhance your leadership skills with data science and AI expertise

Name: Data Science for Decision Makers: Enhance your leadership skills with data science and AI expertise
Author: Jon Howells
ISBN: 9781837638345

By Jon Howells

Rating: 0 out of 5 stars

()

Read preview

Skip carousel

LanguageEnglish

PublisherPackt Publishing

Release dateJul 26, 2024

ISBN9781837638345

Author

Jon Howells

Jon Howells is a seasoned AI and Data Science professional with a decade of experience in the field. He runs an AI consultancy called Qualifai and has worked with various companies, including Unilever, Permira and Capgemini, developing and deploying data science services and solutions. He holds a Master's degree in Computational Statistics & Machine Learning from UCL. Jon is particularly interested in the application of Large Language Models (LLMs) in consumer-focused businesses, such as using LLMs for consumer research and feedback analysis, personalized content generation, and enhanced customer support, ultimately helping businesses better understand and engage with their customers.

Related authors

Skip carousel

Related to Data Science for Decision Makers

Related ebooks

Skip carousel

Data Scientist Roadmap
Ebook
Data Scientist Roadmap
byMohammed Ahmed
Rating: 5 out of 5 stars
5/5
Data Science Mastery: From Beginner to Expert in Big Data Analytics
Ebook
Data Science Mastery: From Beginner to Expert in Big Data Analytics
byKameron Hussain
Rating: 0 out of 5 stars
0 ratings
Data-Centric Machine Learning with Python: The ultimate guide to engineering and deploying high-quality models based on good data
Ebook
Data-Centric Machine Learning with Python: The ultimate guide to engineering and deploying high-quality models based on good data
byJonas Christensen
Rating: 0 out of 5 stars
0 ratings
Principles of Data Science: A beginner's guide to essential math and coding skills for data fluency and machine learning
Ebook
Principles of Data Science: A beginner's guide to essential math and coding skills for data fluency and machine learning
bySinan Ozdemir
Rating: 0 out of 5 stars
0 ratings
Apache Spark for Machine Learning: Build and deploy high-performance big data AI solutions for large-scale clusters
Ebook
Apache Spark for Machine Learning: Build and deploy high-performance big data AI solutions for large-scale clusters
byDeepak Gowda
Rating: 0 out of 5 stars
0 ratings
Data Science Unveiled: A Practical Guide to Key Techniques
Ebook
Data Science Unveiled: A Practical Guide to Key Techniques
byEd A Norex
Rating: 0 out of 5 stars
0 ratings
Machine Learning for Beginners: A Comprehensive Guide to Mastering Algorithms, Data Science, and Artificial Intelligence
Ebook
Machine Learning for Beginners: A Comprehensive Guide to Mastering Algorithms, Data Science, and Artificial Intelligence
byRyan Knight
Rating: 0 out of 5 stars
0 ratings
Mastering Data Science: From Basics to Expert Proficiency
Ebook
Mastering Data Science: From Basics to Expert Proficiency
byWilliam Smith
Rating: 0 out of 5 stars
0 ratings
Machine Learning Data To Decision: Unlocking the Power of AI in Business and Beyond
Ebook
Machine Learning Data To Decision: Unlocking the Power of AI in Business and Beyond
bySachin Dave
Rating: 0 out of 5 stars
0 ratings
Python Machine Learning By Example
Ebook
Python Machine Learning By Example
byYuxi (Hayden) Liu
Rating: 4 out of 5 stars
4/5
Lead With AI: Igniting Company Growth with Artificial Intelligence
Ebook
Lead With AI: Igniting Company Growth with Artificial Intelligence
byAmir Elkabir
Rating: 0 out of 5 stars
0 ratings
Mastering Data Science: A Comprehensive Guide to Techniques and Applications
Ebook
Mastering Data Science: A Comprehensive Guide to Techniques and Applications
byAdam Jones
Rating: 0 out of 5 stars
0 ratings
Machine Learning For Dummies
Ebook
Machine Learning For Dummies
byJohn Paul Mueller
Rating: 4 out of 5 stars
4/5
Machine Learning for Beginners: An Introduction for Beginners, Why Machine Learning Matters Today and How Machine Learning Networks, Algorithms, Concepts and Neural Networks Really Work
Ebook
Machine Learning for Beginners: An Introduction for Beginners, Why Machine Learning Matters Today and How Machine Learning Networks, Algorithms, Concepts and Neural Networks Really Work
bySteven Cooper
Rating: 4 out of 5 stars
4/5
Data Labeling in Machine Learning with Python: Explore modern ways to prepare labeled data for training and fine-tuning ML and generative AI models
Ebook
Data Labeling in Machine Learning with Python: Explore modern ways to prepare labeled data for training and fine-tuning ML and generative AI models
byVijaya Kumar Suda
Rating: 0 out of 5 stars
0 ratings
Machine Learning and Generative AI for Marketing: Take your data-driven marketing strategies to the next level using Python
Ebook
Machine Learning and Generative AI for Marketing: Take your data-driven marketing strategies to the next level using Python
byYoon Hyup Hwang
Rating: 0 out of 5 stars
0 ratings
15 Math Concepts Every Data Scientist Should Know: Understand and learn how to apply the math behind data science algorithms
Ebook
15 Math Concepts Every Data Scientist Should Know: Understand and learn how to apply the math behind data science algorithms
byDavid Hoyle
Rating: 0 out of 5 stars
0 ratings
Data Science and AI Simplified
Ebook
Data Science and AI Simplified
byEkaaksh Deshpande
Rating: 0 out of 5 stars
0 ratings
Python Automation Mastery: From Novice To Pro
Ebook
Python Automation Mastery: From Novice To Pro
byRob Botwright
Rating: 0 out of 5 stars
0 ratings
Data Science Fundamentals and Practical Approaches: Understand Why Data Science Is the Next (English Edition)
Ebook
Data Science Fundamentals and Practical Approaches: Understand Why Data Science Is the Next (English Edition)
byDr. Gypsy Nandi
Rating: 0 out of 5 stars
0 ratings
Data Mining Models: Techniques and Applications
Ebook
Data Mining Models: Techniques and Applications
byRavi Deshpande
Rating: 0 out of 5 stars
0 ratings
Data Science Essentials For Dummies
Ebook
Data Science Essentials For Dummies
byLillian Pierson
Rating: 0 out of 5 stars
0 ratings
Cracking the Data Science Interview: Unlock insider tips from industry experts to master the data science field
Ebook
Cracking the Data Science Interview: Unlock insider tips from industry experts to master the data science field
byLeondra R. Gonzalez
Rating: 0 out of 5 stars
0 ratings
Python Data Cleaning and Preparation Best Practices: A practical guide to organizing and handling data from various sources and formats using Python
Ebook
Python Data Cleaning and Preparation Best Practices: A practical guide to organizing and handling data from various sources and formats using Python
byMaria Zervou
Rating: 0 out of 5 stars
0 ratings
Ultimate Parallel and Distributed Computing with Julia For Data Science: Excel in Data Analysis, Statistical Modeling and Machine Learning by leveraging MLBase.jl and MLJ.jl to optimize workflows (English Edition)
Ebook
Ultimate Parallel and Distributed Computing with Julia For Data Science: Excel in Data Analysis, Statistical Modeling and Machine Learning by leveraging MLBase.jl and MLJ.jl to optimize workflows (English Edition)
byNabanita Dash
Rating: 0 out of 5 stars
0 ratings
Data Science with .NET and Polyglot Notebooks: Programmer's guide to data science using ML.NET, OpenAI, and Semantic Kernel
Ebook
Data Science with .NET and Polyglot Notebooks: Programmer's guide to data science using ML.NET, OpenAI, and Semantic Kernel
byMatt Eland
Rating: 0 out of 5 stars
0 ratings
"Big Data Science" Basic Concepts and Applications
Ebook
"Big Data Science" Basic Concepts and Applications
bySukanta Bhattacharya
Rating: 0 out of 5 stars
0 ratings
Data Analytics for Marketing: A practical guide to analyzing marketing data using Python
Ebook
Data Analytics for Marketing: A practical guide to analyzing marketing data using Python
byGuilherme Diaz-Bérrio
Rating: 0 out of 5 stars
0 ratings
AI and ML for Coders: AI Fundamentals
Ebook
AI and ML for Coders: AI Fundamentals
byAndrew Hinton
Rating: 0 out of 5 stars
0 ratings
Think AI: Explore the flavours of Machine Learning, Neural Networks, Computer Vision and NLP with powerful python libraries (English Edition)
Ebook
Think AI: Explore the flavours of Machine Learning, Neural Networks, Computer Vision and NLP with powerful python libraries (English Edition)
bySwapnali Joshi Naik
Rating: 0 out of 5 stars
0 ratings

Computers For You

Skip carousel

Creating Online Courses with ChatGPT | A Step-by-Step Guide with Prompt Templates
Ebook
Creating Online Courses with ChatGPT | A Step-by-Step Guide with Prompt Templates
byCea West
Rating: 4 out of 5 stars
4/5
The ChatGPT Millionaire Handbook: Make Money Online With the Power of AI Technology
Ebook
The ChatGPT Millionaire Handbook: Make Money Online With the Power of AI Technology
byTJ Books
Rating: 4 out of 5 stars
4/5
The Self-Taught Computer Scientist: The Beginner's Guide to Data Structures & Algorithms
Ebook
The Self-Taught Computer Scientist: The Beginner's Guide to Data Structures & Algorithms
byCory Althoff
Rating: 0 out of 5 stars
0 ratings
Mastering ChatGPT: 21 Prompts Templates for Effortless Writing
Ebook
Mastering ChatGPT: 21 Prompts Templates for Effortless Writing
byCea West
Rating: 4 out of 5 stars
4/5
ChatGPT Money Machine 2024 - The Ultimate Chatbot Cheat Sheet to Go From Clueless Noob to Prompt Prodigy Fast! Complete AI Beginner’s Course to Catch the GPT Gold Rush Before It Leaves You Behind
Ebook
ChatGPT Money Machine 2024 - The Ultimate Chatbot Cheat Sheet to Go From Clueless Noob to Prompt Prodigy Fast! Complete AI Beginner’s Course to Catch the GPT Gold Rush Before It Leaves You Behind
byAlec Rowe
Rating: 0 out of 5 stars
0 ratings
Data Analytics for Beginners: Introduction to Data Analytics
Ebook
Data Analytics for Beginners: Introduction to Data Analytics
byAnthony S. Williams
Rating: 4 out of 5 stars
4/5
Elon Musk
Ebook
Elon Musk
byWalter Isaacson
Rating: 4 out of 5 stars
4/5
Computer Science I Essentials
Ebook
Computer Science I Essentials
byRandall Raus
Rating: 5 out of 5 stars
5/5
Storytelling with Data: Let's Practice!
Ebook
Storytelling with Data: Let's Practice!
byCole Nussbaumer Knaflic
Rating: 4 out of 5 stars
4/5
SQL QuickStart Guide: The Simplified Beginner's Guide to Managing, Analyzing, and Manipulating Data With SQL
Ebook
SQL QuickStart Guide: The Simplified Beginner's Guide to Managing, Analyzing, and Manipulating Data With SQL
byWalter Shields
Rating: 4 out of 5 stars
4/5
Excel Essentials: A Step-by-Step Guide with Pictures for Absolute Beginners to Master the Basics and Start Using Excel with Confidence
Ebook
Excel Essentials: A Step-by-Step Guide with Pictures for Absolute Beginners to Master the Basics and Start Using Excel with Confidence
byNigel Tillery
Rating: 5 out of 5 stars
5/5
Excel 101: A Beginner's & Intermediate's Guide for Mastering the Quintessence of Microsoft Excel (2010-2019 & 365) in no time!
Ebook
Excel 101: A Beginner's & Intermediate's Guide for Mastering the Quintessence of Microsoft Excel (2010-2019 & 365) in no time!
byJohannes Wild
Rating: 0 out of 5 stars
0 ratings
Machine Learning for Beginners: An Introduction for Beginners, Why Machine Learning Matters Today and How Machine Learning Networks, Algorithms, Concepts and Neural Networks Really Work
Ebook
Machine Learning for Beginners: An Introduction for Beginners, Why Machine Learning Matters Today and How Machine Learning Networks, Algorithms, Concepts and Neural Networks Really Work
bySteven Cooper
Rating: 4 out of 5 stars
4/5
The Innovators: How a Group of Hackers, Geniuses, and Geeks Created the Digital Revolution
Ebook
The Innovators: How a Group of Hackers, Geniuses, and Geeks Created the Digital Revolution
byWalter Isaacson
Rating: 4 out of 5 stars
4/5
Data Science from Scratch: The #1 Data Science Guide for Everything A Data Scientist Needs to Know: Python, Linear Algebra, Statistics, Coding, Applications, Neural Networks, and Decision Trees
Ebook
Data Science from Scratch: The #1 Data Science Guide for Everything A Data Scientist Needs to Know: Python, Linear Algebra, Statistics, Coding, Applications, Neural Networks, and Decision Trees
bySteven Cooper
Rating: 4 out of 5 stars
4/5
Learning DevOps: The complete guide to accelerate collaboration with Jenkins, Kubernetes, Terraform and Azure DevOps
Ebook
Learning DevOps: The complete guide to accelerate collaboration with Jenkins, Kubernetes, Terraform and Azure DevOps
byMikael Krief
Rating: 5 out of 5 stars
5/5
CompTIA IT Fundamentals (ITF+) Study Guide: Exam FC0-U61
Ebook
CompTIA IT Fundamentals (ITF+) Study Guide: Exam FC0-U61
byQuentin Docter
Rating: 0 out of 5 stars
0 ratings
CompTIA Security+ Get Certified Get Ahead: SY0-701 Study Guide
Ebook
CompTIA Security+ Get Certified Get Ahead: SY0-701 Study Guide
byJoe Shelley
Rating: 5 out of 5 stars
5/5
UX/UI Design Playbook
Ebook
UX/UI Design Playbook
byOlha Bahaieva
Rating: 4 out of 5 stars
4/5
Fundamentals of Programming: Using Python
Ebook
Fundamentals of Programming: Using Python
byBruce Embry
Rating: 5 out of 5 stars
5/5
Becoming a Data Head: How to Think, Speak, and Understand Data Science, Statistics, and Machine Learning
Ebook
Becoming a Data Head: How to Think, Speak, and Understand Data Science, Statistics, and Machine Learning
byAlex J. Gutman
Rating: 5 out of 5 stars
5/5
Deep Search: How to Explore the Internet More Effectively
Ebook
Deep Search: How to Explore the Internet More Effectively
byAlan Pearce
Rating: 5 out of 5 stars
5/5
2022 Adobe® Premiere Pro Guide For Filmmakers and YouTubers
Ebook
2022 Adobe® Premiere Pro Guide For Filmmakers and YouTubers
byScott Bradley
Rating: 5 out of 5 stars
5/5
Technical Writing For Dummies
Ebook
Technical Writing For Dummies
bySheryl Lindsell-Roberts
Rating: 0 out of 5 stars
0 ratings
Procreate for Beginners: Introduction to Procreate for Drawing and Illustrating on the iPad
Ebook
Procreate for Beginners: Introduction to Procreate for Drawing and Illustrating on the iPad
byAaron Smith
Rating: 5 out of 5 stars
5/5
Learning the Chess Openings
Ebook
Learning the Chess Openings
byJef Kaan
Rating: 5 out of 5 stars
5/5
Microsoft Azure For Dummies
Ebook
Microsoft Azure For Dummies
byJack A. Hyman
Rating: 0 out of 5 stars
0 ratings
Quantum Computing For Dummies
Ebook
Quantum Computing For Dummies
bywhurley
Rating: 3 out of 5 stars
3/5
Get Started in UX: The Complete Guide to Launching a Career in User Experience Design
Ebook
Get Started in UX: The Complete Guide to Launching a Career in User Experience Design
byMatthew Magain
Rating: 4 out of 5 stars
4/5
How To Become A Data Scientist With ChatGPT: A Beginner's Guide to ChatGPT-Assisted Programming
Ebook
How To Become A Data Scientist With ChatGPT: A Beginner's Guide to ChatGPT-Assisted Programming
byRafiq Muhammad
Rating: 4 out of 5 stars
4/5

Related categories

Skip carousel

Reviews for Data Science for Decision Makers

Rating: 0 out of 5 stars

0 ratings

0 ratings0 reviews

Book preview

Data Science for Decision Makers - Jon Howells

Cover.png

Data Science for Decision Makers

All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.

Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.

Group Product Manager: Ali Abidi

Publishing Product Manager: Tejashwini R

Book Project Manager: Hemangi Lotlikar

Content Development Editor: Joseph Sunil

Technical Editor: Rahul Limbachiya

Copy Editor: Safis Editing

Proofreader: Joseph Sunil

Indexer: Rekha Nair

Production Designer: Ponraj Dhandapani

DevRel Marketing Coordinator: Vinishka Kalra

First published: June 2024

Production reference: 1190624

Published by Packt Publishing Ltd.

Grosvenor House 11 St Paul’s Square Birmingham B3 1RB, UK

ISBN 978-1-83763-729-4

www.packtpub.com

To my mother and father, Caroline and Robert, for instilling in me the values of education and constant curiosity. To my partner, Yeshica, for your unwavering support, and to my sister, Felicity, for your keen eye in reviewing and shaping this book.

– Jon Howells

Contributors

About the author

Jon Howells, director of AI consultancy QualifAI, is an experienced professional in data science and machine learning, with over a decade of experience in the consumer goods, market research, and public sectors. He has worked within consultancies including KPMG and Capgemini and with multinational clients such as Unilever and Permira, as well as public sector bodies such as the UK Home Office and the US Food and Drug Administration (FDA).

With an MSc in computational statistics and machine learning from UCL, Jon specializes in applying large language models (LLMs) to consumer-focused businesses, leveraging them for consumer research, personalized content generation, and enhanced customer support. His expertise helps businesses better understand and engage with their customers, driving innovation and unlocking the potential of data-driven decision-making.

About the reviewer

As a principal architect at T-Mobile, Tanmaya Gaur has more than 10 years of web development experience and a passion for delivering technical and architectural leadership for key technology initiatives and business capabilities. In the latest chapter of his professional career, he has been instrumental in shaping the architecture of T-Mobile’s primary CRM solution, which is built using modular micro-frontend architecture and enhances the digital experience for their care representatives and customers.

His expertise in web, infrastructure, and microservices enables him to design and deliver scalable solutions that are performant, secure, and resilient. He works closely with other business and IT partner teams in a highly collaborative environment and is committed to driving the best customer experience across mobile, desktop, point-of-sale, and other emerging devices.

Table of Contents

Preface

Part 1: Understanding Data Science and Its Foundations

Introducing Data Science

Data science, AI, and ML – what’s the difference?

The mathematical and statistical underpinnings of data science

Statistics and data science

What is statistics?

Descriptive and inferential statistics

Sampling strategies

Probability

Probability distribution

Conditional probability

Describing our samples

Measures of central tendency

Measures of dispersion

Degrees of freedom

Correlation, causation, and covariance

The shape of data

Probability distributions

Discrete probability distributions

Continuous probability distributions

Summary

Characterizing and Collecting Data

What are the key criteria to consider when evaluating datasets?

Data quantity

Data velocity

Data variety

Data quality

First-, second-, and third-party data

First-party data – the treasure trove within

Second-party data – building bridges through collaboration

Third-party data – broadening horizons with external expertise

Structured, unstructured, and semi-structured data

Structured data

Unstructured data

Semi-structured data

Methods for collecting data

Storing and processing data

Cloud, on-premises, and hybrid solutions – navigating the data storage and analysis landscape

Cloud computing – scalable services in the cloud

On-premises – maintaining control within your walls

Hybrid – the best of both worlds?

Data processing

Summary

Exploratory Data Analysis

Getting started with Google Colab

What is Google Colab?

A step-by-step guide to setting up Google Colab

Understanding the data you have

EDA techniques and tools

Descriptive statistics

Data visualization

Histograms

Density curves

Boxplots

Heatmaps

Dimensionality reduction

Correlation analysis

Outlier detection

Summary

The Significance of Significance

The idea of testing hypotheses

What is a hypothesis?

How does hypothesis testing work?

Formulating null and alternative hypotheses

Determining the significance level

Understanding errors

Getting to grips with p-values

Significance tests for a population proportion – making informed decisions about proportions

The z-test – comparing a sample proportion to a population proportion

Z-test example made easy

Significance tests for a population average (mean)

Writing hypotheses for a significance test about a mean

Conditions for a t-test about a mean

When to use z or t statistics in significance tests

Example – calculating the t-statistic for a test about a mean

Using a table to estimate the p-value from the t-statistic

Comparing the p-value from the t-statistic to the significance level

One-tailed and two-tailed tests

Walking through a case study

Summary

Understanding Regression

How can I benefit from understanding regression?

Introduction to trend lines

Fitting a trend line to data

Estimating the line of best fit

Calculating the equations of the lines of best fit

Interpreting the slope of a regression line

Interpreting the intercept of a regression line

Understanding residuals

Evaluating the goodness of fit in least-squares regression

Summary

Part 2: Machine Learning – Concepts, Applications, and Pitfalls

Introducing Machine Learning

From statistics to machine learning

What is machine learning?

How does machine learning relate to statistics?

Why is machine learning important?

Customer personalization and segmentation

Fraud detection and security

Supply chain and inventory optimization

Predictive maintenance

Healthcare diagnostics and treatment

The different types of machine learning

Supervised learning

Unsupervised learning

Semi-supervised learning

Reinforcement learning

Transfer learning

Popular machine learning algorithms

Linear regression

Logistic regression

Decision trees

Random forests

Support vector machines

k-nearest neighbors

Neural networks

The machine learning process

Training a supervised machine learning model

Validation of a supervised machine learning model

Testing a supervised machine learning model

Evaluating machine learning models

Risks and limitations of machine learning

Overfitting and underfitting

Bias and variance

Balanced dataset

Models are approximations of reality

Machine learning on unstructured data

Natural language processing (NLP)

Computer vision

Deep learning and artificial intelligence

Artificial intelligence

Deep learning

Summary

Supervised Machine Learning

Defining supervised learning

Applications of supervised learning

The two types of supervised learning

Key factors in supervised learning

Steps within supervised learning

Data preparation – laying the foundation

Algorithm selection – choosing the right tool

Model training – learning from data

Model evaluation – assessing performance

Prediction and deployment – putting the model to work

Characteristics of regression and classification algorithms

Regression algorithms

Classification algorithms

Key considerations in supervised learning

Evaluation metrics

Applications of supervised learning

Consumer goods

Retail

Manufacturing

Summary

Unsupervised Machine Learning

Defining UL

Practical examples of UL

Steps in UL

Step 1 – Data collection

Step 2 – Data preprocessing

Step 3 – Choosing the right model

Step 4 – Training the model

Step 5 – Interpretation and evaluation

In summary

Clustering – unveiling hidden patterns in your data

What is clustering?

How does clustering work?

k-means clustering

Practical applications of clustering

Evaluation metrics for clustering

In summary

Association rule learning

What is association rule learning?

The Apriori algorithm – a practical example

Evaluation metrics

In summary

Applications of UL

Market segmentation

Anomaly detection

Feature extraction

Summary

Interpreting and Evaluating Machine Learning Models

How do I know whether this model will be accurate?

Evaluating on test (holdout) data

Understanding evaluation metrics

Evaluating regression models

R-squared

Root mean squared error

Mean absolute error

When and how to use each metric

Practical evaluation strategies

Summarizing the evaluation of regression models

Evaluating classification models

Classification model evaluation metrics

Precision, recall, and F1-Score

Recall

F1-score

Methods for explaining machine learning models

Making sense of regression models – the power of coefficients

Decoding classification models – unveiling feature importance

Beyond specific models – universal insights using SHAP values

Summary

Common Pitfalls in Machine Learning

Understanding the complexity

Dirty data, damaged models – how data quantity and quality impact ML

The importance of adequate training data

Dealing with poor data quality

Conclusion

Overcoming overfitting and underfitting

Navigating training-serving skew and model drift

Ensuring fairness

Mastering overfitting and underfitting for optimal model performance

Overfitting – when your model is too specific

Underfitting – when your model is too simplistic

Spotting the problem

Conclusion

Training-serving skew and model drift

Training-serving skew

Model drift

Key takeaways

Bias and fairness

Understanding bias

Understanding fairness

Mitigating bias and ensuring fairness

Key takeaways

Summary

Part 3: Leading Successful Data Science Projects and Teams

The Structure of a Data Science Project

The various types of data science projects

Data products

Reports and analytics

Research and methodology

The stages of a data product

Identifying use cases

Evaluating use cases

Planning the data product

Developing a data product

Data preparation and exploratory analysis

Model design and development

Evaluation and testing

Deploying and monitoring a data product

General best practices for data product development

Evaluating impact

Predictive maintenance in manufacturing

Fraud detection in banking

Customer churn prediction in telecom

Demand forecasting in retail

Personalized recommendations in e-commerce

Predictive maintenance in energy

Workforce optimization in quick service restaurants

Chatbot-assisted customer support

Summary

The Data Science Team

Assembling your data science team – key roles and considerations

Data scientists

Machine learning engineers

Data engineers

MLOps engineers

Analytics engineers

Software engineers (full stack, frontend, backend)

Product managers

Business analysts

Data storytellers/visualization experts

Considerations when assembling your team

Data science teams within larger organizations

The hub and spoke model

What is the hub and spoke model?

Practical applications of the hub and spoke model

Building a hub and spoke model

The art of recruitment

Where to find technical talent

How high-performing data science teams operate

Cross-functional collaboration is essential

Diversity of perspectives drives innovation

Start with the right problem to solve

Invest in tooling, infrastructure, and workflow

Continuous adaption and learning are a must

Focus ruthlessly on outcomes over activity

Summary

Managing the Data Science Team

Day-to-day management of a data science team

Enabling rapid experimentation and innovation

Managing inherent uncertainty

Balancing research and application

Communicating effectively in data science and artificial intelligence

Fostering a culture of curiosity and continuous learning

Embracing peer review and collaboration

Common challenges in managing a data science team

Challenge 1 – recruiting and retaining top talent

Challenge 2 – aligning projects with business goals

Challenge 3 – managing inherent uncertainty

Challenge 4 – scaling and operationalizing models

Challenge 5 – deploying robust, reliable, fair models ethically

Empowering and motivating your data science team

Working with other teams and external stakeholders and empowering them to use data

Summary

Continuing Your Journey as a Data Science Leader

Navigating the landscape of emerging technologies

Specializing in an industry

Specializing in a field

Embracing continuous learning

Online courses

Cloud certifications

Technical tutorials and documentation

Learning plan framework

Staying up to date with current DS/ML/AI news and trends

Promoting data-driven thinking within your organization

Host internal learning sessions

Collaborate on cross-functional projects

Share success stories and lessons learned

Mentor and upskill colleagues

Establish a data science community of practice

Networking beyond your organization

Attend industry conferences and events

Join online communities and forums

Engage with local meetups and user groups

Collaborate on side projects or research

Offer mentorship or seek mentors

Summary

Index

Other Books You May Enjoy

Preface

Data science, machine learning, and artificial intelligence (AI) are transforming the business landscape.

Organizations in every industry are harnessing these powerful tools to uncover insights, make predictions, and gain a competitive edge. This trend has only accelerated with the rise in large language models and Generative AI.

But for decision makers without a data science background, or those stepping up from being a data scientist to leading data teams, there are a myriad of challenges. It can be challenging to understand underlying concepts of statistics, machine learning, and AI; manage data teams effectively; and, most importantly, translate complex models into tangible business outcomes – business outcomes that deliver real, bottom-line value to an organization, not just vanity metrics and shiny demos.

This book is your guide. In Data Science for Decision Makers, you’ll gain the essential knowledge and skills to lead in the age of AI. Through clear explanations and practical examples, you’ll learn how to interpret machine learning models, identify valuable use cases, and drive measurable results. Step by step, you’ll learn the foundations of statistics and machine learning. You’ll discover how to plan and execute successful data science initiatives from start to finish.

Along the way, you’ll pick up best practices for building and empowering high-performing teams. Most importantly, you’ll learn how to bridge the gap between the technical world of data science and the business needs of your organization. Whether you’re an executive, a manager, or a data scientist moving into leadership, this book will help you leverage data-driven insights to inform your decisions and propel your company forward.

Who this book is for

Are you an executive seeking to harness the power of data science and AI? A manager eager to lead data-driven teams to success? Or perhaps a data scientist ready to step into a leadership role? If so, this book is for you.

Data Science for Decision Makers is designed for leaders who want to leverage data insights effectively. You don’t need a formal background in statistics or machine learning. What you do need is a desire to understand these concepts, ask the right questions, and make informed decisions.

If you work with data scientists and machine learning engineers, this book will help you interpret their models with confidence. You’ll learn how to recognize valuable opportunities for AI and plan projects that deliver real business value.

Executives will gain a solid foundation in data science methods. Managers will discover how to build and guide high-performing teams. Data scientists will develop the skills to become influential leaders. Wherever you are in your career, this book will help you succeed in the age of AI.

What this book covers

This book is structured into three parts. Firstly, we cover data science and its foundations in statistics. Then, we cover machine learning as it relates to data science, including core machine learning concepts, applications, and pitfalls to avoid. Finally, we cover how to lead successful data science projects and teams. If you are already familiar with the foundations of data science and the core statistical concepts covered in Part 1, you may wish to skip ahead to Part 2 or refresh your knowledge.

Part 1: Understanding Data Science and Its Foundations

Chapter 1

, Introducing Data Science, will provide you with a foundational understanding of data science, its relationship to AI and machine learning, and key statistical concepts. It explores descriptive and inferential statistics, probability, and data distributions, establishing a common language for readers.

Chapter 2

, Characterizing and Collecting Data, will give you the knowledge of how to distinguish between different types of data, including first-, second-, and third-party data, as well as structured, unstructured, and semi-structured data. It explores technologies and methods for collecting, storing, and processing data, and provides guidance on navigating the landscape of data-focused solutions, including cloud, on-premises, and hybrid solutions.

Chapter 3

, Exploratory Data Analysis, introduces the process of exploratory data analysis (EDA) and its importance in understanding data, developing hypotheses, and building better models. The chapter provides hands-on code examples in Python to reinforce the concepts, with step-by-step explanations suitable for readers with no prior experience in Python.

Chapter 4

, The Significance of Significance, explores the concept of statistical significance and its importance in making data-driven decisions. It covers hypothesis testing, also known as significance testing, and provides practical examples to illustrate its application in business scenarios, such as reducing customer churn and evaluating machine learning model improvements.

Chapter 5

, Understanding Regression, introduces regression as a powerful statistical tool for uncovering patterns and relationships within data. It explores various use cases for regression in a business context. The chapter begins with the foundational concept of trend lines before delving into the complexities of regression analysis.

Part 2: Machine Learning – Concepts, Applications, and Pitfalls

Chapter 6

, Introducing Machine Learning, provides an overview of machine learning and its importance in data-driven decision-making. It covers the progression from traditional statistics to machine learning, the various types of machine learning techniques, and the process of training, validating, and testing models.

Chapter 7

, Supervised Machine Learning, focuses on one of the most utilized and beneficial subfields of machine learning. It discusses the steps involved in training and deploying supervised machine learning models and core supervised learning algorithms, as well as factors to consider when training and evaluating these models and their applications.

Chapter 8

, Unsupervised Machine Learning, explores the field of unsupervised learning, where algorithms discover hidden patterns and insights from unlabeled data. The chapter covers practical examples of unsupervised learning, the key steps involved, and techniques such as clustering, anomaly detection, dimensionality reduction, and association rule learning. It emphasizes the distinct nature of unsupervised learning compared to supervised learning and highlights its potential for uncovering valuable information in data without prior training.

Chapter 9

, Interpreting and Evaluating Machine Learning Models, equips readers with the skills needed to assess the accuracy and reliability of machine learning models. You will learn how to use evaluation metrics to measure model performance and understand the importance of using holdout (test) data for unbiased evaluation. The chapter provides insights into the differences between evaluation metrics for regression and classification models, enabling readers to effectively interpret and validate the quality of machine learning models, ensuring their successful implementation in real-world scenarios.

Chapter 10

, Common Pitfalls in Machine Learning, provides readers with the knowledge to identify and address common challenges in developing and deploying machine learning models. It covers issues such as inadequate or poor-quality training data, overfitting and underfitting, training-serving skew, model drift, and bias and fairness. You will learn practical strategies to mitigate these pitfalls, ensuring your models are reliable, accurate, and equitable, ultimately leading to better business decisions and outcomes.

Part 3: Leading Successful Data Science Projects and Teams

Chapter 11

, The Structure of a Data Science Project, provides a comprehensive framework for planning and executing data science projects, focusing on delivering impactful data products. You will learn how to identify, evaluate, and prioritize use cases that align with your organization’s goals and have the potential to drive real business value. The chapter covers the key stages of data product development, from data preparation to model design, evaluation, and deployment. You will also learn how to evaluate the business impact of your data products by selecting relevant metrics and KPIs, enabling you to demonstrate the tangible value and ROI of your initiatives and secure ongoing support for your projects.

Chapter 12

, The Data Science Team, looks at the art and science of assembling a high-performing data science team. You will learn about the key roles that make up a successful team, including data scientists, machine learning engineers, and data engineers, along with the skills and expertise each role brings to the table. The chapter explores different operating models for structuring data science teams within larger organizations.

Chapter 13

, Managing the Data Science Team, explores the unique challenges and best practices for leading data science teams effectively. It covers strategies for enabling rapid experimentation, managing uncertainty, balancing research and production work, communicating effectively, fostering continuous learning, and promoting collaboration. The chapter also discusses common challenges such as aligning projects with business goals, scaling and deploying models, ensuring fairness and ethics, and driving the adoption of data science solutions.

Chapter 14

, Continuing Your Journey as a Data Science Leader, provides guidance on navigating the rapidly evolving landscape of data science, machine learning, and AI. It explores strategies for staying current with emerging technologies, specializing in specific industries or fields, and embracing continuous learning. The chapter also discusses the importance of staying informed about the latest trends and news and how data science leaders can promote data-driven thinking within their organizations.

To get the most out of this book, some familiarity with basic mathematical concepts such as algebra, probability, and statistics is helpful but not required. The real prerequisites are curiosity, a willingness to learn, and a drive to use data for the good of your organization. If you bring those qualities, this book will supply the knowledge and practical skills you need. Step by step, you’ll learn to wield the tools of data science and AI with clarity, confidence, and purpose.

Setup instructions will be provided in the chapters where there are code exercises.

Conventions used

There are a number of text conventions used throughout this book.

Code in text: Indicates code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles. Here is an example: Click on the cell to activate it, type print(Hello, world!), and then click the play button to run the code.

A block of code is set as follows:

# Calculate median (middle value)

median_sales = sales_data_year1.median()

print(fThe median monthly sales, a typical sales month, is {round(median_sales)} units.)

When we wish to draw your attention to a particular part of a code block, the relevant lines or items are set in bold:

# Calculate standard deviation (measure of the amount of variation)

std_dev_sales = sales_data_year1.std()

print(fThe standard deviation, showing the typical variation from the mean sales, is {round(std_dev_sales)} units.)

Bold: Indicates a new term, an important word, or words that you see onscreen. For instance, words in menus or dialog boxes appear in bold. Here is an example: Click File, then choose New Notebook from the dropdown.

Tips or important notes

Appear like this.

Get in touch

Feedback from our readers is always welcome.

General feedback: If you have questions about any aspect of this book, email us at [email protected]

and mention the book title in the subject of your message.

Errata: Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you have found a mistake in this book, we would be grateful if you would report this to us. Please visit www.packtpub.com/support/errata

and fill in the form.

Piracy: If you come across any illegal copies of our works in any form on the internet, we would be grateful if you would provide us with the location address

Enjoying the preview?

Page 1 of 1

Data Science for Decision Makers: Enhance your leadership skills with data science and AI expertise

Jon Howells

Related authors

Related to Data Science for Decision Makers

Related ebooks

Data Scientist Roadmap

Data Science Mastery: From Beginner to Expert in Big Data Analytics

Data-Centric Machine Learning with Python: The ultimate guide to engineering and deploying high-quality models based on good data

Principles of Data Science: A beginner's guide to essential math and coding skills for data fluency and machine learning

Apache Spark for Machine Learning: Build and deploy high-performance big data AI solutions for large-scale clusters

Data Science Unveiled: A Practical Guide to Key Techniques

Machine Learning for Beginners: A Comprehensive Guide to Mastering Algorithms, Data Science, and Artificial Intelligence

Mastering Data Science: From Basics to Expert Proficiency

Machine Learning Data To Decision: Unlocking the Power of AI in Business and Beyond

Python Machine Learning By Example

Lead With AI: Igniting Company Growth with Artificial Intelligence

Mastering Data Science: A Comprehensive Guide to Techniques and Applications

Machine Learning For Dummies

Machine Learning for Beginners: An Introduction for Beginners, Why Machine Learning Matters Today and How Machine Learning Networks, Algorithms, Concepts and Neural Networks Really Work

Data Labeling in Machine Learning with Python: Explore modern ways to prepare labeled data for training and fine-tuning ML and generative AI models

Machine Learning and Generative AI for Marketing: Take your data-driven marketing strategies to the next level using Python

15 Math Concepts Every Data Scientist Should Know: Understand and learn how to apply the math behind data science algorithms

Data Science and AI Simplified

Python Automation Mastery: From Novice To Pro

Data Science Fundamentals and Practical Approaches: Understand Why Data Science Is the Next (English Edition)

Data Mining Models: Techniques and Applications

Data Science Essentials For Dummies

Cracking the Data Science Interview: Unlock insider tips from industry experts to master the data science field

Python Data Cleaning and Preparation Best Practices: A practical guide to organizing and handling data from various sources and formats using Python

Ultimate Parallel and Distributed Computing with Julia For Data Science: Excel in Data Analysis, Statistical Modeling and Machine Learning by leveraging MLBase.jl and MLJ.jl to optimize workflows (English Edition)

Data Science with .NET and Polyglot Notebooks: Programmer's guide to data science using ML.NET, OpenAI, and Semantic Kernel

"Big Data Science" Basic Concepts and Applications

Data Analytics for Marketing: A practical guide to analyzing marketing data using Python

AI and ML for Coders: AI Fundamentals

Think AI: Explore the flavours of Machine Learning, Neural Networks, Computer Vision and NLP with powerful python libraries (English Edition)

Computers For You

Creating Online Courses with ChatGPT | A Step-by-Step Guide with Prompt Templates

The ChatGPT Millionaire Handbook: Make Money Online With the Power of AI Technology

The Self-Taught Computer Scientist: The Beginner's Guide to Data Structures & Algorithms

Mastering ChatGPT: 21 Prompts Templates for Effortless Writing

ChatGPT Money Machine 2024 - The Ultimate Chatbot Cheat Sheet to Go From Clueless Noob to Prompt Prodigy Fast! Complete AI Beginner’s Course to Catch the GPT Gold Rush Before It Leaves You Behind

Data Analytics for Beginners: Introduction to Data Analytics

Elon Musk

Computer Science I Essentials

Storytelling with Data: Let's Practice!

SQL QuickStart Guide: The Simplified Beginner's Guide to Managing, Analyzing, and Manipulating Data With SQL

Excel Essentials: A Step-by-Step Guide with Pictures for Absolute Beginners to Master the Basics and Start Using Excel with Confidence

Excel 101: A Beginner's & Intermediate's Guide for Mastering the Quintessence of Microsoft Excel (2010-2019 & 365) in no time!

Machine Learning for Beginners: An Introduction for Beginners, Why Machine Learning Matters Today and How Machine Learning Networks, Algorithms, Concepts and Neural Networks Really Work

The Innovators: How a Group of Hackers, Geniuses, and Geeks Created the Digital Revolution

Data Science from Scratch: The #1 Data Science Guide for Everything A Data Scientist Needs to Know: Python, Linear Algebra, Statistics, Coding, Applications, Neural Networks, and Decision Trees

Learning DevOps: The complete guide to accelerate collaboration with Jenkins, Kubernetes, Terraform and Azure DevOps

CompTIA IT Fundamentals (ITF+) Study Guide: Exam FC0-U61

CompTIA Security+ Get Certified Get Ahead: SY0-701 Study Guide

UX/UI Design Playbook

Fundamentals of Programming: Using Python

Becoming a Data Head: How to Think, Speak, and Understand Data Science, Statistics, and Machine Learning

Deep Search: How to Explore the Internet More Effectively

2022 Adobe® Premiere Pro Guide For Filmmakers and YouTubers

Technical Writing For Dummies

Procreate for Beginners: Introduction to Procreate for Drawing and Illustrating on the iPad

Learning the Chess Openings

Microsoft Azure For Dummies

Quantum Computing For Dummies

Get Started in UX: The Complete Guide to Launching a Career in User Experience Design

How To Become A Data Scientist With ChatGPT: A Beginner's Guide to ChatGPT-Assisted Programming

Related categories

Reviews for Data Science for Decision Makers

What did you think?

Book preview

Data Science for Decision Makers - Jon Howells