Data Science for Decision Makers: Enhance your leadership skills with data science and AI expertise
By Jon Howells
()
Jon Howells
Jon Howells is a seasoned AI and Data Science professional with a decade of experience in the field. He runs an AI consultancy called Qualifai and has worked with various companies, including Unilever, Permira and Capgemini, developing and deploying data science services and solutions. He holds a Master's degree in Computational Statistics & Machine Learning from UCL. Jon is particularly interested in the application of Large Language Models (LLMs) in consumer-focused businesses, such as using LLMs for consumer research and feedback analysis, personalized content generation, and enhanced customer support, ultimately helping businesses better understand and engage with their customers.
Related to Data Science for Decision Makers
Related ebooks
Data Scientist Roadmap Rating: 5 out of 5 stars5/5Data Science Mastery: From Beginner to Expert in Big Data Analytics Rating: 0 out of 5 stars0 ratingsData-Centric Machine Learning with Python: The ultimate guide to engineering and deploying high-quality models based on good data Rating: 0 out of 5 stars0 ratingsPrinciples of Data Science: A beginner's guide to essential math and coding skills for data fluency and machine learning Rating: 0 out of 5 stars0 ratingsApache Spark for Machine Learning: Build and deploy high-performance big data AI solutions for large-scale clusters Rating: 0 out of 5 stars0 ratingsData Science Unveiled: A Practical Guide to Key Techniques Rating: 0 out of 5 stars0 ratingsMachine Learning for Beginners: A Comprehensive Guide to Mastering Algorithms, Data Science, and Artificial Intelligence Rating: 0 out of 5 stars0 ratingsMastering Data Science: From Basics to Expert Proficiency Rating: 0 out of 5 stars0 ratingsMachine Learning Data To Decision: Unlocking the Power of AI in Business and Beyond Rating: 0 out of 5 stars0 ratingsPython Machine Learning By Example Rating: 4 out of 5 stars4/5Lead With AI: Igniting Company Growth with Artificial Intelligence Rating: 0 out of 5 stars0 ratingsMastering Data Science: A Comprehensive Guide to Techniques and Applications Rating: 0 out of 5 stars0 ratingsMachine Learning For Dummies Rating: 4 out of 5 stars4/5Machine Learning and Generative AI for Marketing: Take your data-driven marketing strategies to the next level using Python Rating: 0 out of 5 stars0 ratings15 Math Concepts Every Data Scientist Should Know: Understand and learn how to apply the math behind data science algorithms Rating: 0 out of 5 stars0 ratingsData Science and AI Simplified Rating: 0 out of 5 stars0 ratingsPython Automation Mastery: From Novice To Pro Rating: 0 out of 5 stars0 ratingsData Science Fundamentals and Practical Approaches: Understand Why Data Science Is the Next (English Edition) Rating: 0 out of 5 stars0 ratingsData Mining Models: Techniques and Applications Rating: 0 out of 5 stars0 ratingsData Science Essentials For Dummies Rating: 0 out of 5 stars0 ratingsCracking the Data Science Interview: Unlock insider tips from industry experts to master the data science field Rating: 0 out of 5 stars0 ratingsData Science with .NET and Polyglot Notebooks: Programmer's guide to data science using ML.NET, OpenAI, and Semantic Kernel Rating: 0 out of 5 stars0 ratings"Big Data Science" Basic Concepts and Applications Rating: 0 out of 5 stars0 ratingsData Analytics for Marketing: A practical guide to analyzing marketing data using Python Rating: 0 out of 5 stars0 ratingsAI and ML for Coders: AI Fundamentals Rating: 0 out of 5 stars0 ratings
Computers For You
Creating Online Courses with ChatGPT | A Step-by-Step Guide with Prompt Templates Rating: 4 out of 5 stars4/5The ChatGPT Millionaire Handbook: Make Money Online With the Power of AI Technology Rating: 4 out of 5 stars4/5The Self-Taught Computer Scientist: The Beginner's Guide to Data Structures & Algorithms Rating: 0 out of 5 stars0 ratingsMastering ChatGPT: 21 Prompts Templates for Effortless Writing Rating: 4 out of 5 stars4/5Data Analytics for Beginners: Introduction to Data Analytics Rating: 4 out of 5 stars4/5Elon Musk Rating: 4 out of 5 stars4/5Computer Science I Essentials Rating: 5 out of 5 stars5/5Storytelling with Data: Let's Practice! Rating: 4 out of 5 stars4/5SQL QuickStart Guide: The Simplified Beginner's Guide to Managing, Analyzing, and Manipulating Data With SQL Rating: 4 out of 5 stars4/5Excel 101: A Beginner's & Intermediate's Guide for Mastering the Quintessence of Microsoft Excel (2010-2019 & 365) in no time! Rating: 0 out of 5 stars0 ratingsThe Innovators: How a Group of Hackers, Geniuses, and Geeks Created the Digital Revolution Rating: 4 out of 5 stars4/5CompTIA IT Fundamentals (ITF+) Study Guide: Exam FC0-U61 Rating: 0 out of 5 stars0 ratingsCompTIA Security+ Get Certified Get Ahead: SY0-701 Study Guide Rating: 5 out of 5 stars5/5UX/UI Design Playbook Rating: 4 out of 5 stars4/5Fundamentals of Programming: Using Python Rating: 5 out of 5 stars5/5Becoming a Data Head: How to Think, Speak, and Understand Data Science, Statistics, and Machine Learning Rating: 5 out of 5 stars5/5Deep Search: How to Explore the Internet More Effectively Rating: 5 out of 5 stars5/52022 Adobe® Premiere Pro Guide For Filmmakers and YouTubers Rating: 5 out of 5 stars5/5Technical Writing For Dummies Rating: 0 out of 5 stars0 ratingsProcreate for Beginners: Introduction to Procreate for Drawing and Illustrating on the iPad Rating: 5 out of 5 stars5/5Learning the Chess Openings Rating: 5 out of 5 stars5/5Microsoft Azure For Dummies Rating: 0 out of 5 stars0 ratingsQuantum Computing For Dummies Rating: 3 out of 5 stars3/5Get Started in UX: The Complete Guide to Launching a Career in User Experience Design Rating: 4 out of 5 stars4/5How To Become A Data Scientist With ChatGPT: A Beginner's Guide to ChatGPT-Assisted Programming Rating: 4 out of 5 stars4/5
Reviews for Data Science for Decision Makers
0 ratings0 reviews
Book preview
Data Science for Decision Makers - Jon Howells
Data Science for Decision Makers
Copyright © 2024 Packt Publishing
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.
Group Product Manager: Ali Abidi
Publishing Product Manager: Tejashwini R
Book Project Manager: Hemangi Lotlikar
Content Development Editor: Joseph Sunil
Technical Editor: Rahul Limbachiya
Copy Editor: Safis Editing
Proofreader: Joseph Sunil
Indexer: Rekha Nair
Production Designer: Ponraj Dhandapani
DevRel Marketing Coordinator: Vinishka Kalra
First published: June 2024
Production reference: 1190624
Published by Packt Publishing Ltd.
Grosvenor House 11 St Paul’s Square Birmingham B3 1RB, UK
ISBN 978-1-83763-729-4
www.packtpub.com
To my mother and father, Caroline and Robert, for instilling in me the values of education and constant curiosity. To my partner, Yeshica, for your unwavering support, and to my sister, Felicity, for your keen eye in reviewing and shaping this book.
– Jon Howells
Contributors
About the author
Jon Howells, director of AI consultancy QualifAI, is an experienced professional in data science and machine learning, with over a decade of experience in the consumer goods, market research, and public sectors. He has worked within consultancies including KPMG and Capgemini and with multinational clients such as Unilever and Permira, as well as public sector bodies such as the UK Home Office and the US Food and Drug Administration (FDA).
With an MSc in computational statistics and machine learning from UCL, Jon specializes in applying large language models (LLMs) to consumer-focused businesses, leveraging them for consumer research, personalized content generation, and enhanced customer support. His expertise helps businesses better understand and engage with their customers, driving innovation and unlocking the potential of data-driven decision-making.
About the reviewer
As a principal architect at T-Mobile, Tanmaya Gaur has more than 10 years of web development experience and a passion for delivering technical and architectural leadership for key technology initiatives and business capabilities. In the latest chapter of his professional career, he has been instrumental in shaping the architecture of T-Mobile’s primary CRM solution, which is built using modular micro-frontend architecture and enhances the digital experience for their care representatives and customers.
His expertise in web, infrastructure, and microservices enables him to design and deliver scalable solutions that are performant, secure, and resilient. He works closely with other business and IT partner teams in a highly collaborative environment and is committed to driving the best customer experience across mobile, desktop, point-of-sale, and other emerging devices.
Table of Contents
Preface
Part 1: Understanding Data Science and Its Foundations
1
Introducing Data Science
Data science, AI, and ML – what’s the difference?
The mathematical and statistical underpinnings of data science
Statistics and data science
What is statistics?
Descriptive and inferential statistics
Sampling strategies
Probability
Probability distribution
Conditional probability
Describing our samples
Measures of central tendency
Measures of dispersion
Degrees of freedom
Correlation, causation, and covariance
The shape of data
Probability distributions
Discrete probability distributions
Continuous probability distributions
Summary
2
Characterizing and Collecting Data
What are the key criteria to consider when evaluating datasets?
Data quantity
Data velocity
Data variety
Data quality
First-, second-, and third-party data
First-party data – the treasure trove within
Second-party data – building bridges through collaboration
Third-party data – broadening horizons with external expertise
Structured, unstructured, and semi-structured data
Structured data
Unstructured data
Semi-structured data
Methods for collecting data
Storing and processing data
Cloud, on-premises, and hybrid solutions – navigating the data storage and analysis landscape
Cloud computing – scalable services in the cloud
On-premises – maintaining control within your walls
Hybrid – the best of both worlds?
Data processing
Summary
3
Exploratory Data Analysis
Getting started with Google Colab
What is Google Colab?
A step-by-step guide to setting up Google Colab
Understanding the data you have
EDA techniques and tools
Descriptive statistics
Data visualization
Histograms
Density curves
Boxplots
Heatmaps
Dimensionality reduction
Correlation analysis
Outlier detection
Summary
4
The Significance of Significance
The idea of testing hypotheses
What is a hypothesis?
How does hypothesis testing work?
Formulating null and alternative hypotheses
Determining the significance level
Understanding errors
Getting to grips with p-values
Significance tests for a population proportion – making informed decisions about proportions
The z-test – comparing a sample proportion to a population proportion
Z-test example made easy
Significance tests for a population average (mean)
Writing hypotheses for a significance test about a mean
Conditions for a t-test about a mean
When to use z or t statistics in significance tests
Example – calculating the t-statistic for a test about a mean
Using a table to estimate the p-value from the t-statistic
Comparing the p-value from the t-statistic to the significance level
One-tailed and two-tailed tests
Walking through a case study
Summary
5
Understanding Regression
How can I benefit from understanding regression?
Introduction to trend lines
Fitting a trend line to data
Estimating the line of best fit
Calculating the equations of the lines of best fit
Interpreting the slope of a regression line
Interpreting the intercept of a regression line
Understanding residuals
Evaluating the goodness of fit in least-squares regression
Summary
Part 2: Machine Learning – Concepts, Applications, and Pitfalls
6
Introducing Machine Learning
From statistics to machine learning
What is machine learning?
How does machine learning relate to statistics?
Why is machine learning important?
Customer personalization and segmentation
Fraud detection and security
Supply chain and inventory optimization
Predictive maintenance
Healthcare diagnostics and treatment
The different types of machine learning
Supervised learning
Unsupervised learning
Semi-supervised learning
Reinforcement learning
Transfer learning
Popular machine learning algorithms
Linear regression
Logistic regression
Decision trees
Random forests
Support vector machines
k-nearest neighbors
Neural networks
The machine learning process
Training a supervised machine learning model
Validation of a supervised machine learning model
Testing a supervised machine learning model
Evaluating machine learning models
Risks and limitations of machine learning
Overfitting and underfitting
Bias and variance
Balanced dataset
Models are approximations of reality
Machine learning on unstructured data
Natural language processing (NLP)
Computer vision
Deep learning and artificial intelligence
Artificial intelligence
Deep learning
Summary
7
Supervised Machine Learning
Defining supervised learning
Applications of supervised learning
The two types of supervised learning
Key factors in supervised learning
Steps within supervised learning
Data preparation – laying the foundation
Algorithm selection – choosing the right tool
Model training – learning from data
Model evaluation – assessing performance
Prediction and deployment – putting the model to work
Characteristics of regression and classification algorithms
Regression algorithms
Classification algorithms
Key considerations in supervised learning
Evaluation metrics
Applications of supervised learning
Consumer goods
Retail
Manufacturing
Summary
8
Unsupervised Machine Learning
Defining UL
Practical examples of UL
Steps in UL
Step 1 – Data collection
Step 2 – Data preprocessing
Step 3 – Choosing the right model
Step 4 – Training the model
Step 5 – Interpretation and evaluation
In summary
Clustering – unveiling hidden patterns in your data
What is clustering?
How does clustering work?
k-means clustering
Practical applications of clustering
Evaluation metrics for clustering
In summary
Association rule learning
What is association rule learning?
The Apriori algorithm – a practical example
Evaluation metrics
In summary
Applications of UL
Market segmentation
Anomaly detection
Feature extraction
Summary
9
Interpreting and Evaluating Machine Learning Models
How do I know whether this model will be accurate?
Evaluating on test (holdout) data
Understanding evaluation metrics
Evaluating regression models
R-squared
Root mean squared error
Mean absolute error
When and how to use each metric
Practical evaluation strategies
Summarizing the evaluation of regression models
Evaluating classification models
Classification model evaluation metrics
Precision, recall, and F1-Score
Recall
F1-score
Methods for explaining machine learning models
Making sense of regression models – the power of coefficients
Decoding classification models – unveiling feature importance
Beyond specific models – universal insights using SHAP values
Summary
10
Common Pitfalls in Machine Learning
Understanding the complexity
Dirty data, damaged models – how data quantity and quality impact ML
The importance of adequate training data
Dealing with poor data quality
Conclusion
Overcoming overfitting and underfitting
Navigating training-serving skew and model drift
Ensuring fairness
Mastering overfitting and underfitting for optimal model performance
Overfitting – when your model is too specific
Underfitting – when your model is too simplistic
Spotting the problem
Conclusion
Training-serving skew and model drift
Training-serving skew
Model drift
Key takeaways
Bias and fairness
Understanding bias
Understanding fairness
Mitigating bias and ensuring fairness
Key takeaways
Summary
Part 3: Leading Successful Data Science Projects and Teams
11
The Structure of a Data Science Project
The various types of data science projects
Data products
Reports and analytics
Research and methodology
The stages of a data product
Identifying use cases
Evaluating use cases
Planning the data product
Developing a data product
Data preparation and exploratory analysis
Model design and development
Evaluation and testing
Deploying and monitoring a data product
General best practices for data product development
Evaluating impact
Predictive maintenance in manufacturing
Fraud detection in banking
Customer churn prediction in telecom
Demand forecasting in retail
Personalized recommendations in e-commerce
Predictive maintenance in energy
Workforce optimization in quick service restaurants
Chatbot-assisted customer support
Summary
12
The Data Science Team
Assembling your data science team – key roles and considerations
Data scientists
Machine learning engineers
Data engineers
MLOps engineers
Analytics engineers
Software engineers (full stack, frontend, backend)
Product managers
Business analysts
Data storytellers/visualization experts
Considerations when assembling your team
Data science teams within larger organizations
The hub and spoke model
What is the hub and spoke model?
Practical applications of the hub and spoke model
Building a hub and spoke model
The art of recruitment
Where to find technical talent
How high-performing data science teams operate
Cross-functional collaboration is essential
Diversity of perspectives drives innovation
Start with the right problem to solve
Invest in tooling, infrastructure, and workflow
Continuous adaption and learning are a must
Focus ruthlessly on outcomes over activity
Summary
13
Managing the Data Science Team
Day-to-day management of a data science team
Enabling rapid experimentation and innovation
Managing inherent uncertainty
Balancing research and application
Communicating effectively in data science and artificial intelligence
Fostering a culture of curiosity and continuous learning
Embracing peer review and collaboration
Common challenges in managing a data science team
Challenge 1 – recruiting and retaining top talent
Challenge 2 – aligning projects with business goals
Challenge 3 – managing inherent uncertainty
Challenge 4 – scaling and operationalizing models
Challenge 5 – deploying robust, reliable, fair models ethically
Empowering and motivating your data science team
Working with other teams and external stakeholders and empowering them to use data
Summary
14
Continuing Your Journey as a Data Science Leader
Navigating the landscape of emerging technologies
Specializing in an industry
Specializing in a field
Embracing continuous learning
Online courses
Cloud certifications
Technical tutorials and documentation
Learning plan framework
Staying up to date with current DS/ML/AI news and trends
Promoting data-driven thinking within your organization
Host internal learning sessions
Collaborate on cross-functional projects
Share success stories and lessons learned
Mentor and upskill colleagues
Establish a data science community of practice
Networking beyond your organization
Attend industry conferences and events
Join online communities and forums
Engage with local meetups and user groups
Collaborate on side projects or research
Offer mentorship or seek mentors
Summary
Index
Other Books You May Enjoy
Preface
Data science, machine learning, and artificial intelligence (AI) are transforming the business landscape.
Organizations in every industry are harnessing these powerful tools to uncover insights, make predictions, and gain a competitive edge. This trend has only accelerated with the rise in large language models and Generative AI.
But for decision makers without a data science background, or those stepping up from being a data scientist to leading data teams, there are a myriad of challenges. It can be challenging to understand underlying concepts of statistics, machine learning, and AI; manage data teams effectively; and, most importantly, translate complex models into tangible business outcomes – business outcomes that deliver real, bottom-line value to an organization, not just vanity metrics and shiny demos.
This book is your guide. In Data Science for Decision Makers, you’ll gain the essential knowledge and skills to lead in the age of AI. Through clear explanations and practical examples, you’ll learn how to interpret machine learning models, identify valuable use cases, and drive measurable results. Step by step, you’ll learn the foundations of statistics and machine learning. You’ll discover how to plan and execute successful data science initiatives from start to finish.
Along the way, you’ll pick up best practices for building and empowering high-performing teams. Most importantly, you’ll learn how to bridge the gap between the technical world of data science and the business needs of your organization. Whether you’re an executive, a manager, or a data scientist moving into leadership, this book will help you leverage data-driven insights to inform your decisions and propel your company forward.
Who this book is for
Are you an executive seeking to harness the power of data science and AI? A manager eager to lead data-driven teams to success? Or perhaps a data scientist ready to step into a leadership role? If so, this book is for you.
Data Science for Decision Makers is designed for leaders who want to leverage data insights effectively. You don’t need a formal background in statistics or machine learning. What you do need is a desire to understand these concepts, ask the right questions, and make informed decisions.
If you work with data scientists and machine learning engineers, this book will help you interpret their models with confidence. You’ll learn how to recognize valuable opportunities for AI and plan projects that deliver real business value.
Executives will gain a solid foundation in data science methods. Managers will discover how to build and guide high-performing teams. Data scientists will develop the skills to become influential leaders. Wherever you are in your career, this book will help you succeed in the age of AI.
What this book covers
This book is structured into three parts. Firstly, we cover data science and its foundations in statistics. Then, we cover machine learning as it relates to data science, including core machine learning concepts, applications, and pitfalls to avoid. Finally, we cover how to lead successful data science projects and teams. If you are already familiar with the foundations of data science and the core statistical concepts covered in Part 1, you may wish to skip ahead to Part 2 or refresh your knowledge.
Part 1: Understanding Data Science and Its Foundations
Chapter 1
, Introducing Data Science, will provide you with a foundational understanding of data science, its relationship to AI and machine learning, and key statistical concepts. It explores descriptive and inferential statistics, probability, and data distributions, establishing a common language for readers.
Chapter 2
, Characterizing and Collecting Data, will give you the knowledge of how to distinguish between different types of data, including first-, second-, and third-party data, as well as structured, unstructured, and semi-structured data. It explores technologies and methods for collecting, storing, and processing data, and provides guidance on navigating the landscape of data-focused solutions, including cloud, on-premises, and hybrid solutions.
Chapter 3
, Exploratory Data Analysis, introduces the process of exploratory data analysis (EDA) and its importance in understanding data, developing hypotheses, and building better models. The chapter provides hands-on code examples in Python to reinforce the concepts, with step-by-step explanations suitable for readers with no prior experience in Python.
Chapter 4
, The Significance of Significance, explores the concept of statistical significance and its importance in making data-driven decisions. It covers hypothesis testing, also known as significance testing, and provides practical examples to illustrate its application in business scenarios, such as reducing customer churn and evaluating machine learning model improvements.
Chapter 5
, Understanding Regression, introduces regression as a powerful statistical tool for uncovering patterns and relationships within data. It explores various use cases for regression in a business context. The chapter begins with the foundational concept of trend lines before delving into the complexities of regression analysis.
Part 2: Machine Learning – Concepts, Applications, and Pitfalls
Chapter 6
, Introducing Machine Learning, provides an overview of machine learning and its importance in data-driven decision-making. It covers the progression from traditional statistics to machine learning, the various types of machine learning techniques, and the process of training, validating, and testing models.
Chapter 7
, Supervised Machine Learning, focuses on one of the most utilized and beneficial subfields of machine learning. It discusses the steps involved in training and deploying supervised machine learning models and core supervised learning algorithms, as well as factors to consider when training and evaluating these models and their applications.
Chapter 8
, Unsupervised Machine Learning, explores the field of unsupervised learning, where algorithms discover hidden patterns and insights from unlabeled data. The chapter covers practical examples of unsupervised learning, the key steps involved, and techniques such as clustering, anomaly detection, dimensionality reduction, and association rule learning. It emphasizes the distinct nature of unsupervised learning compared to supervised learning and highlights its potential for uncovering valuable information in data without prior training.
Chapter 9
, Interpreting and Evaluating Machine Learning Models, equips readers with the skills needed to assess the accuracy and reliability of machine learning models. You will learn how to use evaluation metrics to measure model performance and understand the importance of using holdout (test) data for unbiased evaluation. The chapter provides insights into the differences between evaluation metrics for regression and classification models, enabling readers to effectively interpret and validate the quality of machine learning models, ensuring their successful implementation in real-world scenarios.
Chapter 10
, Common Pitfalls in Machine Learning, provides readers with the knowledge to identify and address common challenges in developing and deploying machine learning models. It covers issues such as inadequate or poor-quality training data, overfitting and underfitting, training-serving skew, model drift, and bias and fairness. You will learn practical strategies to mitigate these pitfalls, ensuring your models are reliable, accurate, and equitable, ultimately leading to better business decisions and outcomes.
Part 3: Leading Successful Data Science Projects and Teams
Chapter 11
, The Structure of a Data Science Project, provides a comprehensive framework for planning and executing data science projects, focusing on delivering impactful data products. You will learn how to identify, evaluate, and prioritize use cases that align with your organization’s goals and have the potential to drive real business value. The chapter covers the key stages of data product development, from data preparation to model design, evaluation, and deployment. You will also learn how to evaluate the business impact of your data products by selecting relevant metrics and KPIs, enabling you to demonstrate the tangible value and ROI of your initiatives and secure ongoing support for your projects.
Chapter 12
, The Data Science Team, looks at the art and science of assembling a high-performing data science team. You will learn about the key roles that make up a successful team, including data scientists, machine learning engineers, and data engineers, along with the skills and expertise each role brings to the table. The chapter explores different operating models for structuring data science teams within larger organizations.
Chapter 13
, Managing the Data Science Team, explores the unique challenges and best practices for leading data science teams effectively. It covers strategies for enabling rapid experimentation, managing uncertainty, balancing research and production work, communicating effectively, fostering continuous learning, and promoting collaboration. The chapter also discusses common challenges such as aligning projects with business goals, scaling and deploying models, ensuring fairness and ethics, and driving the adoption of data science solutions.
Chapter 14
, Continuing Your Journey as a Data Science Leader, provides guidance on navigating the rapidly evolving landscape of data science, machine learning, and AI. It explores strategies for staying current with emerging technologies, specializing in specific industries or fields, and embracing continuous learning. The chapter also discusses the importance of staying informed about the latest trends and news and how data science leaders can promote data-driven thinking within their organizations.
To get the most out of this book, some familiarity with basic mathematical concepts such as algebra, probability, and statistics is helpful but not required. The real prerequisites are curiosity, a willingness to learn, and a drive to use data for the good of your organization. If you bring those qualities, this book will supply the knowledge and practical skills you need. Step by step, you’ll learn to wield the tools of data science and AI with clarity, confidence, and purpose.
Setup instructions will be provided in the chapters where there are code exercises.
Conventions used
There are a number of text conventions used throughout this book.
Code in text: Indicates code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles. Here is an example: Click on the cell to activate it, type print(
Hello, world!), and then click the play button to run the code.
A block of code is set as follows:
# Calculate median (middle value)
median_sales = sales_data_year1.median()
print(fThe median monthly sales, a typical sales month, is {round(median_sales)} units.
)
When we wish to draw your attention to a particular part of a code block, the relevant lines or items are set in bold:
# Calculate standard deviation (measure of the amount of variation)
std_dev_sales = sales_data_year1.std()
print(fThe standard deviation, showing the typical variation from the mean sales, is {round(std_dev_sales)} units.
)
Bold: Indicates a new term, an important word, or words that you see onscreen. For instance, words in menus or dialog boxes appear in bold. Here is an example: Click File, then choose New Notebook from the dropdown.
Tips or important notes
Appear like this.
Get in touch
Feedback from our readers is always welcome.
General feedback: If you have questions about any aspect of this book, email us at [email protected]
and mention the book title in the subject of your message.
Errata: Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you have found a mistake in this book, we would be grateful if you would report this to us. Please visit www.packtpub.com/support/errata
and fill in the form.
Piracy: If you come across any illegal copies of our works in any form on the internet, we would be grateful if you would provide us with the location address