ML
ML
M
Principles and Practices
Chapter 1: Introduction to Machine Learning#
achine learning is a subfield of artificial intelligence (AI) that focuses on the development of
M
algorithms and statistical models that enable computers to perform tasks without explicit
instructions. It relies on patterns and inference instead. The core idea is to allow computers to
learn from data, identify patterns, and make decisions with minimal human intervention.
raining a machine learning model involves feeding it data and allowing it to learn the
T
relationships between the input features and the target outputs. The model's performance is
evaluated using a separate test dataset to ensure it generalizes well to unseen data. Common
evaluation metrics include accuracy, precision, recall, and F1-score.
eature engineering is the process of selecting, modifying, or creating new input features to
F
improve model performance. It involves techniques such as normalization, encoding categorical
variables, and creating interaction terms. Effective feature engineering can significantly enhance
a model's predictive power.
y=β0+β1x+ϵ
y=β
0
+β
1
x+
ϵ
where
y
x
β0
β
0
is the y-intercept,
β1
β
1
ϵ
-means clustering is one of the simplest and most popular unsupervised learning algorithms. It
K
partitions the data into
k
kclusters, where each data point belongs to the clusterwith the nearest mean. The algorithm
iteratively updates the cluster centroids and reassigns data points until convergence. The choice
of
k
kis crucial and can be determined using methods likethe elbow method or silhouette analysis.
3.1.2 Hierarchical Clustering
ierarchical clustering builds a tree-like structure of nested clusters, known as a dendrogram. It
H
can be agglomerative (bottom-up) or divisive (top-down). In agglomerative clustering, each data
point starts as its own cluster, and pairs of clusters are merged as one moves up the hierarchy. In
divisive clustering, the process starts with a single cluster and splits recursively. The choice of
linkage criteria (e.g., single, complete, average) affects the shape of the dendrogram.
-learning is a model-free reinforcement learning algorithm that seeks to find the best action to
Q
take given the current state. It learns a function
Q(s,a)
a
ain state
s
s, and following the optimal policy thereafter. TheQ-values are updated iteratively using the
Bellman equation.
eep Q-Networks combine Q-learning with deep neural networks to handle high-dimensional
D
state spaces. A DQN uses a neural network to approximate the Q-value function, allowing it to
learn directly from raw pixel inputs in environments like video games. Techniques such as
experience replay and target networks are used to stabilize training.
eedforward neural networks are the simplest type of artificial neural network. Information moves
F
in one direction—from input nodes, through hidden nodes (if any), to output nodes. There are no
cycles or loops in the network.