0% found this document useful (0 votes)
16 views

UNIT-V Notes

Uploaded by

mishrashlok888
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views

UNIT-V Notes

Uploaded by

mishrashlok888
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 24

MAHARANA PRATAP GROUP OF INSTITUTIONS

KOTHI MANDHANA, KANPUR


(Approved by AICTE, New Delhi and Affiliated to Dr. AKTU, Lucknow)

Digital Notes
[Department of Computer Application]
Subject Name : Artificial Intelligence
Subject Code : KCA301
Course : MCA
Branch : -
Semester : IIIrd
Prepared by : Mr. Narendra Kumar Sharma

Reference No./MCA/NARENDRA/KCA301/2/3
Unit – 5
PATTERN RECOGNITION

1. Pattern is everything around in this digital world. A pattern can either be seen physically or
it can be observed mathematically by applying algorithms.

Example: The colors on the clothes, speech pattern etc. In computer science, a pattern is
represented using vector feature values.

1.1 What is Pattern Recognition?


Pattern recognition is the process of recognizing patterns by using machine learning
algorithm. Pattern recognition can be defined as the classification of data based on knowledge
already gained or on statistical information extracted from patterns and/or their representation.
One of the important aspects of the pattern recognition is its application potential.

Examples: Speech recognition, speaker identification, multimedia document recognition


(MDR), automatic medical diagnosis.
In a typical pattern recognition application, the raw data is processed and converted into a form
that is amenable for a machine to use.

1.2 Pattern recognition involves classification and cluster of patterns.


 In classification, an appropriate class label is assigned to a pattern based on an abstraction
that is generated using a set of training patterns or domain knowledge. Classification is used
in supervised learning.
 Clustering generated a partition of the data which helps decision making, the specific
decision making activity of interest to us. Clustering is used in an unsupervised learning.

1.3 Features may be represented as continuous, discrete or discrete binary variables. A feature
is a function of one or more measurements, computed so that it quantifies some significant
characteristics of the object.
2
Page
Example: consider our face then eyes, ears, nose etc are features of the face.
A set of features that are taken together, forms the features vector.

Example: In the above example of face, if all the features (eyes, ears, nose etc) taken together
then the sequence is feature vector([eyes, ears, nose]). Feature vector is the sequence of a
features represented as a d-dimensional column vector. In case of speech, MFCC
(Melfrequency Cepstral Coefficient) is the spectral features of the speech. Sequence of first 13
features forms a feature vector.

1.4 Pattern recognition possesses the following features:


 Pattern recognition system should recognise familiar pattern quickly and accurate
 Recognize and classify unfamiliar objects
 Accurately recognize shapes and objects from different angles
 Identify patterns and objects even when partly hidden
 Recognize patterns quickly with ease, and with automaticity.

1.5 Training and Learning in Pattern Recognition


Learning is a phenomena through which a system gets trained and becomes adaptable to give
result in an accurate manner. Learning is the most important phase as how well the system
performs on the data provided to the system depends on which algorithms used on the data.
Entire dataset is divided into two categories, one which is used in training the model i.e.
Training set and the other that is used in testing the model after training, i.e. Testing set.

 Training set:
Training set is used to build a model. It consists of the set of images that are used to train
the system. Training rules and algorithms used give relevant information on how to
associate input data with output decision. The system is trained by applying these
algorithms on the dataset, all the relevant information is extracted from the data and results
are obtained. Generally, 80% of the data of the dataset is taken for training data.
3
Page
 Testing set:
Testing data is used to test the system. It is the set of data which is used to verify whether
the system is producing the correct output after being trained or not. Generally, 20% of the
data of the dataset is used for testing. Testing data is used to measure the accuracy of the
system. Example: a system which identifies which category a particular flower belongs to,
is able to identify seven category of flowers correctly out of ten and rest others wrong, then
the accuracy is 70 %

1.6 Real-time Examples and Explanations:


A pattern is a physical object or an abstract notion. While talking about the classes of animals,
a description of an animal would be a pattern. While talking about various types of balls, then a
description of a ball is a pattern. In the case balls considered as pattern, the classes could be
football, cricket ball, table tennis ball etc. Given a new pattern, the class of the pattern is to be
determined. The choice of attributes and representation of patterns is a very important step in
pattern classification. A good representation is one which makes use of discriminating
attributes and also reduces the computational burden in pattern classification.

An obvious representation of a pattern will be a vector. Each element of the vector can
represent one attribute of the pattern. The first element of the vector will contain the value of
the first attribute for the pattern being considered.
4
Page
Example: While representing spherical objects, (25, 1) may be represented as an spherical
object with 25 units of weight and 1 unit diameter. The class label can form a part of the
vector. If spherical objects belong to class 1, the vector would be (25, 1, 1), where the first
element represents the weight of the object, the second element, the diameter of the object and
the third element represents the class of the object.

1.7 Advantages:
 Pattern recognition solves classification problems
 Pattern recognition solves the problem of fake bio metric detection.
 It is useful for cloth pattern recognition for visually impaired blind people.
 It helps in speaker diarization.
 We can recognise particular object from different angle.

1.8 Disadvantages:
 Syntactic Pattern recognition approach is complex to implement and it is very slow process.
 Sometime to get better accuracy, larger dataset is required.
 It cannot explain why a particular object is recognized.
Example: my face vs my friend‟s face.

1.9 Applications:

 Image processing, segmentation and analysis


Pattern recognition is used to give human recognition intelligence to machine which is
required in image processing.

 Computer vision
Pattern recognition is used to extract meaningful features from given image/video samples
and is used in computer vision for various applications like biological and biomedical
imaging.

 Seismic analysis
5

Pattern recognition approach is used for the discovery, imaging, and interpretation of
Page
temporal patterns in seismic array recordings. Statistical pattern recognition is implemented
and used in different types of seismic analysis models.

 Radar signal classification/analysis


Pattern recognition and Signal processing methods are used in various applications of radar
signal classifications like AP mine detection and identification.

 Speech recognition
The greatest success in speech recognition has been obtained using pattern recognition
paradigms. It is used in various algorithms of speech recognition which tries to avoid the
problems of using a phoneme level of description and treats larger units such as words as
pattern

 Finger print identification


The fingerprint recognition technique is a dominant technology in the biometric market. A
number of recognition methods have been used to perform fingerprint matching out of
which pattern recognition approaches is widely used.

2. Basics and Design Principles


2.1 Pattern Recognition System
Pattern is everything around in this digital world. A pattern can either be seen physically or it
can be observed mathematically by applying algorithms.

In Pattern Recognition, pattern is comprises of the following two fundamental things:


 Collection of observations
 The concept behind the observation

2.2 Feature Vector:


The collection of observations is also known as a feature vector. A feature is a distinctive
characteristic of a good or service that sets it apart from similar items. Feature vector is the
combination of n features in n-dimensional column vector.The different classes may have
6
Page

different features values but the same class always has the same features values.
Example:

 Differentiate between good and bad features.


 Feature properties.

2.3 Classifier and Decision Boundaries:


1. In a statistical-classification problem, a decision boundary is a hypersurface that
partitions the underlying vector space into two sets. A decision boundary is the region
of a problem space in which the output label of a classifier is ambiguous.Classifier is a
hypothesis or discrete-valued function that is used to assign (categorical) class labels to
particular data points.
2. Classifier is used to partition the feature space into class-labeled decision regions.
While Decision Boundaries are the borders between decision regions.

7
Page
2.4 Components in Pattern Recognition System:
A pattern recognition systems can be partitioned into components.There are five typical
components for various pattern recognition systems. These are as following:

 A Sensor : A sensor is a device used to measure a property, such as pressure,


position, temperature, or acceleration, and respond with feedback.

 A Preprocessing Mechanism : Segmentation is used and it is the process of


partitioning a data into multiple segments. It can also be defined as the technique
of dividing or partitioning an data into parts called segments.

 A Feature Extraction Mechanism : feature extraction starts from an initial set


of measured data and builds derived values (features) intended to be informative
and non-redundant, facilitating the subsequent learning and generalization steps,
and in some cases leading to better human interpretations. It can be manual or
automated.

 A Description Algorithm : Pattern recognition algorithms generally aim to


provide a reasonable answer for all possible inputs and to perform “most likely”
matching of the inputs, taking into account their statistical variation

 A Training Set : Training data is a certain percentage of an overall dataset along


with testing set. As a rule, the better the training data, the better the algorithm or
classifier performs.

8
Page
2.5 Design Principles of Pattern Recognition
In pattern recognition system, for recognizing the pattern or structure two basic approaches
are used which can be implemented in diferrent techniques. These are –
 Statistical Approach and
 Structural Approach

1. Statistical Approach:
Statistical methods are mathematical formulas, models, and techniques that are used in the
statistical analysis of raw research data. The application of statistical methods extracts
information from research data and provides different ways to assess the robustness of
research outputs.
Two main statistical methods are used :
i. Descriptive Statistics: It summarizes data from a sample using indexes such as
the mean or standard deviation.
ii. Inferential Statistics: It draw conclusions from data that are subject to random
variation.

2. Structural Approach:
The Structural Approach is a technique wherein the learner masters the pattern of sentence.
Structures are the different arrangements of words in one accepted style or the other.
Types of structures:
 Sentence Patterns
 Phrase Patterns
 Formulas
 Idioms

2.6 Difference Between Statistical Approach and Structural Approach:

Sr. No. Statistical Approach Structural Approach


1 Statistical decision theory. Human perception and cognition.
2 Quantitative features. Morphological primitives
9

3 Fixed number of features. Variable number of primitives.


Page
4 Ignores feature relationships. Captures primitives relationships.
5 Semantics from feature Semantics from primitives encoding.
position.
6 Statistical classifiers. Syntactic grammars.

3. Dimension Reduction
In pattern recognition, Dimension Reduction is defined as-
 It is a process of converting a data set having vast dimensions into a data set with lesser
dimensions.
 It ensures that the converted data set conveys similar information concisely.

Example-
Consider the following example-
 The following graph shows two dimensions x1 and x2.
 x1 represents the measurement of several objects in cm.
 x2 represents the measurement of several objects in inches.

10
Page
In machine learning,
 Using both these dimensions convey similar information.
 Also, they introduce a lot of noise in the system.
 So, it is better to use just one dimension.

3.1 Using dimension reduction techniques-

 We convert the dimensions of data from 2 dimensions (x1 and x2) to 1 dimension (z1).
 It makes the data relatively easier to explain.

3.2 Benefits-
Dimension reduction offers several benefits such as-
 It compresses the data and thus reduces the storage space requirements.
 It reduces the time required for computation since less dimensions require less computation.
 It eliminates the redundant features.
 It improves the model performance.

3.3 Dimension Reduction Techniques-


The two popular and well-known dimension reduction techniques are-

1. Principal Component Analysis (PCA)


2. Linear Discriminant Analysis (LDA)

1. Principal Component Analysis (PCA)-


11

 Principal Component Analysis is a well-known dimension reduction technique.



Page

It transforms the variables into a new set of variables called as principal components.
 These principal components are linear combination of original variables and are
orthogonal.
 The first principal component accounts for most of the possible variation of original data.
 The second principal component does its best to capture the variance in the data.
 There can be only two principal components for a two-dimensional data set.

PCA Algorithm-
The steps involved in PCA Algorithm are as follows-

Step-01: Get data.

Step-02: Compute the mean vector (µ).

Step-03: Subtract mean from the given data.

Step-04: Calculate the covariance matrix.

Step-05: Calculate the eigen vectors and eigen values of the covariance matrix.

Step-06: Choosing components and forming a feature vector.

Step-07: Deriving the new data set.

2. Linear Discriminant Analysis (LDA)-


Linear Discriminant Analysis or Normal Discriminant Analysis or Discriminant Function
Analysis is a dimensionality reduction technique which is commonly used for the supervised
classification problems. It is used for modeling differences in groups i.e. separating two or
more classes. It is used to project the features in higher dimension space into a lower
dimension space.
For example, we have two classes and we need to separate them efficiently. Classes can have
multiple features. Using only a single feature to classify them may result in some overlapping
as shown in the below figure. So, we will keep on increasing the number of features for proper
classification.
12
Page
Example:
Suppose we have two sets of data points belonging to two different classes that we want to
classify. As shown in the given 2D graph, when the data points are plotted on the 2D plane,
there‟s no straight line that can separate the two classes of the data points completely. Hence,
in this case, LDA (Linear Discriminant Analysis) is used which reduces the 2D graph into a 1D
graph in order to maximize the separability between the two classes.

Here, Linear Discriminant Analysis uses both the axes (X and Y) to create a new axis and
projects data onto a new axis in a way to maximize the separation of the two categories and
hence, reducing the 2D graph into a 1D graph.
Two criteria are used by LDA to create a new axis:

1. Maximize the distance between means of the two classes.


2. Minimize the variation within each class.

13
Page
In the above graph, it can be seen that a new axis (in red) is generated and plotted in the 2D
graph such that it maximizes the distance between the means of the two classes and minimizes
the variation within each class. In simple terms, this newly generated axis increases the
separation between the delta points of the two classes. After generating this new axis using the
above-mentioned criteria, all the data points of the classes are plotted on this new axis and are
shown in the figure given below.

But Linear Discriminant Analysis fails when the mean of the distributions are shared, as it
becomes impossible for LDA to find a new axis that makes both the classes linearly separable.
In such cases, we use non-linear discriminant analysis.

Extensions to LDA:
1. Quadratic Discriminant Analysis (QDA): Each class uses its own estimate of variance (or
covariance when there are multiple input variables).
2. Flexible Discriminant Analysis (FDA): Where non-linear combinations of inputs is used
such as splines.
3. Regularized Discriminant Analysis (RDA): Introduces regularization into the estimate of
the variance (actually covariance), moderating the influence of different variables on LDA.

LDA Applications:
1. Face Recognition: In the field of Computer Vision, face recognition is a very popular
application in which each face is represented by a very large number of pixel values. Linear
discriminant analysis (LDA) is used here to reduce the number of features to a more
manageable number before the process of classification. Each of the new dimensions
generated is a linear combination of pixel values, which form a template. The linear
14

combinations obtained using Fisher‟s linear discriminant are called Fisher faces.
Page
2. Medical: In this field, Linear discriminant analysis (LDA) is used to classify the patient
disease state as mild, moderate or severe based upon the patient various parameters and the
medical treatment he is going through. This helps the doctors to intensify or reduce the pace
of their treatment.
3. Customer Identification: Suppose we want to identify the type of customers which are
most likely to buy a particular product in a shopping mall. By doing a simple question and
answers survey, we can gather all the features of the customers. Here, Linear discriminant
analysis will help us to identify and select the features which can describe the
characteristics of the group of customers that are most likely to buy that particular product
in the shopping mall.

4. K-Nearest Neighbours
K-Nearest Neighbors is one of the most basic yet essential classification algorithms in Machine
Learning. It belongs to the supervised learning domain and finds intense application in pattern
recognition, data mining and intrusion detection.

It is widely disposable in real-life scenarios since it is non-parametric, meaning, it does not make
any underlying assumptions about the distribution of data (as opposed to other algorithms such
as GMM, which assume a Gaussian distribution of the given data).
We are given some prior data (also called training data), which classifies coordinates into groups
identified by an attribute.

As an example, consider the following table of data points containing two features:

15
Page
Now, given another set of data points (also called testing data), allocate these points a group by
analyzing the training set. Note that the unclassified points are marked as „White‟.

4.1 Intuition
If we plot these points on a graph, we may be able to locate some clusters or groups. Now,
given an unclassified point, we can assign it to a group by observing what group its nearest
neighbors belong to. This means a point close to a cluster of points classified as „Red‟ has a
higher probability of getting classified as „Red‟.
Intuitively, we can see that the first point (2.5, 7) should be classified as „Green‟ and the
second point (5.5, 4.5) should be classified as „Red‟.

4.2 Algorithm

Let m be the number of training data samples. Let p be an unknown point.

1. Store the training samples in an array of data points arr[]. This means each element of this array
represents a tuple (x, y).
2. for i=0 to m:

3. Calculate Euclidean distance d(arr[i], p).

4. Make set S of K smallest distances obtained. Each of these distances corresponds to an already
classified data point.
16

5. Return the majority label among S.


Page
5. Naive Bayes
Naive Bayes is a machine learning model that is used for large volumes of data, even if you
are working with data that has millions of data records the recommended approach is Naive
Bayes. It gives very good results when it comes to NLP tasks such as sentimental analysis. It
is a fast and uncomplicated classification algorithm.

5.1 Bayes Theorem


It is a theorem that works on conditional probability. Conditional probability is the probability
that something will happen, given that something else has already occurred. The conditional
probability can give us the probability of an event using its prior knowledge.

Conditional probability:

Conditional Probability
Where,
P(A): The probability of hypothesis H being true. This is known as the prior probability.
P(B): The probability of the evidence.
P(A|B): The probability of the evidence given that hypothesis is true.
P(B|A): The probability of the hypothesis given that the evidence is true.

5.2 Naive Bayes Classifier


A classifier is a machine learning model segregating different objects on the basis of certain
features of variables.

It is a kind of classifier that works on the Bayes theorem. Prediction of membership probabilities
is made for every class such as the probability of data points associated with a particular class.

The class having maximum probability is appraised as the most suitable class. This is also
referred to as Maximum A Posteriori (MAP).
17
Page

 The MAP for a hypothesis is:


 𝑀𝐴𝑃 (𝐻) = max 𝑃((𝐻|𝐸))
 𝑀𝐴𝑃 (𝐻) = max 𝑃((𝐻|𝐸) ∗ (𝑃(𝐻)) /𝑃(𝐸))
 𝑀𝐴𝑃 (𝐻) = max(𝑃(𝐸|𝐻) ∗ 𝑃(𝐻))
 𝑃 (𝐸) is evidence probability, and it is used to normalize the result. The result will not
be affected by removing 𝑃(𝐸).

Naive Bayes classifiers conclude that all the variables or features are not related to each
other. The Existence or absence of a variable does not impact the existence or absence of any
other variable. For example,
 Fruit may be observed to be an apple if it is red, round, and about 4″ in diameter.
 In this case also even if all the features are interrelated to each other, an naive bayes classifier
will observe all of these independently contributing to the probability that the fruit is an
apple.

Types Of Naive Bayes Algorithms:

1. Gaussian Naïve Bayes: When characteristic values are continuous in nature then an
assumption is made that the values linked with each class are dispersed according to Gaussian
that is Normal Distribution.

2. Multinomial Naïve Bayes: Multinomial Naive Bayes is favored to use on data that is
multinomial distributed. It is widely used in text classification in NLP. Each event in text
classification constitutes the presence of a word in a document.

3. Bernoulli Naïve Bayes: When data is dispensed according to the multivariate Bernoulli
distributions then Bernoulli Naive Bayes is used. That means there exist multiple features but
each one is assumed to contain a binary value. So, it requires features to be binary-valued.

5.3 Applications of Naive Bayes Algorithms


18

 Real-time Prediction: Being a fast learning algorithm can be used to make predictions in real-
Page

time as well.
 MultiClass Classification: It can be used for multi-class classification problems also.
 Text Classification: As it has shown good results in predicting multi-class classification so it
has more success rates compared to all other algorithms. As a result, it is majorly used
in sentiment analysis & spam detection.

6. Introduction to SVM

Support vector machines (SVMs) are powerful yet flexible supervised machine learning
algorithms which are used both for classification and regression. But generally, they are used in
classification problems. In 1960s, SVMs were first introduced but later they got refined in 1990.
SVMs have their unique way of implementation as compared to other machine learning
algorithms. Lately, they are extremely popular because of their ability to handle multiple
continuous and categorical variables.

6.1 Working of SVM

An SVM model is basically a representation of different classes in a hyperplane in


multidimensional space. The hyperplane will be generated in an iterative manner by SVM so
that the error can be minimized. The goal of SVM is to divide the datasets into classes to find a
maximum marginal hyperplane (MMH).

The followings are important concepts in SVM −

 Support Vectors − Datapoints that are closest to the hyperplane is called support
19

vectors. Separating line will be defined with the help of these data points.
Page
 Hyperplane − As we can see in the above diagram, it is a decision plane or space which
is divided between a set of objects having different classes.

 Margin − It may be defined as the gap between two lines on the closet data points of
different classes. It can be calculated as the perpendicular distance from the line to the
support vectors. Large margin is considered as a good margin and small margin is
considered as a bad margin.

The main goal of SVM is to divide the datasets into classes to find a maximum marginal
hyperplane (MMH) and it can be done in the following two steps −

 First, SVM will generate hyperplanes iteratively that segregates the classes in best way.

 Then, it will choose the hyperplane that separates the classes correctly.

7. Clustering
Clustering is the task of dividing the population or data points into a number of groups such
that data points in the same groups are more similar to other data points in the same group and
dissimilar to the data points in other groups. It is basically a collection of objects on the basis
of similarity and dissimilarity between them.
For ex– The data points in the graph below clustered together can be classified into one single
group. We can distinguish the clusters, and we can identify that there are 3 clusters in the
below picture.

20
Page
It is not necessary for clusters to be a spherical. Such as :

DBSCAN: Density-based Spatial Clustering of Applications with Noise


These data points are clustered by using the basic concept that the data point lies within the
given constraint from the cluster centre. Various distance methods and techniques are used for
calculation of the outliers.

7.1 Why Clustering ?


Clustering is very much important as it determines the intrinsic grouping among the unlabeled
data present. There are no criteria for a good clustering. It depends on the user, what is the
criteria they may use which satisfy their need. For instance, we could be interested in finding
representatives for homogeneous groups (data reduction), in finding “natural clusters” and
describe their unknown properties (“natural” data types), in finding useful and suitable
groupings (“useful” data classes) or in finding unusual data objects (outlier detection). This
algorithm must make some assumptions which constitute the similarity of points and each
assumption make different and equally valid clusters.

7.2 Clustering Methods :


 Density-Based Methods : These methods consider the clusters as the dense region having
some similarity and different from the lower dense region of the space. These methods have
21

good accuracy and ability to merge two clusters.Example DBSCAN (Density-Based Spatial
Page
Clustering of Applications with Noise) , OPTICS (Ordering Points to Identify Clustering
Structure) etc.

 Hierarchical Based Methods : The clusters formed in this method forms a tree-type
structure based on the hierarchy. New clusters are formed using the previously formed one.
It is divided into two category
 Agglomerative (bottom up approach)
 Divisive (top down approach)
examples CURE (Clustering Using Representatives), BIRCH (Balanced Iterative Reducing
Clustering and using Hierarchies) etc.

 Partitioning Methods : These methods partition the objects into k clusters and each
partition forms one cluster. This method is used to optimize an objective criterion similarity
function such as when the distance is a major parameter example K-means, CLARANS
(Clustering Large Applications based upon Randomized Search) etc.

 Grid-based Methods : In this method the data space is formulated into a finite number of
cells that form a grid-like structure. All the clustering operation done on these grids are fast
and independent of the number of data objects example STING (Statistical Information
Grid), wave cluster, CLIQUE (CLustering In Quest) etc.

7. 3 K-means Clustering Algorithms:


K-means clustering algorithm – It is the simplest unsupervised learning algorithm that solves
clustering problem.K-means algorithm partition n observations into k clusters where each
observation belongs to the cluster with the nearest mean serving as a prototype of the cluster .

The algorithm will categorize the items into k groups of similarity. To calculate that similarity,
we will use the euclidean distance as measurement.

The algorithm works as follows:

1. First we initialize k points, called means, randomly.


22

2. We categorize each item to its closest mean and we update the mean‟s coordinates, which
Page

are the averages of the items categorized in that mean so far.


3. We repeat the process for a given number of iterations and at the end, we have our clusters.

Applications of Clustering in different fields


 Marketing : It can be used to characterize & discover customer segments for marketing
purposes.
 Biology : It can be used for classification among different species of plants and animals.
 Libraries : It is used in clustering different books on the basis of topics and information.
 Insurance : It is used to acknowledge the customers, their policies and identifying the
frauds.

23
Page
References
1. Stuart Russell, Peter Norvig, “Artificial Intelligence – A Modern Approach”, Pearson
Education
2. Elaine Rich and Kevin Knight, “Artificial Intelligence”, McGraw-Hill
3. E Charniak and D McDermott, “Introduction to Artificial Intelligence”, Pearson Education
4. Dan W. Patterson, “Artificial Intelligence and Expert Systems”, Prentice Hall of India
5. https://www.geeksforgeeks.org/pattern-recognition-introduction/
6. https://www.geeksforgeeks.org/pattern-recognition-basics-and-design-principles/
7. https://www.gatevidyalay.com/principal-component-analysis-dimension-reduction/
8. https://www.geeksforgeeks.org/k-means-clustering-introduction/
9. https://www.javatpoint.com/k-means-clustering-algorithm-in-machine-learning

24
Page

You might also like