
Data Structure
Networking
RDBMS
Operating System
Java
MS Excel
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
Assign Codes from Code Book to Observations using SciPy Cluster VQ Module
Before implementing k-means algorithms, the scipy.cluster.vq.vq(obs, code_book, check_finite = True) used to assign codes to each observation from a code book. It first compares each observation vector in the ‘M’ by ‘N’ obs array with the centroids in the code book. Once compared, it assigns the code to the closest centroid. It requires unit variance features in the obs array, which we can achieve by passing them through the scipy.cluster.vq.whiten(obs, check_finite = True)function.
Parameters
Below are given the parameters of the function scipy.cluster.vq.vq(obs, code_book, check_finite = True) −
obs− ndarray
It is an ‘M’ by ‘N’ array where each row is an observation, and the columns are the features seen during each observation. The example is given below −
obs = [[ 1., 1., 1.], [ 2., 2., 2.], [ 3., 3., 3.], [ 4., 4., 4.]]
code_book− ndarray
It is also an ‘M’ by ‘N’ array, usually generated by using k-means algorithm, where each row holds a different code, and the columns are the features of that code.
The example is given below −
code_book = [ [ 1., 2., 3., 4.], [ 1., 2., 3., 4.], [ 1., 2., 3., 4.]]
- check_finite− bool,optional
This parameter is used to check whether the input matrices contain only finite numbers. Disabling this parameter may give you a performance gain but it may also result in some problems like crashes or non-termination if the observations do contain infinites. The default value of this parameter is True.
Returns
code− ndarray
It returns a ‘M’ array which holds the code book index for each observation.
dist− ndarray
It also returns the distance, which is also called distortion, between each observation and its nearest code.
Example
import numpy as np from scipy.cluster.vq import vq code_book = np.array([[1.,1.,1.], [2.,2.,2.]]) observations = np.array([[2.9, 1.3, 1.9], [1.7, 3.2, 1.1], [1.0, 0.2, 1.7,]]) vq(observations, code_book)
Output
(array([1, 1, 0]), array([1.14455231, 1.52970585, 1.06301458]))