ML Clustering
ML Clustering
- kmeans
- dbscan
--------------------------
row = record = tuple = observation = instance = datapoint
k=2
- - divide above records
into k groups
1. pick any k rows as centers
randomly - Cr1, Cr2
2. find dist between Dp1 to Cr1
and Dp1 to Cr2
dp1 to G1
dp1 to G2
5 - 9 = 4
4,5 7,9
x1,y1 x2,y2
if a > b is true
5 > 3
18 24 10
- + - + -
12 31 11
** ** **
2 2 2
(11+14+17+11+12+14) / 6 12
(21+10+24+10+28+23) / 6 27
(15+20+10+10+15+30) / 6 15
k - 3 number of clusters
pick k - centroids - random
d1 d2 d3
d4 to d1 5
d4 to d2 7
d4 to d3 3
d4 belongs to d3
d5 to d1 3
d5 to d2 7
d5 to d3 5
d5 belongs to d1
d6
...
d100
-----------------------------------
c1 c2 c3
d1 d3* d9.. d2 d5 d7.. d4 d6 d8 ...
calc mean of
c1 data points