kmeans算法c++
时间: 2025-03-11 17:28:34 浏览: 27
### 实现KMeans算法的C++代码
为了实现K均值聚类(KMeans)算法,在C++中可以定义一个`KMeans`类来封装主要功能。以下是该类的一个简单实现:
```cpp
#include <vector>
#include <cmath>
#include <cstdlib>
#include <ctime>
class Point {
public:
std::vector<double> coordinates;
double distance(const Point& other) const {
double sum = 0.0;
for (size_t i = 0; i < coordinates.size(); ++i)
sum += pow(coordinates[i] - other.coordinates[i], 2);
return sqrt(sum);
}
};
class KMeans {
private:
int numClusters;
std::vector<Point> centroids;
public:
KMeans(int k): numClusters(k), centroids(std::vector<Point>(k)) {}
void initializeCentroids(const std::vector<Point>& points) {
srand(time(0));
for (int i = 0; i < numClusters; ++i) {
centroids[i] = points[rand() % points.size()];
}
}
std::vector<int> fitPredict(const std::vector<Point>& points) {
bool changed = true;
while(changed){
changed = false;
std::vector<std::vector<int>> clusters(numClusters);
// Assign each point to nearest centroid.
for(size_t pIndex = 0; pIndex < points.size(); ++pIndex){
int closestCluster = 0;
double minDistance = points[pIndex].distance(centroids[0]);
for(int cIndex = 1; cIndex < numClusters; ++cIndex){
double dist = points[pIndex].distance(centroids[cIndex]);
if(dist < minDistance){
minDistance = dist;
closestCluster = cIndex;
}
}
clusters[closestCluster].push_back(pIndex);
}
// Recalculate centroids based on assigned points.
for(int clusterId = 0; clusterId < numClusters && !changed; ++clusterId){
auto &pointsInCluster = clusters[clusterId];
if(pointsInCluster.empty()) continue;
Point newCentroid;
for(auto index : pointsInCluster){
for(size_t coordIdx = 0; coordIdx < points[index].coordinates.size(); ++coordIdx){
newCentroid.coordinates.push_back(points[index].coordinates[coordIdx]);
}
}
for(size_t coordIdx = 0; coordIdx < newCentroid.coordinates.size(); ++coordIdx){
newCentroid.coordinates[coordIdx] /= pointsInCluster.size();
}
if(newCentroid.distance(centroids[clusterId]) >= 0.001){
centroids[clusterId] = newCentroid;
changed = true;
} else{
newCentroid.coordinates.clear();
}
}
}
// After convergence, assign labels.
std::vector<int> labels(points.size());
for(size_t pIndex = 0; pIndex < points.size(); ++pIndex){
int closestCluster = 0;
double minDistance = points[pIndex].distance(centroids[0]);
for(int cIndex = 1; cIndex < numClusters; ++cIndex){
double dist = points[pIndex].distance(centroids[cIndex]);
if(dist < minDistance){
minDistance = dist;
closestCluster = cIndex;
}
}
labels[pIndex] = closestCluster;
}
return labels;
}
};
```
此代码实现了基本的K均值聚类过程,包括初始化质心、分配数据点到最近的簇以及重新计算新的质心位置直到收敛为止[^1]。
对于图像压缩应用而言,可以通过减少颜色空间中的唯一色彩数量并仅保留最常出现的颜色来进行处理[^2]。然而上述给出的是通用版本的KMeans实现,并未针对特定应用场景优化。
阅读全文
相关推荐














