c++ kmeans
时间: 2025-01-17 11:29:36 浏览: 29
### C++ 实现 KMeans 聚类算法
KMeans 是一种常见的无监督学习算法,用于自动将相似的数据点分组在一起[^2]。以下是基于 C++ 的 KMeans 聚类算法实现:
#### 初始化阶段
在初始化过程中,随机选择 `k` 个数据点作为初始质心。
```cpp
#include <vector>
#include <cmath>
#include <cstdlib>
struct Point {
double x;
double y;
// 计算两个点之间的欧几里得距离
static double distance(const Point& p1, const Point& p2) {
return std::sqrt(std::pow(p1.x - p2.x, 2) + std::pow(p1.y - p2.y, 2));
}
};
class KMeans {
public:
void fit(const std::vector<Point>& points, int k);
private:
std::vector<Point> centroids_;
};
```
#### 迭代更新过程
通过迭代方式不断调整质心位置直到收敛条件满足为止。每次迭代分为两步:分配簇标签和重新计算新的质心位置。
```cpp
void KMeans::fit(const std::vector<Point>& points, int k) {
// 随机选取k个样本作为初始质心
srand(time(NULL));
for (int i = 0; i < k; ++i) {
size_t index = rand() % points.size();
centroids_.push_back(points[index]);
}
bool converged = false;
while (!converged) {
std::vector<int> labels(points.size());
// Step 1: Assign each point to nearest centroid
for (size_t i = 0; i < points.size(); ++i) {
double minDist = INFINITY;
for (size_t j = 0; j < centroids_.size(); ++j) {
double dist = Point::distance(points[i], centroids_[j]);
if (dist < minDist) {
minDist = dist;
labels[i] = j;
}
}
}
// Step 2: Recalculate new centroids based on assigned clusters
std::vector<std::pair<double, double>> sum(k, {0, 0});
std::vector<int> count(k, 0);
for (size_t i = 0; i < points.size(); ++i) {
int label = labels[i];
sum[label].first += points[i].x;
sum[label].second += points[i].y;
count[label]++;
}
bool changed = false;
for (size_t i = 0; i < k; ++i) {
if (count[i]) {
auto oldCentroid = centroids_[i];
centroids_[i].x = sum[i].first / count[i];
centroids_[i].y = sum[i].second / count[i];
if (!(oldCentroid.x == centroids_[i].x && oldCentroid.y == centroids_[i].y)) {
changed = true;
}
} else {
// Handle empty cluster by reassigning random point as new center
size_t newIndex = rand() % points.size();
centroids_[i] = points[newIndex];
changed = true;
}
}
converged = !changed;
}
}
```
此代码展示了如何使用 C++ 来实现基本的 KMeans 算法逻辑[^3]。需要注意的是,在实际应用中可能还需要考虑更多细节问题,比如处理异常情况以及提高性能等方面的工作。
阅读全文
相关推荐


















