【如果笔记对你有帮助,欢迎关注&点赞&收藏,收到正反馈会加快更新!谢谢支持!】
一、Kmeans
- 输入:数据 (形状为[L, dim]),K个聚类类别,epochs为迭代次数,tol用于判断聚类是否收敛
- 方法流程:随机选择K个点作为初始聚类中心 → [ 计算每个点到K个聚类中心的距离(dist) → 每个点按最短距离的聚类中心打上标签(labels) → 更新聚类中心(new_centroids) → 查看是否收敛(如收敛, break) ] × epochs
- 代码:
import torch def kmeans(x, K, epochs, tol): L, dim = x.shape centroids_idx = torch.randint(0, L, (K,)) centroids = x[centroids_idx] for epoch in epochs: dist = torch.cdist(x, centroids) # 等于 x.unsqueeze(1).expand(L, K, dim), torch.norm(x-centroids, dim=-1) labels = torch.argmin(dist, dim=1) # dist: [L, K] new_centroids = [x[labels == k].mean(dim=0) for k in range(K)] if torch.norm(new_centroids-centroids) < tol: return new_centroids centroids = new_centroids return new_centroids
二、两个⽔平矩形框的IoU计算
- 输入:box1 = [x_l1, y_l1, x_r1, y_r1] ;box2 = [x_l2, y_l2, x_r2, y_r2]
(满足x_r1 > x_l1, y_r1 > y_l1,box2同理,因为在如图坐标系中,右下角的坐标一定比左上角大)
- IoU = 重叠区域面积 / (box1面积 + box2面积 - 重叠区域面积)
- 重叠区域:
- 左上角坐标 lt_x = max(x_l1, x_l2) ,lt_y = max(y_l1, y_l2)
- 右下角坐标 rb_x = min(x_r1, x_r2) ;rb_y = min(y_r1, y_r2)
- 重叠区域面积 = h × w = (rb_y - lt_y) × (rb_x - lt_x)
- 如果有重叠区域,则 h > 0 且 w > 0,否则不存在重叠区域
- 代码:
def iou_2box(box1, box2): lt_x, lt_y = max(box1[0], box2[0]), max(box1[1], box2[1]) # 重叠区域左上角x, y rb_x, rb_y = min(box1[2], box2[2]), min(box1[3], box2[3]) # 重叠区域右下角x, y h, w = rb_y - lt_y, rb_x - lt_x if h <= 0 or w <= 0: return 0 inter_area = h * w area1 = (box1[1] - box1[0])*(box1[3] - box1[2]) area2 = (box2[1] - box2[0])*(box2[3] - box2[2]) return inter_area / (area1 + area2 - inter_area)
三、多个⽔平矩形框的IoU计算
- 方法与上面两个框的计算同理,但变成矩阵运算
- box1 形状: [N, 4] box2 形状: [M, 4] (这里box1表示有N个检测框,box2 有M个)
- 计算重叠区域要将 box1和box2 都expand到 [N, M, 4],表示box1中的每个框和box2中的每个框的重叠关系(共 N*M 个)
- 代码:
def iou_multibox(box1, box2): N, M = box1.shape(0), box2.shape(0) # box1: [N, 4] box2: [M, 4] lt = torch.max(box1[:, :2].unsqueeze(1).expand(N, M, 2), box2[:, :2].unsqueeze(0).expand(N, M, 2)) rb = torch.min(box1[:, 2:].unsqueeze(1).expand(N, M, 2), box2[:, 2:].unsqueeze(0).expand(N, M, 2)) wh = rb - lt # [N, M, 2] wh[wh <= 0] = 0 # h<=0或w<=0,框的重叠区域即为0 inter = wh[:,:,0] * wh[:,:,1] area1 = (box1[:, 2]-box1[:, 0]) * (box1[:, 3]-box1[:, 1]) # [N,] area2 = (box2[:, 2]-box2[:, 0]) * (box2[:, 3]-box2[:, 1]) # [M,] area1 = area1.unsqueeze(1).expand(N, M) # [N, M] area2 = area2.unsqueeze(0).expand(N, M) # [N, M] return inter / (area1 + area2 - inter)
四、NMS(非极大值抑制)
- 作用:在目标检测任务中,模型通常会输出大量重叠的候选检测框,NMS的作用是消除这些冗余的框,只保留置信度最高的框
- 步骤:排序(按置信度由高到低给框排序)→ [ 选择(选置信度最高的检测框加入检测结果列表)→ 计算IoU(选中的框和其他框的交并比)→ 去除重叠度高的框(根据阈值筛选)] × N(直到所有检测框都被检查过) → 输入结果列表
- 代码:
import torch def nms(bboxes, scores, thresh=0.5): # bboxes:所有的检测框;scores:置信度;thresh:重叠区域 x1, y1 = bboxes[:, 0], bboxes[:, 1] x2, y2 = bboxes[:, 2], bboxes[:, 3] areas = (x2 - x1) * (y2 - y1) # bboxes area: [N, ] _, order = scores.sort(0, descending=True) # 给置信度得分排序 keep = [] # 保留下来的框 while order.numel() > 0: if order.numel() == 1: # 如果只剩一个未处理的框,直接保留 i = order.item() keep.append(i) break else: i = order[0].item() # 取当前剩余的置信度最高的框 keep.append(i) # 计算iou xx1 = x1[order[1:]].clamp(min=x1[i]) # xx1就是order[1:]这些框的x1和order[0]的x1取 yy1 = y1[order[1:]].clamp(min=y1[i]) xx2 = x2[order[1:]].clamp(max=x2[i]) yy2 = y2[order[1:]].clamp(max=y2[i]) inter = (xx2 - xx1).clamp(min=0) * (yy2 - yy1).clamp(min=0) union = areas[i] + areas[order[1:]] - inter IoU = inter / union idx = (IoU <= thresh).nonzero().squeeze() # 重叠区域小于阈值的框 if idx.numel() == 0: break order = order[idx+1] # order[0]是bboxes[i];IoU是order[1:]和order[0]对应box的IoU return torch.tensor(keep, dtype=torch.long)