首先上github代码:https://github.com/KaiyangZhou/pytorch-center-loss
主要代码如下:
class CenterLoss(nn.Module):
"""Center loss.
Reference:
Wen et al. A Discriminative Feature Learning Approach for Deep Face Recognition. ECCV 2016.
Args:
num_classes (int): number of classes.
feat_dim (int): feature dimension.
"""
def __init__(self, num_classes=10, feat_dim=2, use_gpu=True):
super(CenterLoss, self).__init__()
self.num_classes = num_classes
self.feat_dim = feat_dim
self.use_gpu = use_gpu
if self.use_gpu:
self.centers = nn.Parameter(torch.randn(self.num_classes, self.feat_dim).cuda())
else:
self.centers = nn.Parameter(torch.randn(self.num_classes, self.feat_dim))
def forward(self, x, labels):
"""
Args:
x: feature matrix with shape (batch_size, feat_dim).
labels: ground truth labels with shape (batch_size).
"""
batch_size = x.size(0)
distmat = torch.pow(x, 2).sum(dim=1, keepdim=True).expand(batch_size, self.num_classes) + \
torch.pow(self.centers, 2).sum(dim=1, keepdim=True).expand(self.num_classes, batch_size).t()
distmat.addmm_(1, -2, x, self.centers.t())
classes = torch.arange(self.num_classes).long()
if self.use_gpu: classes = classes.cuda()
labels = labels.unsqueeze(1).expand(batch_size, self.num_classes)
mask = labels.eq(classes.expand(batch_size, self.num_classes))
dist = distmat * mask.float()
loss = dist.clamp(min=1e-12, max=1e+12).sum() / batch_size
return loss
参数:
num_classes: 数据集类别数
feat_dim: 特征向量的维度
forward部分代码解析:
这里通过举例来说明代码,假设num_classes=6,即标签从0-5。
当前mini-batch中batch size=3,label=0,4,2,feat_dim=5
x: [B, feat_dim]=[3, 5]
使用S0, S4,S2来表示输入x中的三个特征,其中Si维度是5:
同样,centers:[num_classes, feat_dim]=[6,5]可表示为,其中Ci的维度是5:
此处代码运行得到的结果是:
distmat = torch.pow(x, 2).sum(dim=1, keepdim=True).expand(batch_size, self.num_classes) + \
torch.pow(self.centers, 2).sum(dim=1, keepdim=True).expand(self.num_classes, batch_size).t()
该行代码首先对x逐元素平方,再求和,左后再expand到[B,num_classes]=[3,6]维度:
对centers的操作也同理,最后,得到的distmat为:
distmat.addmm_(1, -2, x, self.centers.t())
上述代码运算为:,即:
classes = torch.arange(self.num_classes).long()
labels = labels.unsqueeze(1).expand(batch_size, self.num_classes)
mask = labels.eq(classes.expand(batch_size, self.num_classes))
该段代码意义如下:
dist = distmat * mask.float()
loss = dist.clamp(min=1e-12, max=1e+12).sum() / batch_size
dist*mask恰好保留了输入x三个特征与对应类中心的距离平方,最后的距离之和为:
最后,clamp使其在[1e-12,1e+12]的范围内