yolo训练图像增强
时间: 2025-02-21 12:29:12 浏览: 61
### YOLO 训练中的图像增强方法
在YOLO模型训练过程中,采用合适的图像增强技术可以显著提升模型性能和泛化能力。常见的图像增强方式包括但不限于随机裁剪、颜色抖动、水平翻转等操作。
对于YOLO系列模型而言,在官方文档中提到一系列有效的图像增强策略[^2]。例如:
- **Mosaic Data Augmentation**:该方法会拼接四张不同的图片形成一张新的训练样本,这有助于增加每批次内不同类别之间的交互频率,从而改善边界框回归效果并减少过拟合风险。
- **MixUp augmentation**:此技术通过线性插值两个输入图像及其标签来创建合成样例,进一步增强了网络对各种变换下的不变性的学习能力。
为了具体展示如何实现上述功能之一——即`Mosaic`增强算法,下面给出一段Python代码片段作为参考:
```python
from yolov5.utils.augmentations import Albumentations, augment_hsv, copy_paste, letterbox, mixup, random_affine
def load_mosaic(self, index):
# loads images in a mosaic
labels4 = []
s = self.img_size
yc, xc = (int(random.uniform(-x, 2 * s + x)) for x in self.mosaic_border) # mosaic center x, y
indices = [index] + random.choices(self.indices, k=3) # 3 additional image indices
for i, index in enumerate(indices):
img, _, (h, w) = self.load_image(index)
# place img in img4
if i == 0: # top left
img4 = np.full((s * 2, s * 2, img.shape[2]), 114, dtype=np.uint8) # base image with 4 tiles
x1a, y1a, x2a, y2a = max(xc - w, 0), max(yc - h, 0), xc, yc # xmin, ymin, xmax, ymax (large image)
x1b, y1b, x2b, y2b = w - (x2a - x1a), h - (y2a - y1a), w, h # xmin, ymin, xmax, ymax (small image)
elif i == 1: # top right
x1a, y1a, x2a, y2a = xc, max(yc - h, 0), min(xc + w, s * 2), yc
x1b, y1b, x2b, y2b = 0, h - (y2a - y1a), min(w, x2a - x1a), h
elif i == 2: # bottom left
x1a, y1a, x2a, y2a = max(xc - w, 0), yc, xc, min(s * 2, yc + h)
x1b, y1b, x2b, y2b = w - (x2a - x1a), 0, w, min(y2a - y1a, h)
elif i == 3: # bottom right
x1a, y1a, x2a, y2a = xc, yc, min(xc + w, s * 2), min(s * 2, yc + h)
x1b, y1b, x2b, y2b = 0, 0, min(w, x2a - x1a), min(y2a - y1a, h)
img4[y1a:y2a, x1a:x2a] = img[y1b:y2b, x1b:x2b] # img4[ymin:ymax, xmin:xmax]
padw = x1a - x1b
padh = y1a - y1b
# Labels
labels = self.labels[index].copy()
if labels.size:
labels[:, 1:] = xywhn2xyxy(labels[:, 1:], w, h, padw, padh) # normalized xywh to pixel xyxy format
labels4.append(labels)
# Concat/clip labels
if len(labels4):
labels4 = np.concatenate(labels4, 0)
np.clip(labels4[:, 1:], 0, 2 * s, out=labels4[:, 1:]) # use with caution, clip boxes outside of image
# Augment
img4, labels4 = random_affine(img4,
labels4,
degrees=self.hyp['degrees'],
translate=self.hyp['translate'],
scale=self.hyp['scale'],
shear=self.hyp['shear'],
new_shape=(self.img_size, self.img_size))
return img4, labels4
```
这段代码展示了如何加载四个独立的图像并将它们组合成一个新的mosaic图像用于训练目的。同时进行了仿射变换以引入额外的变化因素,使得最终得到的数据集更加多样化。
阅读全文
相关推荐


















