YOLOv迁移学习实现方法与训练数据流程

知来者逆

于 2025-04-02 16:47:25 发布

阅读量1.1k

点赞数 20

分类专栏： YOLO 文章标签：计算机视觉 YOLO YOLOV8 目标检测深度学习迁移学习

本文链接：https://blog.csdn.net/matt45m/article/details/146954151

版权

YOLO 专栏收录该内容

44 篇文章

订阅专栏

概述

YOLO（You Only Look Once）是一个非常出色的目标检测网络，因此它可以成为各种目标检测任务的有力候选者，包括那些原始网络未经训练的对象。

1. 项目背景

这里将利用 Ultralytics 的 YOLOv8 模型来检测图像中的白细胞，数据集来自 Kaggle 上的 Blood Cell Images 数据集。
在这里插入图片描述
先对数据集进行了一些修改，具体如下：

将图像数量减少到只有 40 张图像，并只选取了含有一个白细胞的图像。
没有使用数据集的原始标签，而是制作了另一个包含仅包含白细胞的裁剪图像的数据集。
将使用这些裁剪图像来自己创建标签。

该项目包含三个主要步骤：

处理原始图像和裁剪图像的数据集，创建适合 YOLOv8 的数据集。
使用迁移学习训练 YOLOv8 模型。
进行预测并保存结果。

大部分代码将是一个类的组成部分，该类将作为原始 YOLOv8 实现的包装器。

2、代码实现

2.1 初始化 YOLOv8 模型

import warnings
from shutil import copy, rmtree
from pathlib import Path
import numpy as np
import cv2
from ultralytics import YOLO
from sklearn.model_selection import train_test_split
import pandas as pd
import torch
import matplotlib.pyplot as plt

class YoloWrapper:
    def __init__(self, model_weights: str) -> None:
        """
        使用权重初始化 YOLOv8 模型。
        参数：
            model_weights (str): 模型权重可以是以下几种之一：
                - 'nano'：使用 YOLOv8 nano 模型
                - 'small'：使用 YOLOv8 small 模型
                - 指向 .pt 文件的路径，该文件包含之前训练保存的权重
        """
        if model_weights == 'nano':
            model_weights = 'yolov8n.pt'
        elif model_weights == 'small':
            model_weights = 'yolov8s.pt'
        elif model_weights == 'medium':
            model_weights = 'yolov8m.pt'
        elif (not Path(model_weights).exists()) or (Path(model_weights).suffix != '.pt'):
            raise ValueError('model_weight 参数应为 "nano"、"small" 或指向 .pt 文件的路径')

        self.model = YOLO(model_weights)

在类的 __init__ 方法中，我们根据权重初始化 YOLOv8 模型。可以选择使用预训练的 nano、small 或 medium 模型，或者加载自定义训练保存的权重。

2.2 创建 YOLO 格式的标签

我们从一个包含两个文件夹的数据集开始：full_images 和 crops。full_images 文件夹包含 40 张含有一个白细胞的血细胞图像，而 crops 文件夹包含 40 张裁剪图像，每张裁剪图像都包含原始图像中的白细胞。原始图像和裁剪图像的名称相同。

我们的目标是为数据集创建 YOLO 格式的标签，这些标签应表示白细胞的边界框。YOLO 格式的标签文件是一个文本文件，其中每行表示一个目标对象，格式如下：

<object-class> <x_center> <y_center> <width> <height>

这些坐标是归一化的（即图像坐标 (0,0) 表示左上角，(1,1) 表示右下角）。对于每张图像，我们将创建一个与其同名的文本文件，并包含上述格式的一行。在我们的案例中，每张图像中只有一个目标对象（白细胞），属于一个类别（编号为 0）。

为了找到裁剪图像在原始图像中的位置（即边界框坐标），我们可以使用一种称为模板匹配的技术。这是一种相对简单的方法，其中裁剪图像沿原始图像滑动，并输出与裁剪图像最匹配的位置。OpenCV 提供了实现模板匹配的函数，我们可以直接使用它。

我们在类中创建了一个静态方法来实现标签创建：

@staticmethod
def create_yolo_labels_from_crop(images_path: str | Path, crops_path: str | Path,
                                 labels_path: str | Path | None = None) -> None:
    """
    从裁剪图像创建 YOLO 格式的标签。
    参数：
        images_path (str|Path): 原始图像文件夹的路径
        crops_path (str|Path): 裁剪图像文件夹的路径
        labels_path (str|Path|None): 保存标签的路径。如果为 None，则在原始图像文件夹的父目录中创建一个名为 'labels' 的文件夹
    """
    if labels_path is None:
        labels_path = Path(images_path).parent / 'labels'
    labels_path.mkdir(parents=True, exist_ok=True)

    # 读取原始图像和裁剪图像的文件名，并对它们进行排序，以便它们对齐
    image_files = sorted(Path(images_path).glob('*.jpg'))
    crop_files = sorted(Path(crops_path).glob('*.jpg'))

    for image_file, crop_file in zip(image_files, crop_files):
        image = cv2.imread(str(image_file))
        crop = cv2.imread(str(crop_file))

        # 使用 OpenCV 的模板匹配函数找到裁剪图像在原始图像中的位置
        result = cv2.matchTemplate(image, crop, cv2.TM_CCOEFF_NORMED)
        min_val, max_val, min_loc, max_loc = cv2.minMaxLoc(result)

        # 计算边界框的坐标
        top_left = max_loc
        bottom_right = (top_left[0] + crop.shape[1], top_left[1] + crop.shape[0])

        # 将坐标归一化到 [0, 1] 范围内
        x_center = (top_left[0] + crop.shape[1] / 2) / image.shape[1]
        y_center = (top_left[1] + crop.shape[0] / 2) / image.shape[0]
        width = crop.shape[1] / image.shape[1]
        height = crop.shape[0] / image.shape[0]

        # 保存为 YOLO 格式的标签文件
        label_file = labels_path / f"{image_file.stem}.txt"
        with open(label_file, 'w') as f:
            f.write(f"0 {x_center:.6f} {y_center:.6f} {width:.6f} {height:.6f}\n")

2.3 创建 YOLO 数据集结构

YOLO 数据集的结构如下：

dataset
  - images
    - train
    - val
  - labels
    - train
    - val

数据以根文件夹（例如 dataset）组织，其中包含两个文件夹：images 和 labels，每个文件夹中的数据分为训练集和验证集。此外，我们还需要一个配置文件，该文件将告诉 YOLO 数据的位置以及数据集中有哪些类别。

我们在类中创建了两个静态方法来完成这些任务：

@staticmethod
def create_dataset(images_path: str | Path, labels_path: str | Path = None, result_path: str | Path = None,
                   train_size: float = 0.9) -> None:
    """
    从图像和标签文件夹创建 YOLO 数据集。
    参数：
        images_path (str|Path): 图像文件夹的路径
        labels_path (str|Path): 标签文件夹的路径
        result_path (str|Path): 保存结果的路径。如果为 None，则在图像文件夹的父目录中创建一个名为 'data' 的文件夹
        train_size (float): 训练集的比例，默认为 0.9
    """
    if labels_path is None:
        labels_path = Path(images_path).parent / 'labels'
    if result_path is None:
        result_path = Path(images_path).parent / 'data'

    result_path.mkdir(parents=True, exist_ok=True)
    (result_path / 'images' / 'train').mkdir(parents=True, exist_ok=True)
    (result_path / 'images' / 'val').mkdir(parents=True, exist_ok=True)
    (result_path / 'labels' / 'train').mkdir(parents=True, exist_ok=True)
    (result_path / 'labels' / 'val').mkdir(parents=True, exist_ok=True)

    image_files = sorted(Path(images_path).glob('*.jpg'))
    label_files = sorted(Path(labels_path).glob('*.txt'))

    train_images, val_images, train_labels, val_labels = train_test_split(image_files, label_files, train_size=train_size, random_state=42)

    for image_file, label_file in zip(train_images, train_labels):
        copy(image_file, result_path / 'images' / 'train' / image_file.name)
        copy(label_file, result_path / 'labels' / 'train' / label_file.name)

    for image_file, label_file in zip(val_images, val_labels):
        copy(image_file, result_path / 'images' / 'val' / image_file.name)
        copy(label_file, result_path / 'labels' / 'val' / label_file.name)


@staticmethod
def create_config_file(parent_data_path: str | Path, class_names: list[str], path_to_save: str = None) -> None:
    """
    创建 YOLOv8 配置文件。
    参数：
        parent_data_path (str|Path): 包含图像和标签文件夹的数据根目录路径
        class_names (list[str]): 类别名称列表
        path_to_save (str): 保存配置文件的路径。如果为 None，则在当前工作目录中创建一个名为 'config.yaml' 的文件
    """
    if path_to_save is None:
        path_to_save = 'config.yaml'
    elif Path(path_to_save).is_dir():
        path_to_save = Path(path_to_save) / 'config.yaml'

    with open(path_to_save, 'w') as f:
        f.write(f"path: {parent_data_path}\n")
        f.write(f"train: images/train\n")
        f.write(f"val: images/val\n")
        f.write(f"nc: {len(class_names)}\n")
        f.write("names: [")
        f.write(", ".join(f"'{name}'" for name in class_names))
        f.write("]\n")

2.4 训练模型

Ultralytics 的 YOLO 类已经提供了一个训练方法，该方法包括数据增强和验证指标等所有内容。我们创建了一个更简单的包装器方法来进行训练：

def train(self, config: str, epochs: int = 100, name: str = None) -> None:
    """
    训练模型。
    参数：
        config (str): 配置文件的路径。该文件包含数据路径、训练集和验证集的相对路径、类别数量和类别名称等信息
        epochs (int): 训练轮数，默认为 100
        name (str): 结果文件夹的名称。如果为 None，则使用默认名称 'train #'
    """
    self.model.train(data=config, epochs=epochs, name=name)

权重和验证结果将保存在项目文件夹中的 runs/detect/ 路径下。

2.5 进行预测并保存结果

现在我们有了一个训练好的模型，可以进行预测。我们创建了一个方法来预测图像的边界框，并在图像上绘制边界框：

def predict_and_show(self, image: str | np.ndarray, threshold: float = 0.25) -> None:
    """
    预测单张图像的边界框并在图像上显示边界框及其置信度。
    参数：
        image (str|np.ndarray): 图像路径或 BGR 格式的 numpy 数组
        threshold (float): 置信度阈值，默认为 0.25
    """
    results = self.model(image)
    for det in results.xyxy[0]:
        if det[4] >= threshold:
            x1, y1, x2, y2 = map(int, det[:4])
            cv2.rectangle(image, (x1, y1), (x2, y2), (0, 255, 0), 2)
            cv2.putText(image, f"{det[4]:.2f}", (x1, y1 - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 255, 0), 2)
    cv2.imshow("Image", image)
    cv2.waitKey(0)

我们还可以创建一个方法来预测一批图像的边界框，并将结果保存到 CSV 文件中：

def predict_and_save_to_csv(self, images: list[str] | list[Path] | list[np.ndarray], image_ids: list[str] = None,
                             path_to_save_csv: str | Path = '', threshold: float = 0.25, minimum_size: int = 100,
                             only_most_conf=True) -> None:
    """
    预测一批图像的边界框并将结果保存到 CSV 文件中。
    参数：
        images (list[str]|list[Path]|list[np.ndarray]): 图像路径列表或 BGR 格式的 numpy 数组列表
        image_ids (list[str]): 图像 ID 列表
        path_to_save_csv (str|Path): 保存 CSV 文件的路径
        threshold (float): 置信度阈值，默认为 0.25
        minimum_size (int): 边界框的最小宽度和高度，默认为 100
        only_most_conf (bool): 是否只保存每个图像中置信度最高的边界框，默认为 True
    """
    results = self.model(images)
    data = []
    for i, det in enumerate(results.xyxy):
        if len(det) > 0:
            det = det[det[:, 4] >= threshold]
            if only_most_conf:
                det = det[torch.argmax(det[:, 4])]
            x1, y1, x2, y2 = map(int, det[:4])
            width = max(x2 - x1, minimum_size)
            height = max(y2 - y1, minimum_size)
            x1 = max(x1 - (width - (x2 - x1)) // 2, 0)
            y1 = max(y1 - (height - (y2 - y1)) // 2, 0)
            data.append([image_ids[i], x1, y1, width, height])
    df = pd.DataFrame(data, columns=['image_id', 'x_top_left', 'y_top_left', 'width', 'height'])
    df.to_csv(path_to_save_csv, index=False)

最后，我们还可以添加一个根据 CSV 文件在图像上绘制边界框的函数：

@staticmethod
def draw_bbox_from_csv(image: str | Path | np.ndarray, csv_path: str, image_id: str = None) -> None:
    """
    根据 CSV 文件在图像上绘制边界框。
    参数：
        image (str|Path|np.ndarray): 图像路径或 BGR 格式的 numpy 数组
        csv_path (str): CSV 文件路径
        image_id (str): 图像 ID。如果为 None，则绘制 CSV 文件中的所有边界框
    """
    df = pd.read_csv(csv_path)
    if image_id is not None:
        df = df[df['image_id'] == image_id]

    image = cv2.imread(str(image)) if isinstance(image, (str, Path)) else image
    for _, row in df.iterrows():
        x1, y1, width, height = row['x_top_left'], row['y_top_left'], row['width'], row['height']
        cv2.rectangle(image, (x1, y1), (x1 + width, y1 + height), (0, 255, 0), 2)
    cv2.imshow("Image", image)
    cv2.waitKey(0)

2.6 示例脚本

以下是一个完整的脚本示例，展示了如何使用上述方法：

from pathlib import Path
import cv2
from yolo_wrapper import YoloWrapper

# 数据路径
dataset_path = Path('data/yolo_dataset')  # YOLO 数据集将存放在这里
large_field_images_path = Path('data/raw_data/full_image')  # 原始图像存放的位置
cropped_images_path = Path('data/raw_data/crops')  # 裁剪图像存放的位置
labels_path = Path('data/labels')  # 标签存放的位置

# 创建 YOLO 格式的标签
YoloWrapper.create_yolo_labels_from_crop(large_field_images_path, cropped_images_path, labels_path)

# 创建 YOLO 数据集结构
YoloWrapper.create_dataset(large_field_images_path, labels_path, dataset_path)

# 创建 YOLO 配置文件
config_path = 'blood_cell_config.yaml'
YoloWrapper.create_config_file(dataset_path, ['white_cell'], config_path)

# 创建预训练的 YOLO 模型并使用迁移学习训练它
model = YoloWrapper('nano')
model.train(config_path, epochs=200, name='blood_cell')

# 对验证集进行预测
data_to_predict_path = dataset_path / 'images' / 'val'
val_image_list = list(data_to_predict_path.glob('*.jpg'))

# 将预测保存在 CSV 文件中，其中边界框应具有最小尺寸
model.predict_and_save_to_csv(val_image_list, path_to_save_csv='nano_blood_cell.csv', minimum_size=100, threshold=0.25,
                              only_most_conf=True)

# 根据 CSV 文件绘制边界框
for image in val_image_list:
    model.draw_bbox_from_csv(image, 'nano_blood_cell.csv', image.stem)