活动介绍
file-type

YOLOv4纯TensorFlow实现教程:训练自定义数据集

下载需积分: 0 | 2.27MB | 更新于2024-11-24 | 188 浏览量 | 2 下载量 举报 收藏
download 立即下载
YOLOv4是一种先进、高效且广泛使用的实时目标检测系统。本项目的重点是用纯TensorFlow(即不借助任何外部框架)实现YOLOv4,从而使得开发者可以在自己的数据集上训练模型。 项目名称中的YOLOv4指的是该算法的第四个版本,其通过引入多尺度预测、路径聚合网络(PANet)、自注意力机制等技术,大大提升了目标检测的准确性和速度。tensorflow指的是使用Google的开源机器学习框架TensorFlow,tf1则明确指出该项目使用的是TensorFlow的1.x版本。Python是该项目的开发语言。 描述部分给出了一个基础的使用指南,包括如何运行项目中的验证脚本(val_voc.py),以及如何将预训练的YOLOv4权重文件(yolov4.weights)转换为项目兼容的格式。转换权重的过程需要使用convert_weight.py脚本,并通过test_yolo_weights.py脚本验证权重是否正确加载。 此外,项目中还包括了tf.data.pipline的使用,这是TensorFlow 1.x版本中的一个重要特性,用于提高数据输入管道的效率,它能够快速地加载、预处理数据并喂入神经网络中进行训练。 标签部分表明该项目与TensorFlow、tf1以及YOLOv4紧密相关,同时也指出了该项目使用Python语言编写。 压缩包文件名称列表中的"YOLOv4_tensorflow-master"表明这是一个包含所有必要文件和目录的项目主目录压缩包,用户可以通过解压这个压缩包来访问和使用该项目的所有资源。" 知识点汇总如下: 1. TensorFlow框架:一个开源的机器学习库,主要用于数值计算和数据流编程。TensorFlow 1.x版本在本项目中被使用,它支持构建和训练深度学习模型。 2. YOLOv4目标检测:YOLOv4是一种流行的目标检测算法,以其高精度和速度著称。它通过对图像进行一次前向传播来进行目标检测,将目标检测过程简化为一个回归问题。 3. 纯TensorFlow实现:意味着本项目并未借助其他高级框架或库,如Keras,而是使用TensorFlow的基础API来构建YOLOv4模型。 4. 模型训练:开发者能够使用自己的数据集,在YOLOv4模型的基础上进行训练和优化,以适应特定的目标检测任务。 5. 权重转换:项目提供了将预训练的YOLOv4权重文件转换为TensorFlow兼容格式的工具,这使得在自定义数据集上训练变得可能。 6. tf.data.pipline:是TensorFlow 1.x版本中的一个重要特性,用于构建高效的数据输入管道,这能够提高数据加载和处理的速度,从而加快训练过程。 7. Python编程语言:作为最广泛使用的高级编程语言之一,Python在该项目中扮演了重要角色,是编写和执行TensorFlow代码的工具。 8. VOC2007数据集:指的是PASCAL VOC挑战赛在2007年使用的数据集,它包含用于训练和测试目标检测模型的标准化图像集合。项目可能使用了该数据集的部分内容或作为示例数据。 9. 权重文件(weights_name.txt):包含网络所有层名称的文件,这对于加载和管理模型权重非常重要。 通过上述知识点,开发者可以更好地理解YOLOv4_tensorflow项目的运行机制、优势以及如何将其应用到具体的机器学习任务中。

相关推荐

filetype

Part 2: Image Classification Project (50 marks) - Submission All of your dataset, code (Python files and ipynb files) should be a package in a single ZIP file, with a PDF of your report (notebook with output cells, analysis, and answers). INCLUDE your dataset in the zip file. For this project, you will develop an image classification model to recognize objects in your custom dataset. The project involves the following steps: Step 1: Dataset Creation (10 marks) • Task: You can choose an existing dataset such as FashionMNIST and add one more class. • Deliverable: Include the dataset (images and labels) in the ZIP file submission. Step 2: Data Loading and Exploration (10 marks) • Task: Organize the dataset into training, validation, and test sets. Display dataset statistics and visualize samples with their labels. For instance, show the number of data entries, the number of classes, the number of data entries for each class, the shape of the image size, and randomly plot 5 images in the training set with their corresponding labels. Step 3: Model Implementation (10 marks) • Task: Implement a classification model, using frameworks like TensorFlow or PyTorch. Existing models like EfficientNet are allowed. Provide details on model parameters. Step 4: Model Training (10 marks) • Task: Train the model with proper settings (e.g., epochs, optimizer, learning rate). Include visualizations of training and validation performance (loss and accuracy). Step 5: Model Evaluation and Testing (10 marks) • Task: Evaluate the model on the test set. Display sample predictions, calculate accuracy, and generate a confusion matrix.

filetype

class Unet(nn.Module): def __init__(self, num_classes): def forward(self, x): return out 我需要的输出结果是这样的,图片按照代码和题目要求输出,包括Original Image Ground Truth Prediction三部分,都要有对应的输出,并且参与测试的图片都要输出,需要补全上述代码,英文输出: 输出结果: Starting training... Epoch 1/20: 100%|██████████| 46/46 [00:15<00:00, 3.04it/s, loss=2.49]Epoch 1/20, Training Loss: 2.8437 Validation Loss: 2.4612 New best model with validation loss: 2.4612 Epoch 2/20: 100%|██████████| 46/46 [00:15<00:00, 3.00it/s, loss=1.59]Epoch 2/20, Training Loss: 2.0684 Validation Loss: 1.5868 New best model with validation loss: 1.5868 Epoch 3/20: 100%|██████████| 46/46 [00:15<00:00, 3.00it/s, loss=1.26]Epoch 3/20, Training Loss: 1.3412 Validation Loss: 1.1896 New best model with validation loss: 1.1896 Epoch 4/20: 100%|██████████| 46/46 [00:15<00:00, 3.02it/s, loss=1.16]Epoch 4/20, Training Loss: 1.0508 Validation Loss: 1.0617 New best model with validation loss: 1.0617 Epoch 5/20: 100%|██████████| 46/46 [00:15<00:00, 2.99it/s, loss=0.812] Epoch 5/20, Training Loss: 0.9584 Validation Loss: 1.0257 New best model with validation loss: 1.0257 Epoch 6/20: 100%|██████████| 46/46 [00:15<00:00, 2.96it/s, loss=0.841]Epoch 6/20, Training Loss: 0.9038 Validation Loss: 1.0027 New best model with validation loss: 1.0027 Epoch 7/20: 100%|██████████| 46/46 [00:16<00:00, 2.84it/s, loss=0.77]Epoch 7/20, Training Loss: 0.8736 Validation Loss: 0.9764 New best model with validation loss: 0.9764 Epoch 8/20: 100%|██████████| 46/46 [00:16<00:00, 2.87it/s, loss=0.809]Epoch 8/20, Training Loss: 0.8373 Validation Loss: 0.9694 New best model with validation loss: 0.9694 Epoch 9/20: 100%|██████████| 46/46 [00:15<00:00, 2.99it/s, loss=1.04]Epoch 9/20, Training Loss: 0.8129 Validation Loss: 0.9442 New best model with validation loss: 0.9442 Epoch 10/20: 100%|██████████| 46/46 [00:15<00:00, 3.00it/s, loss=0.838]Epoch 10/20, Training Loss: 0.7859 Validation Loss: 0.9309 New best model with validation loss: 0.9309 Epoch 11/20: 100%|██████████| 46/46 [00:15<00:00, 3.01it/s, loss=0.799]Epoch 11/20, Training Loss: 0.7673 Validation Loss: 0.9087 New best model with validation loss: 0.9087 Epoch 12/20: 100%|██████████| 46/46 [00:15<00:00, 3.02it/s, loss=0.673]Epoch 12/20, Training Loss: 0.7386 Validation Loss: 0.9185 Epoch 13/20: 100%|██████████| 46/46 [00:15<00:00, 3.00it/s, loss=0.638]Epoch 13/20, Training Loss: 0.6899 Validation Loss: 0.8576 New best model with validation loss: 0.8576 Epoch 14/20: 100%|██████████| 46/46 [00:15<00:00, 3.01it/s, loss=0.553]Epoch 14/20, Training Loss: 0.6538 Validation Loss: 0.8267 New best model with validation loss: 0.8267 Epoch 15/20: 100%|██████████| 46/46 [00:14<00:00, 3.07it/s, loss=0.765] Epoch 15/20, Training Loss: 0.6342 Validation Loss: 0.8240 New best model with validation loss: 0.8240 Epoch 16/20: 100%|██████████| 46/46 [00:15<00:00, 2.99it/s, loss=0.688]Epoch 16/20, Training Loss: 0.6203 Validation Loss: 0.8336 Epoch 17/20: 100%|██████████| 46/46 [00:15<00:00, 2.99it/s, loss=0.518]Epoch 17/20, Training Loss: 0.6099 Validation Loss: 0.8014 New best model with validation loss: 0.8014 Epoch 18/20: 100%|██████████| 46/46 [00:15<00:00, 2.93it/s, loss=0.444]Epoch 18/20, Training Loss: 0.6023 Validation Loss: 0.8169 Epoch 19/20: 100%|██████████| 46/46 [00:15<00:00, 2.98it/s, loss=0.822]Epoch 19/20, Training Loss: 0.5885 Validation Loss: 0.8045 Epoch 20/20: 100%|██████████| 46/46 [00:15<00:00, 2.90it/s, loss=0.425] Epoch 20/20, Training Loss: 0.5659 Validation Loss: 0.7840 New best model with validation loss: 0.7840 Training finished! <ipython-input-5-1f21aef180ff>:213: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature. model.load_state_dict(torch.load("best_segmentation_model.pth")) Model saved to simple_segmentation_model.pth Visualizing model predictions: 而且还要满足题目要求: Task 1. Implement Unet and train it on the PASCAL VOC dataset The Unet paper is here: https://arxiv.org/pdf/1505.04597 Use any number of tricks that you can You cannot use pretrained models, though (until we learn about transfer learning) You must achieve > 15 mean IOU (the code for evaluation is in the end of the notebook) Grading rubric: mean IOU > 15, 10 points mean 12 < IOU <= 15, 8 points mean 10 <= IOU <= 12, 5 points mean IOU < 10, 0 points Important: you need to achieve 10 and more IOU using all 21 classes from PASCAL VOC In the end of the notebook you must execute the last cell and pass the tests, otherwise you will receive 0. 其中不可修改的代码要保证全部正常输出: import os import numpy as np import matplotlib.pyplot as plt from PIL import Image import torch import torch.nn as nn import torch.nn.functional as F import torch.optim as optim from torch.utils.data import Dataset, DataLoader import torchvision.transforms as transforms import torchvision.transforms.functional as TF import torchvision.models as models from torchvision.datasets import VOCSegmentation from tqdm import tqdm torch.manual_seed(42) device = torch.device('cuda' if torch.cuda.is_available() else 'cpu') print(f"Using device: {device}") DATA_DIR = "./data" BATCH_SIZE = 32 NUM_EPOCHS = 20 # Increased to get better results LEARNING_RATE = 0.0001 # Lowered to improve stability IMAGE_SIZE = (224, 224) # PASCAL VOC has 21 classes (including background) NUM_CLASSES = 21 # PASCAL VOC class labels for visualization VOC_CLASSES = [ 'background', 'aeroplane', 'bicycle', 'bird', 'boat', 'bottle', 'bus', 'car', 'cat', 'chair', 'cow', 'diningtable', 'dog', 'horse', 'motorbike', 'person', 'pottedplant', 'sheep', 'sofa', 'train', 'tvmonitor' ] # Color map for visualization VOC_COLORMAP = [ [0, 0, 0], [128, 0, 0], [0, 128, 0], [128, 128, 0], [0, 0, 128], [128, 0, 128], [0, 128, 128], [128, 128, 128], [64, 0, 0], [192, 0, 0], [64, 128, 0], [192, 128, 0], [64, 0, 128], [192, 0, 128], [64, 128, 128], [192, 128, 128], [0, 64, 0], [128, 64, 0], [0, 192, 0], [128, 192, 0], [0, 64, 128] ] class SegmentationTransform: def __init__(self, size, is_train=False): self.size = size self.is_train = is_train def __call__(self, image, mask): if self.is_train and np.random.random() > 0.5: image = TF.hflip(image) mask = TF.hflip(mask) if self.is_train and np.random.random() > 0.7: angle = np.random.randint(-10, 10) image = TF.rotate(image, angle, interpolation=Image.BILINEAR) mask = TF.rotate(mask, angle, interpolation=Image.NEAREST) if self.is_train and np.random.random() > 0.7: brightness_factor = np.random.uniform(0.8, 1.2) contrast_factor = np.random.uniform(0.8, 1.2) image = TF.adjust_brightness(image, brightness_factor) image = TF.adjust_contrast(image, contrast_factor) image = TF.resize(image, self.size, interpolation=Image.BILINEAR) mask = TF.resize(mask, self.size, interpolation=Image.NEAREST) image = TF.to_tensor(image) mask_array = np.array(mask) mask_array[mask_array == 255] = 0 # Set ignore pixels to background mask = torch.from_numpy(mask_array).long() # Normalize image image = TF.normalize(image, mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]) return image, mask class VOCDatasetWrapper(Dataset): def __init__(self, dataset, transform=None): self.dataset = dataset self.transform = transform def __len__(self): return len(self.dataset) def __getitem__(self, idx): image, mask = self.dataset[idx] if self.transform: image, mask = self.transform(image, mask) return image, mask voc_train = VOCSegmentation(root=DATA_DIR, year='2012', image_set='train', download=True) voc_val = VOCSegmentation(root=DATA_DIR, year='2012', image_set='val', download=True) train_transform = SegmentationTransform(IMAGE_SIZE, is_train=True) val_transform = SegmentationTransform(IMAGE_SIZE, is_train=False) train_dataset = VOCDatasetWrapper(voc_train, train_transform) val_dataset = VOCDatasetWrapper(voc_val, val_transform) train_loader = DataLoader(train_dataset, batch_size=BATCH_SIZE, shuffle=True, num_workers=2) # Reduced workers val_loader = DataLoader(val_dataset, batch_size=BATCH_SIZE, shuffle=False, num_workers=2) # Reduced workers # Display some examples from the dataset def visualize_examples(dataset, num_examples=3): fig, axes = plt.subplots(num_examples, 2, figsize=(12, 4 * num_examples)) for i in range(num_examples): # Get a sample idx = np.random.randint(0, len(dataset)) image, mask = dataset.dataset[idx] # Original image axes[i, 0].imshow(image) axes[i, 0].set_title(f"Original Image {idx}") axes[i, 0].axis('off') # Colored mask colored_mask = np.zeros((mask.size[1], mask.size[0], 3), dtype=np.uint8) mask_array = np.array(mask) for class_idx, color in enumerate(VOC_COLORMAP): colored_mask[mask_array == class_idx] = color axes[i, 1].imshow(colored_mask) axes[i, 1].set_title(f"Segmentation Mask {idx}") axes[i, 1].axis('off') plt.tight_layout() plt.show() # Visualize examples before training print("Displaying dataset examples:") visualize_examples(train_dataset) import torch def evaluate_segmentation(model, val_loader, num_classes, device='cuda'): model.eval() confusion_matrix = torch.zeros(num_classes, num_classes, dtype=torch.long, device=device) ignore_index = 255 with torch.no_grad(): for images, masks in val_loader: images = images.to(device) masks = masks.to(device) outputs = model(images) preds = torch.argmax(outputs, dim=1) # [B, H, W] preds = preds.view(-1) masks = masks.view(-1) # Filter out ignore pixels valid_mask = (masks != ignore_index) preds = preds[valid_mask] gt = masks[valid_mask] # Vectorized confusion matrix update indices = gt * num_classes + preds # also on the GPU bins = torch.bincount(indices, minlength=num_classes*num_classes) confusion_matrix += bins.reshape(num_classes, num_classes) # Move confusion matrix back to CPU if you need .item() or numpy confusion_matrix = confusion_matrix.cpu() # Compute IoU class_iou = [] for c in range(num_classes): TP = confusion_matrix[c, c].item() FN = confusion_matrix[c, :].sum().item() - TP FP = confusion_matrix[:, c].sum().item() - TP denom = TP + FP + FN if denom == 0: iou_c = float('nan') else: iou_c = TP / denom class_iou.append(iou_c) # mean_iou valid_iou = [x for x in class_iou if not np.isnan(x)] mean_iou = float(np.mean(valid_iou)) if len(valid_iou) > 0 else 0.0 return class_iou, mean_iou class_iou, mean_iou = evaluate_segmentation( model=trained_model, val_loader=val_loader, num_classes=NUM_CLASSES, device=device ) # Print results for i, iou_val in enumerate(class_iou): print(f"Class {i} IoU = {iou_val:.4f}") print(f"Mean IoU over {len(class_iou)} classes = {mean_iou:.4f}") 尤其是这部分一定要保证可以正常输出但不能更改代码: assert mean_iou > 0.10, 'Your IOU must be larger than 10 to get the grade' if mean_iou > 0.15: print('Full grade, 10 points') elif 0.12 < mean_iou <= 0.15: print('Partial grade, 8 points') elif 0.10 < mean_iou <= 0.12: print('Partial grade, 5 points') else: print('IOU is less than 10, 0 points') print('All tests pass!')

司幽幽
  • 粉丝: 44
上传资源 快速赚钱