一、关于 yolo
YOLO(YOLO(You Only Look Once)是一种流行的物体检测和图像分割模型,由华盛顿大学的约瑟夫-雷德蒙(Joseph Redmon)和阿里-法哈迪(Ali Farhadi)开发。YOLO 于 2015 年推出,因其高速度和高精确度而广受欢迎。
- Docs: https://docs.ultralytics.com
- Solutions: https://docs.ultralytics.com/solutions/
- Community: https://community.ultralytics.com
- GitHub: https://github.com/ultralytics/ultralytics
PS : Flash-Attention 官方明确要求 CUDA 环境23,而 macOS 不支持 NVIDIA CUDA,所以无法安装。
本教程以 Mac Mini M4 为例。
二、安装
安装 torch
https://pytorch.org/get-started/locally/
安装 ultralytics
pip install ultralytics
验证 yolo 安装 – 查看版本
yolo --version
打印信息:
$ yolo --version
Creating new Ultralytics Settings v0.0.6 file ✅
View Ultralytics Settings with 'yolo settings' or at '/Users/es/Library/Application Support/Ultralytics/settings.json'
Update Settings with 'yolo settings key=value', i.e. 'yolo settings runs_dir=path/to/dir'. For help see https://docs.ultralytics.com/quickstart/#ultralytics-settings.
WARNING ⚠️ argument '--version' does not require leading dashes '--', updating to 'version'.
8.3.158
三、命令行使用
常用命令
yolo help
yolo checks
yolo version
yolo settings
yolo copy-cfg
yolo cfg
yolo solutions help
语法格式:
yolo TASK MODE ARGS
- TASK (optional) is one of [‘classify’, ‘segment’, ‘pose’, ‘detect’, ‘obb’]
- MODE (required) is one of [‘benchmark’, ‘track’, ‘export’, ‘val’, ‘train’, ‘predict’]
- ARGS (optional) are any number of custom ‘arg=value’ pairs like ‘imgsz=320’ that override defaults.
更多参数详见:https://docs.ultralytics.com/usage/cfg
官方示例
Train a detection model for 10 epochs with an initial learning_rate of 0.01
yolo train data=coco8.yaml model=yolo11n.pt epochs=10 lr0=0.01
Predict a YouTube video using a pretrained segmentation model at image size 320:
yolo predict model=yolo11n-seg.pt source='https://youtu.be/LNwODJXcvt4' imgsz=320
Val a pretrained detection model at batch-size 1 and image size 640:
yolo val model=yolo11n.pt data=coco8.yaml batch=1 imgsz=640
Export a YOLO11n classification model to ONNX format at image size 224 by 128 (no TASK required)
yolo export model=yolo11n-cls.pt format=onnx imgsz=224,128
Ultralytics solutions usage
yolo solutions count or in ['crop', 'blur', 'workout', 'heatmap', 'isegment', 'visioneye', 'speed', 'queue', 'analytics', 'inference', 'trackzone'] source="path/to/video.mp4"
yolo cfg
$ yolo cfg
Printing '/opt/miniconda3/lib/python3.13/site-packages/ultralytics/cfg/default.yaml'
task: detect
mode: train
model: null
data: null
epochs: 100
time: null
patience: 100
batch: 16
imgsz: 640
save: true
save_period: -1
cache: false
device: null
workers: 8
project: null
name: null
exist_ok: false
pretrained: true
optimizer: auto
verbose: true
seed: 0
deterministic: true
single_cls: false
rect: false
cos_lr: false
close_mosaic: 10
resume: false
amp: true
fraction: 1.0
profile: false
freeze: null
multi_scale: false
overlap_mask: true
mask_ratio: 4
dropout: 0.0
val: true
split: val
save_json: false
conf: null
iou: 0.7
max_det: 300
half: false
dnn: false
plots: true
source: null
vid_stride: 1
stream_buffer: false
visualize: false
augment: false
agnostic_nms: false
classes: null
retina_masks: false
embed: null
show: false
save_frames: false
save_txt: false
save_conf: false
save_crop: false
show_labels: true
show_conf: true
show_boxes: true
line_width: null
format: torchscript
keras: false
optimize: false
int8: false
dynamic: false
simplify: true
opset: null
workspace: null
nms: false
lr0: 0.01
lrf: 0.01
momentum: 0.937
weight_decay: 0.0005
warmup_epochs: 3.0
warmup_momentum: 0.8
warmup_bias_lr: 0.1
box: 7.5
cls: 0.5
dfl: 1.5
pose: 12.0
kobj: 1.0
nbs: 64
hsv_h: 0.015
hsv_s: 0.7
hsv_v: 0.4
degrees: 0.0
translate: 0.1
scale: 0.5
shear: 0.0
perspective: 0.0
flipud: 0.0
fliplr: 0.5
bgr: 0.0
mosaic: 1.0
mixup: 0.0
cutmix: 0.0
copy_paste: 0.0
copy_paste_mode: flip
auto_augment: randaugment
erasing: 0.4
cfg: null
tracker: botsort.yaml
yolo predict
$ yolo predict model=yolo11n.pt imgsz=640 conf=0.25
Downloading https://github.com/ultralytics/assets/releases/download/v8.3.0/yolo11n.pt to 'yolo11n.pt'...
100%|████████████████████████████████████████████| 5.35M/5.35M [00:08<00:00, 655kB/s]
WARNING ⚠️ 'source' argument is missing. Using default 'source=/opt/miniconda3/lib/python3.13/site-packages/ultralytics/assets'.
Ultralytics 8.3.158 🚀 Python-3.13.2 torch-2.7.1 CPU (Apple M4)
YOLO11n summary (fused): 100 layers, 2,616,248 parameters, 0 gradients, 6.5 GFLOPs
image 1/2 /opt/miniconda3/lib/python3.13/site-packages/ultralytics/assets/bus.jpg: 640x480 4 persons, 1 bus, 28.8ms
image 2/2 /opt/miniconda3/lib/python3.13/site-packages/ultralytics/assets/zidane.jpg: 384x640 2 persons, 1 tie, 20.5ms
Speed: 1.9ms preprocess, 24.6ms inference, 2.2ms postprocess per image at shape (1, 3, 384, 640)
Results saved to runs/detect/predict
💡 Learn more at https://docs.ultralytics.com/modes/predict
执行日志说明
- 从 releases 下载模型:https://github.com/ultralytics/assets/releases,下载完后,保存在了 当前执行脚本的目录下。
- 没有设置图片,默认使用 ultralytics 库下的图片:
/opt/miniconda3/lib/python3.13/site-packages/ultralytics/assets
- 识别结果在
runs/detect/predict
文件夹下 - 更多 predict 的使用可见:https://docs.ultralytics.com/modes/predict
四、Python 调用
from ultralytics import YOLO
# Load a model
model = YOLO("yolo11n-seg.pt") # load an official model
model_path = 'yolo11n-seg.pt'
# model_path = ''
model = YOLO(model_path) # load a custom model
# Predict with the model
results = model("https://ultralytics.com/images/bus.jpg") # predict on an image
# Access the results
for result in results:
xy = result.masks.xy # mask in polygon format
xyn = result.masks.xyn # normalized
masks = result.masks.data # mask in matrix format (num_objects x H x W)
print('\n-- results : ', results)
results 数据
此处 results 列表只有一个元素
[ultralytics.engine.results.Results object with attributes:
boxes: ultralytics.engine.results.Boxes object
keypoints: None
masks: ultralytics.engine.results.Masks object
names: {0: 'person', 1: 'bicycle', 2: 'car', 3: 'motorcycle', 4: 'airplane', 5: 'bus', 6: 'train', 7: 'truck', 8: 'boat', 9: 'traffic light', 10: 'fire hydrant', 11: 'stop sign', 12: 'parking meter', 13: 'bench', 14: 'bird', 15: 'cat', 16: 'dog', 17: 'horse', 18: 'sheep', 19: 'cow', 20: 'elephant', 21: 'bear', 22: 'zebra', 23: 'giraffe', 24: 'backpack', 25: 'umbrella', 26: 'handbag', 27: 'tie', 28: 'suitcase', 29: 'frisbee', 30: 'skis', 31: 'snowboard', 32: 'sports ball', 33: 'kite', 34: 'baseball bat', 35: 'baseball glove', 36: 'skateboard', 37: 'surfboard', 38: 'tennis racket', 39: 'bottle', 40: 'wine glass', 41: 'cup', 42: 'fork', 43: 'knife', 44: 'spoon', 45: 'bowl', 46: 'banana', 47: 'apple', 48: 'sandwich', 49: 'orange', 50: 'broccoli', 51: 'carrot', 52: 'hot dog', 53: 'pizza', 54: 'donut', 55: 'cake', 56: 'chair', 57: 'couch', 58: 'potted plant', 59: 'bed', 60: 'dining table', 61: 'toilet', 62: 'tv', 63: 'laptop', 64: 'mouse', 65: 'remote', 66: 'keyboard', 67: 'cell phone', 68: 'microwave', 69: 'oven', 70: 'toaster', 71: 'sink', 72: 'refrigerator', 73: 'book', 74: 'clock', 75: 'vase', 76: 'scissors', 77: 'teddy bear', 78: 'hair drier', 79: 'toothbrush'}
obb: None
orig_img: array([[[119, 146, 172],
[121, 148, 174],
[122, 152, 177],
...,
[161, 171, 188],
[160, 170, 187],
[160, 170, 187]],
[[120, 147, 173],
[122, 149, 175],
[123, 153, 178],
...,
[161, 171, 188],
[160, 170, 187],
[160, 170, 187]],
[[123, 150, 176],
[124, 151, 177],
[125, 155, 180],
...,
[161, 171, 188],
[160, 170, 187],
[160, 170, 187]],
...,
[[183, 182, 186],
[179, 178, 182],
[180, 179, 183],
...,
[121, 111, 117],
[113, 103, 109],
[115, 105, 111]],
[[165, 164, 168],
[173, 172, 176],
[187, 186, 190],
...,
[102, 92, 98],
[101, 91, 97],
[103, 93, 99]],
[[123, 122, 126],
[145, 144, 148],
[176, 175, 179],
...,
[ 95, 85, 91],
[ 96, 86, 92],
[ 98, 88, 94]]], dtype=uint8)
orig_shape: (1080, 810)
path: '/Users/es/Documents/code/code24/bus.jpg'
probs: None
save_dir: 'runs/segment/predict'
speed: {'preprocess': 2.8683749842457473, 'inference': 56.7722920095548, 'postprocess': 5.611875036265701}]