yolov5使用电脑摄像头进行视频
时间: 2025-06-28 12:17:32 浏览: 15
### 使用YOLOv5与电脑摄像头进行实时视频物体检测
为了实现实时视频物体检测,可以利用Python中的`cv2.VideoCapture()`函数来获取来自电脑摄像头的数据流,并将其传递给预训练好的YOLOv5模型来进行预测。下面是一个简单的例子,展示了如何设置环境并执行这一过程[^1]。
#### 安装依赖库
首先需要安装必要的软件包,包括PyTorch、OpenCV以及ultralytics/yolov5仓库中提供的工具集:
```bash
pip install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu113
pip install opencv-python-headless
git clone https://github.com/ultralytics/yolov5.git
cd yolov5
pip install -r requirements.txt
```
#### 编写代码实现
接下来编写一段脚本来读取摄像头帧数据并对每一帧应用YOLOv5模型进行推理操作:
```python
import cv2
from pathlib import Path
import torch
from models.experimental import attempt_load
from utils.general import non_max_suppression, scale_coords
from utils.torch_utils import select_device
from utils.datasets import letterbox
def detect_objects(model, img0):
stride = int(model.stride.max()) # model stride
imgsz = check_img_size(640, s=stride) # check image size
names = model.module.names if hasattr(model, 'module') else model.names # get class names
colors = [[random.randint(0, 255) for _ in range(3)] for _ in names]
# Padded resize
img = letterbox(img0, new_shape=imgsz)[0]
# Convert
img = img[:, :, ::-1].transpose(2, 0, 1) # BGR to RGB, to 3x416x416
img = np.ascontiguousarray(img)
img = torch.from_numpy(img).to(device)
img = img.float() # uint8 to fp16/32
img /= 255.0 # 0 - 255 to 0.0 - 1.0
if img.ndimension() == 3:
img = img.unsqueeze(0)
pred = model(img, augment=False)[0]
# Apply NMS
pred = non_max_suppression(pred, conf_thres=0.25, iou_thres=0.45, classes=None, agnostic=False)
detections = []
for det in pred: # detections per image
if len(det):
# Rescale boxes from img_size to im0 size
det[:, :4] = scale_coords(img.shape[2:], det[:, :4], img0.shape).round()
for *xyxy, conf, cls in reversed(det):
label = f'{names[int(cls)]} {conf:.2f}'
plot_one_box(xyxy, img0, label=label, color=colors[int(cls)], line_thickness=3)
detection_info = {
"label": names[int(cls)],
"confidence": float(conf),
"bounding_box": [int(x.item()) for x in xyxy],
}
detections.append(detection_info)
return img0, detections
if __name__ == '__main__':
weights_path = 'yolov5s.pt' # path/to/custom_weights.pt or yolov5s.pt etc.
device = select_device('')
model = attempt_load(weights_path, map_location=device) # load FP32 model
cap = cv2.VideoCapture(0)
while True:
ret, frame = cap.read()
result_frame, results = detect_objects(model, frame.copy())
cv2.imshow('Detection', result_frame)
key = cv2.waitKey(1) & 0xFF
if key == ord("q"):
break
cap.release()
cv2.destroyAllWindows()
```
此段程序会打开默认的网络摄像机设备(编号通常为0),持续捕获图像帧并通过调用`detect_objects`方法完成对象识别工作。最后,在屏幕上显示带有标注框的结果画面直到按下键盘上的Q键为止。
阅读全文
相关推荐


















