MIdas单目深度估计
时间: 2025-01-31 09:45:36 浏览: 178
### MIDAS 单目深度估计实现与使用
MIDAS (Monocular Depth Awareness System) 是一种先进的单目深度估计模型,能够仅通过一张二维图像预测场景中的物体距离相机的距离。该方法基于卷积神经网络,在多个公开数据集上展示了出色的性能。
#### 安装依赖库
为了运行 MIDAS 模型,需安装必要的 Python 库:
```bash
pip install torch torchvision torchaudio
pip install opencv-python-headless matplotlib numpy requests
```
#### 加载预训练模型并执行推理
下面是一个完整的代码示例来展示如何加载 MIDAS 并对其输入图片进行处理[^1]:
```python
import cv2
import urllib.request
import torch
import numpy as np
from torchvision.transforms import Compose, Resize, ToTensor, Normalize
def load_model(model_type="MiDaS_small"):
midas = torch.hub.load("intel-isl/MiDaS", model_type)
device = torch.device("cuda") if torch.cuda.is_available() else torch.device("cpu")
midas.to(device)
midas.eval()
transform = Compose([
Resize(384),
ToTensor(),
Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
])
return midas, transform, device
url, filename = "http://images.cocodataset.org/val2017/000000039769.jpg", "cat.jpg"
urllib.request.urlretrieve(url, filename)
img = cv2.imread(filename)
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
midas, transform, device = load_model()
input_batch = transform(img).to(device).unsqueeze(0)
with torch.no_grad():
prediction = midas(input_batch)
depth_map = torch.nn.functional.interpolate(
prediction.unsqueeze(1),
size=img.shape[:2],
mode="bicubic",
align_corners=False,
).squeeze().cpu().numpy()
plt.imshow(depth_map)
plt.show()
```
此脚本会下载一幅测试图像,并利用 MiDaS 小型版本模型计算其对应的深度图谱[^1]。
阅读全文
相关推荐

















