本文将解决如何使用TensorFlow Object Detection API训练交通信号灯检测网络,使用Lisa数据集,通过简单脚本将数据集整理为tf record格式,我们将分别在本地的GPU和Google Cloud提供的TPU上进行训练,最后导出网络的protocbuf权重,在jupyter notebook中进行模型验证。
首先感谢谷歌TensorFlow Research Cloud团队提供的免费Cloud TPU,在本文中我使用的是Cloud TPU v2版本,拥有8个核心共计64GB的内存(类比于GPU的显存),能够提供最大180 TFlops的算力。
环境准备
本地GPU环境准备
首先介绍在本地使用GPU进行训练的环境准备,首先确保你的电脑中包含算力比较充足的GPU(俗称显卡,推荐GTX1060及以上级别),使用Ubuntu或者其他Linux发行版本的系统,安装好CUDA和CUDNN,安装tensorflow-gpu和其他环境:
pip install tensorflow-gpu
sudo apt-get install protobuf-compiler python-pil python-lxml python-tk
pip install --user Cython
pip install --user contextlib2
pip install --user jupyter
pip install --user matplotlib
安装COCO API:
git clone https://github.com/cocodataset/cocoapi.git
cd cocoapi/PythonAPI
make
cp -r pycocotools <path_to_tensorflow>/models/research/
克隆models项目:
git clone https://github.com/tensorflow/models.git
cd 到models/research/目录下,运行:
protoc object_detection/protos/*.proto --python_out=.
添加当前路径(research)到python路径:
export PYTHONPATH=$PYTHONPATH:`pwd`:`pwd`/slim
以上均为官方安装教程,见:https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/installation.md
至此,本地运行环境准备完成,后文中我们将在Google Cloud上如准备以上环境。
交通信号灯检测数据集准备
我们首先下载交通信号灯数据集,本文我们采用Lisa信号灯数据集,下载链接:
http://cvrr.ucsd.edu/vivachallenge/index.php/traffic-light/traffic-light-detection/
使用Day Train Set (12.4 GB)和Night Train Set (0.8 GB)两个数据集进行训练,下载完成后解压,得到原始图片(保存与frame文件夹下)和标注的CSV文件(文件名为:frameAnnotationsBOX.csv)
编写脚本create_lisa_tf_record.py
将Lisa数据集以tf record格式保存,代码如下:
#!/usr/bin/env python
import os
import csv
import io
import itertools
import hashlib
import random
import PIL.Image
import tensorflow as tf
from object_detection.utils import label_map_util
from object_detection.utils import dataset_util
import contextlib2
from object_detection.dataset_tools import tf_record_creation_util
ANNOTATION = 'frameAnnotationsBOX.csv'
FRAMES = 'frames'
MAP = {
'go': 'green',
'goLeft': 'green',
'stop': 'red',
'stopLeft': 'red',
'warning': 'yellow',
'warningLeft': 'yellow'
}
flags = tf.app.flags
flags.DEFINE_string('data_dir', '', 'Root directory to LISA dataset.')
flags.DEFINE_string('output_path', '', 'Path to output TFRecord')
flags.DEFINE_string('label_map_path', 'data/lisa_label_map.pbtxt',
'Path to label map proto')
FLAGS = flags.FLAGS
width = None
height = None
def process_frame(label_map_dict, frame):
global width
global height
filename, xmin, ymin, xmax, ymax, classes = frame
if not os.path.exists(filename):
tf.logging.error("File %s not found", filename)
return
with tf.gfile.GFile(filename, 'rb') as img:
encoded_png = img.read()
png = PIL.Image.open(io.BytesIO(encoded_png))
if png.format != 'PNG':
tf.logging.error("File %s has unexpeted image format '%s'", filename, png.format)
return
if width is None and height is None:
width = png.width
height = png.height
tf.logging.info('Expected image size: %dx%d', width, height)
if width != png.width or height != png.height:
tf.logging.error('File %s has unexpected size', filename)
return
print filename
print classes
key = hashlib.sha256(encoded_png).hexdigest()
labels = [ label_map_dict[c] for c in classes ]
xmin = [ float(x)/width for x in xmin ]
xmax = [ float(x)/width for x in xmax ]
ymin = [ float(y)/height for y in ymin ]
ymax = [ float(y)/height for y in ymax ]
classes = [ c.encode(