This project includes quantization aware training (QAT) code for computer vision models. These are examples to show how to apply the Model Optimization Toolkit's quantization aware training API. compared to post-training quantization (PTQ), QAT can minimize the quality loss from quantization, while still achieving the speed-up from integer quantization. Therefore, it is the preferable technique to use when there is strict requirement on model latency and quality. Please find our blogpost for more details.
Currently, we support a limited number of vision tasks & models. We will keep adding support for other tasks and models in the next releases.
You can follow this Colab notebook to try QAT.
- First release of vision models covering image classification and semantic segmentation tasks. Support ResNet, MobileNetV2, MobileNetV3 large and Multi-hardware MobileNet, and DeepLabV3/V3+.
- Release of support for object detection task (RetinaNet).
- Jaehong Kim (Xhark)
- Fang Yang (fyangf)
- Shixin Luo (luotigerlsx)
Model is trained on ImageNet1K train set and evaluated on the validation set.
Model | Resolution | Top-1 Accuracy (FP32) | Top-1 Accuracy (INT8) | Top-1 Accuracy (QAT INT8) | Config | Download |
---|---|---|---|---|---|---|
MobileNetV2 | 224x224 | 72.78 | 72.39 | 72.79 | config | TFLite(Int8/QAT) |
ResNet50 | 224x224 | 76.71 | 76.42 | 77.20 | config | TFLite(Int8/QAT) |
MobileNetV3.5 MultiAVG | 224x224 | 75.21 | 74.12 | 75.13 | config | TFLite(Int8/QAT) |
Model is trained on COCO train set from scratch and evaluated on COCO validation set.
model | resolution | mAP | mAP (FP32) | mAP (INT8) | mAP (QAT INT8) | download |
---|---|---|---|---|---|---|
MobileNet v2 + RetinaNet | 256x256 | 23.3 | 23.3 | 0.04 | 21.7 | ckpt | tensorboard FP32 | INT8 | QAT INT8 |
Model is pretrained using COCO train set. Two datasets, Pascal VOC segmentation dataset and Cityscapes dataset are used to train and evaluate models.
model | resolution | mIoU | mIoU (FP32) | mIoU (INT8) | mIoU (QAT INT8) | download (tflite) |
---|---|---|---|---|---|---|
MobileNet v2 + DeepLab v3 | 512x512 | 75.27 | 75.30 | 73.95 | 74.68 | FP32 | INT8 | QAT INT8 |
MobileNet v2 + DeepLab v3+ | 1024x2048 | 73.82 | 73.84 | 72.33 | 73.49 | FP32 | INT8 | QAT INT8 |
model | resolution | mIoU | mIoU (FP32) | mIoU (INT8) | mIoU (QAT INT8) | download (tflite) |
---|---|---|---|---|---|---|
MobileNet v2 + DeepLab v3+ | 1024x2048 | 73.82 | 73.84 | 72.33 | 73.49 | FP32 | INT8 | QAT INT8 |
It can run on Google Cloud Platform using Cloud TPU. Here is the instruction of using Cloud TPU. Following the instructions to set up Cloud TPU and launch training, using object detection as an example:
# First download the pre-trained floating point model as QAT needs to finetune it.
gsutil cp gs://tf_model_garden/vision/qat/mobilenetv2_ssd_coco/mobilenetv2_ssd_i256_ckpt.tar.gz /tmp/qat/
# Extract the checkpoint.
tar -xvzf /tmp/qat/mobilenetv2_ssd_i256_ckpt.tar.gz
# Launch training. Note that we override the checkpoint path in the config file by "params_override" to supply the correct checkpoint.
PARAMS_OVERRIDE="task.quantization.pretrained_original_checkpoint=/tmp/qat/mobilenetv2_ssd_i256_ckpt"
EXPERIMENT=retinanet_mobile_coco_qat # Change this for your run, for example, 'mobilenet_imagenet_qat'.
CONFIG_FILE=xxx # Change this for your run, for example, path of coco_mobilenetv2_qat_tpu_e2e.yaml.
TPU_NAME="<tpu-name>" # The name assigned while creating a Cloud TPU.
MODEL_DIR="gs://<path-to-model-directory>" # Change this for your run, for example, /tmp/model_dir.
$ python3 train.py \
--experiment=${EXPERIMENT} \
--config_file=${CONFIG_FILE} \
--model_dir=${MODEL_DIR} \
--tpu=$TPU_NAME \
--params_override=${PARAMS_OVERRIDE}
--mode=train
Please run this command line for evaluation.
EXPERIMENT=retinanet_mobile_coco # Change this for your run, for example, 'mobilenet_imagenet_qat'.
CONFIG_FILE=xxx # Change this for your run, for example, path of coco_mobilenetv2_qat_tpu_e2e.yaml.
TPU_NAME="<tpu-name>" # The name assigned while creating a Cloud TPU.
MODEL_DIR="gs://<path-to-model-directory>" # Change this for your run, for example, /tmp/model_dir.
$ python3 train.py \
--experiment=${EXPERIMENT} \
--config_file=${CONFIG_FILE} \
--model_dir=${MODEL_DIR} \
--tpu=$TPU_NAME \
--mode=eval
This project is licensed under the terms of the Apache License 2.0.