YOLOv4 on Edge TPU
Prepare int8 tflite
Supported Operation: https://coral.ai/docs/edgetpu/models-intro/#supported-operations
- Use YOLOv4-Tiny because YOLOv4 model size is too large.
- Use relu for activation. YOLOv4-Tiny uses leaky-relu, but this is an unsupported operation on the Edge TPU.
- Set the desired input_size before converting the model to tflite.
Training
Ref: https://wiki.loliot.net/docs/lang/python/libraries/yolov4/python-yolov4-training#full-script
- v3
- v2
from tensorflow.keras import callbacks, optimizers
from yolov4.tf import SaveWeightsCallback, YOLOv4
import time
yolo = YOLOv4(tiny=True)
yolo.classes = "/content/drive/My Drive/Hard_Soft/NN/coco/coco.names"
yolo.input_size = 608
yolo.batch_size = 32
yolo.make_model(activation1="relu")
...
Convert to tflite
- v3
- v2
- 2021-02-21
- OS: Ubuntu 20.04
- TF: v2.4.1
- yolov4: v3.1.0
danger
Don't confuse yolov4-iny-relu.cfg
and yolov4-tiny-relu-tpu.cfg
from yolov4.tf import YOLOv4, YOLODataset, save_as_tflite
yolo = YOLOv4()
yolo.config.parse_names("test/coco.names")
yolo.config.parse_cfg("config/yolov4-tiny-relu-tpu.cfg")
yolo.make_model()
yolo.load_weights(
"/home/hhk7734/NN/yolov4-tiny-relu.weights", weights_type="yolo"
)
dataset = YOLODataset(
config=yolo.config,
dataset_list="/home/hhk7734/NN/val2017.txt",
image_path_prefix="/home/hhk7734/NN/val2017",
training=False,
)
save_as_tflite(
model=yolo.model,
tflite_path="yolov4-tiny-relu-int8.tflite",
quantization="full_int8",
dataset=dataset,
)
$ edgetpu_compiler -sa yolov4-tiny-relu-int8.tflite
Edge TPU Compiler version 15.0.340273435
Model compiled successfully in 758 ms.
Input model: yolov4-tiny-relu-int8.tflite
Input size: 5.91MiB
Output model: yolov4-tiny-relu-int8_edgetpu.tflite
Output size: 5.98MiB
On-chip memory used for caching model parameters: 5.83MiB
On-chip memory remaining for caching model parameters: 1.78MiB
Off-chip memory used for streaming uncached model parameters: 0.00B
Number of Edge TPU subgraphs: 2
Total number of operations: 51
Operation log: yolov4-tiny-relu-int8_edgetpu.log
Model successfully compiled but not all operations are supported by the Edge TPU. A percentage of the model will instead run on the CPU, which is slower. If possible, consider updating your model to use only operations supported by the Edge TPU. For details, visit g.co/coral/model-reqs.
Number of operations that will run on Edge TPU: 46
Number of operations that will run on CPU: 5
Operator Count Status
RESIZE_BILINEAR 1 Operation version not supported
DEQUANTIZE 4 Operation is working on an unsupported data type
CONV_2D 21 Mapped to Edge TPU
QUANTIZE 8 Mapped to Edge TPU
PAD 2 Mapped to Edge TPU
CONCATENATION 7 Mapped to Edge TPU
LOGISTIC 2 Mapped to Edge TPU
SPLIT 3 Mapped to Edge TPU
MAX_POOL_2D 3 Mapped to Edge TPU
rsync yolov4-tiny-relu-int8_edgetpu.tflite coco.names yolov4-tiny-relu-tpu.cfg \
mendel@<tpu ip>:~
from yolov4.tf import YOLOv4
yolo = YOLOv4(tiny=True, tpu=True)
yolo.classes = "coco.names"
yolo.input_size = (512, 384) # width, height
yolo.make_model(activation1="relu")
yolo.load_weights("yolov4-tiny-relu.weights", weights_type="yolo")
dataset = yolo.load_dataset(
"train2017.txt",
training=False,
image_path_prefix="/home/hhk7734/NN/train2017"
)
yolo.save_as_tflite(
"yolov4-tiny-relu-int8.tflite",
quantization="full_int8",
data_set=dataset,
num_calibration_steps=400
)
$ edgetpu_compiler -sa yolov4-tiny-relu-int8.tflite
Edge TPU Compiler version 15.0.340273435
Model compiled successfully in 1211 ms.
Input model: yolov4-tiny-relu-int8.tflite
Input size: 5.94MiB
Output model: yolov4-tiny-relu-int8_edgetpu.tflite
Output size: 6.33MiB
On-chip memory used for caching model parameters: 6.06MiB
On-chip memory remaining for caching model parameters: 1.66MiB
Off-chip memory used for streaming uncached model parameters: 3.38KiB
Number of Edge TPU subgraphs: 3
Total number of operations: 132
Operation log: yolov4-tiny-relu-int8_edgetpu.log
Model successfully compiled but not all operations are supported by the Edge TPU. A percentage of the model will instead run on the CPU, which is slower. If possible, consider updating your model to use only operations supported by the Edge TPU. For details, visit g.co/coral/model-reqs.
Number of operations that will run on Edge TPU: 101
Number of operations that will run on CPU: 31
Operator Count Status
PAD 2 Mapped to Edge TPU
SPLIT 7 Mapped to Edge TPU
ADD 6 Mapped to Edge TPU
SUB 6 Mapped to Edge TPU
QUANTIZE 27 Mapped to Edge TPU
QUANTIZE 6 Operation is otherwise supported, but not mapped due to some unspecified limitation
DEQUANTIZE 6 Operation is working on an unsupported data type
EXP 6 Operation is working on an unsupported data type
CONCATENATION 9 Mapped to Edge TPU
MAX_POOL_2D 3 Mapped to Edge TPU
CONV_2D 21 Mapped to Edge TPU
LOGISTIC 2 Mapped to Edge TPU
MUL 18 Mapped to Edge TPU
RESIZE_BILINEAR 1 Operation version not supported
SPLIT_V 12 Operation not supported
rsync yolov4-tiny-relu-int8_edgetpu.tflite coco.names mendel@<tpu ip>:~
Run on Edge TPU
Install the Edge TPU runtime
Ref: https://coral.ai/docs/accelerator/get-started/#1-install-the-edge-tpu-runtime
Install just the TensorFlow Lite interpreter
Ref: https://www.tensorflow.org/lite/guide/python#install_just_the_tensorflow_lite_interpreter
Run example script
- v3
- v2
edge_yolov4_tiny_video_test.py
import cv2
from yolov4.tflite import YOLOv4
yolo = YOLOv4()
yolo.config.parse_names("coco.names")
yolo.config.parse_cfg("yolov4-tiny-relu-tpu.cfg")
yolo.summary()
yolo.load_tflite("yolov4-tiny-relu-int8_edgetpu.tflite")
yolo.inference(
"/dev/video1",
is_image=False,
cv_apiPreference=cv2.CAP_V4L2,
cv_frame_size=(640, 480),
cv_fourcc="YUYV",
)
edge_yolov4_tiny_video_test.py
from yolov4.tflite import YOLOv4
import cv2
yolo = YOLOv4(tiny=True, tpu=True)
yolo.classes = "coco.names"
yolo.load_tflite("yolov4-tiny-relu-int8_edgetpu.tflite")
yolo.inference(
"/dev/video1",
is_image=False,
cv_apiPreference=cv2.CAP_V4L2,
cv_frame_size=(640, 480),
cv_fourcc="YUYV",
)
python3 edge_yolov4_tiny_video_test.py