未验证 提交 decb25a7 编写于 作者: W wangxinxin08 提交者: GitHub

Add det models (#5578)

* add PP-YOLO/PP-YOLOE/PP-YOLOE+/PP-Picodet models

* add PP-YOLO/PP-YOLOE/PP-YOLOE+/PP_PicoDet

* modify info.yml of det models

* Update info.yaml

* Update info.yaml

* Update info.yaml

* Update info.yaml
Co-authored-by: NliuTINA0907 <65896652+liuTINA0907@users.noreply.github.com>
上级 f5c7f8b1
import gradio as gr
import numpy as np
import os
from src.detection import Detector
# UGC: Define the inference fn() for your models
def model_inference(image):
image, json_out = Detector('PP-PicoDet')(image)
return image, json_out
def clear_all():
return None, None, None
with gr.Blocks() as demo:
gr.Markdown("Objective Detection")
with gr.Column(scale=1, min_width=100):
img_in = gr.Image(label="Input").style(height=200)
with gr.Row():
btn1 = gr.Button("Clear")
btn2 = gr.Button("Submit")
img_out = gr.Image(label="Output").style(height=200)
json_out = gr.JSON(label="jsonOutput")
btn2.click(fn=model_inference, inputs=img_in, outputs=[img_out, json_out])
btn1.click(fn=clear_all, inputs=None, outputs=[img_in, img_out, json_out])
gr.Button.style(1)
demo.launch()
【PP-PicoDet-App-YAML】
APP_Info:
title: PP-PicoDet-App
colorFrom: blue
colorTo: yellow
sdk: gradio
sdk_version: 3.9.1
app_file: app.py
license: apache-2.0
device: cpu
\ No newline at end of file
mode: paddle
draw_threshold: 0.5
metric: COCO
use_dynamic_shape: false
arch: GFL
min_subgraph_size: 3
param_path: paddlecv://models/picodet_s_320_coco_lcnet/model.pdiparams
model_path: paddlecv://models/picodet_s_320_coco_lcnet/model.pdmodel
Preprocess:
- interp: 2
keep_ratio: false
target_size:
- 320
- 320
type: Resize
- is_scale: true
mean:
- 0.485
- 0.456
- 0.406
std:
- 0.229
- 0.224
- 0.225
type: NormalizeImage
- type: Permute
label_list:
- person
- bicycle
- car
- motorcycle
- airplane
- bus
- train
- truck
- boat
- traffic light
- fire hydrant
- stop sign
- parking meter
- bench
- bird
- cat
- dog
- horse
- sheep
- cow
- elephant
- bear
- zebra
- giraffe
- backpack
- umbrella
- handbag
- tie
- suitcase
- frisbee
- skis
- snowboard
- sports ball
- kite
- baseball bat
- baseball glove
- skateboard
- surfboard
- tennis racket
- bottle
- wine glass
- cup
- fork
- knife
- spoon
- bowl
- banana
- apple
- sandwich
- orange
- broccoli
- carrot
- hot dog
- pizza
- donut
- cake
- chair
- couch
- potted plant
- bed
- dining table
- toilet
- tv
- laptop
- mouse
- remote
- keyboard
- cell phone
- microwave
- oven
- toaster
- sink
- refrigerator
- book
- clock
- vase
- scissors
- teddy bear
- hair drier
- toothbrush
gradio
opencv-python
paddlepaddle
PyYAML
shapely
scipy
Cython
numpy
setuptools
pillow
\ No newline at end of file
import cv2
import os
import numpy as np
import yaml
from paddle.inference import Config, create_predictor, PrecisionType
from PIL import Image
from .download import get_model_path
from .preprocess import preprocess, Resize, NormalizeImage, Permute, PadStride, decode_image
from .visualize import draw_det
class Detector(object):
def __init__(self, model_name):
parent_path = os.path.abspath(os.path.join(__file__, *(['..'] * 2)))
yml_file = os.path.join(parent_path, 'configs/{}.yml'.format(model_name))
with open(yml_file, 'r') as f:
yml_conf = yaml.safe_load(f)
infer_model = get_model_path(yml_conf['model_path'])
infer_params = get_model_path(yml_conf['param_path'])
config = Config(infer_model, infer_params)
device = yml_conf.get('device', 'CPU')
run_mode = yml_conf.get('mode', 'paddle')
cpu_threads = yml_conf.get('cpu_threads', 1)
if device == 'CPU':
config.disable_gpu()
config.set_cpu_math_library_num_threads(cpu_threads)
elif device == 'GPU':
# initial GPU memory(M), device ID
config.enable_use_gpu(200, 0)
# optimize graph and fuse op
config.switch_ir_optim(True)
precision_map = {
'trt_int8': Config.Precision.Int8,
'trt_fp32': Config.Precision.Float32,
'trt_fp16': Config.Precision.Half
}
if run_mode in precision_map.keys():
config.enable_tensorrt_engine(
workspace_size=(1 << 25) * batch_size,
max_batch_size=batch_size,
min_subgraph_size=yml_conf['min_subgraph_size'],
precision_mode=precision_map[run_mode],
use_static=True,
use_calib_mode=False)
if yml_conf['use_dynamic_shape']:
min_input_shape = {
'image': [batch_size, 3, 640, 640],
'scale_factor': [batch_size, 2]
}
max_input_shape = {
'image': [batch_size, 3, 1280, 1280],
'scale_factor': [batch_size, 2]
}
opt_input_shape = {
'image': [batch_size, 3, 1024, 1024],
'scale_factor': [batch_size, 2]
}
config.set_trt_dynamic_shape_info(min_input_shape, max_input_shape,
opt_input_shape)
# disable print log when predict
config.disable_glog_info()
# enable shared memory
config.enable_memory_optim()
# disable feed, fetch OP, needed by zero_copy_run
config.switch_use_feed_fetch_ops(False)
self.predictor = create_predictor(config)
self.yml_conf = yml_conf
self.preprocess_ops = self.create_preprocess_ops(yml_conf)
self.input_names = self.predictor.get_input_names()
self.output_names = self.predictor.get_output_names()
self.draw_threshold = yml_conf.get('draw_threshold', 0.5)
self.class_names = yml_conf['label_list']
def create_preprocess_ops(self, yml_conf):
preprocess_ops = []
for op_info in yml_conf['Preprocess']:
new_op_info = op_info.copy()
op_type = new_op_info.pop('type')
preprocess_ops.append(eval(op_type)(**new_op_info))
return preprocess_ops
def create_inputs(self, image_files):
inputs = dict()
im_list, im_info_list = [], []
for im_path in image_files:
im, im_info = preprocess(im_path, self.preprocess_ops)
im_list.append(im)
im_info_list.append(im_info)
inputs['im_shape'] = np.stack([e['im_shape'] for e in im_info_list], axis=0).astype('float32')
inputs['scale_factor'] = np.stack([e['scale_factor'] for e in im_info_list], axis=0).astype('float32')
inputs['image'] = np.stack(im_list, axis=0).astype('float32')
return inputs
def __call__(self, image_file):
inputs = self.create_inputs([image_file])
for name in self.input_names:
input_tensor = self.predictor.get_input_handle(name)
input_tensor.copy_from_cpu(inputs[name])
self.predictor.run()
boxes_tensor = self.predictor.get_output_handle(self.output_names[0])
np_boxes = boxes_tensor.copy_to_cpu()
boxes_num = self.predictor.get_output_handle(self.output_names[1])
np_boxes_num = boxes_num.copy_to_cpu()
if np_boxes_num.sum() <= 0:
np_boxes = np.zeros([0, 6])
if isinstance(image_file, str):
image = Image.open(image_file).convert('RGB')
elif isinstance(image_file, np.ndarray):
image = image_file
expect_boxes = (np_boxes[:, 1] > self.draw_threshold) & (np_boxes[:, 0] > -1)
np_boxes = np_boxes[expect_boxes, :]
image = draw_det(image, np_boxes, self.class_names)
return image, {'bboxes': np_boxes.tolist()}
\ No newline at end of file
# Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import os
import os.path as osp
import sys
import yaml
import time
import shutil
import requests
import tqdm
import hashlib
import base64
import binascii
import tarfile
import zipfile
__all__ = [
'get_model_path',
'get_config_path',
'get_dict_path',
]
WEIGHTS_HOME = osp.expanduser("~/.cache/paddlecv/models")
CONFIGS_HOME = osp.expanduser("~/.cache/paddlecv/configs")
DICTS_HOME = osp.expanduser("~/.cache/paddlecv/dicts")
# dict of {dataset_name: (download_info, sub_dirs)}
# download info: [(url, md5sum)]
DOWNLOAD_RETRY_LIMIT = 3
PMP_DOWNLOAD_URL_PREFIX = 'https://bj.bcebos.com/v1/paddle-model-ecology/paddlecv/'
def is_url(path):
"""
Whether path is URL.
Args:
path (string): URL string or not.
"""
return path.startswith('http://') \
or path.startswith('https://') \
or path.startswith('paddlecv://')
def parse_url(url):
url = url.replace("paddlecv://", PMP_DOWNLOAD_URL_PREFIX)
return url
def get_model_path(path):
"""Get model path from WEIGHTS_HOME, if not exists,
download it from url.
"""
if not is_url(path):
return path
url = parse_url(path)
path, _ = get_path(url, WEIGHTS_HOME, path_depth=2)
return path
def get_config_path(path):
"""Get config path from CONFIGS_HOME, if not exists,
download it from url.
"""
if not is_url(path):
return path
url = parse_url(path)
path, _ = get_path(url, CONFIGS_HOME)
return path
def get_dict_path(path):
"""Get config path from CONFIGS_HOME, if not exists,
download it from url.
"""
if not is_url(path):
return path
url = parse_url(path)
path, _ = get_path(url, DICTS_HOME)
return path
def map_path(url, root_dir, path_depth=1):
# parse path after download to decompress under root_dir
assert path_depth > 0, "path_depth should be a positive integer"
dirname = url
for _ in range(path_depth):
dirname = osp.dirname(dirname)
fpath = osp.relpath(url, dirname)
path = osp.join(root_dir, fpath)
dirname = osp.dirname(path)
return path, dirname
def get_path(url, root_dir, md5sum=None, check_exist=True, path_depth=1):
""" Download from given url to root_dir.
if file or directory specified by url is exists under
root_dir, return the path directly, otherwise download
from url, return the path.
url (str): download url
root_dir (str): root dir for downloading, it should be
WEIGHTS_HOME
md5sum (str): md5 sum of download package
"""
# parse path after download to decompress under root_dir
fullpath, dirname = map_path(url, root_dir, path_depth)
if osp.exists(fullpath) and check_exist:
if not osp.isfile(fullpath) or \
_check_exist_file_md5(fullpath, md5sum, url):
return fullpath, True
else:
os.remove(fullpath)
fullname = _download(url, dirname, md5sum)
return fullpath, False
def _download(url, path, md5sum=None):
"""
Download from url, save to path.
url (str): download url
path (str): download to given path
"""
if not osp.exists(path):
os.makedirs(path)
fname = osp.split(url)[-1]
fullname = osp.join(path, fname)
retry_cnt = 0
while not (osp.exists(fullname) and _check_exist_file_md5(fullname, md5sum,
url)):
if retry_cnt < DOWNLOAD_RETRY_LIMIT:
retry_cnt += 1
else:
raise RuntimeError("Download from {} failed. "
"Retry limit reached".format(url))
# NOTE: windows path join may incur \, which is invalid in url
if sys.platform == "win32":
url = url.replace('\\', '/')
req = requests.get(url, stream=True)
if req.status_code != 200:
raise RuntimeError("Downloading from {} failed with code "
"{}!".format(url, req.status_code))
# For protecting download interupted, download to
# tmp_fullname firstly, move tmp_fullname to fullname
# after download finished
tmp_fullname = fullname + "_tmp"
total_size = req.headers.get('content-length')
with open(tmp_fullname, 'wb') as f:
if total_size:
for chunk in tqdm.tqdm(
req.iter_content(chunk_size=1024),
total=(int(total_size) + 1023) // 1024,
unit='KB'):
f.write(chunk)
else:
for chunk in req.iter_content(chunk_size=1024):
if chunk:
f.write(chunk)
shutil.move(tmp_fullname, fullname)
return fullname
def _check_exist_file_md5(filename, md5sum, url):
# if md5sum is None, and file to check is model file,
# read md5um from url and check, else check md5sum directly
return _md5check_from_url(filename, url) if md5sum is None \
and filename.endswith('pdparams') \
else _md5check(filename, md5sum)
def _md5check_from_url(filename, url):
# For model in bcebos URLs, MD5 value is contained
# in request header as 'content_md5'
req = requests.get(url, stream=True)
content_md5 = req.headers.get('content-md5')
req.close()
if not content_md5 or _md5check(
filename,
binascii.hexlify(base64.b64decode(content_md5.strip('"'))).decode(
)):
return True
else:
return False
def _md5check(fullname, md5sum=None):
if md5sum is None:
return True
md5 = hashlib.md5()
with open(fullname, 'rb') as f:
for chunk in iter(lambda: f.read(4096), b""):
md5.update(chunk)
calc_md5sum = md5.hexdigest()
if calc_md5sum != md5sum:
return False
return True
# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import cv2
import numpy as np
def decode_image(im_file, im_info):
"""read rgb image
Args:
im_file (str|np.ndarray): input can be image path or np.ndarray
im_info (dict): info of image
Returns:
im (np.ndarray): processed image (np.ndarray)
im_info (dict): info of processed image
"""
if isinstance(im_file, str):
with open(im_file, 'rb') as f:
im_read = f.read()
data = np.frombuffer(im_read, dtype='uint8')
im = cv2.imdecode(data, 1) # BGR mode, but need RGB mode
im = cv2.cvtColor(im, cv2.COLOR_BGR2RGB)
else:
im = im_file
im_info['im_shape'] = np.array(im.shape[:2], dtype=np.float32)
im_info['scale_factor'] = np.array([1., 1.], dtype=np.float32)
return im, im_info
class Resize(object):
"""resize image by target_size and max_size
Args:
target_size (int): the target size of image
keep_ratio (bool): whether keep_ratio or not, default true
interp (int): method of resize
"""
def __init__(self, target_size, keep_ratio=True, interp=cv2.INTER_LINEAR):
if isinstance(target_size, int):
target_size = [target_size, target_size]
self.target_size = target_size
self.keep_ratio = keep_ratio
self.interp = interp
def __call__(self, im, im_info):
"""
Args:
im (np.ndarray): image (np.ndarray)
im_info (dict): info of image
Returns:
im (np.ndarray): processed image (np.ndarray)
im_info (dict): info of processed image
"""
assert len(self.target_size) == 2
assert self.target_size[0] > 0 and self.target_size[1] > 0
im_channel = im.shape[2]
im_scale_y, im_scale_x = self.generate_scale(im)
im = cv2.resize(
im,
None,
None,
fx=im_scale_x,
fy=im_scale_y,
interpolation=self.interp)
im_info['im_shape'] = np.array(im.shape[:2]).astype('float32')
im_info['scale_factor'] = np.array(
[im_scale_y, im_scale_x]).astype('float32')
return im, im_info
def generate_scale(self, im):
"""
Args:
im (np.ndarray): image (np.ndarray)
Returns:
im_scale_x: the resize ratio of X
im_scale_y: the resize ratio of Y
"""
origin_shape = im.shape[:2]
im_c = im.shape[2]
if self.keep_ratio:
im_size_min = np.min(origin_shape)
im_size_max = np.max(origin_shape)
target_size_min = np.min(self.target_size)
target_size_max = np.max(self.target_size)
im_scale = float(target_size_min) / float(im_size_min)
if np.round(im_scale * im_size_max) > target_size_max:
im_scale = float(target_size_max) / float(im_size_max)
im_scale_x = im_scale
im_scale_y = im_scale
else:
resize_h, resize_w = self.target_size
im_scale_y = resize_h / float(origin_shape[0])
im_scale_x = resize_w / float(origin_shape[1])
return im_scale_y, im_scale_x
class NormalizeImage(object):
"""normalize image
Args:
mean (list): im - mean
std (list): im / std
is_scale (bool): whether need im / 255
norm_type (str): type in ['mean_std', 'none']
"""
def __init__(self, mean, std, is_scale=True, norm_type='mean_std'):
self.mean = mean
self.std = std
self.is_scale = is_scale
self.norm_type = norm_type
def __call__(self, im, im_info):
"""
Args:
im (np.ndarray): image (np.ndarray)
im_info (dict): info of image
Returns:
im (np.ndarray): processed image (np.ndarray)
im_info (dict): info of processed image
"""
im = im.astype(np.float32, copy=False)
if self.is_scale:
scale = 1.0 / 255.0
im *= scale
if self.norm_type == 'mean_std':
mean = np.array(self.mean)[np.newaxis, np.newaxis, :]
std = np.array(self.std)[np.newaxis, np.newaxis, :]
im -= mean
im /= std
return im, im_info
class Permute(object):
"""permute image
Args:
to_bgr (bool): whether convert RGB to BGR
channel_first (bool): whether convert HWC to CHW
"""
def __init__(self, ):
super(Permute, self).__init__()
def __call__(self, im, im_info):
"""
Args:
im (np.ndarray): image (np.ndarray)
im_info (dict): info of image
Returns:
im (np.ndarray): processed image (np.ndarray)
im_info (dict): info of processed image
"""
im = im.transpose((2, 0, 1)).copy()
return im, im_info
class PadStride(object):
""" padding image for model with FPN, instead PadBatch(pad_to_stride) in original config
Args:
stride (bool): model with FPN need image shape % stride == 0
"""
def __init__(self, stride=0):
self.coarsest_stride = stride
def __call__(self, im, im_info):
"""
Args:
im (np.ndarray): image (np.ndarray)
im_info (dict): info of image
Returns:
im (np.ndarray): processed image (np.ndarray)
im_info (dict): info of processed image
"""
coarsest_stride = self.coarsest_stride
if coarsest_stride <= 0:
return im, im_info
im_c, im_h, im_w = im.shape
pad_h = int(np.ceil(float(im_h) / coarsest_stride) * coarsest_stride)
pad_w = int(np.ceil(float(im_w) / coarsest_stride) * coarsest_stride)
padding_im = np.zeros((im_c, pad_h, pad_w), dtype=np.float32)
padding_im[:, :im_h, :im_w] = im
return padding_im, im_info
def preprocess(im, preprocess_ops):
# process image by preprocess_ops
im_info = {
'scale_factor': np.array(
[1., 1.], dtype=np.float32),
'im_shape': None,
}
im, im_info = decode_image(im, im_info)
for operator in preprocess_ops:
im, im_info = operator(im, im_info)
return im, im_info
import numpy as np
from PIL import Image, ImageDraw, ImageFile
def get_color_map_list(num_classes):
"""
Args:
num_classes (int): number of class
Returns:
color_map (list): RGB color list
"""
color_map = num_classes * [0, 0, 0]
for i in range(0, num_classes):
j = 0
lab = i
while lab:
color_map[i * 3] |= (((lab >> 0) & 1) << (7 - j))
color_map[i * 3 + 1] |= (((lab >> 1) & 1) << (7 - j))
color_map[i * 3 + 2] |= (((lab >> 2) & 1) << (7 - j))
j += 1
lab >>= 3
color_map = [color_map[i:i + 3] for i in range(0, len(color_map), 3)]
return color_map
def draw_det(image, dt_bboxes, name_set):
im = Image.fromarray(image)
draw_thickness = min(im.size) // 320
draw = ImageDraw.Draw(im)
clsid2color = {}
color_list = get_color_map_list(len(name_set))
for (cls_id, score, xmin, ymin, xmax, ymax) in dt_bboxes:
cls_id = int(cls_id)
name = name_set[cls_id]
color = tuple(color_list[cls_id])
# draw bbox
draw.line(
[(xmin, ymin), (xmin, ymax), (xmax, ymax), (xmax, ymin),
(xmin, ymin)],
width=draw_thickness,
fill=color)
# draw label
text = "{} {:.4f}".format(name, score)
box = draw.textbbox((xmin, ymin), text, anchor='lt')
draw.rectangle(box, fill=color)
draw.text((box[0], box[1]), text, fill=(255, 255, 255))
image = np.array(im)
return image
\ No newline at end of file
## Benchmark
| 模型 | 输入尺寸 | mAP<sup>val<br>0.5:0.95 | mAP<sup>val<br>0.5 | 参数量<br><sup>(M) | FLOPS<br><sup>(G) | 预测时延<sup><small>[CPU](#latency)</small><sup><br><sup>(ms) | 预测时延<sup><small>[Lite](#latency)</small><sup><br><sup>(ms) |
| :-------- | :--------: | :---------------------: | :----------------: | :----------------: | :---------------: | :-----------------------------: | :-----------------------------: |
| PicoDet-XS | 320*320 | 23.5 | 36.1 | 0.70 | 0.67 | 3.9ms | 7.81ms |
| PicoDet-XS | 416*416 | 26.2 | 39.3 | 0.70 | 1.13 | 6.1ms | 12.38ms |
| PicoDet-S | 320*320 | 29.1 | 43.4 | 1.18 | 0.97 | 4.8ms | 9.56ms |
| PicoDet-S | 416*416 | 32.5 | 47.6 | 1.18 | 1.65 | 6.6ms | 15.20ms |
| PicoDet-M | 320*320 | 34.4 | 50.0 | 3.46 | 2.57 | 8.2ms | 17.68ms |
| PicoDet-M | 416*416 | 37.5 | 53.4 | 3.46 | 4.34 | 12.7ms | 28.39ms |
| PicoDet-L | 320*320 | 36.1 | 52.0 | 5.80 | 4.20 | 11.5ms | 25.21ms |
| PicoDet-L | 416*416 | 39.4 | 55.7 | 5.80 | 7.10 | 20.7ms | 42.23ms |
| PicoDet-L | 640*640 | 42.6 | 59.2 | 5.80 | 16.81 | 62.5ms | 108.1ms |
**注意:**
- 时延测试:</a> 我们所有的模型都在`英特尔酷睿i7 10750H`的CPU 和`骁龙865(4xA77+4xA55)`的ARM CPU上测试(4线程,FP16预测)。上面表格中标有`CPU`的是使用OpenVINO测试,标有`Lite`的是使用[Paddle Lite](https://github.com/PaddlePaddle/Paddle-Lite)进行测试。
- PicoDet在COCO train2017上训练,并且在COCO val2017上进行验证。使用4卡GPU训练,并且上表所有的预训练模型都是通过发布的默认配置训练得到。
- Benchmark测试:测试速度benchmark性能时,导出模型后处理不包含在网络中,需要设置`-o export.benchmark=True` 或手动修改[runtime.yml](https://github.com/PaddlePaddle/PaddleDetection/blob/develop/configs/runtime.yml#L12)
## Benchmark
| Model | Input size | mAP<sup>val<br>0.5:0.95 | mAP<sup>val<br>0.5 | Params<br><sup>(M) | FLOPS<br><sup>(G) | Latency<sup><small>[CPU](#latency)</small><sup><br><sup>(ms) | Latency<sup><small>[Lite](#latency)</small><sup><br><sup>(ms) |
| :-------- | :--------: | :---------------------: | :----------------: | :----------------: | :---------------: | :-----------------------------: | :-----------------------------: |
| PicoDet-XS | 320*320 | 23.5 | 36.1 | 0.70 | 0.67 | 3.9ms | 7.81ms |
| PicoDet-XS | 416*416 | 26.2 | 39.3 | 0.70 | 1.13 | 6.1ms | 12.38ms |
| PicoDet-S | 320*320 | 29.1 | 43.4 | 1.18 | 0.97 | 4.8ms | 9.56ms |
| PicoDet-S | 416*416 | 32.5 | 47.6 | 1.18 | 1.65 | 6.6ms | 15.20ms |
| PicoDet-M | 320*320 | 34.4 | 50.0 | 3.46 | 2.57 | 8.2ms | 17.68ms |
| PicoDet-M | 416*416 | 37.5 | 53.4 | 3.46 | 4.34 | 12.7ms | 28.39ms |
| PicoDet-L | 320*320 | 36.1 | 52.0 | 5.80 | 4.20 | 11.5ms | 25.21ms |
| PicoDet-L | 416*416 | 39.4 | 55.7 | 5.80 | 7.10 | 20.7ms | 42.23ms |
| PicoDet-L | 640*640 | 42.6 | 59.2 | 5.80 | 16.81 | 62.5ms | 108.1ms |
**Notes:**
- Latency:</a> All our models test on `Intel core i7 10750H` CPU with MKLDNN by 12 threads and `Qualcomm Snapdragon 865(4xA77+4xA55)` with 4 threads by arm8 and with FP16. In the above table, test CPU latency on Paddle-Inference and testing Mobile latency with `Lite`->[Paddle-Lite](https://github.com/PaddlePaddle/Paddle-Lite).
- PicoDet is trained on COCO train2017 dataset and evaluated on COCO val2017. And PicoDet used 4 GPUs for training and all checkpoints are trained with default settings and hyperparameters.
- Benchmark test: When testing the speed benchmark, the post-processing is not included in the exported model, you need to set `-o export.benchmark=True` or manually modify [runtime.yml](https://github.com/PaddlePaddle/PaddleDetection/blob/develop/configs/runtime.yml#L12).
\ No newline at end of file
## 模型库
### 基础模型库
| 模型 | 输入尺寸 | 权重下载 | 配置文件 | 导出模型 |
| :-- | :-----: | :-----: | :----: | :-----: |
| PicoDet-XS | 320*320 | [model](https://paddledet.bj.bcebos.com/models/picodet_xs_320_coco_lcnet.pdparams) &#124; [log](https://paddledet.bj.bcebos.com/logs/train_picodet_xs_320_coco_lcnet.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/picodet/picodet_xs_320_coco_lcnet.yml) | [w/ 后处理](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_xs_320_coco_lcnet.tar) &#124; [w/o 后处理](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_xs_320_coco_lcnet_non_postprocess.tar) |
| PicoDet-XS | 416*416 | [model](https://paddledet.bj.bcebos.com/models/picodet_xs_416_coco_lcnet.pdparams) &#124; [log](https://paddledet.bj.bcebos.com/logs/train_picodet_xs_416_coco_lcnet.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/picodet/picodet_xs_416_coco_lcnet.yml) | [w/ 后处理](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_xs_416_coco_lcnet.tar) &#124; [w/o 后处理](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_xs_416_coco_lcnet_non_postprocess.tar) |
| PicoDet-S | 320*320 | [model](https://paddledet.bj.bcebos.com/models/picodet_s_320_coco_lcnet.pdparams) &#124; [log](https://paddledet.bj.bcebos.com/logs/train_picodet_s_320_coco_lcnet.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/picodet/picodet_s_320_coco_lcnet.yml) | [w/ 后处理](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_s_320_coco_lcnet.tar) &#124; [w/o 后处理](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_s_320_coco_lcnet_non_postprocess.tar) |
| PicoDet-S | 416*416 | [model](https://paddledet.bj.bcebos.com/models/picodet_s_416_coco_lcnet.pdparams) &#124; [log](https://paddledet.bj.bcebos.com/logs/train_picodet_s_416_coco_lcnet.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/picodet/picodet_s_416_coco_lcnet.yml) | [w/ 后处理](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_s_416_coco_lcnet.tar) &#124; [w/o 后处理](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_s_416_coco_lcnet_non_postprocess.tar) |
| PicoDet-M | 320*320 | [model](https://paddledet.bj.bcebos.com/models/picodet_m_320_coco_lcnet.pdparams) &#124; [log](https://paddledet.bj.bcebos.com/logs/train_picodet_m_320_coco_lcnet.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/picodet/picodet_m_320_coco_lcnet.yml) | [w/ 后处理](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_m_320_coco_lcnet.tar) &#124; [w/o 后处理](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_m_320_coco_lcnet_non_postprocess.tar) |
| PicoDet-M | 416*416 | [model](https://paddledet.bj.bcebos.com/models/picodet_m_416_coco_lcnet.pdparams) &#124; [log](https://paddledet.bj.bcebos.com/logs/train_picodet_m_416_coco_lcnet.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/picodet/picodet_m_416_coco_lcnet.yml) | [w/ 后处理](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_m_416_coco_lcnet.tar) &#124; [w/o 后处理](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_m_416_coco_lcnet_non_postprocess.tar) |
| PicoDet-L | 320*320 | [model](https://paddledet.bj.bcebos.com/models/picodet_l_320_coco_lcnet.pdparams) &#124; [log](https://paddledet.bj.bcebos.com/logs/train_picodet_l_320_coco_lcnet.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/picodet/picodet_l_320_coco_lcnet.yml) | [w/ 后处理](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_l_320_coco_lcnet.tar) &#124; [w/o 后处理](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_l_320_coco_lcnet_non_postprocess.tar) |
| PicoDet-L | 416*416 | [model](https://paddledet.bj.bcebos.com/models/picodet_l_416_coco_lcnet.pdparams) &#124; [log](https://paddledet.bj.bcebos.com/logs/train_picodet_l_416_coco_lcnet.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/picodet/picodet_l_416_coco_lcnet.yml) | [w/ 后处理](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_l_416_coco_lcnet.tar) &#124; [w/o 后处理](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_l_416_coco_lcnet_non_postprocess.tar) |
| PicoDet-L | 640*640 | [model](https://paddledet.bj.bcebos.com/models/picodet_l_640_coco_lcnet.pdparams) &#124; [log](https://paddledet.bj.bcebos.com/logs/train_picodet_l_640_coco_lcnet.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/picodet/picodet_l_640_coco_lcnet.yml) | [w/ 后处理](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_l_640_coco_lcnet.tar) &#124; [w/o 后处理](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_l_640_coco_lcnet_non_postprocess.tar) |
### 部署模型库
| 模型 | 输入尺寸 | ONNX | Paddle Lite(fp32) | Paddle Lite(fp16) |
| :-------- | :--------: | :---------------------: | :----------------: | :----------------: |
| PicoDet-XS | 320*320 | [( w/ 后处理)](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_xs_320_lcnet_postprocessed.onnx) &#124; [( w/o 后处理)](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_xs_320_coco_lcnet.onnx) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_xs_320_coco_lcnet.tar) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_xs_320_coco_lcnet_fp16.tar) |
| PicoDet-XS | 416*416 | [( w/ 后处理)](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_xs_416_lcnet_postprocessed.onnx) &#124; [( w/o 后处理)](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_xs_416_coco_lcnet.onnx) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_xs_416_coco_lcnet.tar) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_xs_416_coco_lcnet_fp16.tar) |
| PicoDet-S | 320*320 | [( w/ 后处理)](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_s_320_lcnet_postprocessed.onnx) &#124; [( w/o 后处理)](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_s_320_coco_lcnet.onnx) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_s_320_coco_lcnet.tar) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_s_320_coco_lcnet_fp16.tar) |
| PicoDet-S | 416*416 | [( w/ 后处理)](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_s_416_lcnet_postprocessed.onnx) &#124; [( w/o 后处理)](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_s_416_coco_lcnet.onnx) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_s_416_coco_lcnet.tar) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_s_416_coco_lcnet_fp16.tar) |
| PicoDet-M | 320*320 | [( w/ 后处理)](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_m_320_lcnet_postprocessed.onnx) &#124; [( w/o 后处理)](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_m_320_coco_lcnet.onnx) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_m_320_coco_lcnet.tar) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_m_320_coco_lcnet_fp16.tar) |
| PicoDet-M | 416*416 | [( w/ 后处理)](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_m_416_lcnet_postprocessed.onnx) &#124; [( w/o 后处理)](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_m_416_coco_lcnet.onnx) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_m_416_coco_lcnet.tar) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_m_416_coco_lcnet_fp16.tar) |
| PicoDet-L | 320*320 | [( w/ 后处理)](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_l_320_lcnet_postprocessed.onnx) &#124; [( w/o 后处理)](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_l_320_coco_lcnet.onnx) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_l_320_coco_lcnet.tar) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_l_320_coco_lcnet_fp16.tar) |
| PicoDet-L | 416*416 | [( w/ 后处理)](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_l_416_lcnet_postprocessed.onnx) &#124; [( w/o 后处理)](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_l_416_coco_lcnet.onnx) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_l_416_coco_lcnet.tar) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_l_416_coco_lcnet_fp16.tar) |
| PicoDet-L | 640*640 | [( w/ 后处理)](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_l_640_lcnet_postprocessed.onnx) &#124; [( w/o 后处理)](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_l_640_coco_lcnet.onnx) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_l_640_coco_lcnet.tar) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_l_640_coco_lcnet_fp16.tar) |
## Model Zoo
### Base Model Zoo
| Model | Input size | Weight | Config | Inference Model |
| :-------- | :--------: | :----: | :-----: | :-------------: |
| PicoDet-XS | 320*320 | [model](https://paddledet.bj.bcebos.com/models/picodet_xs_320_coco_lcnet.pdparams) &#124; [log](https://paddledet.bj.bcebos.com/logs/train_picodet_xs_320_coco_lcnet.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/picodet/picodet_xs_320_coco_lcnet.yml) | [w/ postprocess](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_xs_320_coco_lcnet.tar) &#124; [w/o postprocess](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_xs_320_coco_lcnet_non_postprocess.tar) |
| PicoDet-XS | 416*416 | [model](https://paddledet.bj.bcebos.com/models/picodet_xs_416_coco_lcnet.pdparams) &#124; [log](https://paddledet.bj.bcebos.com/logs/train_picodet_xs_416_coco_lcnet.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/picodet/picodet_xs_416_coco_lcnet.yml) | [w/ postprocess](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_xs_416_coco_lcnet.tar) &#124; [w/o postprocess](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_xs_416_coco_lcnet_non_postprocess.tar) |
| PicoDet-S | 320*320 | [model](https://paddledet.bj.bcebos.com/models/picodet_s_320_coco_lcnet.pdparams) &#124; [log](https://paddledet.bj.bcebos.com/logs/train_picodet_s_320_coco_lcnet.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/picodet/picodet_s_320_coco_lcnet.yml) | [w/ postprocess](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_s_320_coco_lcnet.tar) &#124; [w/o postprocess](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_s_320_coco_lcnet_non_postprocess.tar) |
| PicoDet-S | 416*416 | [model](https://paddledet.bj.bcebos.com/models/picodet_s_416_coco_lcnet.pdparams) &#124; [log](https://paddledet.bj.bcebos.com/logs/train_picodet_s_416_coco_lcnet.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/picodet/picodet_s_416_coco_lcnet.yml) | [w/ postprocess](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_s_416_coco_lcnet.tar) &#124; [w/o postprocess](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_s_416_coco_lcnet_non_postprocess.tar) |
| PicoDet-M | 320*320 | [model](https://paddledet.bj.bcebos.com/models/picodet_m_320_coco_lcnet.pdparams) &#124; [log](https://paddledet.bj.bcebos.com/logs/train_picodet_m_320_coco_lcnet.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/picodet/picodet_m_320_coco_lcnet.yml) | [w/ postprocess](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_m_320_coco_lcnet.tar) &#124; [w/o postprocess](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_m_320_coco_lcnet_non_postprocess.tar) |
| PicoDet-M | 416*416 | [model](https://paddledet.bj.bcebos.com/models/picodet_m_416_coco_lcnet.pdparams) &#124; [log](https://paddledet.bj.bcebos.com/logs/train_picodet_m_416_coco_lcnet.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/picodet/picodet_m_416_coco_lcnet.yml) | [w/ postprocess](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_m_416_coco_lcnet.tar) &#124; [w/o postprocess](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_m_416_coco_lcnet_non_postprocess.tar) |
| PicoDet-L | 320*320 | [model](https://paddledet.bj.bcebos.com/models/picodet_l_320_coco_lcnet.pdparams) &#124; [log](https://paddledet.bj.bcebos.com/logs/train_picodet_l_320_coco_lcnet.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/picodet/picodet_l_320_coco_lcnet.yml) | [w/ postprocess](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_l_320_coco_lcnet.tar) &#124; [w/o postprocess](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_l_320_coco_lcnet_non_postprocess.tar) |
| PicoDet-L | 416*416 | [model](https://paddledet.bj.bcebos.com/models/picodet_l_416_coco_lcnet.pdparams) &#124; [log](https://paddledet.bj.bcebos.com/logs/train_picodet_l_416_coco_lcnet.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/picodet/picodet_l_416_coco_lcnet.yml) | [w/ postprocess](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_l_416_coco_lcnet.tar) &#124; [w/o postprocess](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_l_416_coco_lcnet_non_postprocess.tar) |
| PicoDet-L | 640*640 | [model](https://paddledet.bj.bcebos.com/models/picodet_l_640_coco_lcnet.pdparams) &#124; [log](https://paddledet.bj.bcebos.com/logs/train_picodet_l_640_coco_lcnet.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/picodet/picodet_l_640_coco_lcnet.yml) | [w/ postprocess](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_l_640_coco_lcnet.tar) &#124; [w/o postprocess](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_l_640_coco_lcnet_non_postprocess.tar) |
### Model Zoo for deployment
| Model | Input size | ONNX(w/o postprocess) | Paddle Lite(fp32) | Paddle Lite(fp16) |
| :-------- | :--------: | :---------------------: | :----------------: | :----------------: |
| PicoDet-XS | 320*320 | [( w/ postprocess)](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_xs_320_lcnet_postprocessed.onnx) &#124; [( w/o postprocess)](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_xs_320_coco_lcnet.onnx) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_xs_320_coco_lcnet.tar) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_xs_320_coco_lcnet_fp16.tar) |
| PicoDet-XS | 416*416 | [( w/ postprocess)](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_xs_416_lcnet_postprocessed.onnx) &#124; [( w/o postprocess)](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_xs_416_coco_lcnet.onnx) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_xs_416_coco_lcnet.tar) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_xs_416_coco_lcnet_fp16.tar) |
| PicoDet-S | 320*320 | [( w/ postprocess)](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_s_320_lcnet_postprocessed.onnx) &#124; [( w/o postprocess)](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_s_320_coco_lcnet.onnx) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_s_320_coco_lcnet.tar) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_s_320_coco_lcnet_fp16.tar) |
| PicoDet-S | 416*416 | [( w/ postprocess)](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_s_416_lcnet_postprocessed.onnx) &#124; [( w/o postprocess)](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_s_416_coco_lcnet.onnx) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_s_416_coco_lcnet.tar) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_s_416_coco_lcnet_fp16.tar) |
| PicoDet-M | 320*320 | [( w/ postprocess)](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_m_320_lcnet_postprocessed.onnx) &#124; [( w/o postprocess)](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_m_320_coco_lcnet.onnx) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_m_320_coco_lcnet.tar) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_m_320_coco_lcnet_fp16.tar) |
| PicoDet-M | 416*416 | [( w/ postprocess)](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_m_416_lcnet_postprocessed.onnx) &#124; [( w/o postprocess)](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_m_416_coco_lcnet.onnx) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_m_416_coco_lcnet.tar) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_m_416_coco_lcnet_fp16.tar) |
| PicoDet-L | 320*320 | [( w/ postprocess)](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_l_320_lcnet_postprocessed.onnx) &#124; [( w/o postprocess)](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_l_320_coco_lcnet.onnx) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_l_320_coco_lcnet.tar) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_l_320_coco_lcnet_fp16.tar) |
| PicoDet-L | 416*416 | [( w/ postprocess)](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_l_416_lcnet_postprocessed.onnx) &#124; [( w/o postprocess)](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_l_416_coco_lcnet.onnx) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_l_416_coco_lcnet.tar) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_l_416_coco_lcnet_fp16.tar) |
| PicoDet-L | 640*640 | [( w/ postprocess)](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_l_640_lcnet_postprocessed.onnx) &#124; [( w/o postprocess)](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_l_640_coco_lcnet.onnx) [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_l_640_coco_lcnet.tar) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_l_640_coco_lcnet_fp16.tar) |
---
Model_Info:
name: "PP-PicoDet"
description: "PP-PicoDet是轻量级系列模型,在移动端具有卓越的性能"
description_en: "PP-PicoDet has a series of lightweight models, which are very suitable\
\ for deployment on mobile or CPU"
icon: "@后续UE统一设计之后,会存到bos上某个位置"
from_repo: "PaddleDetection"
Task:
- tag_en: "Computer Vision"
tag: "计算机视觉"
sub_tag_en: "Object Detection"
sub_tag: "目标检测"
Example:
- tag_en: "Intelligent City"
tag: "智慧城市"
sub_tag_en: "Road Hazard Detection"
title: "路面病害检测"
sub_tag: "路面危害检测"
url: "https://aistudio.baidu.com/aistudio/projectdetail/3898651"
Datasets: "COCO test-dev2017, COCO train2017, COCO val2017, Pascal VOC"
Pulisher: "Baidu"
License: "apache.2.0"
Paper:
- title: "PP-PicoDet: A Better Real-Time Object Detector on Mobile Devices"
url: "https://arxiv.org/abs/2111.00902"
IfTraining: 1
IfOnlineDemo: 1
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 1. 模型简介\n",
"PaddleDetection中提出了全新的轻量级系列模型`PP-PicoDet`,在移动端具有卓越的性能,成为全新SOTA轻量级模型。\n",
"\n",
"PP-PicoDet模型有如下特点:\n",
"\n",
"- 🌟 更高的mAP: 第一个在1M参数量之内`mAP(0.5:0.95)`超越**30+**(输入416像素时)。\n",
"- 🚀 更快的预测速度: 网络预测在ARM CPU下可达150FPS。\n",
"- 😊 部署友好: 支持PaddleLite/MNN/NCNN/OpenVINO等预测库,支持转出ONNX,提供了C++/Python/Android的demo。\n",
"- 😍 先进的算法: 我们在现有SOTA算法中进行了创新, 包括:ESNet, CSP-PAN, SimOTA等等。\n",
"\n",
"关于PP-Picodet的更多细节可以参考我们的[官方文档](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.5/configs/picodet/README.md)。"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 2. 模型效果\n",
"PP-Picodet与其他轻量级模型的精度速度对比图如下所示:\n",
"<div align=\"center\">\n",
" <img src=\"https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.5/docs/images/picodet_map.png\" width=500 />\n",
"</div>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 3. 模型如何使用\n",
"首先克隆PaddleDetection,并将数据集存放在`dataset/coco/`目录下面"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"%cd ~/work\n",
"!git clone https://gitee.com/paddlepaddle/PaddleDetection\n",
"%cd PaddleDetection"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 3.1 训练\n",
"执行以下命令训练PP-Picodet"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"vscode": {
"languageId": "plaintext"
}
},
"outputs": [],
"source": [
"# 单卡训练\n",
"!CUDA_VISIBLE_DEVICES=0 python tools/train.py -c configs/picodet/picodet_s_320_coco_lcnet.yml --eval\n",
"\n",
"# 多卡训练\n",
"!CUDA_VISIBLE_DEVICES=0,1,2,3 python -m paddle.distributed.launch --gpus 0,1,2,3 tools/train.py -c configs/picodet/picodet_s_320_coco_lcnet.yml --eval"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**注意:** \n",
"- PicoDet所有模型均由4卡GPU训练得到,如果改变训练GPU卡数,需要按线性比例缩放学习率`base_lr`。"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 3.2 部署\n",
"PP-Picodet支持多种方式部署,具体可以参考[PP-Picodet部署](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.5/configs/picodet/README.md#%E9%83%A8%E7%BD%B2)。"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 4. 模型原理\n",
"PP-Picodet的整体结构图如下所示:\n",
"<div align=\"center\">\n",
" <img src=\"https://bj.bcebos.com/v1/paddledet/modelcenter/images/PP-Picodet-arch.png\" width=70% />\n",
"</div>\n",
"PP-Picodet由以下方法组成:\n",
"- 增强的ShuffleNet-ESNet\n",
"- CSP-PAN\n",
"- SimOTA匹配策略\n",
"\n",
"更多细节可以参考我们的技术报告:https://arxiv.org/abs/2111.00902"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 5. 注意事项\n",
"**所有的命令默认运行在AI Studio的`jupyter`上, 如果运行在终端上,去掉命令开头的符号%或!**"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 6. 相关论文及引用信息\n",
"```\n",
"@article{yu2021pp,\n",
" title={PP-PicoDet: A Better Real-Time Object Detector on Mobile Devices},\n",
" author={Yu, Guanghua and Chang, Qinyao and Lv, Wenyu and Xu, Chang and Cui, Cheng and Ji, Wei and Dang, Qingqing and Deng, Kaipeng and Wang, Guanzhong and Du, Yuning and others},\n",
" journal={arXiv preprint arXiv:2111.00902},\n",
" year={2021}\n",
"}\n",
"```"
]
}
],
"metadata": {
"language_info": {
"name": "python"
},
"orig_nbformat": 4
},
"nbformat": 4,
"nbformat_minor": 2
}
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 1. Introduction\n",
"We developed a series of lightweight models, named PP-PicoDet. Because of the excellent performance, our models are very suitable for deployment on mobile or CPU.\n",
"\n",
"The PP-PicoDet model has the following characteristics:\n",
"- 🌟 Higher mAP: the **first** object detectors that surpass mAP(0.5:0.95) **30+** within 1M parameters when the input size is 416.\n",
"- 🚀 Faster latency: 150FPS on mobile ARM CPU.\n",
"- 😊 Deploy friendly: support PaddleLite/MNN/NCNN/OpenVINO and provide C++/Python/Android implementation.\n",
"- 😍 Advanced algorithm: use the most advanced algorithms and offer innovation, such as ESNet, CSP-PAN, SimOTA with VFL, etc.\n",
"\n",
"For more details, please refer to [official documentation](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.5/configs/picodet/README_en.md)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 2. Model Effects\n",
"The accuracy and speed comparison of PP-Picodet and other lightweight models is shown below:\n",
"<div align=\"center\">\n",
" <img src=\"https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.5/docs/images/picodet_map.png\" width=500 />\n",
"</div>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 3. How to use the model\n",
"Clone PaddleDetection firstly and put the COCO-style dataset in `dataset/coco`"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"vscode": {
"languageId": "plaintext"
}
},
"outputs": [],
"source": [
"%cd ~/work\n",
"!git clone https://gitee.com/paddlepaddle/PaddleDetection\n",
"%cd PaddleDetection"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 3.1 Training\n",
"Training PP-Picodet with following command"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"vscode": {
"languageId": "plaintext"
}
},
"outputs": [],
"source": [
"# training with single GPU\n",
"!CUDA_VISIBLE_DEVICES=0 python tools/train.py -c configs/picodet/picodet_s_320_coco_lcnet.yml --eval\n",
"\n",
"# training with mutiple GPUs\n",
"!CUDA_VISIBLE_DEVICES=0,1,2,3 python -m paddle.distributed.launch --gpus 0,1,2,3 tools/train.py -c configs/picodet/picodet_s_320_coco_lcnet.yml --eval"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Notes:**\n",
"- All models of PicoDet are trained by 4 GPUs. If the number of GPUs is changed, the learning rate `base_lr` needs to be scaled linearly."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 3.2 Deployment\n",
"PP-Picodet supports multiple deployment methods, please refer to [PP-Picodet deployment](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.5/configs/picodet/README_en.md#deployment) for details."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 4. Model principle\n",
"The overall structure of PP-Picodet is shown below:\n",
"<div align=\"center\">\n",
" <img src=\"https://bj.bcebos.com/v1/paddledet/modelcenter/images/PP-Picodet-arch.png\" width=70% />\n",
"</div>\n",
"PP-Picodet is composed of following methods:\n",
"- Enhanced ShuffleNet-ESNet\n",
"- CSP-PAN\n",
"- SimOTA label assignment\n",
"\n",
"For more details, please refer to our technical report: https://arxiv.org/abs/2111.00902"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 5. Attention\n",
"**All commands run on AI Studio's `jupyter` by default. If running on a terminal, remove the % or ! at the beginning of the command.**"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 6. Related papers and citations\n",
"```\n",
"@article{yu2021pp,\n",
" title={PP-PicoDet: A Better Real-Time Object Detector on Mobile Devices},\n",
" author={Yu, Guanghua and Chang, Qinyao and Lv, Wenyu and Xu, Chang and Cui, Cheng and Ji, Wei and Dang, Qingqing and Deng, Kaipeng and Wang, Guanzhong and Du, Yuning and others},\n",
" journal={arXiv preprint arXiv:2111.00902},\n",
" year={2021}\n",
"}\n",
"```"
]
}
],
"metadata": {
"language_info": {
"name": "python"
},
"orig_nbformat": 4
},
"nbformat": 4,
"nbformat_minor": 2
}
import gradio as gr
import numpy as np
import os
from src.detection import Detector
# UGC: Define the inference fn() for your models
def model_inference(image):
image, json_out = Detector('PP-YOLO')(image)
return image, json_out
def clear_all():
return None, None, None
with gr.Blocks() as demo:
gr.Markdown("Objective Detection")
with gr.Column(scale=1, min_width=100):
img_in = gr.Image(label="Input").style(height=200)
with gr.Row():
btn1 = gr.Button("Clear")
btn2 = gr.Button("Submit")
img_out = gr.Image(label="Output").style(height=200)
json_out = gr.JSON(label="jsonOutput")
btn2.click(fn=model_inference, inputs=img_in, outputs=[img_out, json_out])
btn1.click(fn=clear_all, inputs=None, outputs=[img_in, img_out, json_out])
gr.Button.style(1)
demo.launch()
【PP-YOLO-App-YAML】
APP_Info:
title: PP-YOLO-App
colorFrom: blue
colorTo: yellow
sdk: gradio
sdk_version: 3.9.1
app_file: app.py
license: apache-2.0
device: cpu
\ No newline at end of file
mode: paddle
draw_threshold: 0.5
metric: COCO
use_dynamic_shape: false
arch: YOLO
min_subgraph_size: 3
param_path: paddlecv://models/ppyolo_r50vd_dcn_2x_coco/model.pdiparams
model_path: paddlecv://models/ppyolo_r50vd_dcn_2x_coco/model.pdmodel
Preprocess:
- interp: 2
keep_ratio: false
target_size:
- 608
- 608
type: Resize
- is_scale: true
mean:
- 0.485
- 0.456
- 0.406
std:
- 0.229
- 0.224
- 0.225
type: NormalizeImage
- type: Permute
label_list:
- person
- bicycle
- car
- motorcycle
- airplane
- bus
- train
- truck
- boat
- traffic light
- fire hydrant
- stop sign
- parking meter
- bench
- bird
- cat
- dog
- horse
- sheep
- cow
- elephant
- bear
- zebra
- giraffe
- backpack
- umbrella
- handbag
- tie
- suitcase
- frisbee
- skis
- snowboard
- sports ball
- kite
- baseball bat
- baseball glove
- skateboard
- surfboard
- tennis racket
- bottle
- wine glass
- cup
- fork
- knife
- spoon
- bowl
- banana
- apple
- sandwich
- orange
- broccoli
- carrot
- hot dog
- pizza
- donut
- cake
- chair
- couch
- potted plant
- bed
- dining table
- toilet
- tv
- laptop
- mouse
- remote
- keyboard
- cell phone
- microwave
- oven
- toaster
- sink
- refrigerator
- book
- clock
- vase
- scissors
- teddy bear
- hair drier
- toothbrush
gradio
opencv-python
paddlepaddle
PyYAML
shapely
scipy
Cython
numpy
setuptools
pillow
\ No newline at end of file
import cv2
import os
import numpy as np
import yaml
from paddle.inference import Config, create_predictor, PrecisionType
from PIL import Image
from .download import get_model_path
from .preprocess import preprocess, Resize, NormalizeImage, Permute, PadStride, decode_image
from .visualize import draw_det
class Detector(object):
def __init__(self, model_name):
parent_path = os.path.abspath(os.path.join(__file__, *(['..'] * 2)))
yml_file = os.path.join(parent_path, 'configs/{}.yml'.format(model_name))
with open(yml_file, 'r') as f:
yml_conf = yaml.safe_load(f)
infer_model = get_model_path(yml_conf['model_path'])
infer_params = get_model_path(yml_conf['param_path'])
config = Config(infer_model, infer_params)
device = yml_conf.get('device', 'CPU')
run_mode = yml_conf.get('mode', 'paddle')
cpu_threads = yml_conf.get('cpu_threads', 1)
if device == 'CPU':
config.disable_gpu()
config.set_cpu_math_library_num_threads(cpu_threads)
elif device == 'GPU':
# initial GPU memory(M), device ID
config.enable_use_gpu(200, 0)
# optimize graph and fuse op
config.switch_ir_optim(True)
precision_map = {
'trt_int8': Config.Precision.Int8,
'trt_fp32': Config.Precision.Float32,
'trt_fp16': Config.Precision.Half
}
if run_mode in precision_map.keys():
config.enable_tensorrt_engine(
workspace_size=(1 << 25) * batch_size,
max_batch_size=batch_size,
min_subgraph_size=yml_conf['min_subgraph_size'],
precision_mode=precision_map[run_mode],
use_static=True,
use_calib_mode=False)
if yml_conf['use_dynamic_shape']:
min_input_shape = {
'image': [batch_size, 3, 640, 640],
'scale_factor': [batch_size, 2]
}
max_input_shape = {
'image': [batch_size, 3, 1280, 1280],
'scale_factor': [batch_size, 2]
}
opt_input_shape = {
'image': [batch_size, 3, 1024, 1024],
'scale_factor': [batch_size, 2]
}
config.set_trt_dynamic_shape_info(min_input_shape, max_input_shape,
opt_input_shape)
# disable print log when predict
config.disable_glog_info()
# enable shared memory
config.enable_memory_optim()
# disable feed, fetch OP, needed by zero_copy_run
config.switch_use_feed_fetch_ops(False)
self.predictor = create_predictor(config)
self.yml_conf = yml_conf
self.preprocess_ops = self.create_preprocess_ops(yml_conf)
self.input_names = self.predictor.get_input_names()
self.output_names = self.predictor.get_output_names()
self.draw_threshold = yml_conf.get('draw_threshold', 0.5)
self.class_names = yml_conf['label_list']
def create_preprocess_ops(self, yml_conf):
preprocess_ops = []
for op_info in yml_conf['Preprocess']:
new_op_info = op_info.copy()
op_type = new_op_info.pop('type')
preprocess_ops.append(eval(op_type)(**new_op_info))
return preprocess_ops
def create_inputs(self, image_files):
inputs = dict()
im_list, im_info_list = [], []
for im_path in image_files:
im, im_info = preprocess(im_path, self.preprocess_ops)
im_list.append(im)
im_info_list.append(im_info)
inputs['im_shape'] = np.stack([e['im_shape'] for e in im_info_list], axis=0).astype('float32')
inputs['scale_factor'] = np.stack([e['scale_factor'] for e in im_info_list], axis=0).astype('float32')
inputs['image'] = np.stack(im_list, axis=0).astype('float32')
return inputs
def __call__(self, image_file):
inputs = self.create_inputs([image_file])
for name in self.input_names:
input_tensor = self.predictor.get_input_handle(name)
input_tensor.copy_from_cpu(inputs[name])
self.predictor.run()
boxes_tensor = self.predictor.get_output_handle(self.output_names[0])
np_boxes = boxes_tensor.copy_to_cpu()
boxes_num = self.predictor.get_output_handle(self.output_names[1])
np_boxes_num = boxes_num.copy_to_cpu()
if np_boxes_num.sum() <= 0:
np_boxes = np.zeros([0, 6])
if isinstance(image_file, str):
image = Image.open(image_file).convert('RGB')
elif isinstance(image_file, np.ndarray):
image = image_file
expect_boxes = (np_boxes[:, 1] > self.draw_threshold) & (np_boxes[:, 0] > -1)
np_boxes = np_boxes[expect_boxes, :]
image = draw_det(image, np_boxes, self.class_names)
return image, {'bboxes': np_boxes.tolist()}
\ No newline at end of file
# Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import os
import os.path as osp
import sys
import yaml
import time
import shutil
import requests
import tqdm
import hashlib
import base64
import binascii
import tarfile
import zipfile
__all__ = [
'get_model_path',
'get_config_path',
'get_dict_path',
]
WEIGHTS_HOME = osp.expanduser("~/.cache/paddlecv/models")
CONFIGS_HOME = osp.expanduser("~/.cache/paddlecv/configs")
DICTS_HOME = osp.expanduser("~/.cache/paddlecv/dicts")
# dict of {dataset_name: (download_info, sub_dirs)}
# download info: [(url, md5sum)]
DOWNLOAD_RETRY_LIMIT = 3
PMP_DOWNLOAD_URL_PREFIX = 'https://bj.bcebos.com/v1/paddle-model-ecology/paddlecv/'
def is_url(path):
"""
Whether path is URL.
Args:
path (string): URL string or not.
"""
return path.startswith('http://') \
or path.startswith('https://') \
or path.startswith('paddlecv://')
def parse_url(url):
url = url.replace("paddlecv://", PMP_DOWNLOAD_URL_PREFIX)
return url
def get_model_path(path):
"""Get model path from WEIGHTS_HOME, if not exists,
download it from url.
"""
if not is_url(path):
return path
url = parse_url(path)
path, _ = get_path(url, WEIGHTS_HOME, path_depth=2)
return path
def get_config_path(path):
"""Get config path from CONFIGS_HOME, if not exists,
download it from url.
"""
if not is_url(path):
return path
url = parse_url(path)
path, _ = get_path(url, CONFIGS_HOME)
return path
def get_dict_path(path):
"""Get config path from CONFIGS_HOME, if not exists,
download it from url.
"""
if not is_url(path):
return path
url = parse_url(path)
path, _ = get_path(url, DICTS_HOME)
return path
def map_path(url, root_dir, path_depth=1):
# parse path after download to decompress under root_dir
assert path_depth > 0, "path_depth should be a positive integer"
dirname = url
for _ in range(path_depth):
dirname = osp.dirname(dirname)
fpath = osp.relpath(url, dirname)
path = osp.join(root_dir, fpath)
dirname = osp.dirname(path)
return path, dirname
def get_path(url, root_dir, md5sum=None, check_exist=True, path_depth=1):
""" Download from given url to root_dir.
if file or directory specified by url is exists under
root_dir, return the path directly, otherwise download
from url, return the path.
url (str): download url
root_dir (str): root dir for downloading, it should be
WEIGHTS_HOME
md5sum (str): md5 sum of download package
"""
# parse path after download to decompress under root_dir
fullpath, dirname = map_path(url, root_dir, path_depth)
if osp.exists(fullpath) and check_exist:
if not osp.isfile(fullpath) or \
_check_exist_file_md5(fullpath, md5sum, url):
return fullpath, True
else:
os.remove(fullpath)
fullname = _download(url, dirname, md5sum)
return fullpath, False
def _download(url, path, md5sum=None):
"""
Download from url, save to path.
url (str): download url
path (str): download to given path
"""
if not osp.exists(path):
os.makedirs(path)
fname = osp.split(url)[-1]
fullname = osp.join(path, fname)
retry_cnt = 0
while not (osp.exists(fullname) and _check_exist_file_md5(fullname, md5sum,
url)):
if retry_cnt < DOWNLOAD_RETRY_LIMIT:
retry_cnt += 1
else:
raise RuntimeError("Download from {} failed. "
"Retry limit reached".format(url))
# NOTE: windows path join may incur \, which is invalid in url
if sys.platform == "win32":
url = url.replace('\\', '/')
req = requests.get(url, stream=True)
if req.status_code != 200:
raise RuntimeError("Downloading from {} failed with code "
"{}!".format(url, req.status_code))
# For protecting download interupted, download to
# tmp_fullname firstly, move tmp_fullname to fullname
# after download finished
tmp_fullname = fullname + "_tmp"
total_size = req.headers.get('content-length')
with open(tmp_fullname, 'wb') as f:
if total_size:
for chunk in tqdm.tqdm(
req.iter_content(chunk_size=1024),
total=(int(total_size) + 1023) // 1024,
unit='KB'):
f.write(chunk)
else:
for chunk in req.iter_content(chunk_size=1024):
if chunk:
f.write(chunk)
shutil.move(tmp_fullname, fullname)
return fullname
def _check_exist_file_md5(filename, md5sum, url):
# if md5sum is None, and file to check is model file,
# read md5um from url and check, else check md5sum directly
return _md5check_from_url(filename, url) if md5sum is None \
and filename.endswith('pdparams') \
else _md5check(filename, md5sum)
def _md5check_from_url(filename, url):
# For model in bcebos URLs, MD5 value is contained
# in request header as 'content_md5'
req = requests.get(url, stream=True)
content_md5 = req.headers.get('content-md5')
req.close()
if not content_md5 or _md5check(
filename,
binascii.hexlify(base64.b64decode(content_md5.strip('"'))).decode(
)):
return True
else:
return False
def _md5check(fullname, md5sum=None):
if md5sum is None:
return True
md5 = hashlib.md5()
with open(fullname, 'rb') as f:
for chunk in iter(lambda: f.read(4096), b""):
md5.update(chunk)
calc_md5sum = md5.hexdigest()
if calc_md5sum != md5sum:
return False
return True
# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import cv2
import numpy as np
def decode_image(im_file, im_info):
"""read rgb image
Args:
im_file (str|np.ndarray): input can be image path or np.ndarray
im_info (dict): info of image
Returns:
im (np.ndarray): processed image (np.ndarray)
im_info (dict): info of processed image
"""
if isinstance(im_file, str):
with open(im_file, 'rb') as f:
im_read = f.read()
data = np.frombuffer(im_read, dtype='uint8')
im = cv2.imdecode(data, 1) # BGR mode, but need RGB mode
im = cv2.cvtColor(im, cv2.COLOR_BGR2RGB)
else:
im = im_file
im_info['im_shape'] = np.array(im.shape[:2], dtype=np.float32)
im_info['scale_factor'] = np.array([1., 1.], dtype=np.float32)
return im, im_info
class Resize(object):
"""resize image by target_size and max_size
Args:
target_size (int): the target size of image
keep_ratio (bool): whether keep_ratio or not, default true
interp (int): method of resize
"""
def __init__(self, target_size, keep_ratio=True, interp=cv2.INTER_LINEAR):
if isinstance(target_size, int):
target_size = [target_size, target_size]
self.target_size = target_size
self.keep_ratio = keep_ratio
self.interp = interp
def __call__(self, im, im_info):
"""
Args:
im (np.ndarray): image (np.ndarray)
im_info (dict): info of image
Returns:
im (np.ndarray): processed image (np.ndarray)
im_info (dict): info of processed image
"""
assert len(self.target_size) == 2
assert self.target_size[0] > 0 and self.target_size[1] > 0
im_channel = im.shape[2]
im_scale_y, im_scale_x = self.generate_scale(im)
im = cv2.resize(
im,
None,
None,
fx=im_scale_x,
fy=im_scale_y,
interpolation=self.interp)
im_info['im_shape'] = np.array(im.shape[:2]).astype('float32')
im_info['scale_factor'] = np.array(
[im_scale_y, im_scale_x]).astype('float32')
return im, im_info
def generate_scale(self, im):
"""
Args:
im (np.ndarray): image (np.ndarray)
Returns:
im_scale_x: the resize ratio of X
im_scale_y: the resize ratio of Y
"""
origin_shape = im.shape[:2]
im_c = im.shape[2]
if self.keep_ratio:
im_size_min = np.min(origin_shape)
im_size_max = np.max(origin_shape)
target_size_min = np.min(self.target_size)
target_size_max = np.max(self.target_size)
im_scale = float(target_size_min) / float(im_size_min)
if np.round(im_scale * im_size_max) > target_size_max:
im_scale = float(target_size_max) / float(im_size_max)
im_scale_x = im_scale
im_scale_y = im_scale
else:
resize_h, resize_w = self.target_size
im_scale_y = resize_h / float(origin_shape[0])
im_scale_x = resize_w / float(origin_shape[1])
return im_scale_y, im_scale_x
class NormalizeImage(object):
"""normalize image
Args:
mean (list): im - mean
std (list): im / std
is_scale (bool): whether need im / 255
norm_type (str): type in ['mean_std', 'none']
"""
def __init__(self, mean, std, is_scale=True, norm_type='mean_std'):
self.mean = mean
self.std = std
self.is_scale = is_scale
self.norm_type = norm_type
def __call__(self, im, im_info):
"""
Args:
im (np.ndarray): image (np.ndarray)
im_info (dict): info of image
Returns:
im (np.ndarray): processed image (np.ndarray)
im_info (dict): info of processed image
"""
im = im.astype(np.float32, copy=False)
if self.is_scale:
scale = 1.0 / 255.0
im *= scale
if self.norm_type == 'mean_std':
mean = np.array(self.mean)[np.newaxis, np.newaxis, :]
std = np.array(self.std)[np.newaxis, np.newaxis, :]
im -= mean
im /= std
return im, im_info
class Permute(object):
"""permute image
Args:
to_bgr (bool): whether convert RGB to BGR
channel_first (bool): whether convert HWC to CHW
"""
def __init__(self, ):
super(Permute, self).__init__()
def __call__(self, im, im_info):
"""
Args:
im (np.ndarray): image (np.ndarray)
im_info (dict): info of image
Returns:
im (np.ndarray): processed image (np.ndarray)
im_info (dict): info of processed image
"""
im = im.transpose((2, 0, 1)).copy()
return im, im_info
class PadStride(object):
""" padding image for model with FPN, instead PadBatch(pad_to_stride) in original config
Args:
stride (bool): model with FPN need image shape % stride == 0
"""
def __init__(self, stride=0):
self.coarsest_stride = stride
def __call__(self, im, im_info):
"""
Args:
im (np.ndarray): image (np.ndarray)
im_info (dict): info of image
Returns:
im (np.ndarray): processed image (np.ndarray)
im_info (dict): info of processed image
"""
coarsest_stride = self.coarsest_stride
if coarsest_stride <= 0:
return im, im_info
im_c, im_h, im_w = im.shape
pad_h = int(np.ceil(float(im_h) / coarsest_stride) * coarsest_stride)
pad_w = int(np.ceil(float(im_w) / coarsest_stride) * coarsest_stride)
padding_im = np.zeros((im_c, pad_h, pad_w), dtype=np.float32)
padding_im[:, :im_h, :im_w] = im
return padding_im, im_info
def preprocess(im, preprocess_ops):
# process image by preprocess_ops
im_info = {
'scale_factor': np.array(
[1., 1.], dtype=np.float32),
'im_shape': None,
}
im, im_info = decode_image(im, im_info)
for operator in preprocess_ops:
im, im_info = operator(im, im_info)
return im, im_info
import numpy as np
from PIL import Image, ImageDraw, ImageFile
def get_color_map_list(num_classes):
"""
Args:
num_classes (int): number of class
Returns:
color_map (list): RGB color list
"""
color_map = num_classes * [0, 0, 0]
for i in range(0, num_classes):
j = 0
lab = i
while lab:
color_map[i * 3] |= (((lab >> 0) & 1) << (7 - j))
color_map[i * 3 + 1] |= (((lab >> 1) & 1) << (7 - j))
color_map[i * 3 + 2] |= (((lab >> 2) & 1) << (7 - j))
j += 1
lab >>= 3
color_map = [color_map[i:i + 3] for i in range(0, len(color_map), 3)]
return color_map
def draw_det(image, dt_bboxes, name_set):
im = Image.fromarray(image)
draw_thickness = min(im.size) // 320
draw = ImageDraw.Draw(im)
clsid2color = {}
color_list = get_color_map_list(len(name_set))
for (cls_id, score, xmin, ymin, xmax, ymax) in dt_bboxes:
cls_id = int(cls_id)
name = name_set[cls_id]
color = tuple(color_list[cls_id])
# draw bbox
draw.line(
[(xmin, ymin), (xmin, ymax), (xmax, ymax), (xmax, ymin),
(xmin, ymin)],
width=draw_thickness,
fill=color)
# draw label
text = "{} {:.4f}".format(name, score)
box = draw.textbbox((xmin, ymin), text, anchor='lt')
draw.rectangle(box, fill=color)
draw.text((box[0], box[1]), text, fill=(255, 255, 255))
image = np.array(im)
return image
\ No newline at end of file
## Benchmark
### PP-YOLO模型
| 模型 | GPU个数 | 每GPU图片个数 | 骨干网络 | 输入尺寸 | Box AP<sup>val</sup> | Box AP<sup>test</sup> | V100 FP32(FPS) | V100 TensorRT FP16(FPS) |
|:--------------------:|:-------:|:-------------:|:----------:| :-------:| :-------------: | :-------------: | :------------: | :---------------------: |
| PP-YOLO | 8 | 24 | ResNet50vd | 608 | 44.8 | 45.2 | 72.9 | 155.6 |
| PP-YOLO | 8 | 24 | ResNet50vd | 512 | 43.9 | 44.4 | 89.9 | 188.4 |
| PP-YOLO | 8 | 24 | ResNet50vd | 416 | 42.1 | 42.5 | 109.1 | 215.4 |
| PP-YOLO | 8 | 24 | ResNet50vd | 320 | 38.9 | 39.3 | 132.2 | 242.2 |
| PP-YOLO_2x | 8 | 24 | ResNet50vd | 608 | 45.3 | 45.9 | 72.9 | 155.6 |
| PP-YOLO_2x | 8 | 24 | ResNet50vd | 512 | 44.4 | 45.0 | 89.9 | 188.4 |
| PP-YOLO_2x | 8 | 24 | ResNet50vd | 416 | 42.7 | 43.2 | 109.1 | 215.4 |
| PP-YOLO_2x | 8 | 24 | ResNet50vd | 320 | 39.5 | 40.1 | 132.2 | 242.2 |
| PP-YOLO | 4 | 32 | ResNet18vd | 512 | 29.2 | 29.5 | 357.1 | 657.9 |
| PP-YOLO | 4 | 32 | ResNet18vd | 416 | 28.6 | 28.9 | 409.8 | 719.4 |
| PP-YOLO | 4 | 32 | ResNet18vd | 320 | 26.2 | 26.4 | 480.7 | 763.4 |
**注意:**
- PP-YOLO模型使用COCO数据集中train2017作为训练集,使用val2017和test-dev2017作为测试集,Box AP<sup>test</sup>`mAP(IoU=0.5:0.95)`评估结果。
- PP-YOLO模型训练过程中使用8 GPUs,每GPU batch size为24进行训练,如训练GPU数和batch size不使用上述配置,须参考[FAQ](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.5/docs/tutorials/FAQ)调整学习率和迭代次数。
- PP-YOLO模型推理速度测试采用单卡V100,batch size=1进行测试,使用CUDA 10.2, CUDNN 7.5.1,TensorRT推理速度测试使用TensorRT 5.1.2.2。
- PP-YOLO模型FP32的推理速度测试数据为使用`tools/export_model.py`脚本导出模型后,使用`deploy/python/infer.py`脚本中的`--run_benchnark`参数使用Paddle预测库进行推理速度benchmark测试结果, 且测试的均为不包含数据预处理和模型输出后处理(NMS)的数据(与[YOLOv4(AlexyAB)](https://github.com/AlexeyAB/darknet)测试方法一致)。
- TensorRT FP16的速度测试相比于FP32去除了`yolo_box`(bbox解码)部分耗时,即不包含数据预处理,bbox解码和NMS(与[YOLOv4(AlexyAB)](https://github.com/AlexeyAB/darknet)测试方法一致)。
### PP-YOLO轻量级模型
| 模型 | GPU个数 | 每GPU图片个数 | 模型体积 | 输入尺寸 | Box AP<sup>val</sup> | Box AP50<sup>val</sup> | Kirin 990 1xCore (FPS) |
|:------:|:-------:|:----------:|:--------:| :-------:| :------------------: | :--------------------: | :--------------------: |
| PP-YOLO_MobileNetV3_large | 4 | 32 | 28MB | 320 | 23.2 | 42.6 | 14.1 |
| PP-YOLO_MobileNetV3_small | 4 | 32 | 16MB | 320 | 17.2 | 33.8 | 21.5 |
| PP-YOLO tiny | 8 | 32 | 4.2MB | 320 | 20.6 | 92.3 |
| PP-YOLO tiny | 8 | 32 | 4.2MB | 416 | 22.7 | 65.4 |
**注意:**
- PP-YOLO轻量级模型使用COCO数据集中train2017作为训练集,使用val2017作为测试集,Box AP<sup>val</sup>`mAP(IoU=0.5:0.95)`评估结果, Box AP50<sup>val</sup>`mAP(IoU=0.5)`评估结果。
- PP-YOLO_MobileNetV3 模型训练过程中使用4GPU,每GPU batch size为32进行训练;PP-YOLO-tiny 模型训练过程中使用8GPU,每GPU batch size为32进行训练,如训练GPU数和batch size不使用上述配置,须参考[FAQ](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.5/docs/tutorials/FAQ/README.md)调整学习率和迭代次数。
- PP-YOLO_MobileNetV3 模型推理速度测试环境配置为麒麟990芯片单线程;PP-YOLO-tiny 模型推理速度测试环境配置为麒麟990芯片4线程,arm8架构。
## Benchmark
### PP-YOLO
| Model | GPU number | images/GPU | Backbone | input shape | Box AP<sup>val</sup> | Box AP<sup>test</sup> | V100 FP32(FPS) | V100 TensorRT FP16(FPS) |
|:------------------------:|:-------:|:-------------:|:----------:| :-------:| :------------------: | :-------------------: | :------------: | :---------------------: |
| PP-YOLO | 8 | 24 | ResNet50vd | 608 | 44.8 | 45.2 | 72.9 | 155.6 |
| PP-YOLO | 8 | 24 | ResNet50vd | 512 | 43.9 | 44.4 | 89.9 | 188.4 |
| PP-YOLO | 8 | 24 | ResNet50vd | 416 | 42.1 | 42.5 | 109.1 | 215.4 |
| PP-YOLO | 8 | 24 | ResNet50vd | 320 | 38.9 | 39.3 | 132.2 | 242.2 |
| PP-YOLO_2x | 8 | 24 | ResNet50vd | 608 | 45.3 | 45.9 | 72.9 | 155.6 |
| PP-YOLO_2x | 8 | 24 | ResNet50vd | 512 | 44.4 | 45.0 | 89.9 | 188.4 |
| PP-YOLO_2x | 8 | 24 | ResNet50vd | 416 | 42.7 | 43.2 | 109.1 | 215.4 |
| PP-YOLO_2x | 8 | 24 | ResNet50vd | 320 | 39.5 | 40.1 | 132.2 | 242.2 |
| PP-YOLO | 4 | 32 | ResNet18vd | 512 | 29.2 | 29.5 | 357.1 | 657.9 |
| PP-YOLO | 4 | 32 | ResNet18vd | 416 | 28.6 | 28.9 | 409.8 | 719.4 |
| PP-YOLO | 4 | 32 | ResNet18vd | 320 | 26.2 | 26.4 | 480.7 | 763.4 |
**Notes:**
- PP-YOLO is trained on COCO train2017 dataset and evaluated on val2017 & test-dev2017 dataset,Box AP<sup>test</sup> is evaluation results of `mAP(IoU=0.5:0.95)`.
- PP-YOLO used 8 GPUs for training and mini-batch size as 24 on each GPU, if GPU number and mini-batch size is changed, learning rate and iteration times should be adjusted according [FAQ](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.5/docs/tutorials/FAQ).
- PP-YOLO inference speed is tesed on single Tesla V100 with batch size as 1, CUDA 10.2, CUDNN 7.5.1, TensorRT 5.1.2.2 in TensorRT mode.
- PP-YOLO FP32 inference speed testing uses inference model exported by `tools/export_model.py` and benchmarked by running `depoly/python/infer.py` with `--run_benchmark`. All testing results do not contains the time cost of data reading and post-processing(NMS), which is same as [YOLOv4(AlexyAB)](https://github.com/AlexeyAB/darknet) in testing method.
- TensorRT FP16 inference speed testing exclude the time cost of bounding-box decoding(`yolo_box`) part comparing with FP32 testing above, which means that data reading, bounding-box decoding and post-processing(NMS) is excluded(test method same as [YOLOv4(AlexyAB)](https://github.com/AlexeyAB/darknet) too)
- If you set `--run_benchmark=True`,you should install these dependencies at first, `pip install pynvml psutil GPUtil`.
### PP-YOLO for mobile
| Model | GPU number | images/GPU | Model size | input shape | Box AP<sup>val</sup> | Box AP50<sup>val</sup> | Kirin 990 1xCore (FPS) |
|:------:|:-------:|:----------:|:--------:| :-------:| :------------------: | :--------------------: | :--------------------: |
| PP-YOLO_MobileNetV3_large | 4 | 32 | 28MB | 320 | 23.2 | 42.6 | 14.1 |
| PP-YOLO_MobileNetV3_small | 4 | 32 | 16MB | 320 | 17.2 | 33.8 | 21.5 |
| PP-YOLO tiny | 8 | 32 | 4.2MB | 320 | 20.6 | 92.3 |
| PP-YOLO tiny | 8 | 32 | 4.2MB | 416 | 22.7 | 65.4 |
**Notes:**
- PP-YOLO for mobile is trained on COCO train2017 datast and evaluated on val2017 dataset,Box AP<sup>val</sup> is evaluation results of `mAP(IoU=0.5:0.95)`, Box AP<sup>val</sup> is evaluation results of `mAP(IoU=0.5)`.
- PP-YOLO-tiny used 8 GPUs for training and mini-batch size as 32 on each GPU. PP-YOLO_MobileNetV3 used 4 GPUs for training and mini-batch size as 32 on each GPU. If GPU number and mini-batch size is changed, learning rate and iteration times should be adjusted according [FAQ](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.5/docs/tutorials/FAQ).
- PP-YOLO_MobileNetV3 inference speed is tested on Kirin 990 with 1 thread. PP-YOLO-tiny inference speed is tested on Kirin 990 with 4 threads by arm8.
\ No newline at end of file
## 模型库
### PP-YOLO模型
| 模型 | 骨干网络 | 模型下载 | 配置文件 |
|:-----------:|:-------:|:-------:|:-------:|
| PP-YOLO | ResNet50vd | [model](https://paddledet.bj.bcebos.com/models/ppyolo_r50vd_dcn_1x_coco.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.5/configs/ppyolo/ppyolo_r50vd_dcn_1x_coco.yml) |
| PP-YOLO_2x | ResNet50vd | [model](https://paddledet.bj.bcebos.com/models/ppyolo_r50vd_dcn_2x_coco.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.5/configs/ppyolo/ppyolo_r50vd_dcn_2x_coco.yml) |
| PP-YOLO | ResNet18vd | [model](https://paddledet.bj.bcebos.com/models/ppyolo_r18vd_coco.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.5/configs/ppyolo/ppyolo_r18vd_coco.yml) |
| PP-YOLO | MobileNet3 Large | [model](https://paddledet.bj.bcebos.com/models/ppyolo_mbv3_large_coco.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.5/configs/ppyolo/ppyolo_mbv3_large_coco.yml) |
| PP-YOLO | MobileNet3 Small | [model](https://paddledet.bj.bcebos.com/models/ppyolo_mbv3_small_coco.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.5/configs/ppyolo/ppyolo_mbv3_small_coco.yml) |
### PP-YOLO tiny模型
| 模型 | 模型体积 | 后量化模型体积 | 模型下载 | 配置文件 | 量化后模型 |
|:-----------:|:-------:|:-------:|:-------:|:-------:|:-------:|
| PP-YOLO tiny | 4.2MB | 1.3M | [model](https://paddledet.bj.bcebos.com/models/ppyolo_tiny_650e_coco.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.5/configs/ppyolo/ppyolo_tiny_650e_coco.yml) | [预测模型](https://paddledet.bj.bcebos.com/models/ppyolo_tiny_quant.tar) |
## Model Zoo
### PP-YOLO
| Model | Backbone | Download | Config |
|:-----------:|:-------:|:-------:|:-------:|
| PP-YOLO | ResNet50vd | [model](https://paddledet.bj.bcebos.com/models/ppyolo_r50vd_dcn_1x_coco.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.5/configs/ppyolo/ppyolo_r50vd_dcn_1x_coco.yml) |
| PP-YOLO_2x | ResNet50vd | [model](https://paddledet.bj.bcebos.com/models/ppyolo_r50vd_dcn_2x_coco.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.5/configs/ppyolo/ppyolo_r50vd_dcn_2x_coco.yml) |
| PP-YOLO | ResNet18vd | [model](https://paddledet.bj.bcebos.com/models/ppyolo_r18vd_coco.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.5/configs/ppyolo/ppyolo_r18vd_coco.yml) |
| PP-YOLO | MobileNet3 Large | [model](https://paddledet.bj.bcebos.com/models/ppyolo_mbv3_large_coco.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.5/configs/ppyolo/ppyolo_mbv3_large_coco.yml) |
| PP-YOLO | MobileNet3 Small | [model](https://paddledet.bj.bcebos.com/models/ppyolo_mbv3_small_coco.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.5/configs/ppyolo/ppyolo_mbv3_small_coco.yml) |
### PP-YOLO tiny
| Model | Model Size | Post Quant Model Size | Download | Config | post quant model |
|:-----------:|:-------:|:-------:|:-------:|:-------:|:-------:|
| PP-YOLO tiny | 4.2MB | 1.3M | [model](https://paddledet.bj.bcebos.com/models/ppyolo_tiny_650e_coco.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.5/configs/ppyolo/ppyolo_tiny_650e_coco.yml) | [inference model](https://paddledet.bj.bcebos.com/models/ppyolo_tiny_quant.tar) |
\ No newline at end of file
---
Model_Info:
name: "PP-YOLO"
description: "PP-YOLO是PaddleDetection优化和改进的YOLOv3的模型"
description_en: "PP-YOLO is a optimized model based on YOLOv3 in PaddleDetection"
icon: "@后续UE统一设计之后,会存到bos上某个位置"
from_repo: "PaddleDetection"
Task:
- tag_en: "Computer Vision"
tag: "计算机视觉"
sub_tag_en: "Object Detection"
sub_tag: "目标检测"
Example:
- tag_en: "Industrial/Energy"
tag: "工业/能源"
sub_tag_en: "meter reading"
title: "基于OpenVINO的工业指针型表计读数"
sub_tag: "表计读数"
url: "https://aistudio.baidu.com/aistudio/projectdetail/3975848?contributionType=1"
Datasets: "COCO test-dev2017, COCO train2017, COCO val2017, Pascal VOC"
Pulisher: "Baidu"
License: "apache.2.0"
Paper:
- title: "PP-YOLO: An Effective and Efficient Implementation of Object Detector"
url: "https://arxiv.org/pdf/2007.12099v3.pdf"
IfTraining: 1
IfOnlineDemo: 1
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 1. 模型简介\n",
"[PP-YOLO](https://arxiv.org/abs/2007.12099)是PaddleDetection优化和改进的YOLOv3的模型,其精度(COCO数据集mAP)和推理速度均优于[YOLOv4](https://arxiv.org/abs/2004.10934)模型。关于PP-YOLO的更多细节可以参考[官方文档](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.5/configs/ppyolo/README_cn.md)。"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 2. 模型效果\n",
"PP-YOLO在[COCO](http://cocodataset.org) test-dev2017数据集上精度达到45.9%,在单卡V100上FP32推理速度为72.9 FPS, V100上开启TensorRT下FP16推理速度为155.6 FPS。\n",
"\n",
"<div align=\"center\">\n",
" <img src=\"https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.5/docs/images/ppyolo_map_fps.png\" width=500 />\n",
"</div>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 3. 模型如何使用\n",
"首先克隆PaddleDetection,并将数据集存放在`dataset/coco/`目录下面"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"vscode": {
"languageId": "plaintext"
}
},
"outputs": [],
"source": [
"%cd ~/work\n",
"!git clone https://gitee.com/paddlepaddle/PaddleDetection\n",
"%cd PaddleDetection"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 3.1 训练\n",
"使用8GPU通过如下命令一键式启动训练(以下命令均默认在PaddleDetection根目录运行), 通过`--eval`参数开启训练中交替评估。"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"vscode": {
"languageId": "plaintext"
}
},
"outputs": [],
"source": [
"!python -m paddle.distributed.launch --log_dir=./ppyolo_dygraph/ --gpus 0,1,2,3,4,5,6,7 tools/train.py -c configs/ppyolo/ppyolo_r50vd_dcn_1x_coco.yml &>ppyolo_dygraph.log 2>&1 &"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 3.2 推理部署\n",
"PP-YOLO模型部署及推理benchmark需要通过`tools/export_model.py`导出模型后使用Paddle预测库进行部署和推理,可通过如下命令一键式启动。"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"vscode": {
"languageId": "plaintext"
}
},
"outputs": [],
"source": [
"# 导出模型,默认存储于output/ppyolo目录\n",
"!python tools/export_model.py -c configs/ppyolo/ppyolo_r50vd_dcn_1x_coco.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyolo_r50vd_dcn_1x_coco.pdparams\n",
"\n",
"# 预测库推理\n",
"!CUDA_VISIBLE_DEVICES=0 python deploy/python/infer.py --model_dir=output_inference/ppyolo_r50vd_dcn_1x_coco --image_file=demo/000000014439_640x640.jpg --device=GPU"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 4. 模型原理\n",
"PP-YOLO的整体结构图如下图所示:\n",
"<div align=\"center\">\n",
" <img src=\"https://paddledet.bj.bcebos.com/modelcenter/images/PP-YOLO-Arch.png\" width=500 />\n",
"</div>\n",
"\n",
"PP-YOLO从如下方面优化和提升YOLOv3模型的精度和速度:\n",
"\n",
"- 更优的骨干网络: ResNet50vd-DCN\n",
"- 更大的训练batch size: 8 GPUs,每GPU batch_size=24,对应调整学习率和迭代轮数\n",
"- [Drop Block](https://arxiv.org/abs/1810.12890)\n",
"- [Exponential Moving Average](https://www.investopedia.com/terms/e/ema.asp)\n",
"- [IoU Loss](https://arxiv.org/pdf/1902.09630.pdf)\n",
"- [Grid Sensitive](https://arxiv.org/abs/2004.10934)\n",
"- [Matrix NMS](https://arxiv.org/pdf/2003.10152.pdf)\n",
"- [CoordConv](https://arxiv.org/abs/1807.03247)\n",
"- [Spatial Pyramid Pooling](https://arxiv.org/abs/1406.4729)\n",
"\n",
"更多细节可以参考我们的技术报告:https://arxiv.org/abs/2007.12099"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 5. 注意事项\n",
"**所有的命令默认运行在AI Studio的`jupyter`上, 如果运行在终端上,去掉命令开头的符号%或!**"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 6. 相关论文及引用信息\n",
"```\n",
"@misc{long2020ppyolo,\n",
"title={PP-YOLO: An Effective and Efficient Implementation of Object Detector},\n",
"author={Xiang Long and Kaipeng Deng and Guanzhong Wang and Yang Zhang and Qingqing Dang and Yuan Gao and Hui Shen and Jianguo Ren and Shumin Han and Errui Ding and Shilei Wen},\n",
"year={2020},\n",
"eprint={2007.12099},\n",
"archivePrefix={arXiv},\n",
"primaryClass={cs.CV}\n",
"}\n",
"```"
]
}
],
"metadata": {
"language_info": {
"name": "python"
},
"orig_nbformat": 4
},
"nbformat": 4,
"nbformat_minor": 2
}
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 1. Introduction\n",
"[PP-YOLO](https://arxiv.org/abs/2007.12099) is a optimized model based on YOLOv3 in PaddleDetection,whose performance(mAP on COCO) and inference spped are better than [YOLOv4](https://arxiv.org/abs/2004.10934). For more details, refer to [official documentation](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.5/configs/ppyolo/README.md)."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 2. Model Effects\n",
"PP-YOLO reached mmAP(IoU=0.5:0.95) as 45.9% on COCO test-dev2017 dataset, and inference speed of FP32 on single V100 is 72.9 FPS, inference speed of FP16 with TensorRT on single V100 is 155.6 FPS.\n",
"\n",
"<div align=\"center\">\n",
" <img src=\"https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.5/docs/images/ppyolo_map_fps.png\" width=500 />\n",
"</div>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 3. How to use the model\n",
"Clone PaddleDetection firstly and put the COCO-style dataset in `dataset/coco`"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"vscode": {
"languageId": "plaintext"
}
},
"outputs": [],
"source": [
"%cd ~/work\n",
"!git clone https://gitee.com/paddlepaddle/PaddleDetection\n",
"%cd PaddleDetection"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 3.1 Training\n",
"Training PP-YOLO on 8 GPUs with following command(all commands should be run under PaddleDetection dygraph directory as default)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"vscode": {
"languageId": "plaintext"
}
},
"outputs": [],
"source": [
"!python -m paddle.distributed.launch --log_dir=./ppyolo_dygraph/ --gpus 0,1,2,3,4,5,6,7 tools/train.py -c configs/ppyolo/ppyolo_r50vd_dcn_1x_coco.yml &>ppyolo_dygraph.log 2>&1 &"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 3.2 Inference\n",
"For inference deployment or benchmark, model exported with `tools/export_model.py` should be used and perform inference with Paddle inference library with following commands:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"vscode": {
"languageId": "plaintext"
}
},
"outputs": [],
"source": [
"# export model, model will be save in output/ppyolo as default\n",
"python tools/export_model.py -c configs/ppyolo/ppyolo_r50vd_dcn_1x_coco.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyolo_r50vd_dcn_1x_coco.pdparams\n",
"\n",
"# inference with Paddle Inference library\n",
"CUDA_VISIBLE_DEVICES=0 python deploy/python/infer.py --model_dir=output_inference/ppyolo_r50vd_dcn_1x_coco --image_file=demo/000000014439_640x640.jpg --device=GPU"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 4. Model principle\n",
"The overall archtecture of PP-YOLO is shown as follows:\n",
"<div align=\"center\">\n",
" <img src=\"https://paddledet.bj.bcebos.com/modelcenter/images/PP-YOLO-Arch.png\" width=500 />\n",
"</div>\n",
"\n",
"PP-YOLO improved performance and speed of YOLOv3 with following methods:\n",
"\n",
"- Better backbone: ResNet50vd-DCN\n",
"- Larger training batch size: 8 GPUs and mini-batch size as 24 on each GPU\n",
"- [Drop Block](https://arxiv.org/abs/1810.12890)\n",
"- [Exponential Moving Average](https://www.investopedia.com/terms/e/ema.asp)\n",
"- [IoU Loss](https://arxiv.org/pdf/1902.09630.pdf)\n",
"- [Grid Sensitive](https://arxiv.org/abs/2004.10934)\n",
"- [Matrix NMS](https://arxiv.org/pdf/2003.10152.pdf)\n",
"- [CoordConv](https://arxiv.org/abs/1807.03247)\n",
"- [Spatial Pyramid Pooling](https://arxiv.org/abs/1406.4729)\n",
"\n",
"For more details, please refer to our technical report:https://arxiv.org/abs/2007.12099"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 5. Attention\n",
"**All commands run on AI Studio's `jupyter` by default. If running on a terminal, remove the % or ! at the beginning of the command.**"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 6. Related papers and citations\n",
"```\n",
"@misc{long2020ppyolo,\n",
"title={PP-YOLO: An Effective and Efficient Implementation of Object Detector},\n",
"author={Xiang Long and Kaipeng Deng and Guanzhong Wang and Yang Zhang and Qingqing Dang and Yuan Gao and Hui Shen and Jianguo Ren and Shumin Han and Errui Ding and Shilei Wen},\n",
"year={2020},\n",
"eprint={2007.12099},\n",
"archivePrefix={arXiv},\n",
"primaryClass={cs.CV}\n",
"}\n",
"```"
]
}
],
"metadata": {
"language_info": {
"name": "python"
},
"orig_nbformat": 4
},
"nbformat": 4,
"nbformat_minor": 2
}
import gradio as gr
import numpy as np
import os
from src.detection import Detector
# UGC: Define the inference fn() for your models
def model_inference(image):
image, json_out = Detector('PP-YOLOE+')(image)
return image, json_out
def clear_all():
return None, None, None
with gr.Blocks() as demo:
gr.Markdown("Objective Detection")
with gr.Column(scale=1, min_width=100):
img_in = gr.Image(label="Input").style(height=200)
with gr.Row():
btn1 = gr.Button("Clear")
btn2 = gr.Button("Submit")
img_out = gr.Image(label="Output").style(height=200)
json_out = gr.JSON(label="jsonOutput")
btn2.click(fn=model_inference, inputs=img_in, outputs=[img_out, json_out])
btn1.click(fn=clear_all, inputs=None, outputs=[img_in, img_out, json_out])
gr.Button.style(1)
demo.launch()
【PP-YOLOE+-App-YAML】
APP_Info:
title: PP-YOLOE+-App
colorFrom: blue
colorTo: yellow
sdk: gradio
sdk_version: 3.9.1
app_file: app.py
license: apache-2.0
device: cpu
\ No newline at end of file
mode: paddle
draw_threshold: 0.5
metric: COCO
use_dynamic_shape: false
arch: YOLO
min_subgraph_size: 3
param_path: paddlecv://models/ppyoloe_plus_crn_s_80e_coco/model.pdiparams
model_path: paddlecv://models/ppyoloe_plus_crn_s_80e_coco/model.pdmodel
Preprocess:
- interp: 2
keep_ratio: false
target_size:
- 640
- 640
type: Resize
- is_scale: true
mean:
- 0.
- 0.
- 0.
std:
- 1.
- 1.
- 1.
norm_type: null
type: NormalizeImage
- type: Permute
label_list:
- person
- bicycle
- car
- motorcycle
- airplane
- bus
- train
- truck
- boat
- traffic light
- fire hydrant
- stop sign
- parking meter
- bench
- bird
- cat
- dog
- horse
- sheep
- cow
- elephant
- bear
- zebra
- giraffe
- backpack
- umbrella
- handbag
- tie
- suitcase
- frisbee
- skis
- snowboard
- sports ball
- kite
- baseball bat
- baseball glove
- skateboard
- surfboard
- tennis racket
- bottle
- wine glass
- cup
- fork
- knife
- spoon
- bowl
- banana
- apple
- sandwich
- orange
- broccoli
- carrot
- hot dog
- pizza
- donut
- cake
- chair
- couch
- potted plant
- bed
- dining table
- toilet
- tv
- laptop
- mouse
- remote
- keyboard
- cell phone
- microwave
- oven
- toaster
- sink
- refrigerator
- book
- clock
- vase
- scissors
- teddy bear
- hair drier
- toothbrush
gradio
opencv-python
paddlepaddle
PyYAML
shapely
scipy
Cython
numpy
setuptools
pillow
\ No newline at end of file
import cv2
import os
import numpy as np
import yaml
from paddle.inference import Config, create_predictor, PrecisionType
from PIL import Image
from .download import get_model_path
from .preprocess import preprocess, Resize, NormalizeImage, Permute, PadStride, decode_image
from .visualize import draw_det
class Detector(object):
def __init__(self, model_name):
parent_path = os.path.abspath(os.path.join(__file__, *(['..'] * 2)))
yml_file = os.path.join(parent_path, 'configs/{}.yml'.format(model_name))
with open(yml_file, 'r') as f:
yml_conf = yaml.safe_load(f)
infer_model = get_model_path(yml_conf['model_path'])
infer_params = get_model_path(yml_conf['param_path'])
config = Config(infer_model, infer_params)
device = yml_conf.get('device', 'CPU')
run_mode = yml_conf.get('mode', 'paddle')
cpu_threads = yml_conf.get('cpu_threads', 1)
if device == 'CPU':
config.disable_gpu()
config.set_cpu_math_library_num_threads(cpu_threads)
elif device == 'GPU':
# initial GPU memory(M), device ID
config.enable_use_gpu(200, 0)
# optimize graph and fuse op
config.switch_ir_optim(True)
precision_map = {
'trt_int8': Config.Precision.Int8,
'trt_fp32': Config.Precision.Float32,
'trt_fp16': Config.Precision.Half
}
if run_mode in precision_map.keys():
config.enable_tensorrt_engine(
workspace_size=(1 << 25) * batch_size,
max_batch_size=batch_size,
min_subgraph_size=yml_conf['min_subgraph_size'],
precision_mode=precision_map[run_mode],
use_static=True,
use_calib_mode=False)
if yml_conf['use_dynamic_shape']:
min_input_shape = {
'image': [batch_size, 3, 640, 640],
'scale_factor': [batch_size, 2]
}
max_input_shape = {
'image': [batch_size, 3, 1280, 1280],
'scale_factor': [batch_size, 2]
}
opt_input_shape = {
'image': [batch_size, 3, 1024, 1024],
'scale_factor': [batch_size, 2]
}
config.set_trt_dynamic_shape_info(min_input_shape, max_input_shape,
opt_input_shape)
# disable print log when predict
config.disable_glog_info()
# enable shared memory
config.enable_memory_optim()
# disable feed, fetch OP, needed by zero_copy_run
config.switch_use_feed_fetch_ops(False)
self.predictor = create_predictor(config)
self.yml_conf = yml_conf
self.preprocess_ops = self.create_preprocess_ops(yml_conf)
self.input_names = self.predictor.get_input_names()
self.output_names = self.predictor.get_output_names()
self.draw_threshold = yml_conf.get('draw_threshold', 0.5)
self.class_names = yml_conf['label_list']
def create_preprocess_ops(self, yml_conf):
preprocess_ops = []
for op_info in yml_conf['Preprocess']:
new_op_info = op_info.copy()
op_type = new_op_info.pop('type')
preprocess_ops.append(eval(op_type)(**new_op_info))
return preprocess_ops
def create_inputs(self, image_files):
inputs = dict()
im_list, im_info_list = [], []
for im_path in image_files:
im, im_info = preprocess(im_path, self.preprocess_ops)
im_list.append(im)
im_info_list.append(im_info)
inputs['im_shape'] = np.stack([e['im_shape'] for e in im_info_list], axis=0).astype('float32')
inputs['scale_factor'] = np.stack([e['scale_factor'] for e in im_info_list], axis=0).astype('float32')
inputs['image'] = np.stack(im_list, axis=0).astype('float32')
return inputs
def __call__(self, image_file):
inputs = self.create_inputs([image_file])
for name in self.input_names:
input_tensor = self.predictor.get_input_handle(name)
input_tensor.copy_from_cpu(inputs[name])
self.predictor.run()
boxes_tensor = self.predictor.get_output_handle(self.output_names[0])
np_boxes = boxes_tensor.copy_to_cpu()
boxes_num = self.predictor.get_output_handle(self.output_names[1])
np_boxes_num = boxes_num.copy_to_cpu()
if np_boxes_num.sum() <= 0:
np_boxes = np.zeros([0, 6])
if isinstance(image_file, str):
image = Image.open(image_file).convert('RGB')
elif isinstance(image_file, np.ndarray):
image = image_file
expect_boxes = (np_boxes[:, 1] > self.draw_threshold) & (np_boxes[:, 0] > -1)
np_boxes = np_boxes[expect_boxes, :]
image = draw_det(image, np_boxes, self.class_names)
return image, {'bboxes': np_boxes.tolist()}
\ No newline at end of file
# Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import os
import os.path as osp
import sys
import yaml
import time
import shutil
import requests
import tqdm
import hashlib
import base64
import binascii
import tarfile
import zipfile
__all__ = [
'get_model_path',
'get_config_path',
'get_dict_path',
]
WEIGHTS_HOME = osp.expanduser("~/.cache/paddlecv/models")
CONFIGS_HOME = osp.expanduser("~/.cache/paddlecv/configs")
DICTS_HOME = osp.expanduser("~/.cache/paddlecv/dicts")
# dict of {dataset_name: (download_info, sub_dirs)}
# download info: [(url, md5sum)]
DOWNLOAD_RETRY_LIMIT = 3
PMP_DOWNLOAD_URL_PREFIX = 'https://bj.bcebos.com/v1/paddle-model-ecology/paddlecv/'
def is_url(path):
"""
Whether path is URL.
Args:
path (string): URL string or not.
"""
return path.startswith('http://') \
or path.startswith('https://') \
or path.startswith('paddlecv://')
def parse_url(url):
url = url.replace("paddlecv://", PMP_DOWNLOAD_URL_PREFIX)
return url
def get_model_path(path):
"""Get model path from WEIGHTS_HOME, if not exists,
download it from url.
"""
if not is_url(path):
return path
url = parse_url(path)
path, _ = get_path(url, WEIGHTS_HOME, path_depth=2)
return path
def get_config_path(path):
"""Get config path from CONFIGS_HOME, if not exists,
download it from url.
"""
if not is_url(path):
return path
url = parse_url(path)
path, _ = get_path(url, CONFIGS_HOME)
return path
def get_dict_path(path):
"""Get config path from CONFIGS_HOME, if not exists,
download it from url.
"""
if not is_url(path):
return path
url = parse_url(path)
path, _ = get_path(url, DICTS_HOME)
return path
def map_path(url, root_dir, path_depth=1):
# parse path after download to decompress under root_dir
assert path_depth > 0, "path_depth should be a positive integer"
dirname = url
for _ in range(path_depth):
dirname = osp.dirname(dirname)
fpath = osp.relpath(url, dirname)
path = osp.join(root_dir, fpath)
dirname = osp.dirname(path)
return path, dirname
def get_path(url, root_dir, md5sum=None, check_exist=True, path_depth=1):
""" Download from given url to root_dir.
if file or directory specified by url is exists under
root_dir, return the path directly, otherwise download
from url, return the path.
url (str): download url
root_dir (str): root dir for downloading, it should be
WEIGHTS_HOME
md5sum (str): md5 sum of download package
"""
# parse path after download to decompress under root_dir
fullpath, dirname = map_path(url, root_dir, path_depth)
if osp.exists(fullpath) and check_exist:
if not osp.isfile(fullpath) or \
_check_exist_file_md5(fullpath, md5sum, url):
return fullpath, True
else:
os.remove(fullpath)
fullname = _download(url, dirname, md5sum)
return fullpath, False
def _download(url, path, md5sum=None):
"""
Download from url, save to path.
url (str): download url
path (str): download to given path
"""
if not osp.exists(path):
os.makedirs(path)
fname = osp.split(url)[-1]
fullname = osp.join(path, fname)
retry_cnt = 0
while not (osp.exists(fullname) and _check_exist_file_md5(fullname, md5sum,
url)):
if retry_cnt < DOWNLOAD_RETRY_LIMIT:
retry_cnt += 1
else:
raise RuntimeError("Download from {} failed. "
"Retry limit reached".format(url))
# NOTE: windows path join may incur \, which is invalid in url
if sys.platform == "win32":
url = url.replace('\\', '/')
req = requests.get(url, stream=True)
if req.status_code != 200:
raise RuntimeError("Downloading from {} failed with code "
"{}!".format(url, req.status_code))
# For protecting download interupted, download to
# tmp_fullname firstly, move tmp_fullname to fullname
# after download finished
tmp_fullname = fullname + "_tmp"
total_size = req.headers.get('content-length')
with open(tmp_fullname, 'wb') as f:
if total_size:
for chunk in tqdm.tqdm(
req.iter_content(chunk_size=1024),
total=(int(total_size) + 1023) // 1024,
unit='KB'):
f.write(chunk)
else:
for chunk in req.iter_content(chunk_size=1024):
if chunk:
f.write(chunk)
shutil.move(tmp_fullname, fullname)
return fullname
def _check_exist_file_md5(filename, md5sum, url):
# if md5sum is None, and file to check is model file,
# read md5um from url and check, else check md5sum directly
return _md5check_from_url(filename, url) if md5sum is None \
and filename.endswith('pdparams') \
else _md5check(filename, md5sum)
def _md5check_from_url(filename, url):
# For model in bcebos URLs, MD5 value is contained
# in request header as 'content_md5'
req = requests.get(url, stream=True)
content_md5 = req.headers.get('content-md5')
req.close()
if not content_md5 or _md5check(
filename,
binascii.hexlify(base64.b64decode(content_md5.strip('"'))).decode(
)):
return True
else:
return False
def _md5check(fullname, md5sum=None):
if md5sum is None:
return True
md5 = hashlib.md5()
with open(fullname, 'rb') as f:
for chunk in iter(lambda: f.read(4096), b""):
md5.update(chunk)
calc_md5sum = md5.hexdigest()
if calc_md5sum != md5sum:
return False
return True
# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import cv2
import numpy as np
def decode_image(im_file, im_info):
"""read rgb image
Args:
im_file (str|np.ndarray): input can be image path or np.ndarray
im_info (dict): info of image
Returns:
im (np.ndarray): processed image (np.ndarray)
im_info (dict): info of processed image
"""
if isinstance(im_file, str):
with open(im_file, 'rb') as f:
im_read = f.read()
data = np.frombuffer(im_read, dtype='uint8')
im = cv2.imdecode(data, 1) # BGR mode, but need RGB mode
im = cv2.cvtColor(im, cv2.COLOR_BGR2RGB)
else:
im = im_file
im_info['im_shape'] = np.array(im.shape[:2], dtype=np.float32)
im_info['scale_factor'] = np.array([1., 1.], dtype=np.float32)
return im, im_info
class Resize(object):
"""resize image by target_size and max_size
Args:
target_size (int): the target size of image
keep_ratio (bool): whether keep_ratio or not, default true
interp (int): method of resize
"""
def __init__(self, target_size, keep_ratio=True, interp=cv2.INTER_LINEAR):
if isinstance(target_size, int):
target_size = [target_size, target_size]
self.target_size = target_size
self.keep_ratio = keep_ratio
self.interp = interp
def __call__(self, im, im_info):
"""
Args:
im (np.ndarray): image (np.ndarray)
im_info (dict): info of image
Returns:
im (np.ndarray): processed image (np.ndarray)
im_info (dict): info of processed image
"""
assert len(self.target_size) == 2
assert self.target_size[0] > 0 and self.target_size[1] > 0
im_channel = im.shape[2]
im_scale_y, im_scale_x = self.generate_scale(im)
im = cv2.resize(
im,
None,
None,
fx=im_scale_x,
fy=im_scale_y,
interpolation=self.interp)
im_info['im_shape'] = np.array(im.shape[:2]).astype('float32')
im_info['scale_factor'] = np.array(
[im_scale_y, im_scale_x]).astype('float32')
return im, im_info
def generate_scale(self, im):
"""
Args:
im (np.ndarray): image (np.ndarray)
Returns:
im_scale_x: the resize ratio of X
im_scale_y: the resize ratio of Y
"""
origin_shape = im.shape[:2]
im_c = im.shape[2]
if self.keep_ratio:
im_size_min = np.min(origin_shape)
im_size_max = np.max(origin_shape)
target_size_min = np.min(self.target_size)
target_size_max = np.max(self.target_size)
im_scale = float(target_size_min) / float(im_size_min)
if np.round(im_scale * im_size_max) > target_size_max:
im_scale = float(target_size_max) / float(im_size_max)
im_scale_x = im_scale
im_scale_y = im_scale
else:
resize_h, resize_w = self.target_size
im_scale_y = resize_h / float(origin_shape[0])
im_scale_x = resize_w / float(origin_shape[1])
return im_scale_y, im_scale_x
class NormalizeImage(object):
"""normalize image
Args:
mean (list): im - mean
std (list): im / std
is_scale (bool): whether need im / 255
norm_type (str): type in ['mean_std', 'none']
"""
def __init__(self, mean, std, is_scale=True, norm_type='mean_std'):
self.mean = mean
self.std = std
self.is_scale = is_scale
self.norm_type = norm_type
def __call__(self, im, im_info):
"""
Args:
im (np.ndarray): image (np.ndarray)
im_info (dict): info of image
Returns:
im (np.ndarray): processed image (np.ndarray)
im_info (dict): info of processed image
"""
im = im.astype(np.float32, copy=False)
if self.is_scale:
scale = 1.0 / 255.0
im *= scale
if self.norm_type == 'mean_std':
mean = np.array(self.mean)[np.newaxis, np.newaxis, :]
std = np.array(self.std)[np.newaxis, np.newaxis, :]
im -= mean
im /= std
return im, im_info
class Permute(object):
"""permute image
Args:
to_bgr (bool): whether convert RGB to BGR
channel_first (bool): whether convert HWC to CHW
"""
def __init__(self, ):
super(Permute, self).__init__()
def __call__(self, im, im_info):
"""
Args:
im (np.ndarray): image (np.ndarray)
im_info (dict): info of image
Returns:
im (np.ndarray): processed image (np.ndarray)
im_info (dict): info of processed image
"""
im = im.transpose((2, 0, 1)).copy()
return im, im_info
class PadStride(object):
""" padding image for model with FPN, instead PadBatch(pad_to_stride) in original config
Args:
stride (bool): model with FPN need image shape % stride == 0
"""
def __init__(self, stride=0):
self.coarsest_stride = stride
def __call__(self, im, im_info):
"""
Args:
im (np.ndarray): image (np.ndarray)
im_info (dict): info of image
Returns:
im (np.ndarray): processed image (np.ndarray)
im_info (dict): info of processed image
"""
coarsest_stride = self.coarsest_stride
if coarsest_stride <= 0:
return im, im_info
im_c, im_h, im_w = im.shape
pad_h = int(np.ceil(float(im_h) / coarsest_stride) * coarsest_stride)
pad_w = int(np.ceil(float(im_w) / coarsest_stride) * coarsest_stride)
padding_im = np.zeros((im_c, pad_h, pad_w), dtype=np.float32)
padding_im[:, :im_h, :im_w] = im
return padding_im, im_info
def preprocess(im, preprocess_ops):
# process image by preprocess_ops
im_info = {
'scale_factor': np.array(
[1., 1.], dtype=np.float32),
'im_shape': None,
}
im, im_info = decode_image(im, im_info)
for operator in preprocess_ops:
im, im_info = operator(im, im_info)
return im, im_info
import numpy as np
from PIL import Image, ImageDraw, ImageFile
def get_color_map_list(num_classes):
"""
Args:
num_classes (int): number of class
Returns:
color_map (list): RGB color list
"""
color_map = num_classes * [0, 0, 0]
for i in range(0, num_classes):
j = 0
lab = i
while lab:
color_map[i * 3] |= (((lab >> 0) & 1) << (7 - j))
color_map[i * 3 + 1] |= (((lab >> 1) & 1) << (7 - j))
color_map[i * 3 + 2] |= (((lab >> 2) & 1) << (7 - j))
j += 1
lab >>= 3
color_map = [color_map[i:i + 3] for i in range(0, len(color_map), 3)]
return color_map
def draw_det(image, dt_bboxes, name_set):
im = Image.fromarray(image)
draw_thickness = min(im.size) // 320
draw = ImageDraw.Draw(im)
clsid2color = {}
color_list = get_color_map_list(len(name_set))
for (cls_id, score, xmin, ymin, xmax, ymax) in dt_bboxes:
cls_id = int(cls_id)
name = name_set[cls_id]
color = tuple(color_list[cls_id])
# draw bbox
draw.line(
[(xmin, ymin), (xmin, ymax), (xmax, ymax), (xmax, ymin),
(xmin, ymin)],
width=draw_thickness,
fill=color)
# draw label
text = "{} {:.4f}".format(name, score)
box = draw.textbbox((xmin, ymin), text, anchor='lt')
draw.rectangle(box, fill=color)
draw.text((box[0], box[1]), text, fill=(255, 255, 255))
image = np.array(im)
return image
\ No newline at end of file
## Benchmark
| 模型 | Epoch | GPU个数 | 每GPU图片个数 | 骨干网络 | 输入尺寸 | Box AP<sup>val<br>0.5:0.95 | Box AP<sup>test<br>0.5:0.95 | Params(M) | FLOPs(G) | V100 FP32(FPS) | V100 TensorRT FP16(FPS) |
|:--------------:|:-----:|:-------:|:----------:|:----------:| :-------:|:--------------------------:|:---------------------------:|:---------:|:--------:|:---------------:| :---------------------: |
| PP-YOLOE+-s | 300 | 8 | 32 | cspresnet-s | 640 | 43.7 | 43.9 | 7.93 | 17.36 | 208.3 | 333.3 |
| PP-YOLOE+-m | 300 | 8 | 28 | cspresnet-m | 640 | 49. 8 | 50.0 | 23.43 | 49.91 | 123.4 | 208.3 |
| PP-YOLOE+-l | 300 | 8 | 20 | cspresnet-l | 640 | 52.9 | 53.3 | 52.20 | 110.07 | 78.1 | 149.2 |
| PP-YOLOE+-x | 300 | 8 | 16 | cspresnet-x | 640 | 54.7 | 54.9 | 98.42 | 206.59 | 45.0 | 95.2 |
**注意:**
- PP-YOLOE+模型使用COCO数据集中train2017作为训练集,使用val2017和test-dev2017作为测试集。
- PP-YOLOE+模型训练过程中使用8 GPUs进行混合精度训练,如果**GPU卡数**或者**batch size**发生了改变,你需要按照公式 **lr<sub>new</sub> = lr<sub>default</sub> * (batch_size<sub>new</sub> * GPU_number<sub>new</sub>) / (batch_size<sub>default</sub> * GPU_number<sub>default</sub>)** 调整学习率。
- PP-YOLOE+模型推理速度测试采用单卡V100,batch size=1进行测试,使用**CUDA 10.2**, **CUDNN 7.6.5**,TensorRT推理速度测试使用**TensorRT 6.0.1.8**
- 参考[速度测试](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.5/configs/ppyoloe/README_cn.md#%E9%80%9F%E5%BA%A6%E6%B5%8B%E8%AF%95)以复现PP-YOLOE+推理速度测试结果。
- 如果你设置了`--run_benchmark=True`, 你首先需要安装以下依赖`pip install pynvml psutil GPUtil`
\ No newline at end of file
## Benchmark
| Model | Epoch | GPU number | images/GPU | backbone | input shape | Box AP<sup>val<br>0.5:0.95 | Box AP<sup>test<br>0.5:0.95 | Params(M) | FLOPs(G) | V100 FP32(FPS) | V100 TensorRT FP16(FPS) |
|:--------------:|:-----:|:-------:|:----------:|:----------:| :-------:|:--------------------------:|:---------------------------:|:---------:|:--------:|:---------------:| :---------------------: |
| PP-YOLOE+-s | 300 | 8 | 32 | cspresnet-s | 640 | 43.7 | 43.9 | 7.93 | 17.36 | 208.3 | 333.3 |
| PP-YOLOE+-m | 300 | 8 | 28 | cspresnet-m | 640 | 49. 8 | 50.0 | 23.43 | 49.91 | 123.4 | 208.3 |
| PP-YOLOE+-l | 300 | 8 | 20 | cspresnet-l | 640 | 52.9 | 53.3 | 52.20 | 110.07 | 78.1 | 149.2 |
| PP-YOLOE+-x | 300 | 8 | 16 | cspresnet-x | 640 | 54.7 | 54.9 | 98.42 | 206.59 | 45.0 | 95.2 |
**Notes:**
- PP-YOLOE+ is trained on COCO train2017 dataset and evaluated on val2017 & test-dev2017 dataset.
- PP-YOLOE+ used 8 GPUs for mixed precision training, if **GPU number** or **mini-batch size** is changed, **learning rate** should be adjusted according to the formula **lr<sub>new</sub> = lr<sub>default</sub> * (batch_size<sub>new</sub> * GPU_number<sub>new</sub>) / (batch_size<sub>default</sub> * GPU_number<sub>default</sub>)**.
- PP-YOLOE+ inference speed is tesed on single Tesla V100 with batch size as 1, **CUDA 10.2**, **CUDNN 7.6.5**, **TensorRT 6.0.1.8** in TensorRT mode.
- Refer to [Speed testing](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.5/configs/ppyoloe/README.md#speed-testing) to reproduce the speed testing results of PP-YOLOE+.
- If you set `--run_benchmark=True`,you should install these dependencies at first, `pip install pynvml psutil GPUtil`.
\ No newline at end of file
## 模型库
## 模型库
| 模型 | 骨干网络 | 模型下载 | 配置文件 |
|:-----------:|:-------:|:-------:|:-------:|
| PP-YOLOE+-s | cspresnet-s | [model](https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_s_80e_coco.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.5/configs/ppyoloe/ppyoloe_plus_crn_s_80e_coco.yml) |
| PP-YOLOE+-m | cspresnet-m | [model](https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_m_80e_coco.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.5/configs/ppyoloe/ppyoloe_plus_crn_m_80e_coco.yml) |
| PP-YOLOE+-l | cspresnet-l | [model](https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_l_80e_coco.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.5/configs/ppyoloe/ppyoloe_plus_crn_l_80e_coco.yml) |
| PP-YOLOE+-x | cspresnet-x | [model](https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_x_80e_coco.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.5/configs/ppyoloe/ppyoloe_plus_crn_x_80e_coco.yml) |
\ No newline at end of file
## Model Zoo
| Model | Backbone | Download | Config |
|:-----------:|:-------:|:-------:|:-------:|
PP-YOLOE+-s | cspresnet-s | [model](https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_s_80e_coco.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.5/configs/ppyoloe/ppyoloe_plus_crn_s_80e_coco.yml) |
| PP-YOLOE+-m | cspresnet-m | [model](https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_m_80e_coco.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.5/configs/ppyoloe/ppyoloe_plus_crn_m_80e_coco.yml) |
| PP-YOLOE+-l | cspresnet-l | [model](https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_l_80e_coco.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.5/configs/ppyoloe/ppyoloe_plus_crn_l_80e_coco.yml) |
| PP-YOLOE+-x | cspresnet-x | [model](https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_x_80e_coco.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.5/configs/ppyoloe/ppyoloe_plus_crn_x_80e_coco.yml) |
\ No newline at end of file
---
Model_Info:
name: "PP-YOLOE+"
description: "PP-YOLOE+是PP-YOLOE的升级版"
description_en: "PP-YOLOE+ is an upgraded version of PP-YOLOE"
icon: "@后续UE统一设计之后,会存到bos上某个位置"
from_repo: "PaddleDetection"
Task:
- tag_en: "Computer Vision"
tag: "计算机视觉"
sub_tag_en: "Object Detection"
sub_tag: "目标检测"
Example:
- tag_en: "Intelligent Transportation"
tag: "智慧交通"
sub_tag_en: "Object Detection"
title: "基于PP-YOLOE的雾天行人车辆目标检测"
sub_tag: "目标检测"
url: "https://aistudio.baidu.com/aistudio/projectdetail/4228391"
Datasets: "COCO test-dev2017, COCO train2017, COCO val2017, Pascal VOC"
Pulisher: "Baidu"
License: "apache.2.0"
Paper:
- title: "PP-YOLOE: An evolved version of YOLO"
url: "https://arxiv.org/pdf/2203.16250v2.pdf"
IfTraining: 1
IfOnlineDemo: 1
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 1. 模型简介\n",
"PP-YOLOE+是PP-YOLOE的升级版本,从大规模的obj365目标检测预训练模型入手,在大幅提升收敛速度的同时,提升了模型在COCO数据集上的速度。同时,PP-YOLOE+大幅提升了包括数据预处理在内的端到端的预测速度。关于PP-YOLOE+的更多细节可以参考我们的[官方文档](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.5/configs/ppyoloe/README_cn.md)。"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 2. 模型效果\n",
"PP-YOLOE+_l在COCO test-dev2017达到了53.3的mAP, 同时其速度在Tesla V100上达到了78.1 FPS。如下图所示,PP-YOLOE+_s/m/x同样具有卓越的精度速度性价比。\n",
"<div align=\"center\">\n",
" <img src=\"https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.5/docs/images/ppyoloe_plus_map_fps.png\" width=500 />\n",
"</div>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 3. 模型如何使用\n",
"首先克隆PaddleDetection,并将数据集存放在`dataset/coco/`目录下面"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"%cd ~/work\n",
"!git clone https://gitee.com/paddlepaddle/PaddleDetection\n",
"%cd PaddleDetection"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 3.1 训练\n",
"执行以下指令训练PP-YOLOE+"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"vscode": {
"languageId": "plaintext"
}
},
"outputs": [],
"source": [
"# 单卡训练\n",
"!python tools/train.py -c configs/ppyoloe/ppyoloe_plus_crn_l_80e_coco.yml --eval --amp\n",
"\n",
"# 多卡训练\n",
"!python -m paddle.distributed.launch --gpus 0,1,2,3,4,5,6,7 tools/train.py -c configs/ppyoloe/ppyoloe_plus_crn_l_80e_coco.yml --eval --amp"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**注意:**\n",
"- 如果需要边训练边评估,请添加`--eval`.\n",
"- PP-YOLOE支持混合精度训练,请添加`--amp`."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 3.2 推理部署\n",
"#### 3.2.1 使用Paddle-TRT进行推理部署\n",
"接下来,我们将介绍PP-YOLOE如何使用Paddle Inference在TensorRT FP16模式下部署\n",
"\n",
"首先,参考[Paddle Inference文档](https://www.paddlepaddle.org.cn/inference/master/user_guides/download_lib.html#python),下载并安装与你的CUDA, CUDNN和TensorRT相应的wheel包。\n",
"\n",
"然后,运行以下命令导出模型"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"vscode": {
"languageId": "plaintext"
}
},
"outputs": [],
"source": [
"!python tools/export_model.py -c configs/ppyoloe/ppyoloe_plus_crn_l_80e_coco.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_l_80e_coco.pdparams trt=True"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"最后,使用TensorRT FP16进行推理"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"vscode": {
"languageId": "plaintext"
}
},
"outputs": [],
"source": [
"# 推理单张图片\n",
"CUDA_VISIBLE_DEVICES=0 python deploy/python/infer.py --model_dir=output_inference/ppyoloe_plus_crn_l_80e_coco --image_file=demo/000000014439_640x640.jpg --device=gpu --run_mode=trt_fp16\n",
"\n",
"# 推理文件夹下的所有图片\n",
"CUDA_VISIBLE_DEVICES=0 python deploy/python/infer.py --model_dir=output_inference/ppyoloe_plus_crn_l_80e_coco --image_dir=demo/ --device=gpu --run_mode=trt_fp16"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**注意:**\n",
"- TensorRT会根据网络的定义,执行针对当前硬件平台的优化,生成推理引擎并序列化为文件。该推理引擎只适用于当前软硬件平台。如果你的软硬件平台没有发生变化,你可以设置[enable_tensorrt_engine](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.4/deploy/python/infer.py#L745)的参数`use_static=True`,这样生成的序列化文件将会保存在`output_inference`文件夹下,下次执行TensorRT时将加载保存的序列化文件。\n",
"- PaddleDetection release/2.4及其之后的版本将支持NMS调用TensorRT,需要依赖PaddlePaddle release/2.3及其之后的版本。"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### 3.2.2 使用PaddleInference进行推理部署\n",
"对于一些不支持TensorRT的AI加速硬件,我们可以直接使用PaddleInference进行部署。首先运行以下命令导出模型"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"vscode": {
"languageId": "plaintext"
}
},
"outputs": [],
"source": [
"!python tools/export_model.py -c configs/ppyoloe/ppyoloe_plus_crn_l_80e_coco.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_l_80e_coco.pdparams"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"直接使用PaddleInference进行推理"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"vscode": {
"languageId": "plaintext"
}
},
"outputs": [],
"source": [
"# 推理单张图片\n",
"CUDA_VISIBLE_DEVICES=0 python deploy/python/infer.py --model_dir=output_inference/ppyoloe_plus_crn_l_80e_coco --image_file=demo/000000014439_640x640.jpg --device=gpu --run_mode=paddle\n",
"\n",
"# 推理文件夹下的所有图片\n",
"CUDA_VISIBLE_DEVICES=0 python deploy/python/infer.py --model_dir=output_inference/ppyoloe_plus_crn_l_80e_coco --image_dir=demo/ --device=gpu --run_mode=paddle"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### 3.2.3 使用ONNX-TensorRT进行速度测试\n",
"**使用 ONNX 和 TensorRT** 进行测速,执行以下命令:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"vscode": {
"languageId": "plaintext"
}
},
"outputs": [],
"source": [
"# 导出模型\n",
"!python tools/export_model.py -c configs/ppyoloe/ppyoloe_plus_crn_s_80e_coco.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_s_80e_coco.pdparams exclude_nms=True trt=True\n",
"\n",
"# 转化成ONNX格式\n",
"!paddle2onnx --model_dir output_inference/ppyoloe_plus_crn_s_80e_coco --model_filename model.pdmodel --params_filename model.pdiparams --opset_version 12 --save_file ppyoloe_plus_crn_s_80e_coco.onnx\n",
"\n",
"# 测试速度,半精度,batch_size=1\n",
"!trtexec --onnx=./ppyoloe_plus_crn_s_80e_coco.onnx --saveEngine=./ppyoloe_s_bs1.engine --workspace=1024 --avgRuns=1000 --shapes=image:1x3x640x640,scale_factor:1x2 --fp16\n",
"\n",
"# 测试速度,半精度,batch_size=32\n",
"!trtexec --onnx=./ppyoloe_plus_crn_s_80e_coco.onnx --saveEngine=./ppyoloe_s_bs32.engine --workspace=1024 --avgRuns=1000 --shapes=image:32x3x640x640,scale_factor:32x2 --fp16\n",
"\n",
"# 使用上边的脚本, 在T4 和 TensorRT 7.2的环境下,PPYOLOE-plus-s模型速度如下\n",
"# batch_size=1, 2.80ms, 357fps\n",
"# batch_size=32, 67.69ms, 472fps"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 4. 模型原理\n",
"PP-YOLOE+和PP-YOLOE的整体结构基本一致,其相对于PP-YOLOE的改进点如下:\n",
"- 使用大规模数据集obj365预训练模型\n",
"- 在backbone中block分支中增加alpha参数\n",
"- 优化端到端推理速度,提升训练收敛速度"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 5.注意事项\n",
"**所有的命令默认运行在AI Studio的`jupyter`上, 如果运行在终端上,去掉命令开头的符号%或!**"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 6. 相关论文及引用信息\n",
"```\n",
"@article{xu2022pp,\n",
" title={PP-YOLOE: An evolved version of YOLO},\n",
" author={Xu, Shangliang and Wang, Xinxin and Lv, Wenyu and Chang, Qinyao and Cui, Cheng and Deng, Kaipeng and Wang, Guanzhong and Dang, Qingqing and Wei, Shengyu and Du, Yuning and others},\n",
" journal={arXiv preprint arXiv:2203.16250},\n",
" year={2022}\n",
"}\n",
"```"
]
}
],
"metadata": {
"language_info": {
"name": "python"
},
"orig_nbformat": 4
},
"nbformat": 4,
"nbformat_minor": 2
}
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 1. Introduction\n",
"PP-YOLOE+ is a upgraded version of PP-YOLOE. Starting with a large-scale obj365 object detection pretraining model, PP-YOLOE+ not only greatly improves the convergence speed, but also improves the speed of the model on the COCO dataset. At the same time, PP-YOLOE+ greatly improves the end-to-end prediction speed including data preprocessing. For more details, please refer to [official documentation](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.5/configs/ppyoloe/README.md)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 2. Model Effects\n",
"PP-YOLOE+_l achieves 53.3 mAP on COCO test-dev2017 dataset with 78.1 FPS on Tesla V100. While using TensorRT FP16, PP-YOLOE+_l can be further accelerated to 149.2 FPS. PP-YOLOE+_s/m/x also have excellent accuracy and speed performance as shown below.\n",
"<div align=\"center\">\n",
" <img src=\"https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.5/docs/images/ppyoloe_plus_map_fps.png\" width=500 />\n",
"</div>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 3. How to use the model\n",
"Clone PaddleDetection firstly and put the COCO-style dataset in `dataset/coco`"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"vscode": {
"languageId": "plaintext"
}
},
"outputs": [],
"source": [
"%cd ~/work\n",
"!git clone https://gitee.com/paddlepaddle/PaddleDetection\n",
"%cd PaddleDetection"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 3.1 Training\n",
"Training PP-YOLOE+ on 8 GPUs with following command"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"vscode": {
"languageId": "plaintext"
}
},
"outputs": [],
"source": [
"# training with single GPU\n",
"!python tools/train.py -c configs/ppyoloe/ppyoloe_plus_crn_l_80e_coco.yml --eval --amp\n",
"\n",
"# training with mutiple GPUs\n",
"!python -m paddle.distributed.launch --gpus 0,1,2,3,4,5,6,7 tools/train.py -c configs/ppyoloe/ppyoloe_plus_crn_l_80e_coco.yml --eval --amp"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Notes:**\n",
"- If you need to evaluate while training, please add `--eval`.\n",
"- PP-YOLOE supports mixed precision training, please add `--amp`."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 3.2 Inference\n",
"#### 3.2.1 Inference with Paddle-TRT\n",
"Next, we will introduce how to use Paddle Inference to deploy PP-YOLOE+ models in TensorRT FP16 mode.\n",
"\n",
"First, refer to [Paddle Inference Docs](https://www.paddlepaddle.org.cn/inference/master/user_guides/download_lib.html#python), download and install packages corresponding to CUDA, CUDNN and TensorRT version.\n",
"\n",
"Then, Exporting PP-YOLOE+ for Paddle Inference **with TensorRT**, use following command."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"vscode": {
"languageId": "plaintext"
}
},
"outputs": [],
"source": [
"!python tools/export_model.py -c configs/ppyoloe/ppyoloe_plus_crn_l_80e_coco.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_l_80e_coco.pdparams trt=True"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Finally, inference in TensorRT FP16 mode."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"vscode": {
"languageId": "plaintext"
}
},
"outputs": [],
"source": [
"# inference single image\n",
"CUDA_VISIBLE_DEVICES=0 python deploy/python/infer.py --model_dir=output_inference/ppyoloe_plus_crn_l_80e_coco --image_file=demo/000000014439_640x640.jpg --device=gpu --run_mode=trt_fp16\n",
"\n",
"# inference all images in the directory\n",
"CUDA_VISIBLE_DEVICES=0 python deploy/python/infer.py --model_dir=output_inference/ppyoloe_plus_crn_l_80e_coco --image_dir=demo/ --device=gpu --run_mode=trt_fp16"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"*Notes:**\n",
"- TensorRT will perform optimization for the current hardware platform according to the definition of the network, generate an inference engine and serialize it into a file. This inference engine is only applicable to the current hardware hardware platform. If your hardware and software platform has not changed, you can set `use_static=True` in [enable_tensorrt_engine](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.4/deploy/python/infer.py#L745). In this way, the serialized file generated will be saved in the `output_inference` folder, and the saved serialized file will be loaded the next time when TensorRT is executed.\n",
"- PaddleDetection release/2.4 and later versions will support NMS calling TensorRT, which requires PaddlePaddle release/2.3 and later versions."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### 3.2.2 Inference with PaddleInference\n",
"For some AI acceleration hardware that does not support TensorRT, we can directly use PaddleInference to deploy. First run the following command to export the model"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"vscode": {
"languageId": "plaintext"
}
},
"outputs": [],
"source": [
"!python tools/export_model.py -c configs/ppyoloe/ppyoloe_plus_crn_l_80e_coco.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_l_80e_coco.pdparams"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"vscode": {
"languageId": "plaintext"
}
},
"outputs": [],
"source": [
"Inference with PaddleInference directly."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"vscode": {
"languageId": "plaintext"
}
},
"outputs": [],
"source": [
"# inference single image\n",
"CUDA_VISIBLE_DEVICES=0 python deploy/python/infer.py --model_dir=output_inference/ppyoloe_plus_crn_l_80e_coco --image_file=demo/000000014439_640x640.jpg --device=gpu --run_mode=paddle\n",
"\n",
"# inference all images in the directory\n",
"CUDA_VISIBLE_DEVICES=0 python deploy/python/infer.py --model_dir=output_inference/ppyoloe_plus_crn_l_80e_coco --image_dir=demo/ --device=gpu --run_mode=paddle"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### 3.2.3 Speed testing with ONNX-TensorRT\n",
"**Using TensorRT Inference with ONNX** to test speed, run following command"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"vscode": {
"languageId": "plaintext"
}
},
"outputs": [],
"source": [
"# export inference model with trt=True\n",
"!python tools/export_model.py -c configs/ppyoloe/ppyoloe_plus_crn_s_80e_coco.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_s_80e_coco.pdparams exclude_nms=True trt=True\n",
"\n",
"# convert to onnx\n",
"!paddle2onnx --model_dir output_inference/ppyoloe_plus_crn_s_80e_coco --model_filename model.pdmodel --params_filename model.pdiparams --opset_version 12 --save_file ppyoloe_plus_crn_s_80e_coco.onnx\n",
"\n",
"# trt inference using fp16 and batch_size=1\n",
"!trtexec --onnx=./ppyoloe_plus_crn_s_80e_coco.onnx --saveEngine=./ppyoloe_s_bs1.engine --workspace=1024 --avgRuns=1000 --shapes=image:1x3x640x640,scale_factor:1x2 --fp16\n",
"\n",
"# trt inference using fp16 and batch_size=32\n",
"!trtexec --onnx=./ppyoloe_plus_crn_s_80e_coco.onnx --saveEngine=./ppyoloe_s_bs32.engine --workspace=1024 --avgRuns=1000 --shapes=image:32x3x640x640,scale_factor:32x2 --fp16\n",
"\n",
"# Using the above script, T4 and tensorrt 7.2 machine, the speed of PPYOLOE-s model is as follows,\n",
"\n",
"# batch_size=1, 2.80ms, 357fps\n",
"# batch_size=32, 67.69ms, 472fps"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 4. Model principle\n",
"The overall structure of PP-YOLOE+ is roughly the same as that of PP-YOLOE. PP-YOLOE+ improves the preformance with the following methods:\n",
"- Pre training model using large-scale data set obj365\n",
"- In the backbone, add the alpha parameter to the block branch\n",
"- Optimize the end-to-end inference speed and improve the training convergence speed\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 5. Attention\n",
"**All commands run on AI Studio's `jupyter` by default. If running on a terminal, remove the % or ! at the beginning of the command.**"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 6. Related papers and citations\n",
"```\n",
"@article{xu2022pp,\n",
" title={PP-YOLOE: An evolved version of YOLO},\n",
" author={Xu, Shangliang and Wang, Xinxin and Lv, Wenyu and Chang, Qinyao and Cui, Cheng and Deng, Kaipeng and Wang, Guanzhong and Dang, Qingqing and Wei, Shengyu and Du, Yuning and others},\n",
" journal={arXiv preprint arXiv:2203.16250},\n",
" year={2022}\n",
"}\n",
"```"
]
}
],
"metadata": {
"language_info": {
"name": "python"
},
"orig_nbformat": 4
},
"nbformat": 4,
"nbformat_minor": 2
}
import gradio as gr
import numpy as np
import os
from src.detection import Detector
# UGC: Define the inference fn() for your models
def model_inference(image):
image, json_out = Detector('PP-YOLOE')(image)
return image, json_out
def clear_all():
return None, None, None
with gr.Blocks() as demo:
gr.Markdown("Objective Detection")
with gr.Column(scale=1, min_width=100):
img_in = gr.Image(label="Input").style(height=200)
with gr.Row():
btn1 = gr.Button("Clear")
btn2 = gr.Button("Submit")
img_out = gr.Image(label="Output").style(height=200)
json_out = gr.JSON(label="jsonOutput")
btn2.click(fn=model_inference, inputs=img_in, outputs=[img_out, json_out])
btn1.click(fn=clear_all, inputs=None, outputs=[img_in, img_out, json_out])
gr.Button.style(1)
demo.launch()
【PP-YOLOE-App-YAML】
APP_Info:
title: PP-YOLOE-App
colorFrom: blue
colorTo: yellow
sdk: gradio
sdk_version: 3.9.1
app_file: app.py
license: apache-2.0
device: cpu
\ No newline at end of file
mode: paddle
draw_threshold: 0.5
metric: COCO
use_dynamic_shape: false
arch: YOLO
min_subgraph_size: 3
param_path: paddlecv://models/ppyoloe_crn_s_300e_coco/model.pdiparams
model_path: paddlecv://models/ppyoloe_crn_s_300e_coco/model.pdmodel
Preprocess:
- interp: 2
keep_ratio: false
target_size:
- 640
- 640
type: Resize
- is_scale: true
mean:
- 0.485
- 0.456
- 0.406
std:
- 0.229
- 0.224
- 0.225
type: NormalizeImage
- type: Permute
label_list:
- person
- bicycle
- car
- motorcycle
- airplane
- bus
- train
- truck
- boat
- traffic light
- fire hydrant
- stop sign
- parking meter
- bench
- bird
- cat
- dog
- horse
- sheep
- cow
- elephant
- bear
- zebra
- giraffe
- backpack
- umbrella
- handbag
- tie
- suitcase
- frisbee
- skis
- snowboard
- sports ball
- kite
- baseball bat
- baseball glove
- skateboard
- surfboard
- tennis racket
- bottle
- wine glass
- cup
- fork
- knife
- spoon
- bowl
- banana
- apple
- sandwich
- orange
- broccoli
- carrot
- hot dog
- pizza
- donut
- cake
- chair
- couch
- potted plant
- bed
- dining table
- toilet
- tv
- laptop
- mouse
- remote
- keyboard
- cell phone
- microwave
- oven
- toaster
- sink
- refrigerator
- book
- clock
- vase
- scissors
- teddy bear
- hair drier
- toothbrush
gradio
opencv-python
paddlepaddle
PyYAML
shapely
scipy
Cython
numpy
setuptools
pillow
\ No newline at end of file
import cv2
import os
import numpy as np
import yaml
from paddle.inference import Config, create_predictor, PrecisionType
from PIL import Image
from .download import get_model_path
from .preprocess import preprocess, Resize, NormalizeImage, Permute, PadStride, decode_image
from .visualize import draw_det
class Detector(object):
def __init__(self, model_name):
parent_path = os.path.abspath(os.path.join(__file__, *(['..'] * 2)))
yml_file = os.path.join(parent_path, 'configs/{}.yml'.format(model_name))
with open(yml_file, 'r') as f:
yml_conf = yaml.safe_load(f)
infer_model = get_model_path(yml_conf['model_path'])
infer_params = get_model_path(yml_conf['param_path'])
config = Config(infer_model, infer_params)
device = yml_conf.get('device', 'CPU')
run_mode = yml_conf.get('mode', 'paddle')
cpu_threads = yml_conf.get('cpu_threads', 1)
if device == 'CPU':
config.disable_gpu()
config.set_cpu_math_library_num_threads(cpu_threads)
elif device == 'GPU':
# initial GPU memory(M), device ID
config.enable_use_gpu(200, 0)
# optimize graph and fuse op
config.switch_ir_optim(True)
precision_map = {
'trt_int8': Config.Precision.Int8,
'trt_fp32': Config.Precision.Float32,
'trt_fp16': Config.Precision.Half
}
if run_mode in precision_map.keys():
config.enable_tensorrt_engine(
workspace_size=(1 << 25) * batch_size,
max_batch_size=batch_size,
min_subgraph_size=yml_conf['min_subgraph_size'],
precision_mode=precision_map[run_mode],
use_static=True,
use_calib_mode=False)
if yml_conf['use_dynamic_shape']:
min_input_shape = {
'image': [batch_size, 3, 640, 640],
'scale_factor': [batch_size, 2]
}
max_input_shape = {
'image': [batch_size, 3, 1280, 1280],
'scale_factor': [batch_size, 2]
}
opt_input_shape = {
'image': [batch_size, 3, 1024, 1024],
'scale_factor': [batch_size, 2]
}
config.set_trt_dynamic_shape_info(min_input_shape, max_input_shape,
opt_input_shape)
# disable print log when predict
config.disable_glog_info()
# enable shared memory
config.enable_memory_optim()
# disable feed, fetch OP, needed by zero_copy_run
config.switch_use_feed_fetch_ops(False)
self.predictor = create_predictor(config)
self.yml_conf = yml_conf
self.preprocess_ops = self.create_preprocess_ops(yml_conf)
self.input_names = self.predictor.get_input_names()
self.output_names = self.predictor.get_output_names()
self.draw_threshold = yml_conf.get('draw_threshold', 0.5)
self.class_names = yml_conf['label_list']
def create_preprocess_ops(self, yml_conf):
preprocess_ops = []
for op_info in yml_conf['Preprocess']:
new_op_info = op_info.copy()
op_type = new_op_info.pop('type')
preprocess_ops.append(eval(op_type)(**new_op_info))
return preprocess_ops
def create_inputs(self, image_files):
inputs = dict()
im_list, im_info_list = [], []
for im_path in image_files:
im, im_info = preprocess(im_path, self.preprocess_ops)
im_list.append(im)
im_info_list.append(im_info)
inputs['im_shape'] = np.stack([e['im_shape'] for e in im_info_list], axis=0).astype('float32')
inputs['scale_factor'] = np.stack([e['scale_factor'] for e in im_info_list], axis=0).astype('float32')
inputs['image'] = np.stack(im_list, axis=0).astype('float32')
return inputs
def __call__(self, image_file):
inputs = self.create_inputs([image_file])
for name in self.input_names:
input_tensor = self.predictor.get_input_handle(name)
input_tensor.copy_from_cpu(inputs[name])
self.predictor.run()
boxes_tensor = self.predictor.get_output_handle(self.output_names[0])
np_boxes = boxes_tensor.copy_to_cpu()
boxes_num = self.predictor.get_output_handle(self.output_names[1])
np_boxes_num = boxes_num.copy_to_cpu()
if np_boxes_num.sum() <= 0:
np_boxes = np.zeros([0, 6])
if isinstance(image_file, str):
image = Image.open(image_file).convert('RGB')
elif isinstance(image_file, np.ndarray):
image = image_file
expect_boxes = (np_boxes[:, 1] > self.draw_threshold) & (np_boxes[:, 0] > -1)
np_boxes = np_boxes[expect_boxes, :]
image = draw_det(image, np_boxes, self.class_names)
return image, {'bboxes': np_boxes.tolist()}
\ No newline at end of file
# Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import os
import os.path as osp
import sys
import yaml
import time
import shutil
import requests
import tqdm
import hashlib
import base64
import binascii
import tarfile
import zipfile
__all__ = [
'get_model_path',
'get_config_path',
'get_dict_path',
]
WEIGHTS_HOME = osp.expanduser("~/.cache/paddlecv/models")
CONFIGS_HOME = osp.expanduser("~/.cache/paddlecv/configs")
DICTS_HOME = osp.expanduser("~/.cache/paddlecv/dicts")
# dict of {dataset_name: (download_info, sub_dirs)}
# download info: [(url, md5sum)]
DOWNLOAD_RETRY_LIMIT = 3
PMP_DOWNLOAD_URL_PREFIX = 'https://bj.bcebos.com/v1/paddle-model-ecology/paddlecv/'
def is_url(path):
"""
Whether path is URL.
Args:
path (string): URL string or not.
"""
return path.startswith('http://') \
or path.startswith('https://') \
or path.startswith('paddlecv://')
def parse_url(url):
url = url.replace("paddlecv://", PMP_DOWNLOAD_URL_PREFIX)
return url
def get_model_path(path):
"""Get model path from WEIGHTS_HOME, if not exists,
download it from url.
"""
if not is_url(path):
return path
url = parse_url(path)
path, _ = get_path(url, WEIGHTS_HOME, path_depth=2)
return path
def get_config_path(path):
"""Get config path from CONFIGS_HOME, if not exists,
download it from url.
"""
if not is_url(path):
return path
url = parse_url(path)
path, _ = get_path(url, CONFIGS_HOME)
return path
def get_dict_path(path):
"""Get config path from CONFIGS_HOME, if not exists,
download it from url.
"""
if not is_url(path):
return path
url = parse_url(path)
path, _ = get_path(url, DICTS_HOME)
return path
def map_path(url, root_dir, path_depth=1):
# parse path after download to decompress under root_dir
assert path_depth > 0, "path_depth should be a positive integer"
dirname = url
for _ in range(path_depth):
dirname = osp.dirname(dirname)
fpath = osp.relpath(url, dirname)
path = osp.join(root_dir, fpath)
dirname = osp.dirname(path)
return path, dirname
def get_path(url, root_dir, md5sum=None, check_exist=True, path_depth=1):
""" Download from given url to root_dir.
if file or directory specified by url is exists under
root_dir, return the path directly, otherwise download
from url, return the path.
url (str): download url
root_dir (str): root dir for downloading, it should be
WEIGHTS_HOME
md5sum (str): md5 sum of download package
"""
# parse path after download to decompress under root_dir
fullpath, dirname = map_path(url, root_dir, path_depth)
if osp.exists(fullpath) and check_exist:
if not osp.isfile(fullpath) or \
_check_exist_file_md5(fullpath, md5sum, url):
return fullpath, True
else:
os.remove(fullpath)
fullname = _download(url, dirname, md5sum)
return fullpath, False
def _download(url, path, md5sum=None):
"""
Download from url, save to path.
url (str): download url
path (str): download to given path
"""
if not osp.exists(path):
os.makedirs(path)
fname = osp.split(url)[-1]
fullname = osp.join(path, fname)
retry_cnt = 0
while not (osp.exists(fullname) and _check_exist_file_md5(fullname, md5sum,
url)):
if retry_cnt < DOWNLOAD_RETRY_LIMIT:
retry_cnt += 1
else:
raise RuntimeError("Download from {} failed. "
"Retry limit reached".format(url))
# NOTE: windows path join may incur \, which is invalid in url
if sys.platform == "win32":
url = url.replace('\\', '/')
req = requests.get(url, stream=True)
if req.status_code != 200:
raise RuntimeError("Downloading from {} failed with code "
"{}!".format(url, req.status_code))
# For protecting download interupted, download to
# tmp_fullname firstly, move tmp_fullname to fullname
# after download finished
tmp_fullname = fullname + "_tmp"
total_size = req.headers.get('content-length')
with open(tmp_fullname, 'wb') as f:
if total_size:
for chunk in tqdm.tqdm(
req.iter_content(chunk_size=1024),
total=(int(total_size) + 1023) // 1024,
unit='KB'):
f.write(chunk)
else:
for chunk in req.iter_content(chunk_size=1024):
if chunk:
f.write(chunk)
shutil.move(tmp_fullname, fullname)
return fullname
def _check_exist_file_md5(filename, md5sum, url):
# if md5sum is None, and file to check is model file,
# read md5um from url and check, else check md5sum directly
return _md5check_from_url(filename, url) if md5sum is None \
and filename.endswith('pdparams') \
else _md5check(filename, md5sum)
def _md5check_from_url(filename, url):
# For model in bcebos URLs, MD5 value is contained
# in request header as 'content_md5'
req = requests.get(url, stream=True)
content_md5 = req.headers.get('content-md5')
req.close()
if not content_md5 or _md5check(
filename,
binascii.hexlify(base64.b64decode(content_md5.strip('"'))).decode(
)):
return True
else:
return False
def _md5check(fullname, md5sum=None):
if md5sum is None:
return True
md5 = hashlib.md5()
with open(fullname, 'rb') as f:
for chunk in iter(lambda: f.read(4096), b""):
md5.update(chunk)
calc_md5sum = md5.hexdigest()
if calc_md5sum != md5sum:
return False
return True
# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import cv2
import numpy as np
def decode_image(im_file, im_info):
"""read rgb image
Args:
im_file (str|np.ndarray): input can be image path or np.ndarray
im_info (dict): info of image
Returns:
im (np.ndarray): processed image (np.ndarray)
im_info (dict): info of processed image
"""
if isinstance(im_file, str):
with open(im_file, 'rb') as f:
im_read = f.read()
data = np.frombuffer(im_read, dtype='uint8')
im = cv2.imdecode(data, 1) # BGR mode, but need RGB mode
im = cv2.cvtColor(im, cv2.COLOR_BGR2RGB)
else:
im = im_file
im_info['im_shape'] = np.array(im.shape[:2], dtype=np.float32)
im_info['scale_factor'] = np.array([1., 1.], dtype=np.float32)
return im, im_info
class Resize(object):
"""resize image by target_size and max_size
Args:
target_size (int): the target size of image
keep_ratio (bool): whether keep_ratio or not, default true
interp (int): method of resize
"""
def __init__(self, target_size, keep_ratio=True, interp=cv2.INTER_LINEAR):
if isinstance(target_size, int):
target_size = [target_size, target_size]
self.target_size = target_size
self.keep_ratio = keep_ratio
self.interp = interp
def __call__(self, im, im_info):
"""
Args:
im (np.ndarray): image (np.ndarray)
im_info (dict): info of image
Returns:
im (np.ndarray): processed image (np.ndarray)
im_info (dict): info of processed image
"""
assert len(self.target_size) == 2
assert self.target_size[0] > 0 and self.target_size[1] > 0
im_channel = im.shape[2]
im_scale_y, im_scale_x = self.generate_scale(im)
im = cv2.resize(
im,
None,
None,
fx=im_scale_x,
fy=im_scale_y,
interpolation=self.interp)
im_info['im_shape'] = np.array(im.shape[:2]).astype('float32')
im_info['scale_factor'] = np.array(
[im_scale_y, im_scale_x]).astype('float32')
return im, im_info
def generate_scale(self, im):
"""
Args:
im (np.ndarray): image (np.ndarray)
Returns:
im_scale_x: the resize ratio of X
im_scale_y: the resize ratio of Y
"""
origin_shape = im.shape[:2]
im_c = im.shape[2]
if self.keep_ratio:
im_size_min = np.min(origin_shape)
im_size_max = np.max(origin_shape)
target_size_min = np.min(self.target_size)
target_size_max = np.max(self.target_size)
im_scale = float(target_size_min) / float(im_size_min)
if np.round(im_scale * im_size_max) > target_size_max:
im_scale = float(target_size_max) / float(im_size_max)
im_scale_x = im_scale
im_scale_y = im_scale
else:
resize_h, resize_w = self.target_size
im_scale_y = resize_h / float(origin_shape[0])
im_scale_x = resize_w / float(origin_shape[1])
return im_scale_y, im_scale_x
class NormalizeImage(object):
"""normalize image
Args:
mean (list): im - mean
std (list): im / std
is_scale (bool): whether need im / 255
norm_type (str): type in ['mean_std', 'none']
"""
def __init__(self, mean, std, is_scale=True, norm_type='mean_std'):
self.mean = mean
self.std = std
self.is_scale = is_scale
self.norm_type = norm_type
def __call__(self, im, im_info):
"""
Args:
im (np.ndarray): image (np.ndarray)
im_info (dict): info of image
Returns:
im (np.ndarray): processed image (np.ndarray)
im_info (dict): info of processed image
"""
im = im.astype(np.float32, copy=False)
if self.is_scale:
scale = 1.0 / 255.0
im *= scale
if self.norm_type == 'mean_std':
mean = np.array(self.mean)[np.newaxis, np.newaxis, :]
std = np.array(self.std)[np.newaxis, np.newaxis, :]
im -= mean
im /= std
return im, im_info
class Permute(object):
"""permute image
Args:
to_bgr (bool): whether convert RGB to BGR
channel_first (bool): whether convert HWC to CHW
"""
def __init__(self, ):
super(Permute, self).__init__()
def __call__(self, im, im_info):
"""
Args:
im (np.ndarray): image (np.ndarray)
im_info (dict): info of image
Returns:
im (np.ndarray): processed image (np.ndarray)
im_info (dict): info of processed image
"""
im = im.transpose((2, 0, 1)).copy()
return im, im_info
class PadStride(object):
""" padding image for model with FPN, instead PadBatch(pad_to_stride) in original config
Args:
stride (bool): model with FPN need image shape % stride == 0
"""
def __init__(self, stride=0):
self.coarsest_stride = stride
def __call__(self, im, im_info):
"""
Args:
im (np.ndarray): image (np.ndarray)
im_info (dict): info of image
Returns:
im (np.ndarray): processed image (np.ndarray)
im_info (dict): info of processed image
"""
coarsest_stride = self.coarsest_stride
if coarsest_stride <= 0:
return im, im_info
im_c, im_h, im_w = im.shape
pad_h = int(np.ceil(float(im_h) / coarsest_stride) * coarsest_stride)
pad_w = int(np.ceil(float(im_w) / coarsest_stride) * coarsest_stride)
padding_im = np.zeros((im_c, pad_h, pad_w), dtype=np.float32)
padding_im[:, :im_h, :im_w] = im
return padding_im, im_info
def preprocess(im, preprocess_ops):
# process image by preprocess_ops
im_info = {
'scale_factor': np.array(
[1., 1.], dtype=np.float32),
'im_shape': None,
}
im, im_info = decode_image(im, im_info)
for operator in preprocess_ops:
im, im_info = operator(im, im_info)
return im, im_info
import numpy as np
from PIL import Image, ImageDraw, ImageFile
def get_color_map_list(num_classes):
"""
Args:
num_classes (int): number of class
Returns:
color_map (list): RGB color list
"""
color_map = num_classes * [0, 0, 0]
for i in range(0, num_classes):
j = 0
lab = i
while lab:
color_map[i * 3] |= (((lab >> 0) & 1) << (7 - j))
color_map[i * 3 + 1] |= (((lab >> 1) & 1) << (7 - j))
color_map[i * 3 + 2] |= (((lab >> 2) & 1) << (7 - j))
j += 1
lab >>= 3
color_map = [color_map[i:i + 3] for i in range(0, len(color_map), 3)]
return color_map
def draw_det(image, dt_bboxes, name_set):
im = Image.fromarray(image)
draw_thickness = min(im.size) // 320
draw = ImageDraw.Draw(im)
clsid2color = {}
color_list = get_color_map_list(len(name_set))
for (cls_id, score, xmin, ymin, xmax, ymax) in dt_bboxes:
cls_id = int(cls_id)
name = name_set[cls_id]
color = tuple(color_list[cls_id])
# draw bbox
draw.line(
[(xmin, ymin), (xmin, ymax), (xmax, ymax), (xmax, ymin),
(xmin, ymin)],
width=draw_thickness,
fill=color)
# draw label
text = "{} {:.4f}".format(name, score)
box = draw.textbbox((xmin, ymin), text, anchor='lt')
draw.rectangle(box, fill=color)
draw.text((box[0], box[1]), text, fill=(255, 255, 255))
image = np.array(im)
return image
\ No newline at end of file
## Benchmark
| 模型 | Epoch | GPU个数 | 每GPU图片个数 | 骨干网络 | 输入尺寸 | Box AP<sup>val<br>0.5:0.95 | Box AP<sup>test<br>0.5:0.95 | Params(M) | FLOPs(G) | V100 FP32(FPS) | V100 TensorRT FP16(FPS) |
|:--------------:|:-----:|:-------:|:----------:|:----------:| :-------:|:--------------------------:|:---------------------------:|:---------:|:--------:|:---------------:| :---------------------: |
| PP-YOLOE-s | 300 | 8 | 32 | cspresnet-s | 640 | 43.0 | 43.2 | 7.93 | 17.36 | 208.3 | 333.3 |
| PP-YOLOE-m | 300 | 8 | 28 | cspresnet-m | 640 | 49.0 | 49.1 | 23.43 | 49.91 | 123.4 | 208.3 |
| PP-YOLOE-l | 300 | 8 | 20 | cspresnet-l | 640 | 51.4 | 51.6 | 52.20 | 110.07 | 78.1 | 149.2 |
| PP-YOLOE-x | 300 | 8 | 16 | cspresnet-x | 640 | 52.3 | 52.4 | 98.42 | 206.59 | 45.0 | 95.2 |
**注意:**
- PP-YOLOE模型使用COCO数据集中train2017作为训练集,使用val2017和test-dev2017作为测试集。
- PP-YOLOE模型训练过程中使用8 GPUs进行混合精度训练,如果**GPU卡数**或者**batch size**发生了改变,你需要按照公式 **lr<sub>new</sub> = lr<sub>default</sub> * (batch_size<sub>new</sub> * GPU_number<sub>new</sub>) / (batch_size<sub>default</sub> * GPU_number<sub>default</sub>)** 调整学习率。
- PP-YOLOE模型推理速度测试采用单卡V100,batch size=1进行测试,使用**CUDA 10.2**, **CUDNN 7.6.5**,TensorRT推理速度测试使用**TensorRT 6.0.1.8**
- 参考[速度测试](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.5/configs/ppyoloe/README_cn.md#%E9%80%9F%E5%BA%A6%E6%B5%8B%E8%AF%95)以复现PP-YOLOE推理速度测试结果。
- 如果你设置了`--run_benchmark=True`, 你首先需要安装以下依赖`pip install pynvml psutil GPUtil`
\ No newline at end of file
## Benchmark
| Model | Epoch | GPU number | images/GPU | backbone | input shape | Box AP<sup>val<br>0.5:0.95 | Box AP<sup>test<br>0.5:0.95 | Params(M) | FLOPs(G) | V100 FP32(FPS) | V100 TensorRT FP16(FPS) |
|:--------------:|:-----:|:-------:|:----------:|:----------:| :-------:|:--------------------------:|:---------------------------:|:---------:|:--------:|:---------------:| :---------------------: |
| PP-YOLOE-s | 300 | 8 | 32 | cspresnet-s | 640 | 43.0 | 43.2 | 7.93 | 17.36 | 208.3 | 333.3 |
| PP-YOLOE-m | 300 | 8 | 28 | cspresnet-m | 640 | 49.0 | 49.1 | 23.43 | 49.91 | 123.4 | 208.3 |
| PP-YOLOE-l | 300 | 8 | 20 | cspresnet-l | 640 | 51.4 | 51.6 | 52.20 | 110.07 | 78.1 | 149.2 |
| PP-YOLOE-x | 300 | 8 | 16 | cspresnet-x | 640 | 52.3 | 52.4 | 98.42 | 206.59 | 45.0 | 95.2 |
**Notes:**
- PP-YOLOE is trained on COCO train2017 dataset and evaluated on val2017 & test-dev2017 dataset.
- PP-YOLOE used 8 GPUs for mixed precision training, if **GPU number** or **mini-batch size** is changed, **learning rate** should be adjusted according to the formula **lr<sub>new</sub> = lr<sub>default</sub> * (batch_size<sub>new</sub> * GPU_number<sub>new</sub>) / (batch_size<sub>default</sub> * GPU_number<sub>default</sub>)**.
- PP-YOLOE inference speed is tesed on single Tesla V100 with batch size as 1, **CUDA 10.2**, **CUDNN 7.6.5**, **TensorRT 6.0.1.8** in TensorRT mode.
- Refer to [Speed testing](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.5/configs/ppyoloe/README.md#speed-testing) to reproduce the speed testing results of PP-YOLOE.
- If you set `--run_benchmark=True`,you should install these dependencies at first, `pip install pynvml psutil GPUtil`.
\ No newline at end of file
## 模型库
| 模型 | 骨干网络 | 模型下载 | 配置文件 |
|:-----------:|:-------:|:-------:|:-------:|
| PP-YOLOE-s | cspresnet-s | [model](https://paddledet.bj.bcebos.com/models/ppyoloe_crn_s_300e_coco.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.5/configs/ppyoloe/ppyoloe_crn_s_300e_coco.yml) |
| PP-YOLOE-m | cspresnet-m | [model](https://paddledet.bj.bcebos.com/models/ppyoloe_crn_m_300e_coco.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.5/configs/ppyoloe/ppyoloe_crn_m_300e_coco.yml) |
| PP-YOLOE-l | cspresnet-l | [model](https://paddledet.bj.bcebos.com/models/ppyoloe_crn_l_300e_coco.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.5/configs/ppyoloe/ppyoloe_crn_l_300e_coco.yml) |
| PP-YOLOE-x | cspresnet-x | [model](https://paddledet.bj.bcebos.com/models/ppyoloe_crn_x_300e_coco.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.5/configs/ppyoloe/ppyoloe_crn_x_300e_coco.yml) |
\ No newline at end of file
## Model Zoo
| Model | Backbone | Download | Config |
|:-----------:|:-------:|:-------:|:-------:|
| PP-YOLOE-s | cspresnet-s | [model](https://paddledet.bj.bcebos.com/models/ppyoloe_crn_s_300e_coco.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.5/configs/ppyoloe/ppyoloe_crn_s_300e_coco.yml) |
| PP-YOLOE-m | cspresnet-m | [model](https://paddledet.bj.bcebos.com/models/ppyoloe_crn_m_300e_coco.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.5/configs/ppyoloe/ppyoloe_crn_m_300e_coco.yml) |
| PP-YOLOE-l | cspresnet-l | [model](https://paddledet.bj.bcebos.com/models/ppyoloe_crn_l_300e_coco.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.5/configs/ppyoloe/ppyoloe_crn_l_300e_coco.yml) |
| PP-YOLOE-x | cspresnet-x | [model](https://paddledet.bj.bcebos.com/models/ppyoloe_crn_x_300e_coco.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.5/configs/ppyoloe/ppyoloe_crn_x_300e_coco.yml) |
\ No newline at end of file
---
Model_Info:
name: "PP-YOLOE"
description: "PP-YOLOE是基于PP-YOLOv2的卓越的单阶段Anchor-free模型,超越了多种流行的YOLO模型"
description_en: "PP-YOLOE is an excellent single-stage anchor-free model based on\
\ PP-YOLOv2, surpassing a variety of popular YOLO models"
icon: "@后续UE统一设计之后,会存到bos上某个位置"
from_repo: "PaddleDetection"
Task:
- tag_en: "Computer Vision"
tag: "计算机视觉"
sub_tag_en: "Object Detection"
sub_tag: "目标检测"
Example:
- tag_en: "Intelligent Transportation"
tag: "智慧交通"
sub_tag_en: "Object Detection"
title: "基于PP-YOLOE的雾天行人车辆目标检测"
sub_tag: "目标检测"
url: "https://aistudio.baidu.com/aistudio/projectdetail/4228391"
Datasets: "COCO test-dev2017, COCO train2017, COCO val2017, Pascal VOC"
Pulisher: "Baidu"
License: "apache.2.0"
Paper:
- title: "PP-YOLOE: An evolved version of YOLO"
url: "https://arxiv.org/pdf/2203.16250v2.pdf"
IfTraining: 1
IfOnlineDemo: 1
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 1. 模型简介\n",
"PP-YOLOE是基于PP-YOLOv2的卓越的单阶段Anchor-free模型,超越了多种流行的YOLO模型。PP-YOLOE有一系列的模型,即s/m/l/x,可以通过width multiplier和depth multiplier配置。PP-YOLOE避免了使用诸如Deformable Convolution或者Matrix NMS之类的特殊算子,以使其能轻松地部署在多种多样的硬件上。关于PP-YOLOE的更多细节可以参考我们的[官方文档](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.5/configs/ppyoloe/README_cn.md)。"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 2. 模型效果\n",
"PP-YOLOE-l在COCO test-dev2017达到了51.6的mAP, 同时其速度在Tesla V100上达到了78.1 FPS。如下图所示,PP-YOLOE-s/m/x同样具有卓越的精度速度性价比。\n",
"<div align=\"center\">\n",
" <img src=\"https://paddledet.bj.bcebos.com/modelcenter/images/ppyoloe_map_fps.png\" width=500 />\n",
"</div>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 3. 模型如何使用\n",
"首先克隆PaddleDetection,并将数据集存放在`dataset/coco/`目录下面"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"vscode": {
"languageId": "plaintext"
}
},
"outputs": [],
"source": [
"%cd ~/work\n",
"!git clone https://gitee.com/paddlepaddle/PaddleDetection\n",
"%cd PaddleDetection"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 3.1 训练\n",
"执行以下指令训练PP-YOLOE"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"vscode": {
"languageId": "plaintext"
}
},
"outputs": [],
"source": [
"# 单卡训练\n",
"!python tools/train.py -c configs/ppyoloe/ppyoloe_crn_l_300e_coco.yml --eval --amp\n",
"\n",
"# 多卡训练\n",
"!python -m paddle.distributed.launch --gpus 0,1,2,3,4,5,6,7 tools/train.py -c configs/ppyoloe/ppyoloe_crn_l_300e_coco.yml --eval --amp"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**注意:**\n",
"- 如果需要边训练边评估,请添加`--eval`.\n",
"- PP-YOLOE支持混合精度训练,请添加`--amp`."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 3.2 推理部署\n",
"#### 3.2.1 使用Paddle-TRT进行推理部署\n",
"接下来,我们将介绍PP-YOLOE如何使用Paddle Inference在TensorRT FP16模式下部署\n",
"\n",
"首先,参考[Paddle Inference文档](https://www.paddlepaddle.org.cn/inference/master/user_guides/download_lib.html#python),下载并安装与你的CUDA, CUDNN和TensorRT相应的wheel包。\n",
"\n",
"然后,运行以下命令导出模型"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"vscode": {
"languageId": "plaintext"
}
},
"outputs": [],
"source": [
"!python tools/export_model.py -c configs/ppyoloe/ppyoloe_crn_l_300e_coco.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyoloe_crn_l_300e_coco.pdparams trt=True"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"最后,使用TensorRT FP16进行推理"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"vscode": {
"languageId": "plaintext"
}
},
"outputs": [],
"source": [
"# 推理单张图片\n",
"!CUDA_VISIBLE_DEVICES=0 python deploy/python/infer.py --model_dir=output_inference/ppyoloe_crn_l_300e_coco --image_file=demo/000000014439_640x640.jpg --device=gpu --run_mode=trt_fp16\n",
"\n",
"# 推理文件夹下的所有图片\n",
"!CUDA_VISIBLE_DEVICES=0 python deploy/python/infer.py --model_dir=output_inference/ppyoloe_crn_l_300e_coco --image_dir=demo/ --device=gpu --run_mode=trt_fp16"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**注意:**\n",
"- TensorRT会根据网络的定义,执行针对当前硬件平台的优化,生成推理引擎并序列化为文件。该推理引擎只适用于当前软硬件平台。如果你的软硬件平台没有发生变化,你可以设置[enable_tensorrt_engine](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.5/deploy/python/infer.py#L857)的参数`use_static=True`,这样生成的序列化文件将会保存在`output_inference`文件夹下,下次执行TensorRT时将加载保存的序列化文件。\n",
"- PaddleDetection release/2.4及其之后的版本将支持NMS调用TensorRT,需要依赖PaddlePaddle release/2.3及其之后的版本。"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### 3.2.2 使用PaddleInference进行推理部署\n",
"对于一些不支持TensorRT的AI加速硬件,我们可以直接使用PaddleInference进行部署。首先运行以下命令导出模型"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"vscode": {
"languageId": "plaintext"
}
},
"outputs": [],
"source": [
"!python tools/export_model.py -c configs/ppyoloe/ppyoloe_crn_l_300e_coco.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyoloe_crn_l_300e_coco.pdparams"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"直接使用PaddleInference进行推理"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"vscode": {
"languageId": "plaintext"
}
},
"outputs": [],
"source": [
"# 推理单张图片\n",
"!CUDA_VISIBLE_DEVICES=0 python deploy/python/infer.py --model_dir=output_inference/ppyoloe_crn_l_300e_coco --image_file=demo/000000014439_640x640.jpg --device=gpu --run_mode=paddle\n",
"\n",
"# 推理文件夹下的所有图片\n",
"!CUDA_VISIBLE_DEVICES=0 python deploy/python/infer.py --model_dir=output_inference/ppyoloe_crn_l_300e_coco --image_dir=demo/ --device=gpu --run_mode=paddle"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 4. 模型原理\n",
"PP-YOLOE的整体结构图如下所示:\n",
"<div align=\"center\">\n",
" <img src=\"https://paddledet.bj.bcebos.com/modelcenter/images/PP-YOLOE-Arch.png\" width=70% />\n",
"</div>\n",
"PP-YOLOE由以下方法组成:\n",
"\n",
"- 可扩展的backbone和neck\n",
"- [Task Alignment Learning](https://arxiv.org/abs/2108.07755)\n",
"- Efficient Task-aligned head with [DFL](https://arxiv.org/abs/2006.04388)和[VFL](https://arxiv.org/abs/2008.13367)\n",
"- [SiLU(Swish)激活函数](https://arxiv.org/abs/1710.05941)\n",
"\n",
"更多细节可以参考我们的技术报告:https://arxiv.org/abs/2203.16250"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 5. 注意事项\n",
"**所有的命令默认运行在AI Studio的`jupyter`上, 如果运行在终端上,去掉命令开头的符号%或!**"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 6. 相关论文及引用信息\n",
"```\n",
"@article{xu2022pp,\n",
" title={PP-YOLOE: An evolved version of YOLO},\n",
" author={Xu, Shangliang and Wang, Xinxin and Lv, Wenyu and Chang, Qinyao and Cui, Cheng and Deng, Kaipeng and Wang, Guanzhong and Dang, Qingqing and Wei, Shengyu and Du, Yuning and others},\n",
" journal={arXiv preprint arXiv:2203.16250},\n",
" year={2022}\n",
"}\n",
"```"
]
}
],
"metadata": {
"language_info": {
"name": "python"
},
"orig_nbformat": 4
},
"nbformat": 4,
"nbformat_minor": 2
}
此差异已折叠。
import gradio as gr import gradio as gr
import numpy as np import numpy as np
import os
import cv2 as cv from src.detection import Detector
# UGC: Define the inference fn() for your models # UGC: Define the inference fn() for your models
def model_inference(image): def model_inference(image):
json_out = { image, json_out = Detector('PP-YOLOv2')(image)
"base64": "/9j/4AAQSkZJRgABAQAAAQABAAD/2wBDAAIBAQEBAQIBAQECAg...",
"result": "123456"
}
return image, json_out return image, json_out
...@@ -22,16 +19,13 @@ with gr.Blocks() as demo: ...@@ -22,16 +19,13 @@ with gr.Blocks() as demo:
with gr.Column(scale=1, min_width=100): with gr.Column(scale=1, min_width=100):
img_in = gr.Image( img_in = gr.Image(label="Input").style(height=200)
value="https://i.picsum.photos/id/867/600/600.jpg?hmac=qE7QFJwLmlE_WKI7zMH6SgH5iY5fx8ec6ZJQBwKRT44",
shape=(200, 200),
label="Input").style(height=200)
with gr.Row(): with gr.Row():
btn1 = gr.Button("Clear") btn1 = gr.Button("Clear")
btn2 = gr.Button("Submit") btn2 = gr.Button("Submit")
img_out = gr.Image(shape=(200, 200), label="Output").style(height=200) img_out = gr.Image(label="Output").style(height=200)
json_out = gr.JSON(label="jsonOutput") json_out = gr.JSON(label="jsonOutput")
btn2.click(fn=model_inference, inputs=img_in, outputs=[img_out, json_out]) btn2.click(fn=model_inference, inputs=img_in, outputs=[img_out, json_out])
......
...@@ -5,7 +5,7 @@ APP_Info: ...@@ -5,7 +5,7 @@ APP_Info:
colorFrom: blue colorFrom: blue
colorTo: yellow colorTo: yellow
sdk: gradio sdk: gradio
sdk_version: 3.4.1 sdk_version: 3.9.1
app_file: app.py app_file: app.py
license: apache-2.0 license: apache-2.0
device: cpu device: cpu
\ No newline at end of file
mode: paddle
draw_threshold: 0.5
metric: COCO
use_dynamic_shape: false
arch: YOLO
min_subgraph_size: 3
param_path: paddlecv://models/ppyolov2_r50vd_dcn_365e_coco/model.pdiparams
model_path: paddlecv://models/ppyolov2_r50vd_dcn_365e_coco/model.pdmodel
Preprocess:
- interp: 2
keep_ratio: false
target_size:
- 640
- 640
type: Resize
- is_scale: true
mean:
- 0.485
- 0.456
- 0.406
std:
- 0.229
- 0.224
- 0.225
type: NormalizeImage
- type: Permute
label_list:
- person
- bicycle
- car
- motorcycle
- airplane
- bus
- train
- truck
- boat
- traffic light
- fire hydrant
- stop sign
- parking meter
- bench
- bird
- cat
- dog
- horse
- sheep
- cow
- elephant
- bear
- zebra
- giraffe
- backpack
- umbrella
- handbag
- tie
- suitcase
- frisbee
- skis
- snowboard
- sports ball
- kite
- baseball bat
- baseball glove
- skateboard
- surfboard
- tennis racket
- bottle
- wine glass
- cup
- fork
- knife
- spoon
- bowl
- banana
- apple
- sandwich
- orange
- broccoli
- carrot
- hot dog
- pizza
- donut
- cake
- chair
- couch
- potted plant
- bed
- dining table
- toilet
- tv
- laptop
- mouse
- remote
- keyboard
- cell phone
- microwave
- oven
- toaster
- sink
- refrigerator
- book
- clock
- vase
- scissors
- teddy bear
- hair drier
- toothbrush
gradio gradio
opencv-python
paddlepaddle paddlepaddle
PyYAML
shapely
scipy
Cython
numpy
setuptools
pillow
\ No newline at end of file
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
...@@ -13,10 +13,10 @@ The PP-YOLO model uses COCO dataset centralized train2017 as the training set an ...@@ -13,10 +13,10 @@ The PP-YOLO model uses COCO dataset centralized train2017 as the training set an
### 1.3 Benchmark ### 1.3 Benchmark
|模型名称 | 模型简介 | 模型体积 | 输入尺寸 | ips | | Model | GPU number | images/GPU | Backbone | input shape | Box AP<sup>val</sup> | Box AP<sup>test</sup> |
|---|---|---|---|---| |:--------------------:|:-------:|:-------------:|:----------:| :-------:| :-------------: | :-------------: |
|ppyolov2_r50vd_dcn_1x_coco | 目标检测 | | 640 | | | PP-YOLOv2 | 8 | 12 | ResNet50vd | 640 | 49.1 | 49.5 |
|ppyolov2_r50vd_dcn_1x_coco | 目标检测 | | 320 | | | PP-YOLOv2 | 8 | 12 | ResNet101vd | 640 | 49.7 | 50.3 |
## 2. Inference Benchmark ## 2. Inference Benchmark
...@@ -33,6 +33,11 @@ The PP-YOLO model uses COCO dataset centralized train2017 as the training set an ...@@ -33,6 +33,11 @@ The PP-YOLO model uses COCO dataset centralized train2017 as the training set an
The PP-YOLO model uses COCO dataset centralized train2017 as the training set and val2017 and test-dev2017 as the test set. The PP-YOLO model uses COCO dataset centralized train2017 as the training set and val2017 and test-dev2017 as the test set.
### 2.3 Benchmark ### 2.3 Benchmark
| Model | Backbone | input shape | V100 FP32(FPS) | V100 TensorRT FP16(FPS) |
|:--------------------:|:-------:|:-------------:|:----------:| :-------:|
| PP-YOLOv2 | ResNet50vd | 640 | 68.9 | 106.5 |
| PP-YOLOv2 | ResNet101vd | 640 | 49.5 | 87.0 |
PP-YOLOv2 (R50) mAP in the COCO test dataset rises from 45.9% to 49.5%, an increase of 3.6 percentage points compared to v1. FP32 FPS is up to 68.9FPS, FP16 FPS is up to 106.5FPS, surpassing YOLOv4 and even YOLOv5! If RestNet101 is used as the backbone network, PP-YOLOv2 (R101) has up to 50.3% mAP and 15.9% faster than YOLOv5x with the same accuracy! PP-YOLOv2 (R50) mAP in the COCO test dataset rises from 45.9% to 49.5%, an increase of 3.6 percentage points compared to v1. FP32 FPS is up to 68.9FPS, FP16 FPS is up to 106.5FPS, surpassing YOLOv4 and even YOLOv5! If RestNet101 is used as the backbone network, PP-YOLOv2 (R101) has up to 50.3% mAP and 15.9% faster than YOLOv5x with the same accuracy!
![](https://raw.githubusercontent.com/PaddlePaddle/PaddleDetection/release/2.4/docs/images/ppyolo_map_fps.png) ![](https://raw.githubusercontent.com/PaddlePaddle/PaddleDetection/release/2.4/docs/images/ppyolo_map_fps.png)
......
此差异已折叠。
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册