未验证 提交 04a71f22 编写于 作者: W wangguanzhong 提交者: GitHub

add paddlecv (#5604)

上级 e75f74fb
Apache License
Version 2.0, January 2004
http://www.apache.org/licenses/
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
1. Definitions.
"License" shall mean the terms and conditions for use, reproduction,
and distribution as defined by Sections 1 through 9 of this document.
"Licensor" shall mean the copyright owner or entity authorized by
the copyright owner that is granting the License.
"Legal Entity" shall mean the union of the acting entity and all
other entities that control, are controlled by, or are under common
control with that entity. For the purposes of this definition,
"control" means (i) the power, direct or indirect, to cause the
direction or management of such entity, whether by contract or
otherwise, or (ii) ownership of fifty percent (50%) or more of the
outstanding shares, or (iii) beneficial ownership of such entity.
"You" (or "Your") shall mean an individual or Legal Entity
exercising permissions granted by this License.
"Source" form shall mean the preferred form for making modifications,
including but not limited to software source code, documentation
source, and configuration files.
"Object" form shall mean any form resulting from mechanical
transformation or translation of a Source form, including but
not limited to compiled object code, generated documentation,
and conversions to other media types.
"Work" shall mean the work of authorship, whether in Source or
Object form, made available under the License, as indicated by a
copyright notice that is included in or attached to the work
(an example is provided in the Appendix below).
"Derivative Works" shall mean any work, whether in Source or Object
form, that is based on (or derived from) the Work and for which the
editorial revisions, annotations, elaborations, or other modifications
represent, as a whole, an original work of authorship. For the purposes
of this License, Derivative Works shall not include works that remain
separable from, or merely link (or bind by name) to the interfaces of,
the Work and Derivative Works thereof.
"Contribution" shall mean any work of authorship, including
the original version of the Work and any modifications or additions
to that Work or Derivative Works thereof, that is intentionally
submitted to Licensor for inclusion in the Work by the copyright owner
or by an individual or Legal Entity authorized to submit on behalf of
the copyright owner. For the purposes of this definition, "submitted"
means any form of electronic, verbal, or written communication sent
to the Licensor or its representatives, including but not limited to
communication on electronic mailing lists, source code control systems,
and issue tracking systems that are managed by, or on behalf of, the
Licensor for the purpose of discussing and improving the Work, but
excluding communication that is conspicuously marked or otherwise
designated in writing by the copyright owner as "Not a Contribution."
"Contributor" shall mean Licensor and any individual or Legal Entity
on behalf of whom a Contribution has been received by Licensor and
subsequently incorporated within the Work.
2. Grant of Copyright License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
copyright license to reproduce, prepare Derivative Works of,
publicly display, publicly perform, sublicense, and distribute the
Work and such Derivative Works in Source or Object form.
3. Grant of Patent License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
(except as stated in this section) patent license to make, have made,
use, offer to sell, sell, import, and otherwise transfer the Work,
where such license applies only to those patent claims licensable
by such Contributor that are necessarily infringed by their
Contribution(s) alone or by combination of their Contribution(s)
with the Work to which such Contribution(s) was submitted. If You
institute patent litigation against any entity (including a
cross-claim or counterclaim in a lawsuit) alleging that the Work
or a Contribution incorporated within the Work constitutes direct
or contributory patent infringement, then any patent licenses
granted to You under this License for that Work shall terminate
as of the date such litigation is filed.
4. Redistribution. You may reproduce and distribute copies of the
Work or Derivative Works thereof in any medium, with or without
modifications, and in Source or Object form, provided that You
meet the following conditions:
(a) You must give any other recipients of the Work or
Derivative Works a copy of this License; and
(b) You must cause any modified files to carry prominent notices
stating that You changed the files; and
(c) You must retain, in the Source form of any Derivative Works
that You distribute, all copyright, patent, trademark, and
attribution notices from the Source form of the Work,
excluding those notices that do not pertain to any part of
the Derivative Works; and
(d) If the Work includes a "NOTICE" text file as part of its
distribution, then any Derivative Works that You distribute must
include a readable copy of the attribution notices contained
within such NOTICE file, excluding those notices that do not
pertain to any part of the Derivative Works, in at least one
of the following places: within a NOTICE text file distributed
as part of the Derivative Works; within the Source form or
documentation, if provided along with the Derivative Works; or,
within a display generated by the Derivative Works, if and
wherever such third-party notices normally appear. The contents
of the NOTICE file are for informational purposes only and
do not modify the License. You may add Your own attribution
notices within Derivative Works that You distribute, alongside
or as an addendum to the NOTICE text from the Work, provided
that such additional attribution notices cannot be construed
as modifying the License.
You may add Your own copyright statement to Your modifications and
may provide additional or different license terms and conditions
for use, reproduction, or distribution of Your modifications, or
for any such Derivative Works as a whole, provided Your use,
reproduction, and distribution of the Work otherwise complies with
the conditions stated in this License.
5. Submission of Contributions. Unless You explicitly state otherwise,
any Contribution intentionally submitted for inclusion in the Work
by You to the Licensor shall be under the terms and conditions of
this License, without any additional terms or conditions.
Notwithstanding the above, nothing herein shall supersede or modify
the terms of any separate license agreement you may have executed
with Licensor regarding such Contributions.
6. Trademarks. This License does not grant permission to use the trade
names, trademarks, service marks, or product names of the Licensor,
except as required for reasonable and customary use in describing the
origin of the Work and reproducing the content of the NOTICE file.
7. Disclaimer of Warranty. Unless required by applicable law or
agreed to in writing, Licensor provides the Work (and each
Contributor provides its Contributions) on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
implied, including, without limitation, any warranties or conditions
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
PARTICULAR PURPOSE. You are solely responsible for determining the
appropriateness of using or redistributing the Work and assume any
risks associated with Your exercise of permissions under this License.
8. Limitation of Liability. In no event and under no legal theory,
whether in tort (including negligence), contract, or otherwise,
unless required by applicable law (such as deliberate and grossly
negligent acts) or agreed to in writing, shall any Contributor be
liable to You for damages, including any direct, indirect, special,
incidental, or consequential damages of any character arising as a
result of this License or out of the use or inability to use the
Work (including but not limited to damages for loss of goodwill,
work stoppage, computer failure or malfunction, or any and all
other commercial damages or losses), even if such Contributor
has been advised of the possibility of such damages.
9. Accepting Warranty or Additional Liability. While redistributing
the Work or Derivative Works thereof, You may choose to offer,
and charge a fee for, acceptance of support, warranty, indemnity,
or other liability obligations and/or rights consistent with this
License. However, in accepting such obligations, You may act only
on Your own behalf and on Your sole responsibility, not on behalf
of any other Contributor, and only if You agree to indemnify,
defend, and hold each Contributor harmless for any liability
incurred by, or claims asserted against, such Contributor by reason
of your accepting any such warranty or additional liability.
END OF TERMS AND CONDITIONS
APPENDIX: How to apply the Apache License to your work.
To apply the Apache License to your work, attach the following
boilerplate notice, with the fields enclosed by brackets "[]"
replaced with your own identifying information. (Don't include
the brackets!) The text should be enclosed in the appropriate
comment syntax for the file format. We also recommend that a
file or class name and description of purpose be included on the
same "printed page" as the copyright notice for easier
identification within third-party archives.
Copyright [yyyy] [name of copyright owner]
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
include LICENSE
include README.md
include ppcv/model_zoo/MODEL_ZOO
include paddlecv.py
recursive-include configs/ *.*
recursive-include ppcv/ *.py
recursive-include tools/ *.py
recursive-include tests/ *.py
# 飞桨视觉模型库PaddleCV
飞桨视觉模型库PaddleCV是基于飞桨的视觉统一预测部署和模型串联系统,覆盖图像分类,目标检测,图像分割,OCR等主流视觉方向。其中包括飞桨视觉模型库中的PP系列模型,例如PP-LCNet,PP-YOLOE,PP-OCR,PP-LiteSeg。用户可以通过安装包的方式快速进行推理,同时也支持灵活便捷的二次开发。
## <img src="https://user-images.githubusercontent.com/48054808/157827140-03ffaff7-7d14-48b4-9440-c38986ea378c.png" width="20"/> 覆盖模型
| 单模型/串联系统 | 方向 | 模型名称
|:------:|:-----:|:------:|
| 单模型 | 图像分类 | PP-LCNet |
| 单模型 | 图像分类 | PP-LCNet v2 |
| 单模型 | 图像分类 | PP-HGNet |
| 单模型 | 目标检测 | PP-YOLO |
| 单模型 | 目标检测 | PP-YOLO v2 |
| 单模型 | 目标检测 | PP-YOLOE |
| 单模型 | 目标检测 | PP-YOLOE+ |
| 单模型 | 目标检测 | PP-PicoDet |
| 单模型 | 图像分割 | PP-HumanSeg v2 |
| 单模型 | 图像分割 | PP-LiteSeg |
| 单模型 | 图像分割 | PP-Matting v1 |
| 串联系统 | OCR | PP-OCR v2 |
| 串联系统 | OCR | PP-OCR v3 |
| 串联系统 | OCR | PP-Structure |
| 串联系统 | 图像识别 | PP-ShiTu |
| 串联系统 | 图像识别 | PP-ShiTu v2 |
| 串联系统 | 行人分析 | PP-Human |
| 串联系统 | 车辆分析 | PP-Vehicle |
## <img src="https://user-images.githubusercontent.com/48054808/157828296-d5eb0ccb-23ea-40f5-9957-29853d7d13a9.png" width="20"/> 文档教程
- [安装文档](docs/INSTALL.md)
- [使用教程](docs/GETTING_STARTED.md)
- [配置文件说明](docs/config_anno.md)
- 二次开发文档
- [系统设计](docs/system_design.md)
- [新增算子教程](docs/how_to_add_new_op.md)
- [外部自定义算子](docs/custom_ops.md)
## <img title="" src="https://user-images.githubusercontent.com/48054808/157835345-f5d24128-abaf-4813-b793-d2e5bdc70e5a.png" alt="" width="20"> 许可证书
本项目的发布受[Apache 2.0 license](LICENSE)许可认证。
# Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from .paddlecv import *
__version__ = paddlecv.VERSION
\ No newline at end of file
ENV:
min_subgraph_size: 3
shape_info_filename: ./
trt_calib_mode: False
cpu_threads: 1
trt_use_static: False
save_img: True
save_res: True
return_res: True
MODEL:
- ClassificationOp:
name: cls
param_path: paddlecv://models/PPHGNet_small_infer/inference.pdiparams
model_path: paddlecv://models/PPHGNet_small_infer/inference.pdmodel
batch_size: 8
PreProcess:
- ResizeImage:
resize_short: 256
- CropImage:
size: 224
- NormalizeImage:
scale: 0.00392157
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
channel_num: 3
- ToCHWImage:
- ExpandDim:
axis: 0
PostProcess:
- Topk:
topk: 5
class_id_map_file: "paddlecv://dict/classification/imagenet1k_label_list.txt"
Inputs:
- input.image
- ClasOutput:
name: vis
Inputs:
- input.fn
- input.image
- cls.class_ids
- cls.scores
- cls.label_names
ENV:
min_subgraph_size: 3
shape_info_filename: ./
trt_calib_mode: False
cpu_threads: 1
trt_use_static: False
save_img: True
save_res: True
return_res: True
MODEL:
- SegmentationOp:
name: seg
param_path: paddlecv://models/PP_HumanSegV2_256x144_with_Softmax/model.pdiparams
model_path: paddlecv://models/PP_HumanSegV2_256x144_with_Softmax/model.pdmodel
batch_size: 8
PreProcess:
- Resize:
target_size: [256, 144]
- Normalize:
scale: 0.00392157
mean: [0.5, 0.5, 0.5]
std: [0.5, 0.5, 0.5]
order: ''
- ToCHWImage
- ExpandDim
PostProcess:
- SegPostProcess
Inputs:
- input.image
- HumanSegOutput:
name: out
Inputs:
- input.fn
- input.image
- seg.seg_map
ENV:
min_subgraph_size: 3
shape_info_filename: ./
trt_calib_mode: False
cpu_threads: 1
trt_use_static: False
save_img: True
save_res: True
return_res: True
MODEL:
- ClassificationOp:
name: cls
param_path: paddlecv://models/PPLCNet_x1_0_infer/inference.pdiparams
model_path: paddlecv://models/PPLCNet_x1_0_infer/inference.pdmodel
batch_size: 8
PreProcess:
- ResizeImage:
resize_short: 256
- CropImage:
size: 224
- NormalizeImage:
scale: 0.00392157
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
channel_num: 3
- ToCHWImage:
- ExpandDim:
axis: 0
PostProcess:
- Topk:
topk: 5
class_id_map_file: "paddlecv://dict/classification/imagenet1k_label_list.txt"
Inputs:
- input.image
- ClasOutput:
name: vis
Inputs:
- input.fn
- input.image
- cls.class_ids
- cls.scores
- cls.label_names
ENV:
min_subgraph_size: 3
shape_info_filename: ./
trt_calib_mode: False
cpu_threads: 1
trt_use_static: False
save_img: True
save_res: True
return_res: True
MODEL:
- ClassificationOp:
name: cls
param_path: paddlecv://models/PPLCNetV2_base_infer/inference.pdiparams
model_path: paddlecv://models/PPLCNetV2_base_infer/inference.pdmodel
batch_size: 8
PreProcess:
- ResizeImage:
resize_short: 256
- CropImage:
size: 224
- NormalizeImage:
scale: 0.00392157
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
channel_num: 3
- ToCHWImage:
- ExpandDim:
axis: 0
PostProcess:
- Topk:
topk: 5
class_id_map_file: "paddlecv://dict/classification/imagenet1k_label_list.txt"
Inputs:
- input.image
- ClasOutput:
name: vis
Inputs:
- input.fn
- input.image
- cls.class_ids
- cls.scores
- cls.label_names
ENV:
min_subgraph_size: 3
shape_info_filename: ./
trt_calib_mode: False
cpu_threads: 1
trt_use_static: False
save_img: True
save_res: True
return_res: True
MODEL:
- SegmentationOp:
name: seg
param_path: paddlecv://models/PP_LiteSeg_Cityscapes/model.pdiparams
model_path: paddlecv://models/PP_LiteSeg_Cityscapes/model.pdmodel
batch_size: 8
PreProcess:
- Normalize:
scale: 0.00392157
mean: [0.5, 0.5, 0.5]
std: [0.5, 0.5, 0.5]
order: ''
- ToCHWImage
- ExpandDim
PostProcess:
- SegPostProcess
Inputs:
- input.image
- SegOutput:
name: out
Inputs:
- input.fn
- input.image
- seg.seg_map
ENV:
min_subgraph_size: 3
shape_info_filename: ./
trt_calib_mode: False
cpu_threads: 1
trt_use_static: False
save_img: True
save_res: True
return_res: True
MODEL:
- SegmentationOp:
name: seg
param_path: paddlecv://models/PP_MattingV1/model.pdiparams
model_path: paddlecv://models/PP_MattingV1/model.pdmodel
batch_size: 8
PreProcess:
- ResizeByShort:
resize_short: 512
size_divisor: 32
- Normalize:
scale: 0.00392157
mean: [0.5, 0.5, 0.5]
std: [0.5, 0.5, 0.5]
order: ''
- ToCHWImage
- ExpandDim
PostProcess:
- SegPostProcess
Inputs:
- input.image
- MattingOutput:
name: out
Inputs:
- input.fn
- input.image
- seg.seg_map
image_shape: &image_shape 320
ENV:
min_subgraph_size: 3
shape_info_filename: ./
trt_calib_mode: False
cpu_threads: 1
trt_use_static: False
save_img: True
save_res: True
return_res: True
MODEL:
- DetectionOp:
name: det
param_path: paddlecv://models/picodet_s_320_coco_lcnet/model.pdiparams
model_path: paddlecv://models/picodet_s_320_coco_lcnet/model.pdmodel
batch_size: 2
image_shape: [3, *image_shape, *image_shape]
PreProcess:
- Resize:
interp: 2
keep_ratio: false
target_size: [*image_shape, *image_shape]
- NormalizeImage:
is_scale: true
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
- Permute:
PostProcess:
- ParserDetResults:
label_list: paddlecv://dict/detection/coco_label_list.json
threshold: 0.5
Inputs:
- input.image
- DetOutput:
name: vis
Inputs:
- input.fn
- input.image
- det.dt_bboxes
- det.dt_scores
- det.dt_cls_names
image_shape: &image_shape 608
ENV:
min_subgraph_size: 3
shape_info_filename: ./
trt_calib_mode: False
cpu_threads: 1
trt_use_static: False
save_img: True
save_res: True
return_res: True
MODEL:
- DetectionOp:
name: det
param_path: paddlecv://models/ppyolo_r50vd_dcn_2x_coco/model.pdiparams
model_path: paddlecv://models/ppyolo_r50vd_dcn_2x_coco/model.pdmodel
batch_size: 1
image_shape: [3, *image_shape, *image_shape]
PreProcess:
- Resize:
interp: 2
keep_ratio: false
target_size: [*image_shape, *image_shape]
- NormalizeImage:
is_scale: true
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
- Permute:
PostProcess:
- ParserDetResults:
label_list: paddlecv://dict/detection/coco_label_list.json
threshold: 0.5
Inputs:
- input.image
- DetOutput:
name: vis
Inputs:
- input.fn
- input.image
- det.dt_bboxes
- det.dt_scores
- det.dt_cls_names
image_shape: &image_shape 640
ENV:
min_subgraph_size: 3
shape_info_filename: ./
trt_calib_mode: False
cpu_threads: 1
trt_use_static: False
save_img: True
save_res: True
return_res: True
MODEL:
- DetectionOp:
name: det
param_path: paddlecv://models/ppyoloe_plus_crn_l_80e_coco/model.pdiparams
model_path: paddlecv://models/ppyoloe_plus_crn_l_80e_coco/model.pdmodel
batch_size: 1
image_shape: [3, *image_shape, *image_shape]
PreProcess:
- Resize:
interp: 2
keep_ratio: false
target_size: [*image_shape, *image_shape]
- NormalizeImage:
is_scale: true
mean: [0., 0., 0.]
std: [1., 1., 1.]
norm_type: null
- Permute:
PostProcess:
- ParserDetResults:
label_list: paddlecv://dict/detection/coco_label_list.json
threshold: 0.5
Inputs:
- input.image
- DetOutput:
name: vis
Inputs:
- input.fn
- input.image
- det.dt_bboxes
- det.dt_scores
- det.dt_cls_names
image_shape: &image_shape 640
ENV:
min_subgraph_size: 3
shape_info_filename: ./
trt_calib_mode: False
cpu_threads: 1
trt_use_static: False
save_img: True
save_res: True
return_res: True
MODEL:
- DetectionOp:
name: det
param_path: paddlecv://models/ppyoloe_crn_l_300e_coco/model.pdiparams
model_path: paddlecv://models/ppyoloe_crn_l_300e_coco/model.pdmodel
batch_size: 1
image_shape: [3, *image_shape, *image_shape]
PreProcess:
- Resize:
interp: 2
keep_ratio: false
target_size: [*image_shape, *image_shape]
- NormalizeImage:
is_scale: true
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
- Permute:
PostProcess:
- ParserDetResults:
label_list: paddlecv://dict/detection/coco_label_list.json
threshold: 0.5
Inputs:
- input.image
- DetOutput:
name: vis
Inputs:
- input.fn
- input.image
- det.dt_bboxes
- det.dt_scores
- det.dt_cls_names
image_shape: &image_shape 640
ENV:
min_subgraph_size: 3
shape_info_filename: ./
trt_calib_mode: False
cpu_threads: 1
trt_use_static: False
save_img: True
save_res: True
return_res: True
MODEL:
- DetectionOp:
name: det
param_path: paddlecv://models/ppyolov2_r50vd_dcn_365e_coco/model.pdiparams
model_path: paddlecv://models/ppyolov2_r50vd_dcn_365e_coco/model.pdmodel
batch_size: 1
image_shape: [3, *image_shape, *image_shape]
PreProcess:
- Resize:
interp: 2
keep_ratio: false
target_size: [*image_shape, *image_shape]
- NormalizeImage:
is_scale: true
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
- Permute:
PostProcess:
- ParserDetResults:
label_list: paddlecv://dict/detection/coco_label_list.json
threshold: 0.5
Inputs:
- input.image
- DetOutput:
name: vis
Inputs:
- input.fn
- input.image
- det.dt_bboxes
- det.dt_scores
- det.dt_cls_names
image_shape: &image_shape 640
ENV:
min_subgraph_size: 3
shape_info_filename: ./
trt_calib_mode: False
cpu_threads: 1
trt_use_static: False
save_img: True
save_res: False
return_res: True
MODEL:
- DetectionOp:
name: det
param_path: paddlecv://models/mot_ppyoloe_l_36e_pphuman/model.pdiparams
model_path: paddlecv://models/mot_ppyoloe_l_36e_pphuman/model.pdmodel
batch_size: 1
image_shape: [3, *image_shape, *image_shape]
PreProcess:
- Resize:
interp: 2
keep_ratio: false
target_size: [*image_shape, *image_shape]
- Permute:
PostProcess:
- ParserDetResults:
label_list:
- pedestrian
threshold: 0.1
Inputs:
- input.image
- TrackerOP:
name: tracker
type: OCSORTTracker
tracker_configs:
det_thresh: 0.4
max_age: 30
min_hits: 3
iou_threshold: 0.3
delta_t: 3
inertia: 0.2
vertical_ratio: 0
min_box_area: 0
use_byte: False
PostProcess:
- ParserTrackerResults:
label_list:
- pedestrian
Inputs:
- det.dt_bboxes
- det.dt_scores
- det.dt_class_ids
- TrackerOutput:
name: vis
Inputs:
- input.fn
- input.image
- tracker.tk_bboxes
- tracker.tk_scores
- tracker.tk_ids
- tracker.tk_cls_ids
- tracker.tk_cls_names
ENV:
min_subgraph_size: 3
shape_info_filename: ./
trt_calib_mode: False
cpu_threads: 1
trt_use_static: False
save_img: True
save_res: True
return_res: True
MODEL:
- OcrDbDetOp:
name: det
param_path: paddlecv://models/ch_PP-OCRv2_det_infer/inference.pdiparams
model_path: paddlecv://models/ch_PP-OCRv2_det_infer/inference.pdmodel
batch_size: 1
PreProcess:
- RGB2BGR:
- DetResizeForTest:
limit_side_len: 960
limit_type: "max"
- NormalizeImage:
std: [0.229, 0.224, 0.225]
mean: [0.485, 0.456, 0.406]
scale: '1./255.'
order: 'hwc'
- ToCHWImage:
- ExpandDim:
axis: 0
- KeepKeys:
keep_keys: ['image', 'shape']
PostProcess:
- DBPostProcess:
thresh: 0.3
box_thresh: 0.6
max_candidates: 1000
unclip_ratio: 1.5
use_dilation: False
score_mode: "fast"
box_type: "quad"
Inputs:
- input.image
- PolyCropOp:
name: crop
Inputs:
- input.image
- det.dt_polys
- OcrCrnnRecOp:
name: rec
param_path: paddlecv://models/ch_PP-OCRv2_rec_infer/inference.pdiparams
model_path: paddlecv://models/ch_PP-OCRv2_rec_infer/inference.pdmodel
batch_size: 6
PreProcess:
- RGB2BGR:
- ReisizeNormImg:
rec_image_shape: [3, 32, 320]
PostProcess:
- CTCLabelDecode:
character_dict_path: "paddlecv://dict/ocr/ch_dict.txt"
use_space_char: true
Inputs:
- crop.crop_image
- OCROutput:
name: vis
font_path: paddlecv://fonts/simfang.ttf
Inputs:
- input.fn
- input.image
- det.dt_polys
- rec.rec_text
- rec.rec_score
ENV:
min_subgraph_size: 3
shape_info_filename: ./
trt_calib_mode: False
cpu_threads: 1
trt_use_static: False
save_img: True
save_res: True
return_res: True
MODEL:
- OcrDbDetOp:
name: det
param_path: paddlecv://models/ch_PP-OCRv3_det_infer/inference.pdiparams
model_path: paddlecv://models/ch_PP-OCRv3_det_infer/inference.pdmodel
batch_size: 1
PreProcess:
- RGB2BGR:
- DetResizeForTest:
limit_side_len: 960
limit_type: "max"
- NormalizeImage:
std: [0.229, 0.224, 0.225]
mean: [0.485, 0.456, 0.406]
scale: '1./255.'
order: 'hwc'
- ToCHWImage:
- ExpandDim:
axis: 0
- KeepKeys:
keep_keys: ['image', 'shape']
PostProcess:
- DBPostProcess:
thresh: 0.3
box_thresh: 0.6
max_candidates: 1000
unclip_ratio: 1.5
use_dilation: False
score_mode: "fast"
box_type: "quad"
Inputs:
- input.image
- PolyCropOp:
name: crop
Inputs:
- input.image
- det.dt_polys
- OcrCrnnRecOp:
name: rec
param_path: paddlecv://models/ch_PP-OCRv3_rec_infer/inference.pdiparams
model_path: paddlecv://models/ch_PP-OCRv3_rec_infer/inference.pdmodel
batch_size: 6
PreProcess:
- RGB2BGR:
- ReisizeNormImg:
rec_image_shape: [3, 48, 320]
PostProcess:
- CTCLabelDecode:
character_dict_path: "paddlecv://dict/ocr/ch_dict.txt"
use_space_char: true
Inputs:
- crop.crop_image
- OCROutput:
name: vis
font_path: paddlecv://fonts/simfang.ttf
Inputs:
- input.fn
- input.image
- det.dt_polys
- rec.rec_text
- rec.rec_score
ENV:
min_subgraph_size: 3
shape_info_filename: ./
trt_calib_mode: False
cpu_threads: 1
trt_use_static: False
save_img: True
save_res: True
return_res: True
print_res: True
MODEL:
- DetectionOp:
name: det
param_path: paddlecv://models/picodet_PPLCNet_x2_5_mainbody_lite_v1.0_infer/inference.pdiparams
model_path: paddlecv://models/picodet_PPLCNet_x2_5_mainbody_lite_v1.0_infer/inference.pdmodel
batch_size: 2
image_shape: [3, 640, 640]
PreProcess:
- Resize:
interp: 2
keep_ratio: false
target_size: [640, 640]
- NormalizeImage:
is_scale: true
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
- Permute:
PostProcess:
- ParserDetResults:
label_list:
- foreground
threshold: 0.2
max_det_results: 5
Inputs:
- input.image
- BboxCropOp:
name: crop
Inputs:
- input.image
- det.dt_bboxes
- FeatureExtractionOp:
name: feature
param_path: paddlecv://models/general_PPLCNet_x2_5_lite_v1.0_infer/inference.pdiparams
model_path: paddlecv://models/general_PPLCNet_x2_5_lite_v1.0_infer/inference.pdmodel
batch_size: 2
PreProcess:
- ResizeImage:
size: [224, 224]
return_numpy: False
interpolation: bilinear
backend: cv2
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: hwc
- ToCHWImage:
- ExpandDim:
axis: 0
PostProcess:
- NormalizeFeature:
- Index:
index_method: "HNSW32" # supported: HNSW32, IVF, Flat
dist_type: "IP"
index_dir: "./drink_dataset_v1.0/index"
score_thres: 0.5
- NMS4Rec:
thresh: 0.05
Inputs:
- input.image
- crop.crop_image
- det.dt_bboxes
- FeatureOutput:
name: print
Inputs:
- input.fn
- feature.dt_bboxes
- feature.rec_score
- feature.rec_doc
ENV:
min_subgraph_size: 3
shape_info_filename: ./
trt_calib_mode: False
cpu_threads: 1
trt_use_static: False
save_img: True
save_res: True
return_res: True
print_res: True
MODEL:
- DetectionOp:
name: det
param_path: paddlecv://models/picodet_PPLCNet_x2_5_mainbody_lite_v1.0_infer/inference.pdiparams
model_path: paddlecv://models/picodet_PPLCNet_x2_5_mainbody_lite_v1.0_infer/inference.pdmodel
batch_size: 2
image_shape: [3, 640, 640]
PreProcess:
- Resize:
interp: 2
keep_ratio: false
target_size: [640, 640]
- NormalizeImage:
is_scale: true
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
- Permute:
PostProcess:
- ParserDetResults:
label_list:
- foreground
threshold: 0.2
max_det_results: 5
Inputs:
- input.image
- BboxCropOp:
name: crop
Inputs:
- input.image
- det.dt_bboxes
- FeatureExtractionOp:
name: feature
param_path: paddlecv://models/general_PPLCNetV2_base_pretrained_v1.0_infer/inference.pdiparams
model_path: paddlecv://models/general_PPLCNetV2_base_pretrained_v1.0_infer/inference.pdmodel
batch_size: 2
PreProcess:
- ResizeImage:
size: [224, 224]
return_numpy: False
interpolation: bilinear
backend: cv2
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: hwc
- ToCHWImage:
- ExpandDim:
axis: 0
PostProcess:
- NormalizeFeature:
- Index:
index_method: "HNSW32" # supported: HNSW32, IVF, Flat
dist_type: "IP"
index_dir: "./drink_dataset_v2.0/index"
score_thres: 0.5
- NMS4Rec:
thresh: 0.05
Inputs:
- input.image
- crop.crop_image
- det.dt_bboxes
- FeatureOutput:
name: print
Inputs:
- input.fn
- feature.dt_bboxes
- feature.rec_score
- feature.rec_doc
ENV:
min_subgraph_size: 3
shape_info_filename: ./
trt_calib_mode: False
cpu_threads: 1
trt_use_static: False
save_img: True
save_res: True
return_res: True
MODEL:
- DetectionOp:
name: layout
param_path: paddlecv://models/ch_PP-StructureV2_picodet_lcnet_x1_0_fgd_layout_infer/inference.pdiparams
model_path: paddlecv://models/ch_PP-StructureV2_picodet_lcnet_x1_0_fgd_layout_infer/inference.pdmodel
batch_size: 1
image_shape: [ 3, 800, 608 ]
PreProcess:
- Resize:
interp: 2
keep_ratio: false
target_size: [ 800, 608 ]
- NormalizeImage:
is_scale: true
mean: [ 0.485, 0.456, 0.406 ]
std: [ 0.229, 0.224, 0.225 ]
- RGB2BGR:
- Permute:
PostProcess:
- ParserDetResults:
label_list: paddlecv://dict/ocr/layout_publaynet_dict.txt
threshold: 0.5
Inputs:
- input.image
- BboxCropOp:
name: bbox_crop
Inputs:
- input.image
- layout.dt_bboxes
- OcrDbDetOp:
name: det
param_path: paddlecv://models/ch_PP-OCRv3_det_infer/inference.pdiparams
model_path: paddlecv://models/ch_PP-OCRv3_det_infer/inference.pdmodel
batch_size: 1
PreProcess:
- RGB2BGR:
- DetResizeForTest:
limit_side_len: 960
limit_type: "max"
- NormalizeImage:
std: [0.229, 0.224, 0.225]
mean: [0.485, 0.456, 0.406]
scale: '1./255.'
order: 'hwc'
- ToCHWImage:
- ExpandDim:
axis: 0
- KeepKeys:
keep_keys: ['image', 'shape']
PostProcess:
- DBPostProcess:
thresh: 0.3
box_thresh: 0.6
max_candidates: 1000
unclip_ratio: 1.5
use_dilation: False
score_mode: "fast"
box_type: "quad"
Inputs:
- bbox_crop.crop_image
- PolyCropOp:
name: crop
Inputs:
- bbox_crop.crop_image
- det.dt_polys
- OcrCrnnRecOp:
name: rec
param_path: paddlecv://models/ch_PP-OCRv3_rec_infer/inference.pdiparams
model_path: paddlecv://models/ch_PP-OCRv3_rec_infer/inference.pdmodel
batch_size: 6
PreProcess:
- RGB2BGR:
- ReisizeNormImg:
rec_image_shape: [3, 48, 320]
PostProcess:
- CTCLabelDecode:
character_dict_path: paddlecv://dict/ocr/ch_dict.txt
use_space_char: true
Inputs:
- crop.crop_image
- PPStructureFilterOp:
keep_keys: [table]
name: filter_table
Inputs:
- layout.dt_cls_names
- bbox_crop.crop_image
- det.dt_polys
- rec.rec_text
- PPStructureFilterOp:
keep_keys: [ text, title, list, figure ]
name: filter_txts
Inputs:
- layout.dt_cls_names
- bbox_crop.crop_image
- det.dt_polys
- rec.rec_text
- PPStructureTableStructureOp:
name: table
param_path: paddlecv://models/ch_PP-StructureV2_SLANet_infer/inference.pdiparams
model_path: paddlecv://models/ch_PP-StructureV2_SLANet_infer/inference.pdmodel
batch_size: 1
PreProcess:
- RGB2BGR:
- ResizeTableImage:
max_len: 488
- NormalizeImage:
scale: 1./255.
mean: [ 0.485, 0.456, 0.406 ]
std: [ 0.229, 0.224, 0.225 ]
order: 'hwc'
- PaddingTableImage:
size: [ 488, 488 ]
- ToCHWImage:
- ExpandDim:
axis: 0
PostProcess:
- TableLabelDecode:
character_dict_path: paddlecv://dict/ocr/table_structure_dict_ch.txt
merge_no_span_structure: true
Inputs:
- filter_table.image
- TableMatcherOp:
name: Matcher
Inputs:
- table.dt_bboxes
- table.structures
- filter_table.dt_polys
- filter_table.rec_text
- PPStructureResultConcatOp:
name: concat
Inputs:
- table.structures
- Matcher.html
- layout.dt_bboxes
- table.dt_bboxes
- filter_table.dt_polys
- filter_table.rec_text
- filter_txts.dt_polys
- filter_txts.rec_text
- PPStructureOutput:
name: vis
Inputs:
- input.fn
- input.image
- concat.dt_polys
- concat.rec_text
- concat.dt_bboxes
- concat.html
- concat.cell_bbox
- concat.structures
ENV:
min_subgraph_size: 3
shape_info_filename: ./
trt_calib_mode: False
cpu_threads: 1
trt_use_static: False
save_img: true
save_res: true
return_res: False
MODEL:
- OcrDbDetOp:
name: det
param_path: paddlecv://models/ch_PP-OCRv3_det_infer/inference.pdiparams
model_path: paddlecv://models/ch_PP-OCRv3_det_infer/inference.pdmodel
batch_size: 1
PreProcess:
- DetResizeForTest:
limit_side_len: 960
limit_type: "max"
- NormalizeImage:
std: [ 0.229, 0.224, 0.225 ]
mean: [ 0.485, 0.456, 0.406 ]
scale: '1./255.'
order: 'hwc'
- ToCHWImage:
- ExpandDim:
axis: 0
- KeepKeys:
keep_keys: [ 'image', 'shape' ]
PostProcess:
- DBPostProcess:
thresh: 0.3
box_thresh: 0.6
max_candidates: 1000
unclip_ratio: 1.5
use_dilation: False
score_mode: "fast"
box_type: "quad"
Inputs:
- input.image
- PolyCropOp:
name: crop
Inputs:
- input.image
- det.dt_polys
- OcrCrnnRecOp:
name: rec
param_path: paddlecv://models/ch_PP-OCRv3_rec_infer/inference.pdiparams
model_path: paddlecv://models/ch_PP-OCRv3_rec_infer/inference.pdmodel
batch_size: 6
PreProcess:
- ReisizeNormImg:
rec_image_shape: [ 3, 48, 320 ]
PostProcess:
- CTCLabelDecode:
character_dict_path: paddlecv://dict/ocr/ch_dict.txt
use_space_char: true
Inputs:
- crop.crop_image
- PPStructureKieSerOp:
name: ser
param_path: paddlecv://models/PP-Structure_ser_vi_layoutxlm_xfund_infer/inference.pdiparams
model_path: paddlecv://models/PP-Structure_ser_vi_layoutxlm_xfund_infer/inference.pdmodel
batch_size: 1
use_visual_backbone: False
PreProcess:
- VQATokenLabelEncode:
algorithm: LayoutXLM
class_path: paddlecv://dict/ocr/class_list_xfun.txt
contains_re: False
order_method: tb-yx
- VQATokenPad:
max_seq_len: 512
return_attention_mask: true
- VQASerTokenChunk:
max_seq_len: 512
return_attention_mask: true
- Resize:
size: [224, 224]
- NormalizeImage:
scale: 1.
mean: [123.675, 116.28, 103.53]
std: [58.395, 57.12, 57.375]
order: 'hwc'
- ToCHWImage:
- KeepKeys:
keep_keys: [ 'input_ids', 'bbox', 'attention_mask', 'token_type_ids', 'image', 'labels', 'segment_offset_id', 'ocr_info', 'entities']
PostProcess:
- VQASerTokenLayoutLMPostProcess:
class_path: paddlecv://dict/ocr/class_list_xfun.txt
Inputs:
- input.image
- det.dt_polys
- rec.rec_text
- PPStructureKieReOp:
name: re
param_path: paddlecv://models/PP-Structure_re_vi_layoutxlm_xfund_infer/inference.pdiparams
model_path: paddlecv://models/PP-Structure_re_vi_layoutxlm_xfund_infer/inference.pdmodel
batch_size: 1
use_visual_backbone: False
delete_pass: [ simplify_with_basic_ops_pass ]
PreProcess:
- ReInput:
entities_labels: {'HEADER': 0, 'QUESTION': 1, 'ANSWER': 2}
PostProcess:
- VQAReTokenLayoutLMPostProcess:
Inputs:
- input.image
- ser.pred_id
- ser.pred
- ser.dt_polys
- ser.rec_text
- ser.inputs
- PPStructureReOutput:
name: vis
font_path: paddlecv://fonts/simfang.ttf
Inputs:
- input.fn
- input.image
- re.head
- re.tail
ENV:
min_subgraph_size: 3
shape_info_filename: ./
trt_calib_mode: False
cpu_threads: 1
trt_use_static: False
save_img: true
save_res: true
return_res: False
MODEL:
- OcrDbDetOp:
name: det
param_path: paddlecv://models/ch_PP-OCRv3_det_infer/inference.pdiparams
model_path: paddlecv://models/ch_PP-OCRv3_det_infer/inference.pdmodel
batch_size: 1
PreProcess:
- DetResizeForTest:
limit_side_len: 960
limit_type: "max"
- NormalizeImage:
std: [ 0.229, 0.224, 0.225 ]
mean: [ 0.485, 0.456, 0.406 ]
scale: '1./255.'
order: 'hwc'
- ToCHWImage:
- ExpandDim:
axis: 0
- KeepKeys:
keep_keys: [ 'image', 'shape' ]
PostProcess:
- DBPostProcess:
thresh: 0.3
box_thresh: 0.6
max_candidates: 1000
unclip_ratio: 1.5
use_dilation: False
score_mode: "fast"
box_type: "quad"
Inputs:
- input.image
- PolyCropOp:
name: crop
Inputs:
- input.image
- det.dt_polys
- OcrCrnnRecOp:
name: rec
param_path: paddlecv://models/ch_PP-OCRv3_rec_infer/inference.pdiparams
model_path: paddlecv://models/ch_PP-OCRv3_rec_infer/inference.pdmodel
batch_size: 6
PreProcess:
- ReisizeNormImg:
rec_image_shape: [ 3, 48, 320 ]
PostProcess:
- CTCLabelDecode:
character_dict_path: paddlecv://dict/ocr/ch_dict.txt
use_space_char: true
Inputs:
- crop.crop_image
- PPStructureKieSerOp:
name: ser
param_path: paddlecv://models/PP-Structure_ser_vi_layoutxlm_xfund_infer/inference.pdiparams
model_path: paddlecv://models/PP-Structure_ser_vi_layoutxlm_xfund_infer/inference.pdmodel
batch_size: 1
use_visual_backbone: False
PreProcess:
- VQATokenLabelEncode:
algorithm: LayoutXLM
class_path: paddlecv://dict/ocr/class_list_xfun.txt
contains_re: False
order_method: tb-yx
- VQATokenPad:
max_seq_len: 512
return_attention_mask: true
- VQASerTokenChunk:
max_seq_len: 512
return_attention_mask: true
- Resize:
size: [224, 224]
- NormalizeImage:
scale: 1.
mean: [123.675, 116.28, 103.53]
std: [58.395, 57.12, 57.375]
order: 'hwc'
- ToCHWImage:
- KeepKeys:
keep_keys: [ 'input_ids', 'bbox', 'attention_mask', 'token_type_ids', 'image', 'labels', 'segment_offset_id', 'ocr_info', 'entities']
PostProcess:
- VQASerTokenLayoutLMPostProcess:
class_path: paddlecv://dict/ocr/class_list_xfun.txt
Inputs:
- input.image
- det.dt_polys
- rec.rec_text
- PPStructureSerOutput:
name: vis
font_path: paddlecv://fonts/simfang.ttf
Inputs:
- input.fn
- input.image
- ser.pred_id
- ser.pred
- ser.dt_polys
- ser.rec_text
ENV:
min_subgraph_size: 3
shape_info_filename: ./
trt_calib_mode: False
cpu_threads: 1
trt_use_static: False
save_img: True
save_res: True
return_res: True
MODEL:
- OcrDbDetOp:
name: det
param_path: paddlecv://models/ch_PP-OCRv3_det_infer/inference.pdiparams
model_path: paddlecv://models/ch_PP-OCRv3_det_infer/inference.pdmodel
batch_size: 1
PreProcess:
- RGB2BGR:
- DetResizeForTest:
limit_side_len: 960
limit_type: "max"
- NormalizeImage:
std: [0.229, 0.224, 0.225]
mean: [0.485, 0.456, 0.406]
scale: '1./255.'
order: 'hwc'
- ToCHWImage:
- ExpandDim:
axis: 0
- KeepKeys:
keep_keys: ['image', 'shape']
PostProcess:
- DBPostProcess:
thresh: 0.3
box_thresh: 0.6
max_candidates: 1000
unclip_ratio: 1.5
use_dilation: False
score_mode: "fast"
box_type: "quad"
Inputs:
- input.image
- PolyCropOp:
name: crop
Inputs:
- input.image
- det.dt_polys
- OcrCrnnRecOp:
name: rec
param_path: paddlecv://models/ch_PP-OCRv3_rec_infer/inference.pdiparams
model_path: paddlecv://models/ch_PP-OCRv3_rec_infer/inference.pdmodel
batch_size: 6
PreProcess:
- RGB2BGR:
- ReisizeNormImg:
rec_image_shape: [3, 48, 320]
PostProcess:
- CTCLabelDecode:
character_dict_path: "paddlecv://dict/ocr/ch_dict.txt"
use_space_char: true
Inputs:
- crop.crop_image
- PPStructureTableStructureOp:
name: table
param_path: paddlecv://models/ch_PP-StructureV2_SLANet_infer/inference.pdiparams
model_path: paddlecv://models/ch_PP-StructureV2_SLANet_infer/inference.pdmodel
batch_size: 1
PreProcess:
- RGB2BGR:
- ResizeTableImage:
max_len: 488
- NormalizeImage:
scale: 1./255.
mean: [ 0.485, 0.456, 0.406 ]
std: [ 0.229, 0.224, 0.225 ]
order: 'hwc'
- PaddingTableImage:
size: [ 488, 488 ]
- ToCHWImage:
- ExpandDim:
axis: 0
PostProcess:
- TableLabelDecode:
character_dict_path: "paddlecv://dict/ocr/table_structure_dict_ch.txt"
merge_no_span_structure: true
Inputs:
- input.image
- TableMatcherOp:
name: Matcher
Inputs:
- table.dt_bboxes
- table.structures
- det.dt_polys
- rec.rec_text
- OCRTableOutput:
name: vis
Inputs:
- input.fn
- input.image
- table.dt_bboxes
- table.structures
- table.scores
- Matcher.html
ENV:
min_subgraph_size: 3
shape_info_filename: ./
trt_calib_mode: False
cpu_threads: 1
trt_use_static: False
save_img: True
save_res: True
return_res: True
MODEL:
- ClassificationOp:
name: cls
param_path: paddlecv://models/text_image_orientation_infer/inference.pdiparams
model_path: paddlecv://models/text_image_orientation_infer/inference.pdmodel
batch_size: 8
PreProcess:
- ResizeImage:
resize_short: 256
- CropImage:
size: 224
- NormalizeImage:
scale: 0.00392157
mean: [ 0.485, 0.456, 0.406 ]
std: [ 0.229, 0.224, 0.225 ]
order: ''
channel_num: 3
- ToCHWImage:
- ExpandDim:
axis: 0
PostProcess:
- Topk:
topk: 2
class_id_map_file: paddlecv://models/text_image_orientation_infer/text_image_orientation_label_list.txt
Inputs:
- input.image
- OCRRotateOp:
name: rotate
rotate_map: {'90': 2, '180': 1, '270': 0}
Inputs:
- input.image
- cls.label_names
- cls.scores
- DetectionOp:
name: layout
param_path: paddlecv://models/ch_PP-StructureV2_picodet_lcnet_x1_0_fgd_layout_infer/inference.pdiparams
model_path: paddlecv://models/ch_PP-StructureV2_picodet_lcnet_x1_0_fgd_layout_infer/inference.pdmodel
batch_size: 1
image_shape: [ 3, 800, 608 ]
PreProcess:
- Resize:
interp: 2
keep_ratio: false
target_size: [ 800, 608 ]
- NormalizeImage:
is_scale: true
mean: [ 0.485, 0.456, 0.406 ]
std: [ 0.229, 0.224, 0.225 ]
- RGB2BGR:
- Permute:
PostProcess:
- ParserDetResults:
label_list: paddlecv://dict/ocr/layout_publaynet_dict.txt
threshold: 0.5
Inputs:
- rotate.image
- BboxCropOp:
name: bbox_crop
Inputs:
- rotate.image
- layout.dt_bboxes
- OcrDbDetOp:
name: det
param_path: paddlecv://models/ch_PP-OCRv3_det_infer/inference.pdiparams
model_path: paddlecv://models/ch_PP-OCRv3_det_infer/inference.pdmodel
batch_size: 1
PreProcess:
- RGB2BGR:
- DetResizeForTest:
limit_side_len: 960
limit_type: "max"
- NormalizeImage:
std: [0.229, 0.224, 0.225]
mean: [0.485, 0.456, 0.406]
scale: '1./255.'
order: 'hwc'
- ToCHWImage:
- ExpandDim:
axis: 0
- KeepKeys:
keep_keys: ['image', 'shape']
PostProcess:
- DBPostProcess:
thresh: 0.3
box_thresh: 0.6
max_candidates: 1000
unclip_ratio: 1.5
use_dilation: False
score_mode: "fast"
box_type: "quad"
Inputs:
- bbox_crop.crop_image
- PolyCropOp:
name: crop
Inputs:
- bbox_crop.crop_image
- det.dt_polys
- OcrCrnnRecOp:
name: rec
param_path: paddlecv://models/ch_PP-OCRv3_rec_infer/inference.pdiparams
model_path: paddlecv://models/ch_PP-OCRv3_rec_infer/inference.pdmodel
batch_size: 6
PreProcess:
- RGB2BGR:
- ReisizeNormImg:
rec_image_shape: [3, 48, 320]
PostProcess:
- CTCLabelDecode:
character_dict_path: paddlecv://dict/ocr/ch_dict.txt
use_space_char: true
Inputs:
- crop.crop_image
- PPStructureFilterOp:
keep_keys: [table]
name: filter_table
Inputs:
- layout.dt_cls_names
- bbox_crop.crop_image
- det.dt_polys
- rec.rec_text
- PPStructureFilterOp:
keep_keys: [ text, title, list, figure ]
name: filter_txts
Inputs:
- layout.dt_cls_names
- bbox_crop.crop_image
- det.dt_polys
- rec.rec_text
- PPStructureTableStructureOp:
name: table
param_path: paddlecv://models/ch_PP-StructureV2_SLANet_infer/inference.pdiparams
model_path: paddlecv://models/ch_PP-StructureV2_SLANet_infer/inference.pdmodel
batch_size: 1
PreProcess:
- RGB2BGR:
- ResizeTableImage:
max_len: 488
- NormalizeImage:
scale: 1./255.
mean: [ 0.485, 0.456, 0.406 ]
std: [ 0.229, 0.224, 0.225 ]
order: 'hwc'
- PaddingTableImage:
size: [ 488, 488 ]
- ToCHWImage:
- ExpandDim:
axis: 0
PostProcess:
- TableLabelDecode:
character_dict_path: paddlecv://dict/ocr/table_structure_dict_ch.txt
merge_no_span_structure: true
Inputs:
- filter_table.image
- TableMatcherOp:
name: Matcher
Inputs:
- table.dt_bboxes
- table.structures
- filter_table.dt_polys
- filter_table.rec_text
- PPStructureResultConcatOp:
name: concat
Inputs:
- table.structures
- Matcher.html
- layout.dt_bboxes
- table.dt_bboxes
- filter_table.dt_polys
- filter_table.rec_text
- filter_txts.dt_polys
- filter_txts.rec_text
- PPStructureOutput:
name: vis
Inputs:
- input.fn
- rotate.image
- concat.dt_polys
- concat.rec_text
- concat.dt_bboxes
- concat.html
- concat.cell_bbox
- concat.structures
det_image_shape: &det_image_shape 320
kpt_image_height: &kpt_image_height 128
kpt_image_width: &kpt_image_width 96
kpt_image_shape: &kpt_image_shape [*kpt_image_width, *kpt_image_height]
ENV:
min_subgraph_size: 3
shape_info_filename: ./
trt_calib_mode: False
cpu_threads: 1
trt_use_static: False
save_img: True
save_res: True
return_res: True
MODEL:
- DetectionOp:
name: det
param_path: paddlecv://models/picodet_s_320_lcnet_pedestrian/model.pdiparams
model_path: paddlecv://models/picodet_s_320_lcnet_pedestrian/model.pdmodel
batch_size: 1
image_shape: [3, *det_image_shape, *det_image_shape]
PreProcess:
- Resize:
interp: 2
keep_ratio: false
target_size: [*det_image_shape, *det_image_shape]
- Permute:
PostProcess:
- ParserDetResults:
label_list: paddlecv://dict/detection/coco_label_list.json
threshold: 0.5
keep_cls_ids: [0]
Inputs:
- input.image
- BboxExpandCropOp:
name: crop
Inputs:
- input.image
- det.dt_bboxes
- KeypointOp:
name: kpt
param_path: paddlecv://models/tinypose_128x96/inference.pdiparams
model_path: paddlecv://models/tinypose_128x96/inference.pdmodel
batch_size: 2
image_shape: [3, *kpt_image_height, *kpt_image_width]
PreProcess:
- TopDownEvalAffine:
trainsize: *kpt_image_shape
- NormalizeImage:
is_scale: true
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
- Permute:
PostProcess:
- HRNetPostProcess:
use_dark: True
Inputs:
- crop.crop_image
- crop.tl_point
- KptOutput:
name: vis
Inputs:
- input.fn
- input.image
- kpt.keypoints
- kpt.kpt_scores
image_shape: &image_shape 640
ENV:
min_subgraph_size: 3
shape_info_filename: ./
trt_calib_mode: False
cpu_threads: 1
trt_use_static: False
save_img: True
save_res: True
return_res: True
MODEL:
- DetectionOp:
name: det
param_path: paddlecv://models/mot_ppyoloe_l_36e_ppvehicle/model.pdiparams
model_path: paddlecv://models/mot_ppyoloe_l_36e_ppvehicle/model.pdmodel
batch_size: 1
image_shape: [3, *image_shape, *image_shape]
PreProcess:
- Resize:
interp: 2
keep_ratio: false
target_size: [*image_shape, *image_shape]
- Permute:
PostProcess:
- ParserDetResults:
label_list:
- vehicle
threshold: 0.1
Inputs:
- input.image
- TrackerOP:
name: tracker
type: OCSORTTracker
tracker_configs:
det_thresh: 0.4
max_age: 30
min_hits: 3
iou_threshold: 0.3
delta_t: 3
inertia: 0.2
vertical_ratio: 0
min_box_area: 0
use_byte: False
PostProcess:
- ParserTrackerResults:
label_list:
- vehicle
Inputs:
- det.dt_bboxes
- det.dt_scores
- det.dt_class_ids
- TrackerOutput:
name: vis
Inputs:
- input.fn
- input.image
- tracker.tk_bboxes
- tracker.tk_scores
- tracker.tk_ids
- tracker.tk_cls_ids
- tracker.tk_cls_names
ENV:
run_mode: paddle
device: GPU
min_subgraph_size: 3
shape_info_filename: ./
trt_calib_mode: False
cpu_threads: 1
trt_use_static: False
MODEL:
- BboxCropOp:
name: bbox_crop
Inputs:
- input.image
- input.bbox
ENV:
run_mode: paddle
device: GPU
min_subgraph_size: 3
shape_info_filename: ./
trt_calib_mode: False
cpu_threads: 1
trt_use_static: False
save_img: True
save_res: True
return_res: True
MODEL:
- ClassificationOp:
name: cls
param_path: paddlecv://models/PPLCNet_x1_0_infer/inference.pdiparams
model_path: paddlecv://models/PPLCNet_x1_0_infer/inference.pdmodel
batch_size: 8
PreProcess:
- ResizeImage:
resize_short: 256
- CropImage:
size: 224
- NormalizeImage:
scale: 0.00392157
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
channel_num: 3
- ToCHWImage:
- ExpandDim:
axis: 0
PostProcess:
- Topk:
topk: 5
class_id_map_file: "paddlecv://dict/classification/imagenet1k_label_list.txt"
Inputs:
- input.image
- ClasOutput:
name: vis
Inputs:
- input.fn
- input.image
- cls.class_ids
- cls.scores
- cls.label_names
ENV:
run_mode: paddle
device: GPU
min_subgraph_size: 3
shape_info_filename: ./
trt_calib_mode: False
cpu_threads: 1
trt_use_static: False
MODEL:
- ClsCorrectionOp:
name: cls_corr
class_num: 4
threshold: 0.9
Inputs:
- input.image
- input.class_ids
- input.scores
ENV:
run_mode: paddle
device: GPU
min_subgraph_size: 3
shape_info_filename: ./
trt_calib_mode: False
cpu_threads: 1
trt_use_static: False
save_img: False
save_res: False
return_res: False
MODEL:
- BlankOp:
name: blank
param_path: paddlecv://models/PPLCNet_x1_0_infer/inference.pdiparams
model_path: paddlecv://models/PPLCNet_x1_0_infer/inference.pdmodel
Inputs:
- input.image
ENV:
run_mode: paddle
device: GPU
min_subgraph_size: 3
shape_info_filename: ./
trt_calib_mode: False
cpu_threads: 1
trt_use_static: False
save_img: True
save_res: True
return_res: True
MODEL:
- DetectionOp:
name: det
param_path: paddlecv://models/picodet_PPLCNet_x2_5_mainbody_lite_v1.0_infer/inference.pdiparams
model_path: paddlecv://models/picodet_PPLCNet_x2_5_mainbody_lite_v1.0_infer/inference.pdmodel
batch_size: 2
image_shape: [3, 640, 640]
PreProcess:
- Resize:
interp: 2
keep_ratio: false
target_size: [640, 640]
- NormalizeImage:
is_scale: true
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
- Permute:
PostProcess:
- ParserDetResults:
label_list: paddlecv://dict/detection/coco_label_list.json
threshold: 0.5
Inputs:
- input.image
- DetOutput:
name: vis
Inputs:
- input.fn
- input.image
- det.dt_bboxes
- det.dt_scores
- det.dt_cls_names
ENV:
run_mode: paddle
device: GPU
min_subgraph_size: 3
shape_info_filename: ./
trt_calib_mode: False
cpu_threads: 1
trt_use_static: False
return_res: True
print_res: True
MODEL:
- FeatureExtractionOp:
name: feature
param_path: paddlecv://models/general_PPLCNetV2_base_pretrained_v1.0_infer/inference.pdiparams
model_path: paddlecv://models/general_PPLCNetV2_base_pretrained_v1.0_infer/inference.pdmodel
batch_size: 1
PreProcess:
- ResizeImage:
size: [224, 224]
return_numpy: False
interpolation: bilinear
backend: cv2
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: hwc
- ToCHWImage:
- ExpandDim:
axis: 0
PostProcess:
- NormalizeFeature:
- Index:
index_method: "HNSW32" # supported: HNSW32, IVF, Flat
dist_type: "IP"
index_dir: "./drink_dataset_v2.0/index"
score_thres: 0.5
Inputs:
- input.image
- FeatureOutput:
name: print
Inputs:
- input.fn
- feature.rec_score
- feature.rec_doc
ENV:
run_mode: paddle
device: GPU
min_subgraph_size: 3
shape_info_filename: ./
trt_calib_mode: False
cpu_threads: 1
trt_use_static: False
MODEL:
- FragmentCompositionOp:
name: text_merge
Inputs:
- input.text
ENV:
run_mode: paddle
device: GPU
min_subgraph_size: 3
shape_info_filename: ./
trt_calib_mode: False
cpu_threads: 1
trt_use_static: False
MODEL:
- KeyFrameExtractionOp:
name: kfe
algo: "luv_diff" # one of [luv_diff, ]
params:
thresh: null # you can also set is as 0.6
use_top_order: False
use_local_maxima: True
num_top_frames: 50
window_len: 50
Inputs:
- input.video_path
train_height: &train_height 128
train_width: &train_width 96
trainsize: &trainsize [*train_width, *train_height]
ENV:
run_mode: paddle
device: GPU
min_subgraph_size: 3
shape_info_filename: ./
trt_calib_mode: False
cpu_threads: 1
trt_use_static: False
save_img: True
save_res: True
return_res: True
MODEL:
- KeypointOp:
name: kpt
param_path: paddlecv://models/tinypose_128x96/inference.pdiparams
model_path: paddlecv://models/tinypose_128x96/inference.pdmodel
batch_size: 2
image_shape: [3, *train_height, *train_width]
PreProcess:
- TopDownEvalAffine:
trainsize: *trainsize
- NormalizeImage:
is_scale: true
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
- Permute:
PostProcess:
- HRNetPostProcess:
use_dark: True
Inputs:
- input.image
- KptOutput:
name: vis
Inputs:
- input.fn
- input.image
- kpt.keypoints
- kpt.kpt_scores
ENV:
run_mode: paddle
device: GPU
min_subgraph_size: 3
shape_info_filename: ./
trt_calib_mode: False
cpu_threads: 1
trt_use_static: False
save_img: False
save_res: False
return_res: False
MODEL:
- OcrCrnnRecOp:
name: rec
param_path: paddlecv://models/ch_PP-OCRv3_rec_infer/inference.pdiparams
model_path: paddlecv://models/ch_PP-OCRv3_rec_infer/inference.pdmodel
batch_size: 6
PreProcess:
- RGB2BGR:
- ReisizeNormImg:
rec_image_shape: [3, 48, 320]
PostProcess:
- CTCLabelDecode:
character_dict_path: "paddlecv://dict/ocr/ch_dict.txt"
use_space_char: true
Inputs:
- input.image
ENV:
run_mode: paddle
device: GPU
min_subgraph_size: 3
shape_info_filename: ./
trt_calib_mode: False
cpu_threads: 1
trt_use_static: False
save_img: true
save_res: true
return_res: true
MODEL:
- OcrDbDetOp:
name: det
param_path: paddlecv://models/ch_PP-OCRv3_det_infer/inference.pdiparams
model_path: paddlecv://models/ch_PP-OCRv3_det_infer/inference.pdmodel
batch_size: 1
PreProcess:
- RGB2BGR:
- DetResizeForTest:
limit_side_len: 960
limit_type: "max"
- NormalizeImage:
std: [0.229, 0.224, 0.225]
mean: [0.485, 0.456, 0.406]
scale: '1./255.'
order: 'hwc'
- ToCHWImage:
- ExpandDim:
axis: 0
- KeepKeys:
keep_keys: ['image', 'shape']
PostProcess:
- DBPostProcess:
thresh: 0.3
box_thresh: 0.6
max_candidates: 1000
unclip_ratio: 1.5
use_dilation: False
score_mode: "fast"
box_type: "quad"
Inputs:
- input.image
- OCROutput:
name: vis
Inputs:
- input.fn
- input.image
- det.dt_polys
ENV:
run_mode: paddle
device: GPU
min_subgraph_size: 3
shape_info_filename: ./
trt_calib_mode: False
cpu_threads: 1
trt_use_static: False
save_img: True
save_res: True
return_res: True
MODEL:
- DetectionOp:
name: det
param_path: paddlecv://models/ch_PP-StructureV2_picodet_lcnet_x1_0_fgd_layout_infer/inference.pdiparams
model_path: paddlecv://models/ch_PP-StructureV2_picodet_lcnet_x1_0_fgd_layout_infer/inference.pdmodel
batch_size: 1
image_shape: [3, 800, 608]
PreProcess:
- Resize:
interp: 2
keep_ratio: false
target_size: [800, 608]
- NormalizeImage:
is_scale: true
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
- RGB2BGR:
- Permute:
PostProcess:
- ParserDetResults:
label_list: paddlecv://dict/ocr/layout_publaynet_dict.txt
threshold: 0.5
Inputs:
- input.image
- DetOutput:
name: vis
Inputs:
- input.fn
- input.image
- det.dt_bboxes
- det.dt_scores
- det.dt_cls_names
ENV:
run_mode: paddle
device: GPU
min_subgraph_size: 3
shape_info_filename: ./
trt_calib_mode: False
cpu_threads: 1
trt_use_static: False
save_img: true
save_res: true
return_res: False
MODEL:
- PPStructureTableStructureOp:
name: table
param_path: paddlecv://models/ch_PP-StructureV2_SLANet_infer/inference.pdiparams
model_path: paddlecv://models/ch_PP-StructureV2_SLANet_infer/inference.pdmodel
batch_size: 1
PreProcess:
- RGB2BGR:
- ResizeTableImage:
max_len: 488
- NormalizeImage:
scale: 1./255.
mean: [ 0.485, 0.456, 0.406 ]
std: [ 0.229, 0.224, 0.225 ]
order: 'hwc'
- PaddingTableImage:
size: [ 488, 488 ]
- ToCHWImage:
- ExpandDim:
axis: 0
PostProcess:
- TableLabelDecode:
character_dict_path: "paddlecv://dict/ocr/table_structure_dict_ch.txt"
merge_no_span_structure: true
Inputs:
- input.image
- OCRTableOutput:
name: vis
Inputs:
- input.fn
- input.image
- table.dt_bboxes
- table.structures
- table.scores
ENV:
run_mode: paddle
device: GPU
min_subgraph_size: 3
shape_info_filename: ./
trt_calib_mode: False
cpu_threads: 1
trt_use_static: False
MODEL:
- ClassificationOp:
name: cls
param_path: paddlecv://models/PPLCNet_x1_0_infer/inference.pdiparams
model_path: paddlecv://models/PPLCNet_x1_0_infer/inference.pdmodel
batch_size: 8
PreProcess:
- ResizeImage:
resize_short: 256
- CropImage:
size: 224
- NormalizeImage:
scale: 0.00392157
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
channel_num: 3
- ToCHWImage:
- ExpandDim:
axis: 0
PostProcess:
- Topk:
topk: 5
class_id_map_file: paddlecv://dict/classification/imagenet1k_label_list.txt
Inputs:
- input.image
ENV:
run_mode: paddle
device: GPU
min_subgraph_size: 3
shape_info_filename: ./
trt_calib_mode: False
cpu_threads: 1
trt_use_static: False
MODEL:
- PolyCropOp:
name: poly_crop
Inputs:
- input.image
- input.poly
ENV:
run_mode: paddle
device: GPU
min_subgraph_size: 3
shape_info_filename: ./
trt_calib_mode: False
cpu_threads: 1
trt_use_static: False
MODEL:
- PPStructureFilterOp:
name: filter
keep_keys: [table]
Inputs:
- input.dt_cls_names
- input.crop_image
- input.dt_polys
- input.rec_text
ENV:
run_mode: paddle
device: GPU
min_subgraph_size: 3
shape_info_filename: ./
trt_calib_mode: False
cpu_threads: 1
trt_use_static: False
MODEL:
- PPStructureResultConcatOp:
name: concat
Inputs:
- input.table.structures
- input.Matcher.html
- input.layout.dt_bboxes
- input.table.dt_bboxes
- input.filter_table.dt_polys
- input.filter_table.rec_text
- input.filter_txts.dt_polys
- input.filter_txts.rec_text
ENV:
run_mode: paddle
device: GPU
min_subgraph_size: 3
shape_info_filename: ./
trt_calib_mode: False
cpu_threads: 1
trt_use_static: False
save_img: True
save_res: True
return_res: True
MODEL:
- SegmentationOp:
name: seg
param_path: paddlecv://models/PP_HumanSegV2_256x144_with_Softmax/model.pdiparams
model_path: paddlecv://models/PP_HumanSegV2_256x144_with_Softmax/model.pdmodel
batch_size: 8
PreProcess:
- Resize:
target_size: [256, 144]
- Normalize:
scale: 0.00392157
mean: [0.5, 0.5, 0.5]
std: [0.5, 0.5, 0.5]
order: ''
- ToCHWImage
- ExpandDim
PostProcess:
- SegPostProcess
Inputs:
- input.image
- HumanSegOutput:
name: out
Inputs:
- input.fn
- input.image
- seg.seg_map
ENV:
run_mode: paddle
device: GPU
min_subgraph_size: 3
shape_info_filename: ./
trt_calib_mode: False
cpu_threads: 1
trt_use_static: False
save_img: True
save_res: True
return_res: True
MODEL:
- SegmentationOp:
name: seg
param_path: paddlecv://models/PP_MattingV1/model.pdiparams
model_path: paddlecv://models/PP_MattingV1/model.pdmodel
batch_size: 8
PreProcess:
- ResizeByShort:
resize_short: 512
size_divisor: 32
- Normalize:
scale: 0.00392157
mean: [0.5, 0.5, 0.5]
std: [0.5, 0.5, 0.5]
order: ''
- ToCHWImage
- ExpandDim
PostProcess:
- SegPostProcess
Inputs:
- input.image
- MattingOutput:
name: out
Inputs:
- input.fn
- input.image
- seg.seg_map
ENV:
run_mode: paddle
device: GPU
min_subgraph_size: 3
shape_info_filename: ./
trt_calib_mode: False
cpu_threads: 1
trt_use_static: False
save_img: True
save_res: True
return_res: True
MODEL:
- SegmentationOp:
name: seg
param_path: paddlecv://models/PP_LiteSeg_Cityscapes/model.pdiparams
model_path: paddlecv://models/PP_LiteSeg_Cityscapes/model.pdmodel
batch_size: 8
PreProcess:
- Normalize:
scale: 0.00392157
mean: [0.5, 0.5, 0.5]
std: [0.5, 0.5, 0.5]
order: ''
- ToCHWImage
- ExpandDim
PostProcess:
- SegPostProcess
Inputs:
- input.image
- SegOutput:
name: out
Inputs:
- input.fn
- input.image
- seg.seg_map
ENV:
run_mode: paddle
device: GPU
min_subgraph_size: 3
shape_info_filename: ./
trt_calib_mode: False
cpu_threads: 1
trt_use_static: False
MODEL:
- TableMatcherOp:
name: tablematcher
filter_ocr_result: False
Inputs:
- input.dt_bboxes
- input.structures
- input.dt_polys
- input.rec_text
# Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from functools import reduce
import os
import importlib
import numpy as np
import math
import paddle
from ppcv.ops.base import create_operators
from ppcv.ops.models.base import ModelBaseOp
from ppcv.core.workspace import register
from .preprocess import *
from .postprocess import *
@register
class DetectionCustomOp(ModelBaseOp):
def __init__(self, model_cfg, env_cfg):
super(DetectionOp, self).__init__(model_cfg, env_cfg)
self.model_cfg = model_cfg
mod = importlib.import_module(__name__)
self.preprocessor = create_operators(model_cfg["PreProcess"], mod)
self.postprocessor = create_operators(model_cfg["PostProcess"], mod)
@classmethod
def get_output_keys(cls):
return ["dt_bboxes", "dt_scores", "dt_class_ids", "dt_cls_names"]
def preprocess(self, image):
im_info = {
'scale_factor': np.array(
[1., 1.], dtype=np.float32),
'im_shape': np.array(
image.shape[:2], dtype=np.float32),
'input_shape': self.model_cfg["image_shape"],
}
for ops in self.preprocessor:
image, im_info = ops(image, im_info)
return image, im_info
def postprocess(self, inputs, result, bbox_num):
outputs = result
for idx, ops in enumerate(self.postprocessor):
if idx == len(self.postprocessor) - 1:
outputs, bbox_num = ops(outputs, bbox_num, self.output_keys)
else:
outputs, bbox_num = ops(outputs, bbox_num)
return outputs, bbox_num
def create_inputs(self, imgs, im_info):
inputs = {}
im_shape = []
scale_factor = []
if len(imgs) == 1:
image = np.array((imgs[0], )).astype('float32')
im_shape = np.array((im_info[0]['im_shape'], )).astype('float32')
scale_factor = np.array(
(im_info[0]['scale_factor'], )).astype('float32')
inputs = dict(
im_shape=im_shape, image=image, scale_factor=scale_factor)
outputs = [inputs[key] for key in self.input_names]
return outputs
for e in im_info:
im_shape.append(np.array((e['im_shape'], )).astype('float32'))
scale_factor.append(
np.array((e['scale_factor'], )).astype('float32'))
inputs['im_shape'] = np.concatenate(im_shape, axis=0)
inputs['scale_factor'] = np.concatenate(scale_factor, axis=0)
imgs_shape = [[e.shape[1], e.shape[2]] for e in imgs]
max_shape_h = max([e[0] for e in imgs_shape])
max_shape_w = max([e[1] for e in imgs_shape])
padding_imgs = []
for img in imgs:
im_c, im_h, im_w = img.shape[:]
padding_im = np.zeros(
(im_c, max_shape_h, max_shape_w), dtype=np.float32)
padding_im[:, :im_h, :im_w] = img
padding_imgs.append(padding_im)
inputs['image'] = np.stack(padding_imgs, axis=0)
outputs = [inputs[key] for key in self.input_names]
return outputs
def infer(self, image_list):
inputs = []
batch_loop_cnt = math.ceil(float(len(image_list)) / self.batch_size)
results = []
bbox_nums = []
for i in range(batch_loop_cnt):
start_index = i * self.batch_size
end_index = min((i + 1) * self.batch_size, len(image_list))
batch_image_list = image_list[start_index:end_index]
# preprocess
output_list = []
info_list = []
for img in batch_image_list:
output, info = self.preprocess(img)
output_list.append(output)
info_list.append(info)
inputs = self.create_inputs(output_list, info_list)
# model inference
result = self.predictor.run(inputs)
res = result[0]
bbox_num = result[1]
# postprocess
res, bbox_num = self.postprocess(inputs, res, bbox_num)
results.append(res)
bbox_nums.append(bbox_num)
# results = self.merge_batch_result(results)
return results, bbox_nums
def __call__(self, inputs):
"""
step1: parser inputs
step2: run
step3: merge results
input: a list of dict
"""
# for the input_keys as list
# inputs = [pipe_input[key] for pipe_input in pipe_inputs for key in self.input_keys]
key = self.input_keys[0]
if isinstance(inputs[0][key], (list, tuple)):
inputs = [input[key] for input in inputs]
else:
inputs = [[input[key]] for input in inputs]
sub_index_list = [len(input) for input in inputs]
inputs = reduce(lambda x, y: x.extend(y) or x, inputs)
# step2: run
outputs, bbox_nums = self.infer(inputs)
# step3: merge
curr_offsef_id = 0
pipe_outputs = []
for i, bbox_num in enumerate(bbox_nums):
output = outputs[i]
start_id = 0
for num in bbox_num:
end_id = start_id + num
out = {k: v[start_id:end_id] for k, v in output.items()}
pipe_outputs.append(out)
start_id = end_id
return pipe_outputs
# Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import cv2
import numpy as np
from scipy.special import softmax
from ppcv.utils.download import get_dict_path
class ParserDetResults(object):
def __init__(self, label_list, threshold=0.5, max_det_results=100):
self.threshold = threshold
self.max_det_results = max_det_results
self.clsid2catid, self.catid2name = self.get_categories(label_list)
def get_categories(self, label_list):
label_list = get_dict_path(label_list)
if label_list.endswith('json'):
# lazy import pycocotools here
from pycocotools.coco import COCO
coco = COCO(label_list)
cats = coco.loadCats(coco.getCatIds())
clsid2catid = {i: cat['id'] for i, cat in enumerate(cats)}
catid2name = {cat['id']: cat['name'] for cat in cats}
elif label_list.endswith('txt'):
cats = []
with open(label_list) as f:
for line in f.readlines():
cats.append(line.strip())
if cats[0] == 'background': cats = cats[1:]
clsid2catid = {i: i for i in range(len(cats))}
catid2name = {i: name for i, name in enumerate(cats)}
else:
raise ValueError("label_list {} should be json or txt.".format(
label_list))
return clsid2catid, catid2name
def __call__(self, preds, bbox_num, output_keys):
start_id = 0
dt_bboxes = []
scores = []
class_ids = []
cls_names = []
new_bbox_num = []
for num in bbox_num:
end_id = start_id + num
pred = preds[start_id:end_id]
start_id = end_id
max_det_results = min(self.max_det_results, pred.shape[0])
keep_indexes = pred[:, 1].argsort()[::-1][:max_det_results]
select_num = 0
for idx in keep_indexes:
single_res = pred[idx].tolist()
class_id = int(single_res[0])
score = single_res[1]
bbox = single_res[2:]
if score < self.threshold:
continue
if class_id == -1:
continue
select_num += 1
dt_bboxes.append(bbox)
scores.append(score)
class_ids.append(class_id)
cls_names.append(self.catid2name[self.clsid2catid[class_id]])
new_bbox_num.append(select_num)
result = {
output_keys[0]: dt_bboxes,
output_keys[1]: scores,
output_keys[2]: class_ids,
output_keys[3]: cls_names,
}
new_bbox_num = np.array(new_bbox_num).astype('int32')
return result, new_bbox_num
def hard_nms(box_scores, iou_threshold, top_k=-1, candidate_size=200):
"""
Args:
box_scores (N, 5): boxes in corner-form and probabilities.
iou_threshold: intersection over union threshold.
top_k: keep top_k results. If k <= 0, keep all the results.
candidate_size: only consider the candidates with the highest scores.
Returns:
picked: a list of indexes of the kept boxes
"""
scores = box_scores[:, -1]
boxes = box_scores[:, :-1]
picked = []
indexes = np.argsort(scores)
indexes = indexes[-candidate_size:]
while len(indexes) > 0:
current = indexes[-1]
picked.append(current)
if 0 < top_k == len(picked) or len(indexes) == 1:
break
current_box = boxes[current, :]
indexes = indexes[:-1]
rest_boxes = boxes[indexes, :]
iou = iou_of(
rest_boxes,
np.expand_dims(
current_box, axis=0), )
indexes = indexes[iou <= iou_threshold]
return box_scores[picked, :]
def iou_of(boxes0, boxes1, eps=1e-5):
"""Return intersection-over-union (Jaccard index) of boxes.
Args:
boxes0 (N, 4): ground truth boxes.
boxes1 (N or 1, 4): predicted boxes.
eps: a small number to avoid 0 as denominator.
Returns:
iou (N): IoU values.
"""
overlap_left_top = np.maximum(boxes0[..., :2], boxes1[..., :2])
overlap_right_bottom = np.minimum(boxes0[..., 2:], boxes1[..., 2:])
overlap_area = area_of(overlap_left_top, overlap_right_bottom)
area0 = area_of(boxes0[..., :2], boxes0[..., 2:])
area1 = area_of(boxes1[..., :2], boxes1[..., 2:])
return overlap_area / (area0 + area1 - overlap_area + eps)
def area_of(left_top, right_bottom):
"""Compute the areas of rectangles given two corners.
Args:
left_top (N, 2): left top corner.
right_bottom (N, 2): right bottom corner.
Returns:
area (N): return the area.
"""
hw = np.clip(right_bottom - left_top, 0.0, None)
return hw[..., 0] * hw[..., 1]
class PicoDetPostProcess(object):
"""
Args:
input_shape (int): network input image size
ori_shape (int): ori image shape of before padding
scale_factor (float): scale factor of ori image
enable_mkldnn (bool): whether to open MKLDNN
"""
def __init__(self,
layout_dict_path,
strides=[8, 16, 32, 64],
score_threshold=0.4,
nms_threshold=0.5,
nms_top_k=1000,
keep_top_k=100):
self.labels = self.load_layout_dict(get_dict_path(layout_dict_path))
self.strides = strides
self.score_threshold = score_threshold
self.nms_threshold = nms_threshold
self.nms_top_k = nms_top_k
self.keep_top_k = keep_top_k
def load_layout_dict(self, layout_dict_path):
with open(layout_dict_path, 'r', encoding='utf-8') as fp:
labels = fp.readlines()
return [label.strip('\n') for label in labels]
def warp_boxes(self, boxes, ori_shape):
"""Apply transform to boxes
"""
width, height = ori_shape[1], ori_shape[0]
n = len(boxes)
if n:
# warp points
xy = np.ones((n * 4, 3))
xy[:, :2] = boxes[:, [0, 1, 2, 3, 0, 3, 2, 1]].reshape(
n * 4, 2) # x1y1, x2y2, x1y2, x2y1
# xy = xy @ M.T # transform
xy = (xy[:, :2] / xy[:, 2:3]).reshape(n, 8) # rescale
# create new boxes
x = xy[:, [0, 2, 4, 6]]
y = xy[:, [1, 3, 5, 7]]
xy = np.concatenate(
(x.min(1), y.min(1), x.max(1), y.max(1))).reshape(4, n).T
# clip boxes
xy[:, [0, 2]] = xy[:, [0, 2]].clip(0, width)
xy[:, [1, 3]] = xy[:, [1, 3]].clip(0, height)
return xy.astype(np.float32)
else:
return boxes
def img_info(self, ori_img, img):
origin_shape = ori_img.shape
resize_shape = img.shape
im_scale_y = resize_shape[2] / float(origin_shape[0])
im_scale_x = resize_shape[3] / float(origin_shape[1])
scale_factor = np.array([im_scale_y, im_scale_x], dtype=np.float32)
img_shape = np.array(img.shape[2:], dtype=np.float32)
input_shape = np.array(img).astype('float32').shape[2:]
ori_shape = np.array((img_shape, )).astype('float32')
scale_factor = np.array((scale_factor, )).astype('float32')
return ori_shape, input_shape, scale_factor
def __call__(self, preds, info_list, output_keys=None):
scores, raw_boxes = preds['scores'], preds['boxes']
batch_size = raw_boxes[0].shape[0]
reg_max = int(raw_boxes[0].shape[-1] / 4 - 1)
out_boxes_num = []
out_boxes_list = []
for batch_id in range(batch_size):
ori_shape, input_shape, scale_factor = info_list[batch_id]['ori_shape'], info_list[batch_id]['im_shape'], \
info_list[batch_id][
'scale_factor']
# generate centers
decode_boxes = []
select_scores = []
for stride, box_distribute, score in zip(self.strides, raw_boxes,
scores):
box_distribute = box_distribute[batch_id]
score = score[batch_id]
# centers
fm_h = input_shape[0] / stride
fm_w = input_shape[1] / stride
h_range = np.arange(fm_h)
w_range = np.arange(fm_w)
ww, hh = np.meshgrid(w_range, h_range)
ct_row = (hh.flatten() + 0.5) * stride
ct_col = (ww.flatten() + 0.5) * stride
center = np.stack((ct_col, ct_row, ct_col, ct_row), axis=1)
# box distribution to distance
reg_range = np.arange(reg_max + 1)
box_distance = box_distribute.reshape((-1, reg_max + 1))
box_distance = softmax(box_distance, axis=1)
box_distance = box_distance * np.expand_dims(reg_range, axis=0)
box_distance = np.sum(box_distance, axis=1).reshape((-1, 4))
box_distance = box_distance * stride
# top K candidate
topk_idx = np.argsort(score.max(axis=1))[::-1]
topk_idx = topk_idx[:self.nms_top_k]
center = center[topk_idx]
score = score[topk_idx]
box_distance = box_distance[topk_idx]
# decode box
decode_box = center + [-1, -1, 1, 1] * box_distance
select_scores.append(score)
decode_boxes.append(decode_box)
# nms
bboxes = np.concatenate(decode_boxes, axis=0)
confidences = np.concatenate(select_scores, axis=0)
picked_box_probs = []
picked_labels = []
for class_index in range(0, confidences.shape[1]):
probs = confidences[:, class_index]
mask = probs > self.score_threshold
probs = probs[mask]
if probs.shape[0] == 0:
continue
subset_boxes = bboxes[mask, :]
box_probs = np.concatenate(
[subset_boxes, probs.reshape(-1, 1)], axis=1)
box_probs = hard_nms(
box_probs,
iou_threshold=self.nms_threshold,
top_k=self.keep_top_k, )
picked_box_probs.append(box_probs)
picked_labels.extend([class_index] * box_probs.shape[0])
if len(picked_box_probs) == 0:
out_boxes_list.append(np.empty((0, 4)))
out_boxes_num.append(0)
else:
picked_box_probs = np.concatenate(picked_box_probs)
# resize output boxes
picked_box_probs[:, :4] = self.warp_boxes(
picked_box_probs[:, :4], ori_shape)
im_scale = np.concatenate(
[scale_factor[::-1], scale_factor[::-1]])
picked_box_probs[:, :4] /= im_scale
# clas score box
out_boxes_list.append(
np.concatenate(
[
np.expand_dims(
np.array(picked_labels),
axis=-1), np.expand_dims(
picked_box_probs[:, 4], axis=-1),
picked_box_probs[:, :4]
],
axis=1))
out_boxes_num.append(len(picked_labels))
out_boxes_list = np.concatenate(out_boxes_list, axis=0)
out_boxes_num = np.asarray(out_boxes_num).astype(np.int32)
bboxes, scores, clsids, labels = [], [], [], []
for dt in out_boxes_list:
clsid, bbox, score = int(dt[0]), dt[2:], dt[1]
label = self.labels[clsid]
bboxes.append(bbox)
scores.append(score.tolist())
clsids.append(clsid)
labels.append(label)
results = {
output_keys[0]: np.array(bboxes),
output_keys[1]: scores,
output_keys[2]: clsids,
output_keys[3]: labels
}
return results, out_boxes_num
# Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import cv2
import numpy as np
def decode_image(im_file, im_info):
"""read rgb image
Args:
im_file (str|np.ndarray): input can be image path or np.ndarray
im_info (dict): info of image
Returns:
im (np.ndarray): processed image (np.ndarray)
im_info (dict): info of processed image
"""
if isinstance(im_file, str):
with open(im_file, 'rb') as f:
im_read = f.read()
data = np.frombuffer(im_read, dtype='uint8')
im = cv2.imdecode(data, 1) # BGR mode, but need RGB mode
im = cv2.cvtColor(im, cv2.COLOR_BGR2RGB)
else:
im = im_file
im_info['im_shape'] = np.array(im.shape[:2], dtype=np.float32)
im_info['scale_factor'] = np.array([1., 1.], dtype=np.float32)
return im, im_info
class Resize(object):
"""resize image by target_size and max_size
Args:
target_size (int): the target size of image
keep_ratio (bool): whether keep_ratio or not, default true
interp (int): method of resize
"""
def __init__(
self,
target_size,
keep_ratio=True,
interp=cv2.INTER_LINEAR, ):
if isinstance(target_size, int):
target_size = [target_size, target_size]
self.target_size = target_size
self.keep_ratio = keep_ratio
self.interp = interp
def __call__(self, im, im_info):
"""
Args:
im (np.ndarray): image (np.ndarray)
im_info (dict): info of image
Returns:
im (np.ndarray): processed image (np.ndarray)
im_info (dict): info of processed image
"""
assert len(self.target_size) == 2
assert self.target_size[0] > 0 and self.target_size[1] > 0
im_channel = im.shape[2]
im_scale_y, im_scale_x = self.generate_scale(im)
# set image_shape
im_info['input_shape'][1] = int(im_scale_y * im.shape[0])
im_info['input_shape'][2] = int(im_scale_x * im.shape[1])
im = cv2.resize(
im,
None,
None,
fx=im_scale_x,
fy=im_scale_y,
interpolation=self.interp)
im_info['im_shape'] = np.array(im.shape[:2]).astype('float32')
im_info['scale_factor'] = np.array(
[im_scale_y, im_scale_x]).astype('float32')
return im, im_info
def generate_scale(self, im):
"""
Args:
im (np.ndarray): image (np.ndarray)
Returns:
im_scale_x: the resize ratio of X
im_scale_y: the resize ratio of Y
"""
origin_shape = im.shape[:2]
im_c = im.shape[2]
if self.keep_ratio:
im_size_min = np.min(origin_shape)
im_size_max = np.max(origin_shape)
target_size_min = np.min(self.target_size)
target_size_max = np.max(self.target_size)
im_scale = float(target_size_min) / float(im_size_min)
if np.round(im_scale * im_size_max) > target_size_max:
im_scale = float(target_size_max) / float(im_size_max)
im_scale_x = im_scale
im_scale_y = im_scale
else:
resize_h, resize_w = self.target_size
im_scale_y = resize_h / float(origin_shape[0])
im_scale_x = resize_w / float(origin_shape[1])
return im_scale_y, im_scale_x
class NormalizeImage(object):
"""normalize image
Args:
mean (list): im - mean
std (list): im / std
is_scale (bool): whether need im / 255
is_channel_first (bool): if True: image shape is CHW, else: HWC
"""
def __init__(self, mean, std, is_scale=True, norm_type='mean_std'):
self.mean = mean
self.std = std
self.is_scale = is_scale
self.norm_type = norm_type
def __call__(self, im, im_info):
"""
Args:
im (np.ndarray): image (np.ndarray)
im_info (dict): info of image
Returns:
im (np.ndarray): processed image (np.ndarray)
im_info (dict): info of processed image
"""
im = im.astype(np.float32, copy=False)
if self.is_scale:
scale = 1.0 / 255.0
im *= scale
if self.norm_type == 'mean_std':
mean = np.array(self.mean)[np.newaxis, np.newaxis, :]
std = np.array(self.std)[np.newaxis, np.newaxis, :]
im -= mean
im /= std
return im, im_info
class Permute(object):
"""permute image
Args:
to_bgr (bool): whether convert RGB to BGR
channel_first (bool): whether convert HWC to CHW
"""
def __init__(self, ):
super().__init__()
def __call__(self, im, im_info):
"""
Args:
im (np.ndarray): image (np.ndarray)
im_info (dict): info of image
Returns:
im (np.ndarray): processed image (np.ndarray)
im_info (dict): info of processed image
"""
im = im.transpose((2, 0, 1)).copy()
return im, im_info
class PadStride(object):
""" padding image for model with FPN , instead PadBatch(pad_to_stride, pad_gt) in original config
Args:
stride (bool): model with FPN need image shape % stride == 0
"""
def __init__(self, stride=0):
self.coarsest_stride = stride
def __call__(self, im, im_info):
"""
Args:
im (np.ndarray): image (np.ndarray)
im_info (dict): info of image
Returns:
im (np.ndarray): processed image (np.ndarray)
im_info (dict): info of processed image
"""
coarsest_stride = self.coarsest_stride
if coarsest_stride <= 0:
return im, im_info
im_c, im_h, im_w = im.shape
pad_h = int(np.ceil(float(im_h) / coarsest_stride) * coarsest_stride)
pad_w = int(np.ceil(float(im_w) / coarsest_stride) * coarsest_stride)
padding_im = np.zeros((im_c, pad_h, pad_w), dtype=np.float32)
padding_im[:, :im_h, :im_w] = im
return padding_im, im_info
# Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import os
import sys
import cv2
import unittest
import yaml
import argparse
import paddlecv
from ppcv.core.workspace import global_config
from ppcv.core.config import ConfigParser
class TestCustomDetection(unittest.TestCase):
def setUp(self):
self.config = 'test_custom_detection.yml'
self.input = '../demo/ILSVRC2012_val_00020010.jpeg'
self.cfg_dict = dict(config=self.config, input=self.input)
cfg = argparse.Namespace(**self.cfg_dict)
config = ConfigParser(cfg)
config.print_cfg()
self.model_cfg, self.env_cfg = config.parse()
def test_detection(self):
img = cv2.imread(self.input)[:, :, ::-1]
inputs = [
{
"input.image": img,
},
{
"input.image": img,
},
]
op_name = list(self.model_cfg[0].keys())[0]
det_op = global_config[op_name](self.model_cfg[0][op_name],
self.env_cfg)
result = det_op(inputs)
def test_pipeline(self):
input = os.path.abspath(self.input)
ppcv = paddlecv.PaddleCV(config_path=self.config)
ppcv(input)
if __name__ == '__main__':
unittest.main()
ENV:
run_mode: paddle
device: GPU
min_subgraph_size: 3
shape_info_filename: ./
trt_calib_mode: False
cpu_threads: 1
trt_use_static: False
save_img: True
save_res: True
return_res: True
MODEL:
- DetectionOp:
name: det
param_path: paddlecv://models/picodet_PPLCNet_x2_5_mainbody_lite_v1.0_infer/inference.pdiparams
model_path: paddlecv://models/picodet_PPLCNet_x2_5_mainbody_lite_v1.0_infer/inference.pdmodel
batch_size: 2
image_shape: [3, 640, 640]
PreProcess:
- Resize:
interp: 2
keep_ratio: false
target_size: [640, 640]
- NormalizeImage:
is_scale: true
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
- Permute:
PostProcess:
- ParserDetResults:
label_list: ../tests/coco_label_list.json
threshold: 0.5
Inputs:
- input.image
- DetOutput:
name: vis
Inputs:
- input.fn
- input.image
- det.dt_bboxes
- det.dt_scores
- det.dt_cls_names
# 10分钟快速上手PaddleCV
PaddleCV是飞桨视觉统一的推理部署套件,提供了单模型、多模型串联部署流程。本章节我们将详细讲解PaddleCV使用方法
- [安装](#1)
- [预测部署](#2)
- [部署示例](#2.1)
- [参数说明](#2.2)
- [配置文件](#2.3)
- [二次开发](#3)
<a name="1"></a>
## 1. 安装
关于安装配置运行环境,请参考[安装指南](INSTALL.md)
<a name="2"></a>
## 2. 预测部署
PaddleCV预测部署依赖推理模型(paddle.jit.save保存的模型),PaddleCV的配置文件中预置了不同任务推荐的推理模型下载链接。如果需要依赖飞桨各开发套件进行二次开发,相应导出文档链接如下表所示
| 开发套件名称 | 导出模型文档链接 |
|:-----------:|:------------------:|
| PaddleClas | [文档链接](https://github.com/PaddlePaddle/PaddleClas/blob/release/2.5/docs/zh_CN/deployment/export_model.md) |
| PaddleDetection | [文档链接](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.5/deploy/EXPORT_MODEL.md) |
| PaddleSeg | [文档链接](https://github.com/PaddlePaddle/PaddleSeg/blob/release/2.6/docs/model_export_cn.md) |
| PaddleOCR | [文档链接](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.6/doc/doc_ch/table_recognition.md#41-%E6%A8%A1%E5%9E%8B%E5%AF%BC%E5%87%BA) |
注意:
1. PaddleOCR分别提供了不同任务的导出模型方法,上表提供链接为文本检测模型导出文档,其他任务可以参考[文档教程](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.6/README_ch.md#-%E6%96%87%E6%A1%A3%E6%95%99%E7%A8%8B)
<a name="2.1"></a>
### 1)部署示例
得到导出模型后可以使用如下命令进行预测部署:
```bash
# 图像分类任务
python -u tools/predict.py --config=configs/single_op/PP-HGNet.yml --input=demo/ILSVRC2012_val_00020010.jpeg
# 目标检测任务
python -u tools/predict.py --config=configs/single_op/PP-YOLOE+.yml --input=demo/000000014439.jpg
# OCR任务
python -u tools/predict.py --config=configs/system/PP-OCRv3.yml --input=demo/word_1.jpg
```
使用whl包安装后,也可以在python中使用三行代码快速进行预测部署,示例如下:
```python
from paddlecv import PaddleCV
paddlecv = PaddleCV(task_name="PP-OCRv3")
res = paddlecv("../demo/00056221.jpg")
```
<a name="2.2"></a>
### 2)参数说明
| 参数名 | 是否必选 | 默认值 | 含义 |
|:------:|:---------:|:---------:|:---------:|
| config | 是 | None | 配置文件路径 |
| input | 是 | None | 输入路径,支持图片文件,图片文件夹和视频文件 |
| output_dir | 否 | output | 输出结果保存路径,包含可视化结果和结构化输出 |
| run_mode | 否 | paddle | 预测部署模式,可选项为`'paddle'/'trt_fp32'/'trt_fp16'/'trt_int8'/'mkldnn'/'mkldnn_bf16'` |
| device | 否 | CPU | 运行设备,可选项为`CPU/GPU/XPU` |
<a name="2.3"></a>
### 3)配置文件
配置文件划分为[单模型配置](../configs/single_op)[串联系统配置](../configs/system)。配置内容分类环境类配置和模型类配置。环境配置中包含device设置,输出结果保存路径等字段。模型配置中包含各个模型的预处理,模型推理,输出后处理全流程配置项。需要注意使用正确的`param_path``model_path`路径。具体配置含义可以参考[配置文件说明文档](config_anno.md)
支持通过命令行修改配置文件内容,示例如下:
```
# 通过-o修改检测后处理阈值
# -o 中的`0`表示MODEL的Op位置,防止模型串联过程中出现同类Op的情况
python -u tools/predict.py --config=configs/single_op/PP-YOLOE+.yml --input=demo/000000014439.jpg -o MODEL.0.DetectionOp.PostProcess.0.ParserDetResults.threshold=0.6
```
**注意:**
1. 优先级排序:命令行输入 > 配置文件配置
<a name="3"></a>
## 3. 二次开发
PaddleCV中内置了分类、检测、分割等单模型算子,以及OCR,行人分析工具等串联系统算子。如果用户在使用过程中需要自定义算子进行二次开发,可以参考[新增算子文档](how_to_add_new_op.md)[外部算子开发文档](custom_ops.md)
# 安装文档
## 环境要求
- PaddlePaddle 2.3
- OS 64位操作系统
- Python 3(3.5.1+/3.6/3.7/3.8/3.9),64位版本
- pip/pip3(9.0.1+),64位版本
- CUDA >= 10.2
- cuDNN >= 7.6
## 安装说明
### 1. 安装PaddlePaddle
```
# CUDA10.2
python -m pip install paddlepaddle-gpu==2.3.2 -i https://mirror.baidu.com/pypi/simple
# CPU
python -m pip install paddlepaddle==2.3.2 -i https://mirror.baidu.com/pypi/simple
```
- 更多CUDA版本或环境快速安装,请参考[PaddlePaddle快速安装文档](https://www.paddlepaddle.org.cn/install/quick)
- 更多安装方式例如conda或源码编译安装方法,请参考[PaddlePaddle安装文档](https://www.paddlepaddle.org.cn/documentation/docs/zh/install/index_cn.html)
请确保您的PaddlePaddle安装成功并且版本不低于需求版本。使用以下命令进行验证。
```
# 在您的Python解释器中确认PaddlePaddle安装成功
>>> import paddle
>>> paddle.utils.run_check()
# 确认PaddlePaddle版本
python -c "import paddle; print(paddle.__version__)"
```
### 2. 安装PaddleCV
```
# 克隆PaddleModelPipeline仓库
cd <path/to/clone/PaddleModelPipeline>
git clone https://github.com/jerrywgz/PaddleModelPipeline.git
# 安装其他依赖
cd PaddleModelPipeline
# 编译安装paddlecv
python setup.py install
```
同时支持whl包安装使用,详细步骤参考[文档](whl.md)
安装后确认测试通过:
```
python tests/test_pipeline.py
```
测试通过后会提示如下信息:
```
.
----------------------------------------------------------------------
Ran 1 tests in 2.967s
OK
```
## 快速体验
**恭喜!** 您已经成功安装了PaddleCV,接下来快速体验目标检测效果
```
# 在GPU上预测一张图片
export CUDA_VISIBLE_DEVICES=0
python -u tools/predict.py --config=configs/single_op/PP-YOLOE+.yml --input=demo/000000014439.jpg
```
会在`output`文件夹下生成一个画有预测结果的同名图像。
结果如下图:
![](../demo/000000014439_output.jpg)
# 配置文件说明
本文档以目标检测模型[PP-YOLOE+](../configs/single_op/PP-YOLOE+.yml)为例,具体说明配置文件各字段含义
## 环境配置段
```
ENV:
min_subgraph_size: 3 # TensorRT最小子图大小
shape_info_filename: ./ # TensorRT shape收集文件路径
trt_calib_mode: False # 如果设置TensorRT离线量化校准,需要设置为True
cpu_threads: 1 # CPU部署时线程数
trt_use_static: False # TensorRT部署是否加载预生成的engine文件
save_img: True # 是否保存可视化图片,默认路径在output文件夹下
save_res: True # 是否保存结构化输出,默认路径在output文件夹下
return_res: True # 是否返回全量结构化输出结果
```
## 模型配置段
```
MODEL:
- DetectionOp: # 模型算子类名,输出字段为固定值,即["dt_bboxes", "dt_scores", "dt_class_ids", "dt_cls_names"]
name: det # 算子名称,单个配置文件中不同算子名称不能重复
param_path: paddlecv://models/ppyoloe_plus_crn_l_80e_coco/model.pdiparams # 推理模型参数文件,支持本地地址,也支持线上链接并自动下载
model_path: paddlecv://models/ppyoloe_plus_crn_l_80e_coco/model.pdmodel # 推理模型文件,支持本地地址,也支持线上链接并自动下载
batch_size: 1 # batch size
image_shape: [3, *image_shape, *image_shape] # 网络输入shape
PreProcess: # 预处理模块,集中在ppcv/ops/models/detection/preprocess.py中
- Resize:
interp: 2
keep_ratio: false
target_size: [*image_shape, *image_shape]
- NormalizeImage:
is_scale: true
mean: [0., 0., 0.]
std: [1., 1., 1.]
norm_type: null
- Permute:
PostProcess: #后处理模块,集中在ppcv/ops/models/detection/postprocess.py
- ParserDetResults:
label_list: paddlecv://dict/detection/coco_label_list.json
threshold: 0.5
Inputs: # 输入字段,DetectionOp算子输入所需字段,格式为{上一个op名}.{上一个op输出字段},第一个Op的上一个op名为input
- input.image
- DetOutput: # 输出模型算子类名
name: vis # 算子名称,单个配置文件中不同算子名称不能重复
Inputs: # 输入字段,DetOutput算子输入所需字段,格式为{上一个op名}.{上一个op输出字段}
- input.fn
- input.image
- det.dt_bboxes
- det.dt_scores
- det.dt_cls_names
```
# 外部算子开发
- [简介](#1)
- [外部算子依赖](#2)
- [外部算子实现方式](#3)
<a name="1"></a>
## 1. 简介
本教程主要介绍基于paddlecv新增外部算子,实现定制化算子开发,进行外部算子开发前,首先准备paddlecv环境,推荐使用pip安装
```bash
pip install paddlecv
```
<a name="2"></a>
## 2. 外部算子依赖
外部算子主要依赖接口如下:
#### 1)`ppcv.ops.base.create_operators(params, mod)`
- 功能:创建预处理后处理算子接口
- 输入:
- params: 前后后处理配置字典
- mod: 当前算子module
- 输出:前后处理算子实例化对象列表
#### 2)算子BaseOp
外部算子类型和paddlecv内算子类型相同,分为模型算子、衔接算子和输出算子。新增外部算子需要继承每类算子对应的BaseOp,对应关系如下:
```txt
模型算子:ppcv.ops.models.base.ModelBaseOp
衔接算子:ppcv.ops.connector.base.ConnectorBaseOp
输出算子:ppcv.ops.output.base.OutputBaseOp
```
#### 3)ppcv.core.workspace.register
需要使用@register对每个外部算子类进行修饰,例如:
```python
from ppcv.ops.models.base import ModelBaseOp
from ppcv.core.workspace import register
@register
class DetectionCustomOp(ModelBaseOp)
```
<a name="3"></a>
## 3. 外部算子实现方式
可直接参考[新增算子文档](how_to_add_new_op.md),实现后使用方式与paddlecv内部提供算子相同。paddlecv中提供检测外部算子[示例](../custom_op)
# 新增算子
## 1. 简介
本教程主要介绍怎样基于PaddleCV新增推理算子。
本项目中,算子主要分为3个部分。
- 模型推理算子:给定输入,加载模型,完成预处理、推理、后处理,返回输出。
- 模型衔接算子:给定输入,计算得到输出。一般用于将一个模型的输出处理为另外一个模型的输入,比如说目标检测、文本检测的扣图、方向矫正模块之后的模型旋转、文本合成等操作。
- 模型输出算子:存储、可视化、输出模型的输出结果。
在下面的介绍中,我们把算子称为op。
## 2. 单个op的输入/输出格式
PaddleCV的输入为图像或者视频。
对于所有的op,系统会将其整理为`a list of dict`的格式。列表中的每个元素均为一个待推理的对象及其中间结果。比如,对于图像分类来说,其输入仅包含图像信息,输入格式如下所示。
```json
[
{"image": img1},
{"image": img2},
]
```
输出格式为
```json
[
{"image": img1, "class_ids": class_id1, "scores": scores1, "label_names": label_names1},
{"image": img2, "class_ids": class_id2, "scores": scores2, "label_names": label_names2},
]
```
同理,对于模型衔接算子(BBoxCropOp为例)来说,其输入如下。
```json
[
{"image": img1, "bbox": bboxes1},
{"image": img2, "bbox": bboxes2},
]
```
## 3. 新增算子
### 3.1 模型推理算子
模型推理算子,整体继承自[ModelBaseOp类](../ppcv/ops/models/base.py)。示例可参考图像分类op:[ClassificationOp类](../ppcv/ops/models/classification/inference.py)。具体地,我们需要实现以下几个内容。
(1)该类需要继承自`ModelBaseOp`,同时使用`@register`方法进行注册,保证全局唯一。
(2)实现类中一些方法,包括
- 初始化`__init__`
- 输入:model_cfg与env_cfg
- 输出:无
- 模型预处理`preprocess`
- 输入:基于input_keys过滤后的模型输入
- 输出:模型预处理结果
- 模型后处理`postprocess`
- 输入:模型推理结果
- 输出:模型后处理结果
- 预测`__call__`
- 输入:该op依赖的输入内容
- 输出:该op的处理结果
### 3.2 模型衔接算子
模型衔接算子,整体继承自[ConnectorBaseOp](../ppcv/ops/connector/base.py)。示例可参考方向矫正op:[ClsCorrectionOp类](../ppcv/ops/connector/op_connector.py)。具体地,我们需要实现以下几个内容。
(1)该类需要继承自`ConnectorBaseOp`,同时使用`@register`方法进行注册,保证全局唯一。
(2)实现类中一些方法,包括
- 初始化`__init__`
- 输入:model_cfg、env_cfg(一般为None)
- 输出:无
- 调用`__call__`
- 输入:该op依赖的输入内容
- 输出:该op的处理结果
### 3.3 模型输出算子
模型衔接算子,整体继承自[OutputBaseOp](../ppcv/ops/output/base.py)。示例可参考方向矫正op:[ClasOutput类](../ppcv/ops/output/classification.py)。具体地,我们需要实现以下几个内容。
(1)该类需要继承自`OutputBaseOp`,同时使用`@register`方法进行注册,保证全局唯一。
(2)实现类中一些方法,包括
- 初始化`__init__`
- 输入:model_cfg、env_cfg(一般为None)
- 输出:无
- 调用`__call__`
- 输入:模型输出
- 输出:返回结果
## 4. 新增单测
在新增op之后,需要新增基于该op的单测,可以参考[test_classification.py](../tests/test_classification.py)
# 系统设计思想
- [目标](#1)
- [框架设计](#2)
- [2.1 配置模块设计](#2.1)
- [2.2 输入模块设计](#2.2)
- [2.3 算子实现方案](#2.3)
- [2.4 系统串联方案](#2.4)
<a name="1"></a>
## 目标
为了解决深度学习单模型及串联系统部署问题,飞桨模型团队设计了一套通用统一的部署系统,其核心特点包括:
1. 通用性:系统既要满足单模型部署,又要支持多模型复杂的拓扑关系
2. 高可用性:支持多种不同输入类型,通过配置文件即可高效实现复现系统串联
4. 高灵活性:支持自定义算子便捷接入,灵活实现定制化部署需求
<a name="2"></a>
## 框架设计
系统整体架构如图所示
<div align=center>
<img src='images/pipeline.png' height = "250" align="middle"/>
</div>
<a name="2.1"></a>
**一. 配置模块设计**
配置模块解析配置文件,拆分为环境配置和模型配置,同时检查配置项是否合规。环境配置负责管理部署环境相关配置,例如`run_mode``device`等。模型配置负责管理每个模型算子配置,包括模型路径、前后处理等。通过`Inputs`配置段实现模型间复杂的串联关系。配置示例可以参考[PP-PicoDet.yml](../configs/single_op/PP-PicoDet.yml)
同时支持命令行更新配置文件任意配置项功能,利于开发者快速进行更改环境,替换模型,超参调优等工作。
配置文件管理部分,系统针对每个任务(task),推荐用户使用对应的配置文件,并提供`get_config_file`接口实现自动下载, 例如:
```python
import paddlecv
paddlecv.get_config_file('detection')
```
<a name="2.2"></a>
**二. 输入模块设计**
输入模块解析输入文件格式,支持图片,图片文件夹,视频,numpy数据格式。统一使用`input`字段作为输入接口。输入模块代码实现参考[链接](../ppcv/engine/pipeline.py#L45)
<a name="2.3"></a>
**三. 算子实现方案**
系统算子分为模型算子(MODEL)、衔接算子(CONNECTOR)和输出算子(OUTPUT)三部分。三部分算子均有固定的输出格式和输出字段。模型算子将每个模型的预处理、前向推理、后处理端到端全流程进行独立封装;衔接算子连接模型算子的各类输入输出,例如扣图、过滤等;输出字段负责单模型或复杂系统的输出形式,例如可视化、结果保存等功能。详细算子实现流程请参考[文档](how_to_add_new_op.md)
<a name="2.4"></a>
**四. 系统串联方案**
系统通过有向无环图(DAG)串联各个算子并执行,每个算子需要指定`Inputs`字段,字段格式为`{last_op_name}.{last_op_output_name}`,即需要包含前置算子名称和对应输出字段名。从而建立算子之间的拓扑关系,并通过拓扑排序的方式决定算子执行顺序。系统串联执行过程中,会维护全量输出结果,并根据算子指定的`Inputs`字段对结果进行过滤,保证各算子内部计算独立。执行器核心代码实现参考[链接](../ppcv/core/framework.py#L92)
# Whl包使用
## 1. 安装与简介
目前该whl包尚未上传至pypi,因此目前需要通过以下方式安装。
```shell
python setup.py bdist_wheel
pip install dist/paddlecv-0.1.0-py3-none-any.whl
```
## 2. 基本调用
使用方式如下所示。
* 可以指定task_name或者config_path,来获取所需要预测的系统。当使用`task_name`时,会从PaddleCV项目中获取已经自带的模型或者串联系统,进行预测,而使用`config_path`时,则会加载配置文件,完成模型或者串联系统的初始化。
s
```py
from paddlecv import PaddleCV
paddlecv = PaddleCV(task_name="PP-OCRv3")
res = paddlecv("../demo/00056221.jpg")
```
* 如果希望查看系统自带的的串联系统列表,可以使用下面的方式。
```py
from paddlecv import PaddleCV
PaddleCV.list_all_supported_tasks()
```
输出内容如下。
```
[11/17 06:17:20] ppcv INFO: Tasks and recommanded configs that paddlecv supports are :
PP-Human: paddlecv://configs/system/PP-Human.yml
PP-OCRv2: paddlecv://configs/system/PP-OCRv2.yml
PP-OCRv3: paddlecv://configs/system/PP-OCRv3.yml
...
```
## 3. 高阶开发
如果你希望优化paddlecv whl包接口,可以修改`paddlecv.py`文件,然后重新编译生成whl包即可。
# Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import os
import sys
import importlib
import argparse
__dir__ = os.path.dirname(__file__)
sys.path.insert(0, os.path.join(__dir__, ''))
import cv2
import logging
import numpy as np
from pathlib import Path
ppcv = importlib.import_module('.', 'ppcv')
tools = importlib.import_module('.', 'tools')
tests = importlib.import_module('.', 'tests')
VERSION = '0.1.0'
import yaml
from ppcv.model_zoo.model_zoo import TASK_DICT, list_model, get_config_file
from ppcv.engine.pipeline import Pipeline
from ppcv.utils.logger import setup_logger
logger = setup_logger()
class PaddleCV(object):
def __init__(self,
task_name=None,
config_path=None,
output_dir=None,
run_mode='paddle',
device='CPU'):
if task_name is not None:
assert task_name in TASK_DICT, f"task_name must be one of {list(TASK_DICT.keys())} but got {task_name}"
config_path = get_config_file(task_name)
else:
assert config_path is not None, "task_name and config_path can not be None at the same time!!!"
self.cfg_dict = dict(
config=config_path,
output_dir=output_dir,
run_mode=run_mode,
device=device)
cfg = argparse.Namespace(**self.cfg_dict)
self.pipeline = Pipeline(cfg)
@classmethod
def list_all_supported_tasks(self, ):
logger.info(
f"Tasks and recommanded configs that paddlecv supports are : ")
buffer = yaml.dump(TASK_DICT)
print(buffer)
return
@classmethod
def list_all_supported_models(self, filters=[]):
list_model(filters)
return
def __call__(self, input):
res = self.pipeline.run(input)
return res
# Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from . import (core, engine, ops, utils, model_zoo)
# Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from . import workspace
from .workspace import *
__all__ = workspace.__all__
# Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import os
import numpy as np
import math
import paddle
import collections
from collections import defaultdict
from collections.abc import Sequence, Mapping
import yaml
import copy
from argparse import ArgumentParser, RawDescriptionHelpFormatter
from ppcv.utils.logger import setup_logger
import ppcv
from ppcv.ops import *
logger = setup_logger('config')
class ArgsParser(ArgumentParser):
def __init__(self):
super(ArgsParser, self).__init__(
formatter_class=RawDescriptionHelpFormatter)
self.add_argument(
"-o", "--opt", nargs='*', help="set configuration options")
def parse_args(self, argv=None):
args = super(ArgsParser, self).parse_args(argv)
assert args.config is not None, \
"Please specify --config=configure_file_path."
args.opt = self._parse_opt(args.opt)
return args
def _parse_opt(self, opts):
config = {}
if not opts:
return config
for s in opts:
s = s.strip()
k, v = s.split('=', 1)
if '.' not in k:
config[k] = yaml.load(v, Loader=yaml.Loader)
else:
keys = k.split('.')
if keys[0] not in config:
config[keys[0]] = {}
cur = config[keys[0]]
for idx, key in enumerate(keys[1:]):
if idx == len(keys) - 2:
cur[key] = yaml.load(v, Loader=yaml.Loader)
else:
cur[key] = {}
cur = cur[key]
return config
class ConfigParser(object):
def __init__(self, args):
with open(args.config) as f:
cfg = yaml.safe_load(f)
print('args: ', args)
self.model_cfg, self.env_cfg = self.merge_cfg(args, cfg)
self.check_cfg()
def merge_cfg(self, args, cfg):
env_cfg = cfg['ENV']
model_cfg = cfg['MODEL']
def merge(cfg, arg):
merge_cfg = copy.deepcopy(cfg)
for k, v in cfg.items():
if k in arg:
merge_cfg[k] = arg[k]
else:
if isinstance(v, dict):
merge_cfg[k] = merge(v, arg)
return merge_cfg
def merge_opt(cfg, arg):
for k, v in arg.items():
if isinstance(cfg, Sequence):
k = eval(k)
cfg[k] = merge_opt(cfg[k], v)
else:
if (k in cfg and (isinstance(cfg[k], Sequence) or
isinstance(cfg[k], Mapping)) and
isinstance(arg[k], Mapping)):
merge_opt(cfg[k], arg[k])
else:
cfg[k] = arg[k]
return cfg
args_dict = vars(args)
for k, v in args_dict.items():
if k not in env_cfg:
env_cfg[k] = v
env_cfg = merge(env_cfg, args_dict)
print('debug env_cfg: ', env_cfg)
if 'opt' in args_dict.keys() and args_dict['opt']:
opt_dict = args_dict['opt']
if opt_dict.get('ENV', None):
env_cfg = merge_opt(env_cfg, opt_dict['ENV'])
if opt_dict.get('MODEL', None):
model_cfg = merge_opt(model_cfg, opt_dict['MODEL'])
return model_cfg, env_cfg
def check_cfg(self):
unique_name = set()
unique_name.add('input')
op_list = ppcv.ops.__all__
for model in self.model_cfg:
model_name = list(model.keys())[0]
model_dict = list(model.values())[0]
# check the name and last_ops is legal
if 'name' not in model_dict:
raise ValueError(
'Missing name field in {} model config'.format(model_name))
inputs = model_dict['Inputs']
for input in inputs:
input_str = input.split('.')
assert len(
input_str
) > 1, 'The Inputs name should be in format of {last_op_name}.{last_op_output_name}, but receive {} in {} model config'.format(
input, model_name)
last_op = input.split('.')[0]
assert last_op in unique_name, 'The last_op {} in {} model config is not exist.'.format(
last_op, model_name)
unique_name.add(model_dict['name'])
device = self.env_cfg.get("device", "CPU")
assert device.upper() in ['CPU', 'GPU', 'XPU'
], "device should be CPU, GPU or XPU"
def parse(self):
return self.model_cfg, self.env_cfg
def print_cfg(self):
print('----------- Environment Arguments -----------')
buffer = yaml.dump(self.env_cfg)
print(buffer)
print('------------- Model Arguments ---------------')
buffer = yaml.dump(self.model_cfg)
print(buffer)
print('---------------------------------------------')
# Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import os
import numpy as np
import math
import paddle
from collections import defaultdict
import ppcv
from ppcv.ops import *
from ppcv.utils.helper import get_output_keys, gen_input_name
from ppcv.core.workspace import create
class DAG(object):
"""
Directed Acyclic Graph(DAG) engine, builds one DAG topology.
"""
def __init__(self, cfg):
self.graph, self.rev_graph, self.in_degrees = self.build_dag(cfg)
self.num = len(self.in_degrees)
def build_dag(self, cfg):
graph = defaultdict(list) # op -> next_op
unique_name = set()
unique_name.add('input')
rev_graph = defaultdict(list) # op -> last_op
for op in cfg:
op_dict = list(op.values())[0]
unique_name.add(op_dict['name'])
in_degrees = dict((u, 0) for u in unique_name)
for op in cfg:
op_cfg = list(op.values())[0]
inputs = op_cfg['Inputs']
for input in inputs:
last_op = input.split('.')[0]
graph[last_op].append(op_cfg['name'])
rev_graph[op_cfg['name']].append(last_op)
in_degrees[op_cfg['name']] += 1
return graph, rev_graph, in_degrees
def get_graph(self):
return self.graph
def get_reverse_graph(self):
return self.rev_graph
def topo_sort(self):
"""
Topological sort of DAG, creates inverted multi-layers views.
Args:
graph (dict): the DAG stucture
in_degrees (dict): Next op list for each op
Returns:
sort_result: the hierarchical topology list. examples:
DAG :[A -> B -> C -> E]
\-> D /
sort_result: [A, B, C, D, E]
"""
# Select vertices with in_degree = 0
Q = [u for u in self.in_degrees if self.in_degrees[u] == 0]
sort_result = []
while Q:
u = Q.pop()
sort_result.append(u)
for v in self.graph[u]:
# remove output degrees
self.in_degrees[v] -= 1
# re-select vertices with in_degree = 0
if self.in_degrees[v] == 0:
Q.append(v)
if len(sort_result) == self.num:
return sort_result
else:
return None
class Executor(object):
"""
The executor which implements model series pipeline
Args:
env_cfg: The enrionment configuration
model_cfg: The models configuration
"""
def __init__(self, model_cfg, env_cfg):
dag = DAG(model_cfg)
self.order = dag.topo_sort()
self.model_cfg = model_cfg
self.op_name2op = {}
self.has_output_op = False
for op in model_cfg:
op_arch = list(op.keys())[0]
op_cfg = list(op.values())[0]
op_name = op_cfg['name']
op = create(op_arch, op_cfg, env_cfg)
self.op_name2op[op_name] = op
if op.type() == 'OUTPUT':
self.has_output_op = True
self.output_keys = get_output_keys(model_cfg)
self.last_ops_dict = dag.get_reverse_graph()
self.input_dep = self.reset_dep()
def reset_dep(self, ):
return self.build_dep(self.model_cfg, self.output_keys)
def build_dep(self, cfg, output_keys):
# compute the output degree for each input name
dep = dict()
for op in cfg:
inputs = list(op.values())[0]['Inputs']
for name in inputs:
if name in dep:
dep[name] += 1
else:
dep.update({name: 1})
return dep
def update_res(self, results, op_outputs, input_name):
# step1: remove the result when keys not used in later input
for res, out in zip(results, op_outputs):
if self.has_output_op:
del_name = []
for k in out.keys():
if k not in self.input_dep:
del_name.append(k)
# remove the result when keys not used in later input
for name in del_name:
del out[name]
res.update(out)
# step2: if the input name is no longer used, then result will be deleted
if self.has_output_op:
for name in input_name:
self.input_dep[name] -= 1
if self.input_dep[name] == 0:
for res in results:
del res[name]
def run(self, input, frame_id=-1):
self.input_dep = self.reset_dep()
# execute each operator according to toposort order
results = input
for i, op_name in enumerate(self.order[1:]):
op = self.op_name2op[op_name]
op.set_frame(frame_id)
last_ops = self.last_ops_dict[op_name]
input_keys = op.get_input_keys()
output_keys = list(results[0].keys())
input = op.filter_input(results, input_keys)
last_op_output = op(input)
if op.type() != 'OUTPUT':
op.check_output(last_op_output, op_name)
self.update_res(results, last_op_output, input_keys)
else:
results = last_op_output
return results
# Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import os
__all__ = ['register', 'create']
global_config = dict()
def register(cls):
"""
Register a given module class.
Args:
cls (type): Module class to be registered.
Returns: cls
"""
if cls.__name__ in global_config:
raise ValueError("Module class already registered: {}".format(
cls.__name__))
global_config[cls.__name__] = cls
return cls
def create(cls_name, op_cfg, env_cfg):
"""
Create an instance of given module class.
Args:
cls_name(str): Class of which to create instnce.
Return: instance of type `cls_or_name`
"""
assert type(cls_name) == str, "should be a name of class"
if cls_name not in global_config:
raise ValueError("The module {} is not registered".format(cls_name))
cls = global_config[cls_name]
return cls(op_cfg, env_cfg)
def get_global_op():
return global_config
# Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from . import pipeline
from .pipeline import *
__all__ = pipeline.__all__
# Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import os
import sys
import numpy as np
import math
import glob
import paddle
import cv2
from collections import defaultdict
try:
from collections.abc import Sequence
except Exception:
from collections import Sequence
from ppcv.core.framework import Executor
from ppcv.utils.logger import setup_logger
from ppcv.core.config import ConfigParser
logger = setup_logger('pipeline')
__all__ = ['Pipeline']
class Pipeline(object):
def __init__(self, cfg):
config = ConfigParser(cfg)
config.print_cfg()
self.model_cfg, self.env_cfg = config.parse()
self.exe = Executor(self.model_cfg, self.env_cfg)
self.output_dir = self.env_cfg.get('output_dir', 'output')
def _parse_input(self, input):
if isinstance(input, np.ndarray):
return [input], 'data'
if isinstance(input, Sequence) and isinstance(input[0], np.ndarray):
return input, 'data'
im_exts = ['jpg', 'jpeg', 'png', 'bmp']
im_exts += [ext.upper() for ext in im_exts]
video_exts = ['mp4', 'avi', 'wmv', 'mov', 'mpg', 'mpeg', 'flv']
video_exts += [ext.upper() for ext in video_exts]
if isinstance(input, (list, tuple)) and isinstance(input[0], str):
input_type = "image"
images = [
image for image in input
if any([image.endswith(ext) for ext in im_exts])
]
return images, input_type
if os.path.isdir(input):
input_type = "image"
logger.info(
'Input path is directory, search the images automatically')
images = set()
infer_dir = os.path.abspath(input)
for ext in im_exts:
images.update(glob.glob('{}/*.{}'.format(infer_dir, ext)))
images = list(images)
return images, input_type
logger.info('Input path is {}'.format(input))
input_ext = os.path.splitext(input)[-1][1:]
if input_ext in im_exts:
input_type = "image"
return [input], input_type
if input_ext in video_exts:
input_type = "video"
return input, input_type
raise ValueError("Unsupported input format: {}".fomat(input_ext))
return
def run(self, input):
input, input_type = self._parse_input(input)
if input_type == "image" or input_type == 'data':
results = self.predict_images(input)
elif input_type == "video":
results = self.predict_video(input)
else:
raise ValueError("Unexpected input type: {}".format(input_type))
return results
def decode_image(self, input):
if isinstance(input, str):
with open(input, 'rb') as f:
im_read = f.read()
data = np.frombuffer(im_read, dtype='uint8')
im = cv2.imdecode(data, 1) # BGR mode, but need RGB mode
im = cv2.cvtColor(im, cv2.COLOR_BGR2RGB)
else:
im = input
return im
def predict_images(self, input):
batch_input = [{
'input.image': self.decode_image(f),
'input.fn': 'tmp.jpg' if isinstance(f, np.ndarray) else f
} for f in input]
results = self.exe.run(batch_input)
return results
def predict_video(self, input):
capture = cv2.VideoCapture(input)
file_name = input.split('/')[-1]
# Get Video info : resolution, fps, frame count
width = int(capture.get(cv2.CAP_PROP_FRAME_WIDTH))
height = int(capture.get(cv2.CAP_PROP_FRAME_HEIGHT))
fps = int(capture.get(cv2.CAP_PROP_FPS))
frame_count = int(capture.get(cv2.CAP_PROP_FRAME_COUNT))
logger.info("video fps: %d, frame_count: %d" % (fps, frame_count))
if not os.path.exists(self.output_dir):
os.makedirs(self.output_dir)
out_path = os.path.join(self.output_dir, file_name)
fourcc = cv2.VideoWriter_fourcc(* 'mp4v')
writer = cv2.VideoWriter(out_path, fourcc, fps, (width, height))
frame_id = 0
results = None
while (1):
if frame_id % 10 == 0:
logger.info('frame id: {}'.format(frame_id))
ret, frame = capture.read()
if not ret:
break
frame_rgb = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
frame_input = [{'input.image': frame_rgb, 'input.fn': input}]
results = self.exe.run(frame_input, frame_id)
writer.write(results[0]['output'])
frame_id += 1
writer.release()
logger.info('save result to {}'.format(out_path))
return results
single_op/PP-YOLOv2
single_op/PP-PicoDet
single_op/PP-LiteSeg
single_op/PP-YOLOE+
single_op/PP-MattingV1
single_op/PP-YOLO
single_op/PP-LCNetV2
single_op/PP-HGNet
single_op/PP-LCNet
single_op/PP-HumanSegV2
single_op/PP-YOLOE
system/PP-Structure-layout-table
system/PP-Structure-re
system/PP-Structure
system/PP-OCRv2
system/PP-Vehicle
system/PP-ShiTuV2
system/PP-Structure-table
system/PP-Human
system/PP-TinyPose
system/PP-ShiTu
system/PP-OCRv3
system/PP-Structure-ser
# Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from . import model_zoo
from .model_zoo import *
__all__ = model_zoo.__all__
# Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import os.path as osp
import pkg_resources
try:
from collections.abc import Sequence
except:
from collections import Sequence
from ppcv.utils.download import get_config_path, get_model_path
from ppcv.utils.logger import setup_logger
logger = setup_logger(__name__)
__all__ = [
'list_model', 'get_config_file', 'get_model_file', 'MODEL_ZOO_FILENAME'
]
MODEL_ZOO_FILENAME = 'MODEL_ZOO'
TASK_DICT = {
# single model
'classification': 'paddlecv://configs/single_op/PP-HGNet',
'detection': 'paddlecv://configs/single_op/PP-YOLOE+.yml',
'segmentation': 'paddlecv://configs/single_op/PP-LiteSeg.yml',
# system
'PP-OCRv2': 'paddlecv://configs/system/PP-OCRv2.yml',
'PP-OCRv3': 'paddlecv://configs/system/PP-OCRv3.yml',
'PP-StructureV2': 'paddlecv://configs/system/PP-Structure.yml',
'PP-StructureV2-layout-table':
'paddlecv://configs/system/PP-Structure-layout-table.yml',
'PP-StructureV2-table': 'paddlecv://configs/system/PP-Structure-table.yml',
'PP-StructureV2-ser': 'paddlecv://configs/system/PP-Structure-ser.yml',
'PP-StructureV2-re': 'paddlecv://configs/system/PP-Structure-re.yml',
'PP-Human': 'paddlecv://configs/system/PP-Human.yml',
'PP-Vehicle': 'paddlecv://configs/system/PP-Vehicle.yml',
'PP-TinyPose': 'paddlecv://configs/system/PP-TinyPose.yml',
}
def list_model(filters=[]):
model_zoo_file = pkg_resources.resource_filename('ppcv.model_zoo',
MODEL_ZOO_FILENAME)
with open(model_zoo_file) as f:
model_names = f.read().splitlines()
# filter model_name
def filt(name):
for f in filters:
if name.find(f) < 0:
return False
return True
if isinstance(filters, str) or not isinstance(filters, Sequence):
filters = [filters]
model_names = [name for name in model_names if filt(name)]
if len(model_names) == 0 and len(filters) > 0:
raise ValueError("no model found, please check filters seeting, "
"filters can be set as following kinds:\n"
"\tTask: single_op, system\n"
"\tArchitecture: PPLCNet, PPYOLOE ...\n")
model_str = "Available Models:\n"
for model_name in model_names:
model_str += "\t{}\n".format(model_name)
logger.info(model_str)
# models and configs save on bcebos under dygraph directory
def get_config_file(task):
"""Get config path from task.
"""
if task not in TASK_DICT:
tasks = TASK_DICT.keys()
logger.error("Illegal task: {}, please use one of {}".format(task,
tasks))
path = TASK_DICT[task]
return get_config_path(path)
def get_model_file(path):
return get_model_path(path)
# Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from . import models
from . import output
from . import connector
from .models import *
from .output import *
from .connector import *
__all__ = models.__all__ + output.__all__ + connector.__all__
# Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import os
import importlib
import math
import numpy as np
try:
from collections.abc import Sequence
except Exception:
from collections import Sequence
import paddle
from paddle.inference import Config
from paddle.inference import create_predictor
from ppcv.ops.predictor import PaddlePredictor
from ppcv.utils.download import get_model_path
__all__ = ["BaseOp", ]
def create_operators(params, mod):
"""
create operators based on the config
Args:
params(list): a dict list, used to create some operators
mod(module) : a module that can import single ops
"""
assert isinstance(params, list), ('operator config should be a list')
if mod is None:
mod = importlib.import_module(__name__)
ops = []
for operator in params:
if isinstance(operator, str):
op_name = operator
param = {}
else:
assert isinstance(operator,
dict) and len(operator) == 1, "yaml format error"
op_name = list(operator)[0]
param = {} if operator[op_name] is None else operator[op_name]
op = getattr(mod, op_name)(**param)
ops.append(op)
return ops
class BaseOp(object):
"""
Base Operator, implement of prediction process
Args
"""
def __init__(self, model_cfg, env_cfg):
self.model_cfg = model_cfg
self.env_cfg = env_cfg
self.input_keys = model_cfg["Inputs"]
@classmethod
def type(self):
raise NotImplementedError
@classmethod
def get_output_keys(cls):
raise NotImplementedError
def get_input_keys(self):
return self.input_keys
def filter_input(self, last_outputs, input_keys):
f_inputs = [{k: last[k] for k in input_keys} for last in last_outputs]
return f_inputs
def check_output(self, output, name):
if not isinstance(output, Sequence):
raise ValueError('The output of op: {} must be Sequence').format(
name)
output = output[0]
if not isinstance(output, dict):
raise ValueError(
'The element of output in op: {} must be dict').format(name)
out_keys = list(output.keys())
for out, define in zip(out_keys, self.output_keys):
if out != define:
raise ValueError(
'The output key in op: {} is inconsistent, expect {}, but received {}'.
format(name, define, out))
def set_frame(self, frame_id):
self.frame_id = frame_id
def __call__(self, image_list):
raise NotImplementedError
# Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from .op_connector import *
__all__ = op_connector.__all__
# Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import cv2
import numpy as np
from ppcv.ops.base import BaseOp
class ConnectorBaseOp(BaseOp):
def __init__(self, model_cfg, env_cfg=None):
super(ConnectorBaseOp, self).__init__(model_cfg, env_cfg)
self.name = model_cfg["name"]
keys = self.get_output_keys()
self.output_keys = [self.name + '.' + key for key in keys]
@classmethod
def type(self):
return 'CONNECTOR'
此差异已折叠。
此差异已折叠。
此差异已折叠。
# Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from .tracker import *
from .postprocess import *
__all__ = tracker.__all__ + postprocess.__all__
\ No newline at end of file
# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from . import ocsort_matching
from .ocsort_matching import *
# Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import cv2
import numpy as np
from scipy.special import softmax
from ppcv.utils.download import get_dict_path
__all__ = ['ParserTrackerResults']
class ParserTrackerResults(object):
def __init__(self, label_list):
self.clsid2catid, self.catid2name = self.get_categories(label_list)
def get_categories(self, label_list):
if isinstance(label_list, list):
clsid2catid = {i: i for i in range(len(label_list))}
catid2name = {i: label_list[i] for i in range(len(label_list))}
return clsid2catid, catid2name
label_list = get_dict_path(label_list)
if label_list.endswith('json'):
# lazy import pycocotools here
from pycocotools.coco import COCO
coco = COCO(label_list)
cats = coco.loadCats(coco.getCatIds())
clsid2catid = {i: cat['id'] for i, cat in enumerate(cats)}
catid2name = {cat['id']: cat['name'] for cat in cats}
elif label_list.endswith('txt'):
cats = []
with open(label_list) as f:
for line in f.readlines():
cats.append(line.strip())
if cats[0] == 'background': cats = cats[1:]
clsid2catid = {i: i for i in range(len(cats))}
catid2name = {i: name for i, name in enumerate(cats)}
else:
raise ValueError("label_list {} should be json or txt.".format(
label_list))
return clsid2catid, catid2name
def __call__(self, tracking_outputs, output_keys):
tk_cls_ids = tracking_outputs[output_keys[3]]
tk_cls_names = [
self.catid2name[self.clsid2catid[cls_id]] for cls_id in tk_cls_ids
]
tracking_outputs[output_keys[4]] = tk_cls_names
return tracking_outputs
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册