Merge dygraph into master (#1931)

Merge dygraph branch into master branch. Co-authored-by: N FDInSky <48318485+FDInSky@users.noreply.github.com> Co-authored-by: N sunxl1988 <47514455+sunxl1988@users.noreply.github.com> Co-authored-by: N wangguanzhong <jerrywgz@126.com> Co-authored-by: N wangxinxin08 <69842442+wangxinxin08@users.noreply.github.com> Co-authored-by: N wanghuancoder <wanghuancoder@163.com> Co-authored-by: N cnn <liuhui29@baidu.com> Co-authored-by: N Guanghua Yu <742925032@qq.com> Co-authored-by: N Kaipeng Deng <dengkaipeng@baidu.com> Co-authored-by: N root <root@bjyz-sys-gpu-kongming0.bjyz.baidu.com>

Merge dygraph into master (#1931)
Merge dygraph branch into master branch. Co-authored-by: N FDInSky <48318485+FDInSky@users.noreply.github.com> Co-authored-by: N sunxl1988 <47514455+sunxl1988@users.noreply.github.com> Co-authored-by: N wangguanzhong <jerrywgz@126.com> Co-authored-by: N wangxinxin08 <69842442+wangxinxin08@users.noreply.github.com> Co-authored-by: N wanghuancoder <wanghuancoder@163.com> Co-authored-by: N cnn <liuhui29@baidu.com> Co-authored-by: N Guanghua Yu <742925032@qq.com> Co-authored-by: N Kaipeng Deng <dengkaipeng@baidu.com> Co-authored-by: N root <root@bjyz-sys-gpu-kongming0.bjyz.baidu.com>
eab8662f · qingqing01 · GitHub · 7a2e7950 · eab8662f · eab8662f
152 changed file
--- a/.travis/requirements.txt
+++ b/.travis/requirements.txt
@@ -3,3 +3,6 @@
 # from source files in unittest.sh
 tqdm
 cython
+shapely
+llvmlite==0.33
+numba==0.50
--- a/dygraph/.gitignore
+++ b/dygraph/.gitignore
+# Virtualenv
+/.venv/
+/venv/
+# Byte-compiled / optimized / DLL files
+__pycache__/
+.ipynb_checkpoints/
+*.py[cod]
+# C extensions
+*.so
+# json file
+*.json
+# Distribution / packaging
+/bin/
+/build/
+/develop-eggs/
+/dist/
+/eggs/
+/lib/
+/lib64/
+/output/
+/inference_model/
+/parts/
+/sdist/
+/var/
+/*.egg-info/
+/.installed.cfg
+/*.egg
+/.eggs
+# AUTHORS and ChangeLog will be generated while packaging
+/AUTHORS
+/ChangeLog
+# BCloud / BuildSubmitter
+/build_submitter.*
+/logger_client_log
+# Installer logs
+pip-log.txt
+pip-delete-this-directory.txt
+# Unit test / coverage reports
+.tox/
+.coverage
+.cache
+.pytest_cache
+nosetests.xml
+coverage.xml
+# Translations
+*.mo
+# Sphinx documentation
+/docs/_build/
+*.json
+*.tar
+*.pyc
+.idea/
+dataset/coco/annotations
+dataset/coco/train2017
+dataset/coco/val2017
+dataset/voc/VOCdevkit
+dataset/fruit/fruit-detection/
+dataset/voc/test.txt
+dataset/voc/trainval.txt
+dataset/wider_face/WIDER_test
+dataset/wider_face/WIDER_train
+dataset/wider_face/WIDER_val
+dataset/wider_face/wider_face_split
--- a/dygraph/.pre-commit-config.yaml
+++ b/dygraph/.pre-commit-config.yaml
+-   repo: https://github.com/PaddlePaddle/mirrors-yapf.git
+    sha: 0d79c0c469bab64f7229c9aca2b1186ef47f0e37
+    hooks:
+    -   id: yapf
+        files: \.py$
+-   repo: https://github.com/pre-commit/pre-commit-hooks
+    sha: a11d9314b22d8f8c7556443875b731ef05965464
+    hooks:
+    -   id: check-merge-conflict
+    -   id: check-symlinks
+    -   id: detect-private-key
+        files: (?!.*paddle)^.*$
+    -   id: end-of-file-fixer
+        files: \.(md|yml)$
+    -   id: trailing-whitespace
+        files: \.(md|yml)$
+-   repo: https://github.com/Lucas-C/pre-commit-hooks
+    sha: v1.0.1
+    hooks:
+    -   id: forbid-crlf
+        files: \.(md|yml)$
+    -   id: remove-crlf
+        files: \.(md|yml)$
+    -   id: forbid-tabs
+        files: \.(md|yml)$
+    -   id: remove-tabs
+        files: \.(md|yml)$
--- a/dygraph/.style.yapf
+++ b/dygraph/.style.yapf
+[style]
+based_on_style = pep8
+column_limit = 80
--- a/dygraph/LICENSE
+++ b/dygraph/LICENSE
+                                 Apache License
+                           Version 2.0, January 2004
+                        http://www.apache.org/licenses/
+   TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
+   1. Definitions.
+      "License" shall mean the terms and conditions for use, reproduction,
+      and distribution as defined by Sections 1 through 9 of this document.
+      "Licensor" shall mean the copyright owner or entity authorized by
+      the copyright owner that is granting the License.
+      "Legal Entity" shall mean the union of the acting entity and all
+      other entities that control, are controlled by, or are under common
+      control with that entity. For the purposes of this definition,
+      "control" means (i) the power, direct or indirect, to cause the
+      direction or management of such entity, whether by contract or
+      otherwise, or (ii) ownership of fifty percent (50%) or more of the
+      outstanding shares, or (iii) beneficial ownership of such entity.
+      "You" (or "Your") shall mean an individual or Legal Entity
+      exercising permissions granted by this License.
+      "Source" form shall mean the preferred form for making modifications,
+      including but not limited to software source code, documentation
+      source, and configuration files.
+      "Object" form shall mean any form resulting from mechanical
+      transformation or translation of a Source form, including but
+      not limited to compiled object code, generated documentation,
+      and conversions to other media types.
+      "Work" shall mean the work of authorship, whether in Source or
+      Object form, made available under the License, as indicated by a
+      copyright notice that is included in or attached to the work
+      (an example is provided in the Appendix below).
+      "Derivative Works" shall mean any work, whether in Source or Object
+      form, that is based on (or derived from) the Work and for which the
+      editorial revisions, annotations, elaborations, or other modifications
+      represent, as a whole, an original work of authorship. For the purposes
+      of this License, Derivative Works shall not include works that remain
+      separable from, or merely link (or bind by name) to the interfaces of,
+      the Work and Derivative Works thereof.
+      "Contribution" shall mean any work of authorship, including
+      the original version of the Work and any modifications or additions
+      to that Work or Derivative Works thereof, that is intentionally
+      submitted to Licensor for inclusion in the Work by the copyright owner
+      or by an individual or Legal Entity authorized to submit on behalf of
+      the copyright owner. For the purposes of this definition, "submitted"
+      means any form of electronic, verbal, or written communication sent
+      to the Licensor or its representatives, including but not limited to
+      communication on electronic mailing lists, source code control systems,
+      and issue tracking systems that are managed by, or on behalf of, the
+      Licensor for the purpose of discussing and improving the Work, but
+      excluding communication that is conspicuously marked or otherwise
+      designated in writing by the copyright owner as "Not a Contribution."
+      "Contributor" shall mean Licensor and any individual or Legal Entity
+      on behalf of whom a Contribution has been received by Licensor and
+      subsequently incorporated within the Work.
+   2. Grant of Copyright License. Subject to the terms and conditions of
+      this License, each Contributor hereby grants to You a perpetual,
+      worldwide, non-exclusive, no-charge, royalty-free, irrevocable
+      copyright license to reproduce, prepare Derivative Works of,
+      publicly display, publicly perform, sublicense, and distribute the
+      Work and such Derivative Works in Source or Object form.
+   3. Grant of Patent License. Subject to the terms and conditions of
+      this License, each Contributor hereby grants to You a perpetual,
+      worldwide, non-exclusive, no-charge, royalty-free, irrevocable
+      (except as stated in this section) patent license to make, have made,
+      use, offer to sell, sell, import, and otherwise transfer the Work,
+      where such license applies only to those patent claims licensable
+      by such Contributor that are necessarily infringed by their
+      Contribution(s) alone or by combination of their Contribution(s)
+      with the Work to which such Contribution(s) was submitted. If You
+      institute patent litigation against any entity (including a
+      cross-claim or counterclaim in a lawsuit) alleging that the Work
+      or a Contribution incorporated within the Work constitutes direct
+      or contributory patent infringement, then any patent licenses
+      granted to You under this License for that Work shall terminate
+      as of the date such litigation is filed.
+   4. Redistribution. You may reproduce and distribute copies of the
+      Work or Derivative Works thereof in any medium, with or without
+      modifications, and in Source or Object form, provided that You
+      meet the following conditions:
+      (a) You must give any other recipients of the Work or
+          Derivative Works a copy of this License; and
+      (b) You must cause any modified files to carry prominent notices
+          stating that You changed the files; and
+      (c) You must retain, in the Source form of any Derivative Works
+          that You distribute, all copyright, patent, trademark, and
+          attribution notices from the Source form of the Work,
+          excluding those notices that do not pertain to any part of
+          the Derivative Works; and
+      (d) If the Work includes a "NOTICE" text file as part of its
+          distribution, then any Derivative Works that You distribute must
+          include a readable copy of the attribution notices contained
+          within such NOTICE file, excluding those notices that do not
+          pertain to any part of the Derivative Works, in at least one
+          of the following places: within a NOTICE text file distributed
+          as part of the Derivative Works; within the Source form or
+          documentation, if provided along with the Derivative Works; or,
+          within a display generated by the Derivative Works, if and
+          wherever such third-party notices normally appear. The contents
+          of the NOTICE file are for informational purposes only and
+          do not modify the License. You may add Your own attribution
+          notices within Derivative Works that You distribute, alongside
+          or as an addendum to the NOTICE text from the Work, provided
+          that such additional attribution notices cannot be construed
+          as modifying the License.
+      You may add Your own copyright statement to Your modifications and
+      may provide additional or different license terms and conditions
+      for use, reproduction, or distribution of Your modifications, or
+      for any such Derivative Works as a whole, provided Your use,
+      reproduction, and distribution of the Work otherwise complies with
+      the conditions stated in this License.
+   5. Submission of Contributions. Unless You explicitly state otherwise,
+      any Contribution intentionally submitted for inclusion in the Work
+      by You to the Licensor shall be under the terms and conditions of
+      this License, without any additional terms or conditions.
+      Notwithstanding the above, nothing herein shall supersede or modify
+      the terms of any separate license agreement you may have executed
+      with Licensor regarding such Contributions.
+   6. Trademarks. This License does not grant permission to use the trade
+      names, trademarks, service marks, or product names of the Licensor,
+      except as required for reasonable and customary use in describing the
+      origin of the Work and reproducing the content of the NOTICE file.
+   7. Disclaimer of Warranty. Unless required by applicable law or
+      agreed to in writing, Licensor provides the Work (and each
+      Contributor provides its Contributions) on an "AS IS" BASIS,
+      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
+      implied, including, without limitation, any warranties or conditions
+      of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
+      PARTICULAR PURPOSE. You are solely responsible for determining the
+      appropriateness of using or redistributing the Work and assume any
+      risks associated with Your exercise of permissions under this License.
+   8. Limitation of Liability. In no event and under no legal theory,
+      whether in tort (including negligence), contract, or otherwise,
+      unless required by applicable law (such as deliberate and grossly
+      negligent acts) or agreed to in writing, shall any Contributor be
+      liable to You for damages, including any direct, indirect, special,
+      incidental, or consequential damages of any character arising as a
+      result of this License or out of the use or inability to use the
+      Work (including but not limited to damages for loss of goodwill,
+      work stoppage, computer failure or malfunction, or any and all
+      other commercial damages or losses), even if such Contributor
+      has been advised of the possibility of such damages.
+   9. Accepting Warranty or Additional Liability. While redistributing
+      the Work or Derivative Works thereof, You may choose to offer,
+      and charge a fee for, acceptance of support, warranty, indemnity,
+      or other liability obligations and/or rights consistent with this
+      License. However, in accepting such obligations, You may act only
+      on Your own behalf and on Your sole responsibility, not on behalf
+      of any other Contributor, and only if You agree to indemnify,
+      defend, and hold each Contributor harmless for any liability
+      incurred by, or claims asserted against, such Contributor by reason
+      of your accepting any such warranty or additional liability.
+   END OF TERMS AND CONDITIONS
+   APPENDIX: How to apply the Apache License to your work.
+      To apply the Apache License to your work, attach the following
+      boilerplate notice, with the fields enclosed by brackets "[]"
+      replaced with your own identifying information. (Don't include
+      the brackets!)  The text should be enclosed in the appropriate
+      comment syntax for the file format. We also recommend that a
+      file or class name and description of purpose be included on the
+      same "printed page" as the copyright notice for easier
+      identification within third-party archives.
+   Copyright [yyyy] [name of copyright owner]
+   Licensed under the Apache License, Version 2.0 (the "License");
+   you may not use this file except in compliance with the License.
+   You may obtain a copy of the License at
+       http://www.apache.org/licenses/LICENSE-2.0
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
--- a/dygraph/README.md
+++ b/dygraph/README.md
+# PaddleDetection
+动态图版本的PaddleDetection, 此版本为试用版本，还在持续优化设计、性能、新增模型、文档等。
+支持的模型:
+- Faster-RCNN (FPN)
+- Mask-RCNN (FPN)
+- Cascade RCNN
+- YOLOv3
+- SSD
+扩展特性：
+- [x] **Synchronized Batch Norm**
+- [x] **Group Norm**
+- [x] **Modulated Deformable Convolution**
+- [x] **Deformable PSRoI Pooling**
+## 文档教程
+### 教程
+- [安装说明](docs/tutorials/INSTALL_cn.md)
+- [训练/评估/预测流程](docs/tutorials/GETTING_STARTED_cn.md)
+- [常见问题汇总](docs/FAQ.md)
+- [推理部署](deploy)
+    - [模型导出教程](docs/advanced_tutorials/deploy/EXPORT_MODEL.md)
+    - [Python端推理部署](deploy/python)
+    - [C++端推理部署](deploy/cpp)
+    - [推理Benchmark](docs/advanced_tutorials/deploy/BENCHMARK_INFER_cn.md)
+## 模型库
+- [模型库](docs/MODEL_ZOO_cn.md)
--- a/dygraph/configs/_base_/datasets/coco_detection.yml
+++ b/dygraph/configs/_base_/datasets/coco_detection.yml
+metric: COCO
+num_classes: 80
+TrainDataset:
+  !COCODataSet
+    image_dir: train2017
+    anno_path: annotations/instances_train2017.json
+    dataset_dir: dataset/coco
+    data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+EvalDataset:
+  !COCODataSet
+    image_dir: val2017
+    anno_path: annotations/instances_val2017.json
+    dataset_dir: dataset/coco
+TestDataset:
+  !ImageFolder
+    anno_path: annotations/instances_val2017.json
--- a/dygraph/configs/_base_/datasets/coco_instance.yml
+++ b/dygraph/configs/_base_/datasets/coco_instance.yml
+metric: COCO
+num_classes: 80
+TrainDataset:
+  !COCODataSet
+    image_dir: train2017
+    anno_path: annotations/instances_train2017.json
+    dataset_dir: dataset/coco
+    data_fields: ['image', 'gt_bbox', 'gt_class', 'gt_poly', 'is_crowd']
+EvalDataset:
+  !COCODataSet
+    image_dir: val2017
+    anno_path: annotations/instances_val2017.json
+    dataset_dir: dataset/coco
+TestDataset:
+  !ImageFolder
+    anno_path: annotations/instances_val2017.json
--- a/dygraph/configs/_base_/datasets/voc.yml
+++ b/dygraph/configs/_base_/datasets/voc.yml
+metric: VOC
+num_classes: 20
+TrainDataset:
+  !VOCDataSet
+    dataset_dir: dataset/voc
+    anno_path: trainval.txt
+    label_list: label_list.txt
+    data_fields: ['image', 'gt_bbox', 'gt_class', 'difficult']
+EvalDataset:
+  !VOCDataSet
+    dataset_dir: dataset/voc
+    anno_path: test.txt
+    label_list: label_list.txt
+    data_fields: ['image', 'gt_bbox', 'gt_class', 'difficult']
+TestDataset:
+  !ImageFolder
+    anno_path: dataset/voc/label_list.txt
--- a/dygraph/configs/_base_/models/cascade_mask_rcnn_r50_fpn.yml
+++ b/dygraph/configs/_base_/models/cascade_mask_rcnn_r50_fpn.yml
+architecture: CascadeRCNN
+pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/ResNet50_cos_pretrained.tar
+weights: output/cascade_mask_rcnn_r50_fpn_1x_coco/model_final
+load_static_weights: True
+roi_stages: 3
+# Model Achitecture
+CascadeRCNN:
+  # model anchor info flow
+  anchor: Anchor
+  proposal: Proposal
+  mask: Mask
+  # model feat info flow
+  backbone: ResNet
+  neck: FPN
+  rpn_head: RPNHead
+  bbox_head: BBoxHead
+  mask_head: MaskHead
+  # post process
+  bbox_post_process: BBoxPostProcess
+  mask_post_process: MaskPostProcess
+ResNet:
+  # index 0 stands for res2
+  depth: 50
+  norm_type: bn
+  freeze_at: 0
+  return_idx: [0,1,2,3]
+  num_stages: 4
+FPN:
+  in_channels: [256, 512, 1024, 2048]
+  out_channel: 256
+  min_level: 0
+  max_level: 4
+  spatial_scale: [0.25, 0.125, 0.0625, 0.03125]
+RPNHead:
+  rpn_feat:
+    name: RPNFeat
+    feat_in: 256
+    feat_out: 256
+  anchor_per_position: 3
+  rpn_channel: 256
+Anchor:
+  anchor_generator:
+    name: AnchorGeneratorRPN
+    aspect_ratios: [0.5, 1.0, 2.0]
+    anchor_start_size: 32
+    stride: [4., 4.]
+  anchor_target_generator:
+    name: AnchorTargetGeneratorRPN
+    batch_size_per_im: 256
+    fg_fraction: 0.5
+    negative_overlap: 0.3
+    positive_overlap: 0.7
+    straddle_thresh: 0.0
+Proposal:
+  proposal_generator:
+    name: ProposalGenerator
+    min_size: 0.0
+    nms_thresh: 0.7
+    train_pre_nms_top_n: 2000
+    train_post_nms_top_n: 2000
+    infer_pre_nms_top_n: 1000
+    infer_post_nms_top_n: 1000
+  proposal_target_generator:
+    name: ProposalTargetGenerator
+    batch_size_per_im: 512
+    bbox_reg_weights: [0.1, 0.1, 0.2, 0.2]
+    bg_thresh_hi: [0.5, 0.6, 0.7]
+    bg_thresh_lo: [0.0, 0.0, 0.0]
+    fg_thresh: [0.5, 0.6, 0.7]
+    fg_fraction: 0.25
+    is_cls_agnostic: true
+BBoxHead:
+  bbox_feat:
+    name: BBoxFeat
+    roi_extractor:
+      name: RoIAlign
+      resolution: 7
+      sampling_ratio: 2
+    head_feat:
+      name: TwoFCHead
+      in_dim: 256
+      mlp_dim: 1024
+  in_feat: 1024
+  cls_agnostic: true
+BBoxPostProcess:
+  decode:
+    name: RCNNBox
+    num_classes: 81
+    batch_size: 1
+    var_weight: 3.
+  nms:
+    name: MultiClassNMS
+    keep_top_k: 100
+    score_threshold: 0.05
+    nms_threshold: 0.5
+Mask:
+  mask_target_generator:
+    name: MaskTargetGenerator
+    mask_resolution: 28
+MaskHead:
+  mask_feat:
+    name: MaskFeat
+    num_convs: 4
+    feat_in: 256
+    feat_out: 256
+    mask_roi_extractor:
+      name: RoIAlign
+      resolution: 14
+      sampling_ratio: 2
+    share_bbox_feat: False
+  feat_in: 256
+MaskPostProcess:
+  mask_resolution: 28
--- a/dygraph/configs/_base_/models/cascade_rcnn_r50_fpn.yml
+++ b/dygraph/configs/_base_/models/cascade_rcnn_r50_fpn.yml
+architecture: CascadeRCNN
+pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/ResNet50_cos_pretrained.tar
+weights: output/cascade_rcnn_r50_fpn_1x_coco/model_final
+load_static_weights: True
+roi_stages: 3
+# Model Achitecture
+CascadeRCNN:
+  # model anchor info flow
+  anchor: Anchor
+  proposal: Proposal
+  # model feat info flow
+  backbone: ResNet
+  neck: FPN
+  rpn_head: RPNHead
+  bbox_head: BBoxHead
+  # post process
+  bbox_post_process: BBoxPostProcess
+ResNet:
+  # index 0 stands for res2
+  depth: 50
+  norm_type: bn
+  freeze_at: 0
+  return_idx: [0,1,2,3]
+  num_stages: 4
+FPN:
+  in_channels: [256, 512, 1024, 2048]
+  out_channel: 256
+  min_level: 0
+  max_level: 4
+  spatial_scale: [0.25, 0.125, 0.0625, 0.03125]
+RPNHead:
+  rpn_feat:
+    name: RPNFeat
+    feat_in: 256
+    feat_out: 256
+  anchor_per_position: 3
+  rpn_channel: 256
+Anchor:
+  anchor_generator:
+    name: AnchorGeneratorRPN
+    aspect_ratios: [0.5, 1.0, 2.0]
+    anchor_start_size: 32
+    stride: [4., 4.]
+  anchor_target_generator:
+    name: AnchorTargetGeneratorRPN
+    batch_size_per_im: 256
+    fg_fraction: 0.5
+    negative_overlap: 0.3
+    positive_overlap: 0.7
+    straddle_thresh: 0.0
+Proposal:
+  proposal_generator:
+    name: ProposalGenerator
+    min_size: 0.0
+    nms_thresh: 0.7
+    train_pre_nms_top_n: 2000
+    train_post_nms_top_n: 2000
+    infer_pre_nms_top_n: 1000
+    infer_post_nms_top_n: 1000
+  proposal_target_generator:
+    name: ProposalTargetGenerator
+    batch_size_per_im: 512
+    bbox_reg_weights: [0.1, 0.1, 0.2, 0.2]
+    bg_thresh_hi: [0.5, 0.6, 0.7]
+    bg_thresh_lo: [0.0, 0.0, 0.0]
+    fg_thresh: [0.5, 0.6, 0.7]
+    fg_fraction: 0.25
+    is_cls_agnostic: true
+BBoxHead:
+  bbox_feat:
+    name: BBoxFeat
+    roi_extractor:
+      name: RoIAlign
+      resolution: 7
+      sampling_ratio: 2
+    head_feat:
+      name: TwoFCHead
+      in_dim: 256
+      mlp_dim: 1024
+  in_feat: 1024
+  cls_agnostic: true
+BBoxPostProcess:
+  decode:
+    name: RCNNBox
+    num_classes: 81
+    batch_size: 1
+    var_weight: 3.
+  nms:
+    name: MultiClassNMS
+    keep_top_k: 100
+    score_threshold: 0.05
+    nms_threshold: 0.5
--- a/dygraph/configs/_base_/models/faster_rcnn_r50.yml
+++ b/dygraph/configs/_base_/models/faster_rcnn_r50.yml
+architecture: FasterRCNN
+pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/ResNet50_cos_pretrained.tar
+weights: output/faster_rcnn_r50_1x_coco/model_final.pdparams
+load_static_weights: True
+# Model Achitecture
+FasterRCNN:
+  # model anchor info flow
+  anchor: Anchor
+  proposal: Proposal
+  # model feat info flow
+  backbone: ResNet
+  rpn_head: RPNHead
+  bbox_head: BBoxHead
+  # post process
+  bbox_post_process: BBoxPostProcess
+ResNet:
+  # index 0 stands for res2
+  depth: 50
+  norm_type: bn
+  freeze_at: 0
+  return_idx: [2]
+  num_stages: 3
+RPNHead:
+  rpn_feat:
+    name: RPNFeat
+    feat_in: 1024
+    feat_out: 1024
+  anchor_per_position: 15
+  rpn_channel: 1024
+Anchor:
+  anchor_generator:
+    name: AnchorGeneratorRPN
+    anchor_sizes: [32, 64, 128, 256, 512]
+    aspect_ratios: [0.5, 1.0, 2.0]
+    stride: [16.0, 16.0]
+    variance: [1.0, 1.0, 1.0, 1.0]
+  anchor_target_generator:
+    name: AnchorTargetGeneratorRPN
+    batch_size_per_im: 256
+    fg_fraction: 0.5
+    negative_overlap: 0.3
+    positive_overlap: 0.7
+    straddle_thresh: 0.0
+Proposal:
+  proposal_generator:
+    name: ProposalGenerator
+    min_size: 0.0
+    nms_thresh: 0.7
+    train_pre_nms_top_n: 12000
+    train_post_nms_top_n: 2000
+    infer_pre_nms_top_n: 6000
+    infer_post_nms_top_n: 1000
+  proposal_target_generator:
+    name: ProposalTargetGenerator
+    batch_size_per_im: 512
+    bbox_reg_weights: [0.1, 0.1, 0.2, 0.2]
+    bg_thresh_hi: [0.5,]
+    bg_thresh_lo: [0.0,]
+    fg_thresh: [0.5,]
+    fg_fraction: 0.25
+BBoxHead:
+  bbox_feat:
+    name: BBoxFeat
+    roi_extractor:
+      name: RoIAlign
+      resolution: 14
+      sampling_ratio: 0
+      start_level: 0
+      end_level: 0
+    head_feat:
+      name: Res5Head
+      feat_in: 1024
+      feat_out: 512
+  with_pool: true
+  in_feat: 2048
+BBoxPostProcess:
+  decode:
+    name: RCNNBox
+    num_classes: 81
+    batch_size: 1
+  nms:
+    name: MultiClassNMS
+    keep_top_k: 100
+    score_threshold: 0.05
+    nms_threshold: 0.5
--- a/dygraph/configs/_base_/models/faster_rcnn_r50_fpn.yml
+++ b/dygraph/configs/_base_/models/faster_rcnn_r50_fpn.yml
+architecture: FasterRCNN
+pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/ResNet50_cos_pretrained.tar
+weights: output/faster_rcnn_r50_fpn_1x_coco/model_final.pdparams
+load_static_weights: True
+# Model Achitecture
+FasterRCNN:
+  # model anchor info flow
+  anchor: Anchor
+  proposal: Proposal
+  # model feat info flow
+  backbone: ResNet
+  neck: FPN
+  rpn_head: RPNHead
+  bbox_head: BBoxHead
+  # post process
+  bbox_post_process: BBoxPostProcess
+ResNet:
+  # index 0 stands for res2
+  depth: 50
+  norm_type: bn
+  freeze_at: 0
+  return_idx: [0,1,2,3]
+  num_stages: 4
+FPN:
+  in_channels: [256, 512, 1024, 2048]
+  out_channel: 256
+  min_level: 0
+  max_level: 4
+  spatial_scale: [0.25, 0.125, 0.0625, 0.03125]
+RPNHead:
+  rpn_feat:
+    name: RPNFeat
+    feat_in: 256
+    feat_out: 256
+  anchor_per_position: 3
+  rpn_channel: 256
+Anchor:
+  anchor_generator:
+    name: AnchorGeneratorRPN
+    aspect_ratios: [0.5, 1.0, 2.0]
+    anchor_start_size: 32
+    stride: [4., 4.]
+  anchor_target_generator:
+    name: AnchorTargetGeneratorRPN
+    batch_size_per_im: 256
+    fg_fraction: 0.5
+    negative_overlap: 0.3
+    positive_overlap: 0.7
+    straddle_thresh: 0.0
+Proposal:
+  proposal_generator:
+    name: ProposalGenerator
+    min_size: 0.0
+    nms_thresh: 0.7
+    train_pre_nms_top_n: 2000
+    train_post_nms_top_n: 2000
+    infer_pre_nms_top_n: 1000
+    infer_post_nms_top_n: 1000
+  proposal_target_generator:
+    name: ProposalTargetGenerator
+    batch_size_per_im: 512
+    bbox_reg_weights: [0.1, 0.1, 0.2, 0.2]
+    bg_thresh_hi: [0.5,]
+    bg_thresh_lo: [0.0,]
+    fg_thresh: [0.5,]
+    fg_fraction: 0.25
+BBoxHead:
+  bbox_feat:
+    name: BBoxFeat
+    roi_extractor:
+      name: RoIAlign
+      resolution: 7
+      sampling_ratio: 2
+    head_feat:
+      name: TwoFCHead
+      in_dim: 256
+      mlp_dim: 1024
+  in_feat: 1024
+BBoxPostProcess:
+  decode:
+    name: RCNNBox
+    num_classes: 81
+    batch_size: 1
+  nms:
+    name: MultiClassNMS
+    keep_top_k: 100
+    score_threshold: 0.05
+    nms_threshold: 0.5
--- a/dygraph/configs/_base_/models/mask_rcnn_r50.yml
+++ b/dygraph/configs/_base_/models/mask_rcnn_r50.yml
+architecture: MaskRCNN
+pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/ResNet50_cos_pretrained.tar
+weights: output/mask_rcnn_r50_fpn_1x/model_final
+load_static_weights: True
+# Model Achitecture
+MaskRCNN:
+  # model anchor info flow
+  anchor: Anchor
+  proposal: Proposal
+  mask: Mask
+  # model feat info flow
+  backbone: ResNet
+  rpn_head: RPNHead
+  bbox_head: BBoxHead
+  mask_head: MaskHead
+  # post process
+  bbox_post_process: BBoxPostProcess
+  mask_post_process: MaskPostProcess
+ResNet:
+  # index 0 stands for res2
+  depth: 50
+  norm_type: bn
+  freeze_at: 0
+  return_idx: [2]
+  num_stages: 3
+RPNHead:
+  rpn_feat:
+    name: RPNFeat
+    feat_in: 1024
+    feat_out: 1024
+  anchor_per_position: 15
+Anchor:
+  anchor_generator:
+    name: AnchorGeneratorRPN
+    anchor_sizes: [32, 64, 128, 256, 512]
+    aspect_ratios: [0.5, 1.0, 2.0]
+    stride: [16.0, 16.0]
+    variance: [1.0, 1.0, 1.0, 1.0]
+  anchor_target_generator:
+    name: AnchorTargetGeneratorRPN
+    batch_size_per_im: 256
+    fg_fraction: 0.5
+    negative_overlap: 0.3
+    positive_overlap: 0.7
+    straddle_thresh: 0.0
+Proposal:
+  proposal_generator:
+    name: ProposalGenerator
+    min_size: 0.0
+    nms_thresh: 0.7
+    train_pre_nms_top_n: 12000
+    train_post_nms_top_n: 2000
+    infer_pre_nms_top_n: 6000
+    infer_post_nms_top_n: 1000
+  proposal_target_generator:
+    name: ProposalTargetGenerator
+    batch_size_per_im: 512
+    bbox_reg_weights: [0.1, 0.1, 0.2, 0.2]
+    bg_thresh_hi: [0.5,]
+    bg_thresh_lo: [0.0,]
+    fg_thresh: [0.5,]
+    fg_fraction: 0.25
+BBoxHead:
+  bbox_feat:
+    name: BBoxFeat
+    roi_extractor: RoIAlign
+    head_feat:
+      name: Res5Head
+      feat_in: 1024
+      feat_out: 512
+  with_pool: true
+  in_feat: 2048
+BBoxPostProcess:
+  decode:
+    name: RCNNBox
+    num_classes: 81
+    batch_size: 1
+  nms:
+    name: MultiClassNMS
+    keep_top_k: 100
+    score_threshold: 0.05
+    nms_threshold: 0.5
+Mask:
+  mask_target_generator:
+    name: MaskTargetGenerator
+    mask_resolution: 14
+RoIAlign:
+  resolution: 14
+  sampling_ratio: 0
+  start_level: 0
+  end_level: 0
+MaskHead:
+  mask_feat:
+    name: MaskFeat
+    num_convs: 0
+    feat_in: 2048
+    feat_out: 256
+    mask_roi_extractor: RoIAlign
+    share_bbox_feat: true
+  feat_in: 256
+MaskPostProcess:
+  mask_resolution: 14
--- a/dygraph/configs/_base_/models/mask_rcnn_r50_fpn.yml
+++ b/dygraph/configs/_base_/models/mask_rcnn_r50_fpn.yml
+architecture: MaskRCNN
+pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/ResNet50_cos_pretrained.tar
+weights: output/mask_rcnn_r50_fpn_1x/model_final
+load_static_weights: True
+# Model Achitecture
+MaskRCNN:
+  # model anchor info flow
+  anchor: Anchor
+  proposal: Proposal
+  mask: Mask
+  # model feat info flow
+  backbone: ResNet
+  neck: FPN
+  rpn_head: RPNHead
+  bbox_head: BBoxHead
+  mask_head: MaskHead
+  # post process
+  bbox_post_process: BBoxPostProcess
+  mask_post_process: MaskPostProcess
+ResNet:
+  # index 0 stands for res2
+  depth: 50
+  norm_type: bn
+  freeze_at: 0
+  return_idx: [0,1,2,3]
+  num_stages: 4
+FPN:
+  in_channels: [256, 512, 1024, 2048]
+  out_channel: 256
+  min_level: 0
+  max_level: 4
+  spatial_scale: [0.25, 0.125, 0.0625, 0.03125]
+RPNHead:
+  rpn_feat:
+    name: RPNFeat
+    feat_in: 256
+    feat_out: 256
+  anchor_per_position: 3
+  rpn_channel: 256
+Anchor:
+  anchor_generator:
+    name: AnchorGeneratorRPN
+    aspect_ratios: [0.5, 1.0, 2.0]
+    anchor_start_size: 32
+    stride: [4., 4.]
+  anchor_target_generator:
+    name: AnchorTargetGeneratorRPN
+    batch_size_per_im: 256
+    fg_fraction: 0.5
+    negative_overlap: 0.3
+    positive_overlap: 0.7
+    straddle_thresh: 0.0
+Proposal:
+  proposal_generator:
+    name: ProposalGenerator
+    min_size: 0.0
+    nms_thresh: 0.7
+    train_pre_nms_top_n: 2000
+    train_post_nms_top_n: 2000
+    infer_pre_nms_top_n: 1000
+    infer_post_nms_top_n: 1000
+  proposal_target_generator:
+    name: ProposalTargetGenerator
+    batch_size_per_im: 512
+    bbox_reg_weights: [0.1, 0.1, 0.2, 0.2]
+    bg_thresh_hi: [0.5,]
+    bg_thresh_lo: [0.0,]
+    fg_thresh: [0.5,]
+    fg_fraction: 0.25
+BBoxHead:
+  bbox_feat:
+    name: BBoxFeat
+    roi_extractor:
+      name: RoIAlign
+      resolution: 7
+      sampling_ratio: 2
+    head_feat:
+      name: TwoFCHead
+      in_dim: 256
+      mlp_dim: 1024
+  in_feat: 1024
+BBoxPostProcess:
+  decode:
+    name: RCNNBox
+    num_classes: 81
+    batch_size: 1
+  nms:
+    name: MultiClassNMS
+    keep_top_k: 100
+    score_threshold: 0.05
+    nms_threshold: 0.5
+Mask:
+  mask_target_generator:
+    name: MaskTargetGenerator
+    mask_resolution: 28
+MaskHead:
+  mask_feat:
+    name: MaskFeat
+    num_convs: 4
+    feat_in: 256
+    feat_out: 256
+    mask_roi_extractor:
+      name: RoIAlign
+      resolution: 14
+      sampling_ratio: 2
+    share_bbox_feat: False
+  feat_in: 256
+MaskPostProcess:
+  mask_resolution: 28
--- a/dygraph/configs/_base_/models/ssd_vgg16_300.yml
+++ b/dygraph/configs/_base_/models/ssd_vgg16_300.yml
+architecture: SSD
+pretrain_weights: https://paddlemodels.bj.bcebos.com/object_detection/dygraph/VGG16_caffe_pretrained.pdparams
+weights: output/ssd_vgg16/model_final
+# Model Achitecture
+SSD:
+  # model feat info flow
+  backbone: VGG
+  ssd_head: SSDHead
+  # post process
+  post_process: BBoxPostProcess
+VGG:
+  depth: 16
+  normalizations: [20., -1, -1, -1, -1, -1]
+SSDHead:
+  in_channels: [512, 1024, 512, 256, 256, 256]
+  anchor_generator: AnchorGeneratorSSD
+AnchorGeneratorSSD:
+  steps: [8, 16, 32, 64, 100, 300]
+  aspect_ratios: [[2.], [2., 3.], [2., 3.], [2., 3.], [2.], [2.]]
+  min_ratio: 20
+  max_ratio: 90
+  min_sizes: [30.0, 60.0, 111.0, 162.0, 213.0, 264.0]
+  max_sizes: [60.0, 111.0, 162.0, 213.0, 264.0, 315.0]
+  offset: 0.5
+  flip: true
+  min_max_aspect_ratios_order: true
+BBoxPostProcess:
+  decode:
+    name: SSDBox
+  nms:
+    name: MultiClassNMS
+    keep_top_k: 200
+    score_threshold: 0.01
+    nms_threshold: 0.45
+    nms_top_k: 400
+    nms_eta: 1.0
--- a/dygraph/configs/_base_/models/yolov3_darknet53.yml
+++ b/dygraph/configs/_base_/models/yolov3_darknet53.yml
+architecture: YOLOv3
+pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/DarkNet53_pretrained.tar
+weights: output/yolov3_darknet/model_final
+use_fine_grained_loss: false
+load_static_weights: True
+norm_type: sync_bn
+YOLOv3:
+  backbone: DarkNet
+  neck: YOLOv3FPN
+  yolo_head: YOLOv3Head
+  post_process: BBoxPostProcess
+DarkNet:
+  depth: 53
+  return_idx: [2, 3, 4]
+YOLOv3FPN:
+  feat_channels: [1024, 768, 384]
+YOLOv3Head:
+  anchors: [[10, 13], [16, 30], [33, 23],
+            [30, 61], [62, 45], [59, 119],
+            [116, 90], [156, 198], [373, 326]]
+  anchor_masks: [[6, 7, 8], [3, 4, 5], [0, 1, 2]]
+  loss: YOLOv3Loss
+YOLOv3Loss:
+  ignore_thresh: 0.7
+  downsample: [32, 16, 8]
+  label_smooth: false
+BBoxPostProcess:
+  decode:
+    name: YOLOBox
+    conf_thresh: 0.005
+    downsample_ratio: 32
+    clip_bbox: true
+  nms:
+    name: MultiClassNMS
+    keep_top_k: 100
+    score_threshold: 0.01
+    nms_threshold: 0.45
+    nms_top_k: 1000
+    normalized: false
+    background_label: -1
--- a/dygraph/configs/_base_/optimizers/rcnn_1x.yml
+++ b/dygraph/configs/_base_/optimizers/rcnn_1x.yml
+epoch: 12
+LearningRate:
+  base_lr: 0.01
+  schedulers:
+  - !PiecewiseDecay
+    gamma: 0.1
+    milestones: [8, 11]
+  - !LinearWarmup
+    start_factor: 0.3333333333333333
+    steps: 500
+OptimizerBuilder:
+  optimizer:
+    momentum: 0.9
+    type: Momentum
+  regularizer:
+    factor: 0.0001
+    type: L2
--- a/dygraph/configs/_base_/optimizers/ssd_240e.yml
+++ b/dygraph/configs/_base_/optimizers/ssd_240e.yml
+epoch: 240
+LearningRate:
+  base_lr: 0.001
+  schedulers:
+  - !PiecewiseDecay
+    gamma: 0.1
+    milestones:
+    - 160
+    - 200
+  - !LinearWarmup
+    start_factor: 0.3333333333333333
+    steps: 500
+OptimizerBuilder:
+  optimizer:
+    momentum: 0.9
+    type: Momentum
+  regularizer:
+    factor: 0.0005
+    type: L2
--- a/dygraph/configs/_base_/optimizers/yolov3_270e.yml
+++ b/dygraph/configs/_base_/optimizers/yolov3_270e.yml
+epoch: 270
+LearningRate:
+  base_lr: 0.001
+  schedulers:
+  - !PiecewiseDecay
+    gamma: 0.1
+    milestones:
+    - 216
+    - 243
+  - !LinearWarmup
+    start_factor: 0.
+    steps: 4000
+OptimizerBuilder:
+  optimizer:
+    momentum: 0.9
+    type: Momentum
+  regularizer:
+    factor: 0.0005
+    type: L2
--- a/dygraph/configs/_base_/readers/faster_fpn_reader.yml
+++ b/dygraph/configs/_base_/readers/faster_fpn_reader.yml
+worker_num: 2
+TrainReader:
+  sample_transforms:
+  - DecodeOp: { }
+  - RandomFlipImage: {prob: 0.5}
+  - NormalizeImage: {is_channel_first: false, is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+  - ResizeImage: {target_size: 800, max_size: 1333, interp: 1, use_cv2: true}
+  - Permute: {to_bgr: false, channel_first: true}
+  batch_transforms:
+  - PadBatch: {pad_to_stride: 32, use_padded_im_info: false, pad_gt: true}
+  batch_size: 1
+  shuffle: true
+  drop_last: true
+EvalReader:
+  sample_transforms:
+  - DecodeOp: { }
+  - NormalizeImageOp: { is_scale: true, mean: [ 0.485,0.456,0.406 ], std: [ 0.229, 0.224,0.225 ] }
+  - ResizeOp: { interp: 1, target_size: [ 800, 1333 ], keep_ratio: True }
+  - PermuteOp: { }
+  batch_transforms:
+  - PadBatchOp: { pad_to_stride: 32, pad_gt: false }
+  batch_size: 1
+  shuffle: false
+  drop_last: false
+  drop_empty: false
+TestReader:
+  sample_transforms:
+  - DecodeOp: { }
+  - NormalizeImageOp: { is_scale: true, mean: [ 0.485,0.456,0.406 ], std: [ 0.229, 0.224,0.225 ] }
+  - ResizeOp: { interp: 1, target_size: [ 800, 1333 ], keep_ratio: True }
+  - PermuteOp: { }
+  batch_transforms:
+  - PadBatchOp: { pad_to_stride: 32, pad_gt: false }
+  batch_size: 1
+  shuffle: false
+  drop_last: false
--- a/dygraph/configs/_base_/readers/faster_reader.yml
+++ b/dygraph/configs/_base_/readers/faster_reader.yml
+worker_num: 2
+TrainReader:
+  sample_transforms:
+  - DecodeOp: { }
+  - RandomFlipImage: {prob: 0.5}
+  - NormalizeImage: {is_channel_first: false, is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+  - ResizeImage: {target_size: 800, max_size: 1333, interp: 1, use_cv2: true}
+  - Permute: {to_bgr: false, channel_first: true}
+  batch_transforms:
+  - PadBatch: {pad_to_stride: -1, use_padded_im_info: false, pad_gt: true}
+  batch_size: 1
+  shuffle: true
+  drop_last: true
+EvalReader:
+  sample_transforms:
+  - DecodeOp: { }
+  - NormalizeImageOp: { is_scale: true, mean: [ 0.485,0.456,0.406 ], std: [ 0.229, 0.224,0.225 ] }
+  - ResizeOp: { interp: 1, target_size: [ 800, 1333 ], keep_ratio: True }
+  - PermuteOp: { }
+  batch_transforms:
+  - PadBatchOp: { pad_to_stride: -1, pad_gt: false }
+  batch_size: 1
+  shuffle: false
+  drop_last: false
+  drop_empty: false
+TestReader:
+  sample_transforms:
+  - DecodeOp: { }
+  - NormalizeImageOp: { is_scale: true, mean: [ 0.485,0.456,0.406 ], std: [ 0.229, 0.224,0.225 ] }
+  - ResizeOp: { interp: 1, target_size: [ 800, 1333 ], keep_ratio: True }
+  - PermuteOp: { }
+  batch_transforms:
+  - PadBatchOp: { pad_to_stride: -1, pad_gt: false }
+  batch_size: 1
+  shuffle: false
+  drop_last: false
--- a/dygraph/configs/_base_/readers/mask_fpn_reader.yml
+++ b/dygraph/configs/_base_/readers/mask_fpn_reader.yml
+worker_num: 2
+TrainReader:
+  sample_transforms:
+  - DecodeOp: {}
+  - RandomFlipImage: {prob: 0.5, is_mask_flip: true}
+  - NormalizeImage: {is_channel_first: false, is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+  - ResizeImage: {target_size: 800, max_size: 1333, interp: 1, use_cv2: true}
+  - Permute: {to_bgr: false, channel_first: true}
+  batch_transforms:
+  - PadBatch: {pad_to_stride: 32, use_padded_im_info: false, pad_gt: true}
+  batch_size: 1
+  shuffle: true
+  drop_last: true
+EvalReader:
+  sample_transforms:
+  - DecodeOp: {}
+  - NormalizeImageOp: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+  - ResizeOp: {interp: 1, target_size: [800, 1333], keep_ratio: True}
+  - PermuteOp: {}
+  batch_transforms:
+  - PadBatchOp: {pad_to_stride: 32, pad_gt: false}
+  batch_size: 1
+  shuffle: false
+  drop_last: false
+  drop_empty: false
+TestReader:
+  sample_transforms:
+  - DecodeOp: {}
+  - NormalizeImageOp: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+  - ResizeOp: {interp: 1, target_size: [800, 1333], keep_ratio: True}
+  - PermuteOp: {}
+  batch_transforms:
+  - PadBatchOp: {pad_to_stride: 32, pad_gt: false}
+  batch_size: 1
+  shuffle: false
+  drop_last: false
--- a/dygraph/configs/_base_/readers/mask_reader.yml
+++ b/dygraph/configs/_base_/readers/mask_reader.yml
+worker_num: 2
+TrainReader:
+  sample_transforms:
+  - DecodeOp: {}
+  - RandomFlipImage: {prob: 0.5, is_mask_flip: true}
+  - NormalizeImage: {is_channel_first: false, is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+  - ResizeImage: {target_size: 800, max_size: 1333, interp: 1, use_cv2: true}
+  - Permute: {to_bgr: false, channel_first: true}
+  batch_transforms:
+  - PadBatch: {pad_to_stride: -1., use_padded_im_info: false, pad_gt: true}
+  batch_size: 1
+  shuffle: true
+  drop_last: true
+EvalReader:
+  sample_transforms:
+  - DecodeOp: {}
+  - NormalizeImageOp: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+  - ResizeOp: {interp: 1, target_size: [800, 1333], keep_ratio: True}
+  - PermuteOp: {}
+  batch_transforms:
+  - PadBatchOp: {pad_to_stride: -1., pad_gt: false}
+  batch_size: 1
+  shuffle: false
+  drop_last: false
+  drop_empty: false
+TestReader:
+  sample_transforms:
+  - DecodeOp: {}
+  - NormalizeImageOp: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+  - ResizeOp: {interp: 1, target_size: [800, 1333], keep_ratio: True}
+  - PermuteOp: {}
+  batch_transforms:
+  - PadBatchOp: {pad_to_stride: -1., pad_gt: false}
+  batch_size: 1
+  shuffle: false
+  drop_last: false
+  drop_empty: false
--- a/dygraph/configs/_base_/readers/ssd_reader.yml
+++ b/dygraph/configs/_base_/readers/ssd_reader.yml
+worker_num: 2
+TrainReader:
+  inputs_def:
+    num_max_boxes: 90
+  sample_transforms:
+    - DecodeOp: {}
+    - RandomDistortOp: {brightness: [0.5, 1.125, 0.875], random_apply: False}
+    - RandomExpandOp: {fill_value: [104., 117., 123.]}
+    - RandomCropOp: {allow_no_crop: true}
+    - RandomFlipOp: {}
+    - NormalizeBoxOp: {}
+    - ResizeOp: {target_size: [300, 300], keep_ratio: False, interp: 1}
+    - PadBoxOp: {num_max_boxes: 90}
+  batch_transforms:
+    - NormalizeImageOp: {mean: [104., 117., 123.], std: [1., 1., 1.], is_scale: false}
+    - PermuteOp: {}
+  batch_size: 8
+  shuffle: true
+  drop_last: true
+EvalReader:
+  sample_transforms:
+    - DecodeOp: {}
+    - ResizeOp: {target_size: [300, 300], keep_ratio: False, interp: 1}
+    - NormalizeImageOp: {mean: [104., 117., 123.], std: [1., 1., 1.], is_scale: false}
+    - PermuteOp: {}
+  batch_size: 1
+  drop_empty: false
+TestReader:
+  inputs_def:
+    image_shape: [3, 300, 300]
+  sample_transforms:
+    - DecodeOp: {}
+    - ResizeOp: {target_size: [300, 300], keep_ratio: False, interp: 1}
+    - NormalizeImageOp: {mean: [104., 117., 123.], std: [1., 1., 1.], is_scale: false}
+    - PermuteOp: {}
+  batch_size: 1
--- a/dygraph/configs/_base_/readers/yolov3_reader.yml
+++ b/dygraph/configs/_base_/readers/yolov3_reader.yml
+worker_num: 2
+TrainReader:
+  inputs_def:
+    num_max_boxes: 50
+  sample_transforms:
+    - DecodeOp: {}
+    - MixupOp: {alpha: 1.5, beta: 1.5}
+    - RandomDistortOp: {}
+    - RandomExpandOp: {fill_value: [123.675, 116.28, 103.53]}
+    - RandomCropOp: {}
+    - RandomFlipOp: {}
+  batch_transforms:
+    - BatchRandomResizeOp: {target_size: [320, 352, 384, 416, 448, 480, 512, 544, 576, 608], random_size: True, random_interp: True, keep_ratio: False}
+    - NormalizeBoxOp: {}
+    - PadBoxOp: {num_max_boxes: 50}
+    - BboxXYXY2XYWHOp: {}
+    - NormalizeImageOp: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+    - PermuteOp: {}
+    - Gt2YoloTargetOp: {anchor_masks: [[6, 7, 8], [3, 4, 5], [0, 1, 2]], anchors: [[10, 13], [16, 30], [33, 23], [30, 61], [62, 45], [59, 119], [116, 90], [156, 198], [373, 326]], downsample_ratios: [32, 16, 8]}
+  batch_size: 8
+  shuffle: true
+  drop_last: true
+  mixup_epoch: 250
+EvalReader:
+  inputs_def:
+    num_max_boxes: 50
+  sample_transforms:
+    - DecodeOp: {}
+    - ResizeOp: {target_size: [608, 608], keep_ratio: False, interp: 2}
+    - NormalizeImageOp: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+    - PermuteOp: {}
+  batch_size: 1
+  drop_empty: false
+TestReader:
+  inputs_def:
+    image_shape: [3, 608, 608]
+  sample_transforms:
+    - DecodeOp: {}
+    - ResizeOp: {target_size: [608, 608], keep_ratio: False, interp: 2}
+    - NormalizeImageOp: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+    - PermuteOp: {}
+  batch_size: 1
--- a/dygraph/configs/_base_/runtime.yml
+++ b/dygraph/configs/_base_/runtime.yml
+use_gpu: true
+log_iter: 20
+save_dir: output
+snapshot_epoch: 1
--- a/dygraph/configs/cascade_mask_rcnn_r50_fpn_1x_coco.yml
+++ b/dygraph/configs/cascade_mask_rcnn_r50_fpn_1x_coco.yml
+_BASE_: [
+  './_base_/models/cascade_mask_rcnn_r50_fpn.yml',
+  './_base_/optimizers/rcnn_1x.yml',
+  './_base_/datasets/coco_instance.yml',
+  './_base_/readers/mask_fpn_reader.yml',
+  './_base_/runtime.yml',
+]
--- a/dygraph/configs/cascade_rcnn_r50_fpn_1x_coco.yml
+++ b/dygraph/configs/cascade_rcnn_r50_fpn_1x_coco.yml
+_BASE_: [
+  './_base_/models/cascade_rcnn_r50_fpn.yml',
+  './_base_/optimizers/rcnn_1x.yml',
+  './_base_/datasets/coco_detection.yml',
+  './_base_/readers/faster_fpn_reader.yml',
+  './_base_/runtime.yml',
+]
--- a/dygraph/configs/faster_rcnn_r50_1x_coco.yml
+++ b/dygraph/configs/faster_rcnn_r50_1x_coco.yml
+_BASE_: [
+  './_base_/models/faster_rcnn_r50.yml',
+  './_base_/optimizers/rcnn_1x.yml',
+  './_base_/datasets/coco_detection.yml',
+  './_base_/readers/faster_reader.yml',
+  './_base_/runtime.yml',
+]
--- a/dygraph/configs/faster_rcnn_r50_fpn_1x_coco.yml
+++ b/dygraph/configs/faster_rcnn_r50_fpn_1x_coco.yml
+_BASE_: [
+  './_base_/models/faster_rcnn_r50_fpn.yml',
+  './_base_/optimizers/rcnn_1x.yml',
+  './_base_/datasets/coco_detection.yml',
+  './_base_/readers/faster_fpn_reader.yml',
+  './_base_/runtime.yml',
+]
--- a/dygraph/configs/mask_rcnn_r50_1x_coco.yml
+++ b/dygraph/configs/mask_rcnn_r50_1x_coco.yml
+_BASE_: [
+  './_base_/models/mask_rcnn_r50.yml',
+  './_base_/optimizers/rcnn_1x.yml',
+  './_base_/datasets/coco_instance.yml',
+  './_base_/readers/mask_reader.yml',
+  './_base_/runtime.yml',
+]
--- a/dygraph/configs/mask_rcnn_r50_fpn_1x_coco.yml
+++ b/dygraph/configs/mask_rcnn_r50_fpn_1x_coco.yml
+_BASE_: [
+  './_base_/models/mask_rcnn_r50_fpn.yml',
+  './_base_/optimizers/rcnn_1x.yml',
+  './_base_/datasets/coco_instance.yml',
+  './_base_/readers/mask_fpn_reader.yml',
+  './_base_/runtime.yml',
+]
--- a/dygraph/configs/ssd_vgg16_300_240e_voc.yml
+++ b/dygraph/configs/ssd_vgg16_300_240e_voc.yml
+_BASE_: [
+  './_base_/models/ssd_vgg16_300.yml',
+  './_base_/optimizers/ssd_240e.yml',
+  './_base_/datasets/voc.yml',
+  './_base_/readers/ssd_reader.yml',
+  './_base_/runtime.yml',
+]
--- a/dygraph/configs/yolov3_darknet53_270e_coco.yml
+++ b/dygraph/configs/yolov3_darknet53_270e_coco.yml
+_BASE_: [
+  './_base_/models/yolov3_darknet53.yml',
+  './_base_/optimizers/yolov3_270e.yml',
+  './_base_/datasets/coco_detection.yml',
+  './_base_/readers/yolov3_reader.yml',
+  './_base_/runtime.yml',
+]
--- a/dygraph/dataset/coco/download_coco.py
+++ b/dygraph/dataset/coco/download_coco.py
+# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+import sys
+import os.path as osp
+import logging
+# add python path of PadleDetection to sys.path
+parent_path = osp.abspath(osp.join(__file__, *(['..'] * 3)))
+if parent_path not in sys.path:
+    sys.path.append(parent_path)
+from ppdet.utils.download import download_dataset
+logging.basicConfig(level=logging.INFO)
+download_path = osp.split(osp.realpath(sys.argv[0]))[0]
+download_dataset(download_path, 'coco')
--- a/dygraph/dataset/voc/create_list.py
+++ b/dygraph/dataset/voc/create_list.py
+# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+import sys
+import os.path as osp
+import logging
+# add python path of PadleDetection to sys.path
+parent_path = osp.abspath(osp.join(__file__, *(['..'] * 3)))
+if parent_path not in sys.path:
+    sys.path.append(parent_path)
+from ppdet.utils.download import create_voc_list
+logging.basicConfig(level=logging.INFO)
+voc_path = osp.split(osp.realpath(sys.argv[0]))[0]
+create_voc_list(voc_path)
--- a/dygraph/dataset/voc/download_voc.py
+++ b/dygraph/dataset/voc/download_voc.py
+# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+import sys
+import os.path as osp
+import logging
+# add python path of PadleDetection to sys.path
+parent_path = osp.abspath(osp.join(__file__, *(['..'] * 3)))
+if parent_path not in sys.path:
+    sys.path.append(parent_path)
+from ppdet.utils.download import download_dataset
+logging.basicConfig(level=logging.INFO)
+download_path = osp.split(osp.realpath(sys.argv[0]))[0]
+download_dataset(download_path, 'voc')
--- a/dygraph/dataset/voc/label_list.txt
+++ b/dygraph/dataset/voc/label_list.txt
+aeroplane
+bicycle
+bird
+boat
+bottle
+bus
+car
+cat
+chair
+cow
+diningtable
+dog
+horse
+motorbike
+person
+pottedplant
+sheep
+sofa
+train
+tvmonitor
--- a/dygraph/deploy/EXPORT_MODEL.md
+++ b/dygraph/deploy/EXPORT_MODEL.md
+# 模型动转静导出
+训练得到一个满足要求的模型后，如果想要将该模型接入到C++预测库或者Serving服务，需要通过`tools/export_model.py`将动态图模型转化为静态图模型并导出。同时，会导出预测时使用的配置文件，路径与模型保存路径相同, 配置文件名为`infer_cfg.yml`。
+**说明：**
+- **输入部分：** 动转静导出模型输入统一为：
+| 输入名称 | 输入形状 | 表示含义 |
+| :---------: | ----------- | ---------- |
+| image |  [None, 3, H, W] | 输入网络的图像，None表示batch维度，如果输入图像大小为变长，则H,W为None |
+| im_shape | [None, 2] | 图像经过resize后的大小，表示为H,W, None表示batch维度 |
+| scale_factor | [None, 2] | 输入图像大小比真实图像大小，表示为scale_y, scale_x |
+具体预处理方式可参考配置文件中TestReader部分。
+- **输出部分：** 动转静导出模型输出统一为：
+  - bbox, NMS的输出，形状为[N, 6], 其中N为预测框的个数，6为[class_id, score, x1, y1, x2, y2]。
+  - bbox\_num, 每张图片对应预测框的个数，例如batch_size为2，输出为[N1, N2], 表示第一张图包含N1个预测框，第二张图包含N2个预测框，并且预测框的总个数和NMS输出的第一维N相同
+  - mask，如果网络中包含mask，则会输出mask分支
+- 模型动转静导出不支持模型结构中包含numpy相关操作的情况。
+## 启动参数说明
+|      FLAG      |      用途      |    默认值    |                 备注                      |
+|:--------------:|:--------------:|:------------:|:-----------------------------------------:|
+|       -c       |  指定配置文件  |     None     |                                           |
+|  --output_dir  |  模型保存路径  |  `./output_inference`  |  模型默认保存在`output/配置文件名/`路径下 |
+## 使用示例
+使用训练得到的模型进行试用，脚本如下
+```bash
+# 导出FasterRCNN模型
+python tools/export_model.py -c configs/faster_rcnn_r50_1x_coco.yml \
+        --output_dir=./inference_model \
+        -o weights=output/faster_rcnn_r50_1x_coco/model_final
+```
+预测模型会导出到`inference_model/faster_rcnn_r50_1x_coco`目录下，分别为`infer_cfg.yml`, `model.pdiparams`,  `model.pdiparams.info`, `model.pdmodel`。
+## 设置导出模型的输入大小
+使用Fluid-TensorRT进行预测时，由于<=TensorRT 5.1的版本仅支持定长输入，保存模型的`data`层的图片大小需要和实际输入图片大小一致。而Fluid C++预测引擎没有此限制。设置TestReader中的`image_shape`可以修改保存模型中的输入图片大小。示例如下:
+```bash
+# 导出FasterRCNN模型，输入是3x640x640
+python tools/export_model.py -c configs/faster_rcnn_r50_1x_coco.yml \
+        --output_dir=./inference_model \
+        -o weights=https://paddlemodels.bj.bcebos.com/object_detection/dygraph/faster_rcnn_r50_1x_coco.pdparams \
+           TestReader.inputs_def.image_shape=[3,640,640]
+# 导出YOLOv3模型，输入是3x320x320
+python tools/export_model.py -c configs/yolov3_darknet53_270e_coco.yml \
+        --output_dir=./inference_model \
+        -o weights=https://paddlemodels.bj.bcebos.com/object_detection/dygraph/yolov3_darknet53_270e_coco.pdparams \
+           TestReader.inputs_def.image_shape=[3,320,320]
+```
--- a/dygraph/deploy/README.md
+++ b/dygraph/deploy/README.md
+# PaddleDetection 预测部署
+`PaddleDetection`目前支持：
+- 使用`Python`和`C++`部署在`Windows` 和`Linux` 上运行
+- [在线服务化部署](./serving/README.md)
+- [移动端部署](https://github.com/PaddlePaddle/Paddle-Lite-Demo)
+## 模型导出
+训练得到一个满足要求的模型后，如果想要将该模型接入到C++服务器端预测库或移动端预测库，需要通过`tools/export_model.py`导出该模型。
+- [导出教程](https://github.com/PaddlePaddle/PaddleDetection/blob/master/docs/advanced_tutorials/deploy/EXPORT_MODEL.md)
+模型导出后, 目录结构如下(以`yolov3_darknet`为例):
+```
+yolov3_darknet # 模型目录
+├── infer_cfg.yml # 模型配置信息
+├── __model__     # 模型文件
+└── __params__    # 参数文件
+```
+预测时，该目录所在的路径会作为程序的输入参数。
+## 预测部署
+- [1. Python预测(支持 Linux 和 Windows)](https://github.com/PaddlePaddle/PaddleDetection/blob/master/deploy/python)
+- [2. C++预测(支持 Linux 和 Windows)](https://github.com/PaddlePaddle/PaddleDetection/blob/master/deploy/cpp)
+- [3. 在线服务化部署](./serving/README.md)
+- [4. 移动端部署](https://github.com/PaddlePaddle/Paddle-Lite-Demo)
+- [5. Jetson设备部署](./cpp/docs/Jetson_build.md)
--- a/dygraph/deploy/cpp/CMakeLists.txt
+++ b/dygraph/deploy/cpp/CMakeLists.txt
+cmake_minimum_required(VERSION 3.0)
+project(PaddleObjectDetector CXX C)
+option(WITH_MKL        "Compile demo with MKL/OpenBlas support,defaultuseMKL."          ON)
+option(WITH_GPU        "Compile demo with GPU/CPU, default use CPU."                    ON)
+option(WITH_STATIC_LIB "Compile demo with static/shared library, default use static."   ON)
+option(WITH_TENSORRT "Compile demo with TensorRT."   OFF)
+SET(PADDLE_DIR "" CACHE PATH "Location of libraries")
+SET(OPENCV_DIR "" CACHE PATH "Location of libraries")
+SET(CUDA_LIB "" CACHE PATH "Location of libraries")
+SET(CUDNN_LIB "" CACHE PATH "Location of libraries")
+SET(TENSORRT_INC_DIR "" CACHE PATH "Compile demo with TensorRT")
+SET(TENSORRT_LIB_DIR "" CACHE PATH "Compile demo with TensorRT")
+include(cmake/yaml-cpp.cmake)
+include_directories("${CMAKE_SOURCE_DIR}/")
+include_directories("${CMAKE_CURRENT_BINARY_DIR}/ext/yaml-cpp/src/ext-yaml-cpp/include")
+link_directories("${CMAKE_CURRENT_BINARY_DIR}/ext/yaml-cpp/lib")
+macro(safe_set_static_flag)
+    foreach(flag_var
+        CMAKE_CXX_FLAGS CMAKE_CXX_FLAGS_DEBUG CMAKE_CXX_FLAGS_RELEASE
+        CMAKE_CXX_FLAGS_MINSIZEREL CMAKE_CXX_FLAGS_RELWITHDEBINFO)
+      if(${flag_var} MATCHES "/MD")
+        string(REGEX REPLACE "/MD" "/MT" ${flag_var} "${${flag_var}}")
+      endif(${flag_var} MATCHES "/MD")
+    endforeach(flag_var)
+endmacro()
+if (WITH_MKL)
+    ADD_DEFINITIONS(-DUSE_MKL)
+endif()
+if (NOT DEFINED PADDLE_DIR OR ${PADDLE_DIR} STREQUAL "")
+    message(FATAL_ERROR "please set PADDLE_DIR with -DPADDLE_DIR=/path/paddle_influence_dir")
+endif()
+if (NOT DEFINED OPENCV_DIR OR ${OPENCV_DIR} STREQUAL "")
+    message(FATAL_ERROR "please set OPENCV_DIR with -DOPENCV_DIR=/path/opencv")
+endif()
+include_directories("${CMAKE_SOURCE_DIR}/")
+include_directories("${PADDLE_DIR}/")
+include_directories("${PADDLE_DIR}/third_party/install/protobuf/include")
+include_directories("${PADDLE_DIR}/third_party/install/glog/include")
+include_directories("${PADDLE_DIR}/third_party/install/gflags/include")
+include_directories("${PADDLE_DIR}/third_party/install/xxhash/include")
+if (EXISTS "${PADDLE_DIR}/third_party/install/snappy/include")
+    include_directories("${PADDLE_DIR}/third_party/install/snappy/include")
+endif()
+if(EXISTS "${PADDLE_DIR}/third_party/install/snappystream/include")
+    include_directories("${PADDLE_DIR}/third_party/install/snappystream/include")
+endif()
+include_directories("${PADDLE_DIR}/third_party/boost")
+include_directories("${PADDLE_DIR}/third_party/eigen3")
+if (EXISTS "${PADDLE_DIR}/third_party/install/snappy/lib")
+    link_directories("${PADDLE_DIR}/third_party/install/snappy/lib")
+endif()
+if(EXISTS "${PADDLE_DIR}/third_party/install/snappystream/lib")
+    link_directories("${PADDLE_DIR}/third_party/install/snappystream/lib")
+endif()
+link_directories("${PADDLE_DIR}/third_party/install/protobuf/lib")
+link_directories("${PADDLE_DIR}/third_party/install/glog/lib")
+link_directories("${PADDLE_DIR}/third_party/install/gflags/lib")
+link_directories("${PADDLE_DIR}/third_party/install/xxhash/lib")
+link_directories("${PADDLE_DIR}/paddle/lib/")
+link_directories("${CMAKE_CURRENT_BINARY_DIR}")
+if (WIN32)
+  include_directories("${PADDLE_DIR}/paddle/fluid/inference")
+  include_directories("${PADDLE_DIR}/paddle/include")
+  link_directories("${PADDLE_DIR}/paddle/fluid/inference")
+  find_package(OpenCV REQUIRED PATHS ${OPENCV_DIR}/build/ NO_DEFAULT_PATH)
+else ()
+  find_package(OpenCV REQUIRED PATHS ${OPENCV_DIR}/share/OpenCV NO_DEFAULT_PATH)
+  include_directories("${PADDLE_DIR}/paddle/include")
+  link_directories("${PADDLE_DIR}/paddle/lib")
+endif ()
+include_directories(${OpenCV_INCLUDE_DIRS})
+if (WIN32)
+    add_definitions("/DGOOGLE_GLOG_DLL_DECL=")
+    set(CMAKE_C_FLAGS_DEBUG   "${CMAKE_C_FLAGS_DEBUG} /bigobj /MTd")
+    set(CMAKE_C_FLAGS_RELEASE  "${CMAKE_C_FLAGS_RELEASE} /bigobj /MT")
+    set(CMAKE_CXX_FLAGS_DEBUG  "${CMAKE_CXX_FLAGS_DEBUG} /bigobj /MTd")
+    set(CMAKE_CXX_FLAGS_RELEASE   "${CMAKE_CXX_FLAGS_RELEASE} /bigobj /MT")
+    if (WITH_STATIC_LIB)
+        safe_set_static_flag()
+        add_definitions(-DSTATIC_LIB)
+    endif()
+else()
+    set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -g -o2 -fopenmp -std=c++11")
+    set(CMAKE_STATIC_LIBRARY_PREFIX "")
+endif()
+# TODO let users define cuda lib path
+if (WITH_GPU)
+    if (NOT DEFINED CUDA_LIB OR ${CUDA_LIB} STREQUAL "")
+        message(FATAL_ERROR "please set CUDA_LIB with -DCUDA_LIB=/path/cuda-8.0/lib64")
+    endif()
+    if (NOT WIN32)
+        if (NOT DEFINED CUDNN_LIB)
+            message(FATAL_ERROR "please set CUDNN_LIB with -DCUDNN_LIB=/path/cudnn_v7.4/cuda/lib64")
+        endif()
+    endif(NOT WIN32)
+endif()
+if (NOT WIN32)
+  if (WITH_TENSORRT AND WITH_GPU)
+	  include_directories("${TENSORRT_INC_DIR}/")
+	  link_directories("${TENSORRT_LIB_DIR}/")
+  endif()
+endif(NOT WIN32)
+if (NOT WIN32)
+    set(NGRAPH_PATH "${PADDLE_DIR}/third_party/install/ngraph")
+    if(EXISTS ${NGRAPH_PATH})
+        include(GNUInstallDirs)
+        include_directories("${NGRAPH_PATH}/include")
+        link_directories("${NGRAPH_PATH}/${CMAKE_INSTALL_LIBDIR}")
+        set(NGRAPH_LIB ${NGRAPH_PATH}/${CMAKE_INSTALL_LIBDIR}/libngraph${CMAKE_SHARED_LIBRARY_SUFFIX})
+    endif()
+endif()
+if(WITH_MKL)
+  include_directories("${PADDLE_DIR}/third_party/install/mklml/include")
+  if (WIN32)
+    set(MATH_LIB ${PADDLE_DIR}/third_party/install/mklml/lib/mklml.lib
+            ${PADDLE_DIR}/third_party/install/mklml/lib/libiomp5md.lib)
+  else ()
+    set(MATH_LIB ${PADDLE_DIR}/third_party/install/mklml/lib/libmklml_intel${CMAKE_SHARED_LIBRARY_SUFFIX}
+            ${PADDLE_DIR}/third_party/install/mklml/lib/libiomp5${CMAKE_SHARED_LIBRARY_SUFFIX})
+    execute_process(COMMAND cp -r ${PADDLE_DIR}/third_party/install/mklml/lib/libmklml_intel${CMAKE_SHARED_LIBRARY_SUFFIX} /usr/lib)
+  endif ()
+  set(MKLDNN_PATH "${PADDLE_DIR}/third_party/install/mkldnn")
+  if(EXISTS ${MKLDNN_PATH})
+    include_directories("${MKLDNN_PATH}/include")
+    if (WIN32)
+      set(MKLDNN_LIB ${MKLDNN_PATH}/lib/mkldnn.lib)
+    else ()
+      set(MKLDNN_LIB ${MKLDNN_PATH}/lib/libmkldnn.so.0)
+    endif ()
+  endif()
+else()
+  set(MATH_LIB ${PADDLE_DIR}/third_party/install/openblas/lib/libopenblas${CMAKE_STATIC_LIBRARY_SUFFIX})
+endif()
+if (WIN32)
+    if(EXISTS "${PADDLE_DIR}/paddle/fluid/inference/libpaddle_fluid${CMAKE_STATIC_LIBRARY_SUFFIX}")
+        set(DEPS
+            ${PADDLE_DIR}/paddle/fluid/inference/libpaddle_fluid${CMAKE_STATIC_LIBRARY_SUFFIX})
+    else()
+        set(DEPS
+            ${PADDLE_DIR}/paddle/lib/libpaddle_fluid${CMAKE_STATIC_LIBRARY_SUFFIX})
+    endif()
+endif()
+if(WITH_STATIC_LIB)
+    set(DEPS
+        ${PADDLE_DIR}/paddle/lib/libpaddle_fluid${CMAKE_STATIC_LIBRARY_SUFFIX})
+else()
+    set(DEPS
+        ${PADDLE_DIR}/paddle/lib/libpaddle_fluid${CMAKE_SHARED_LIBRARY_SUFFIX})
+endif()
+if (NOT WIN32)
+    set(DEPS ${DEPS}
+        ${MATH_LIB} ${MKLDNN_LIB}
+        glog gflags protobuf z xxhash yaml-cpp
+        )
+    if(EXISTS "${PADDLE_DIR}/third_party/install/snappystream/lib")
+        set(DEPS ${DEPS} snappystream)
+    endif()
+    if (EXISTS "${PADDLE_DIR}/third_party/install/snappy/lib")
+        set(DEPS ${DEPS} snappy)
+    endif()
+else()
+    set(DEPS ${DEPS}
+        ${MATH_LIB} ${MKLDNN_LIB}
+        glog gflags_static libprotobuf xxhash libyaml-cppmt)
+    set(DEPS ${DEPS} libcmt shlwapi)
+    if (EXISTS "${PADDLE_DIR}/third_party/install/snappy/lib")
+        set(DEPS ${DEPS} snappy)
+    endif()
+    if(EXISTS "${PADDLE_DIR}/third_party/install/snappystream/lib")
+        set(DEPS ${DEPS} snappystream)
+    endif()
+endif(NOT WIN32)
+if(WITH_GPU)
+  if(NOT WIN32)
+    if (WITH_TENSORRT)
+	    set(DEPS ${DEPS} ${TENSORRT_LIB_DIR}/libnvinfer${CMAKE_SHARED_LIBRARY_SUFFIX})
+	    set(DEPS ${DEPS} ${TENSORRT_LIB_DIR}/libnvinfer_plugin${CMAKE_SHARED_LIBRARY_SUFFIX})
+    endif()
+    set(DEPS ${DEPS} ${CUDA_LIB}/libcudart${CMAKE_SHARED_LIBRARY_SUFFIX})
+    set(DEPS ${DEPS} ${CUDNN_LIB}/libcudnn${CMAKE_SHARED_LIBRARY_SUFFIX})
+  else()
+    set(DEPS ${DEPS} ${CUDA_LIB}/cudart${CMAKE_STATIC_LIBRARY_SUFFIX} )
+    set(DEPS ${DEPS} ${CUDA_LIB}/cublas${CMAKE_STATIC_LIBRARY_SUFFIX} )
+    set(DEPS ${DEPS} ${CUDNN_LIB}/cudnn${CMAKE_STATIC_LIBRARY_SUFFIX})
+  endif()
+endif()
+if (NOT WIN32)
+    set(EXTERNAL_LIB "-ldl -lrt -lgomp -lz -lm -lpthread")
+    set(DEPS ${DEPS} ${EXTERNAL_LIB})
+endif()
+set(DEPS ${DEPS} ${OpenCV_LIBS})
+add_executable(main src/main.cc src/preprocess_op.cc src/object_detector.cc)
+ADD_DEPENDENCIES(main ext-yaml-cpp)
+target_link_libraries(main ${DEPS})
+if (WIN32 AND WITH_MKL)
+    add_custom_command(TARGET main POST_BUILD
+        COMMAND ${CMAKE_COMMAND} -E copy_if_different ${PADDLE_DIR}/third_party/install/mklml/lib/mklml.dll ./mklml.dll
+        COMMAND ${CMAKE_COMMAND} -E copy_if_different ${PADDLE_DIR}/third_party/install/mklml/lib/libiomp5md.dll ./libiomp5md.dll
+        COMMAND ${CMAKE_COMMAND} -E copy_if_different ${PADDLE_DIR}/third_party/install/mkldnn/lib/mkldnn.dll ./mkldnn.dll
+        COMMAND ${CMAKE_COMMAND} -E copy_if_different ${PADDLE_DIR}/third_party/install/mklml/lib/mklml.dll ./release/mklml.dll
+        COMMAND ${CMAKE_COMMAND} -E copy_if_different ${PADDLE_DIR}/third_party/install/mklml/lib/libiomp5md.dll ./release/libiomp5md.dll
+        COMMAND ${CMAKE_COMMAND} -E copy_if_different ${PADDLE_DIR}/third_party/install/mkldnn/lib/mkldnn.dll ./release/mkldnn.dll
+    )
+endif()
--- a/dygraph/deploy/cpp/README.md
+++ b/dygraph/deploy/cpp/README.md
+# C++端预测部署
+## 本教程结构
+[1.说明](#1说明)
+[2.主要目录和文件](#2主要目录和文件)
+[3.编译部署](#3编译)
+## 1.说明
+本目录为用户提供一个跨平台的`C++`部署方案，让用户通过`PaddleDetection`训练的模型导出后，即可基于本项目快速运行，也可以快速集成代码结合到自己的项目实际应用中去。
+主要设计的目标包括以下四点：
+- 跨平台，支持在 `Windows` 和 `Linux` 完成编译、二次开发集成和部署运行
+- 可扩展性，支持用户针对新模型开发自己特殊的数据预处理等逻辑
+- 高性能，除了`PaddlePaddle`自身带来的性能优势，我们还针对图像检测的特点对关键步骤进行了性能优化
+- 支持各种不同检测模型结构，包括`Yolov3`/`Faster_RCNN`/`SSD`等
+## 2.主要目录和文件
+```bash
+deploy/cpp
+|
+├── src
+│   ├── main.cc # 集成代码示例, 程序入口
+│   ├── object_detector.cc # 模型加载和预测主要逻辑封装类实现
+│   └── preprocess_op.cc # 预处理相关主要逻辑封装实现
+|
+├── include
+│   ├── config_parser.h # 导出模型配置yaml文件解析
+│   ├── object_detector.h # 模型加载和预测主要逻辑封装类
+│   └── preprocess_op.h # 预处理相关主要逻辑类封装
+|
+├── docs
+│   ├── linux_build.md # Linux 编译指南
+│   └── windows_vs2019_build.md # Windows VS2019编译指南
+│
+├── build.sh # 编译命令脚本
+│
+├── CMakeList.txt # cmake编译入口文件
+|
+├── CMakeSettings.json # Visual Studio 2019 CMake项目编译设置
+│
+└── cmake # 依赖的外部项目cmake（目前仅有yaml-cpp）
+```
+## 3.编译部署
+### 3.1 导出模型
+请确认您已经基于`PaddleDetection`的[export_model.py](https://github.com/PaddlePaddle/PaddleDetection/blob/dygraph/tools/export_model.py)导出您的模型，并妥善保存到合适的位置。导出模型细节请参考 [导出模型教程](https://github.com/PaddlePaddle/PaddleDetection/tree/dygraph/deploy/EXPORT_MODEL.md)。
+模型导出后, 目录结构如下(以`yolov3_darknet`为例):
+```
+yolov3_darknet # 模型目录
+├── infer_cfg.yml # 模型配置信息
+├── model.pdmodel     # 模型文件
+├── model.pdiparams.info #模型公用信息
+└── model.pdiparams    # 参数文件
+```
+预测时，该目录所在的路径会作为程序的输入参数。
+### 3.2 编译
+仅支持在`Windows`和`Linux`平台编译和使用
+- [Linux 编译指南](https://github.com/PaddlePaddle/PaddleDetection/blob/master/deploy/cpp/docs/linux_build.md)
+- [Windows编译指南(使用Visual Studio 2019)](https://github.com/PaddlePaddle/PaddleDetection/blob/master/deploy/cpp/docs/windows_vs2019_build.md)
--- a/dygraph/deploy/cpp/cmake/yaml-cpp.cmake
+++ b/dygraph/deploy/cpp/cmake/yaml-cpp.cmake
+find_package(Git REQUIRED)
+include(ExternalProject)
+message("${CMAKE_BUILD_TYPE}")
+ExternalProject_Add(
+        ext-yaml-cpp
+        URL https://bj.bcebos.com/paddlex/deploy/deps/yaml-cpp.zip
+        URL_MD5 9542d6de397d1fbd649ed468cb5850e6
+        CMAKE_ARGS
+        -DYAML_CPP_BUILD_TESTS=OFF
+		-DYAML_CPP_BUILD_TOOLS=OFF
+        -DYAML_CPP_INSTALL=OFF
+        -DYAML_CPP_BUILD_CONTRIB=OFF
+		-DMSVC_SHARED_RT=OFF
+		-DBUILD_SHARED_LIBS=OFF
+        -DCMAKE_BUILD_TYPE=${CMAKE_BUILD_TYPE}
+        -DCMAKE_CXX_FLAGS=${CMAKE_CXX_FLAGS}
+        -DCMAKE_CXX_FLAGS_DEBUG=${CMAKE_CXX_FLAGS_DEBUG}
+        -DCMAKE_CXX_FLAGS_RELEASE=${CMAKE_CXX_FLAGS_RELEASE}
+        -DCMAKE_LIBRARY_OUTPUT_DIRECTORY=${CMAKE_BINARY_DIR}/ext/yaml-cpp/lib
+        -DCMAKE_ARCHIVE_OUTPUT_DIRECTORY=${CMAKE_BINARY_DIR}/ext/yaml-cpp/lib
+        PREFIX "${CMAKE_BINARY_DIR}/ext/yaml-cpp"
+        # Disable install step
+        INSTALL_COMMAND ""
+	    LOG_DOWNLOAD ON
+        LOG_BUILD 1
+)
--- a/dygraph/deploy/cpp/docs/linux_build.md
+++ b/dygraph/deploy/cpp/docs/linux_build.md
+# Linux平台编译指南
+## 说明
+本文档在 `Linux`平台使用`GCC 4.8.5` 和 `GCC 4.9.4`测试过，如果需要使用更高G++版本编译使用，则需要重新编译Paddle预测库，请参考: [从源码编译Paddle预测库](https://www.paddlepaddle.org.cn/documentation/docs/zh/develop/advanced_guide/inference_deployment/inference/build_and_install_lib_cn.html)。本文档使用的预置的opencv库是在ubuntu 16.04上用gcc4.8编译的，如果需要在ubuntu 16.04以外的系统环境编译，那么需自行编译opencv库。
+## 前置条件
+* G++ 4.8.2 ~ 4.9.4
+* CUDA 9.0 / CUDA 10.0, cudnn 7+ （仅在使用GPU版本的预测库时需要）
+* CMake 3.0+
+请确保系统已经安装好上述基本软件，**下面所有示例以工作目录为 `/root/projects/`演示**。
+### Step1: 下载代码
+ `git clone https://github.com/PaddlePaddle/PaddleDetection.git`
+**说明**：其中`C++`预测代码在`/root/projects/PaddleDetection/deploy/cpp` 目录，该目录不依赖任何`PaddleDetection`下其他目录。
+### Step2: 下载PaddlePaddle C++ 预测库 fluid_inference
+PaddlePaddle C++ 预测库针对不同的`CPU`和`CUDA`版本提供了不同的预编译版本，请根据实际情况下载:  [C++预测库下载列表](https://www.paddlepaddle.org.cn/documentation/docs/zh/develop/advanced_guide/inference_deployment/inference/build_and_install_lib_cn.html)
+下载并解压后`/root/projects/fluid_inference`目录包含内容为：
+```
+fluid_inference
+├── paddle # paddle核心库和头文件
+|
+├── third_party # 第三方依赖库和头文件
+|
+└── version.txt # 版本和编译信息
+```
+**注意:** 预编译版本除`nv-jetson-cuda10-cudnn7.5-trt5` 以外其它包都是基于`GCC 4.8.5`编译，使用高版本`GCC`可能存在 `ABI`兼容性问题，建议降级或[自行编译预测库](https://www.paddlepaddle.org.cn/documentation/docs/zh/advanced_guide/inference_deployment/inference/build_and_install_lib_cn.html)。
+### Step4: 编译
+编译`cmake`的命令在`scripts/build.sh`中，请根据实际情况修改主要参数，其主要内容说明如下：
+```
+# 是否使用GPU(即是否使用 CUDA)
+WITH_GPU=OFF
+# 使用MKL or openblas
+WITH_MKL=ON
+# 是否集成 TensorRT(仅WITH_GPU=ON 有效)
+WITH_TENSORRT=OFF
+# TensorRT 的include路径
+TENSORRT_LIB_DIR=/path/to/TensorRT/include
+# TensorRT 的lib路径
+TENSORRT_LIB_DIR=/path/to/TensorRT/lib
+# Paddle 预测库路径
+PADDLE_DIR=/path/to/fluid_inference
+# Paddle 的预测库是否使用静态库来编译
+# 使用TensorRT时，Paddle的预测库通常为动态库
+WITH_STATIC_LIB=OFF
+# CUDA 的 lib 路径
+CUDA_LIB=/path/to/cuda/lib
+# CUDNN 的 lib 路径
+CUDNN_LIB=/path/to/cudnn/lib
+# 请检查以上各个路径是否正确
+# 以下无需改动
+cmake .. \
+    -DWITH_GPU=${WITH_GPU} \
+    -DWITH_MKL=${WITH_MKL} \
+    -DWITH_TENSORRT=${WITH_TENSORRT} \
+    -DTENSORRT_LIB_DIR=${TENSORRT_LIB_DIR} \
+    -DTENSORRT_INC_DIR=${TENSORRT_INC_DIR} \
+    -DPADDLE_DIR=${PADDLE_DIR} \
+    -DWITH_STATIC_LIB=${WITH_STATIC_LIB} \
+    -DCUDA_LIB=${CUDA_LIB} \
+    -DCUDNN_LIB=${CUDNN_LIB} \
+    -DOPENCV_DIR=${OPENCV_DIR}
+make
+```
+修改脚本设置好主要参数后，执行`build`脚本：
+ ```shell
+ sh ./scripts/build.sh
+ ```
+**注意**: OPENCV依赖OPENBLAS，Ubuntu用户需确认系统是否已存在`libopenblas.so`。如未安装，可执行apt-get install libopenblas-dev进行安装。
+### Step5: 预测及可视化
+编译成功后，预测入口程序为`build/main`其主要命令参数说明如下：
+|  参数   | 说明  |
+|  ----  | ----  |
+| --model_dir  | 导出的预测模型所在路径 |
+| --image_path  | 要预测的图片文件路径 |
+| --video_path  | 要预测的视频文件路径 |
+| --camera_id | Option | 用来预测的摄像头ID，默认为-1（表示不使用摄像头预测）|
+| --use_gpu  | 是否使用 GPU 预测, 支持值为0或1(默认值为0)|
+| --gpu_id  |  指定进行推理的GPU device id(默认值为0)|
+| --run_mode | 使用GPU时，默认为fluid, 可选（fluid/trt_fp32/trt_fp16）|
+| --run_benchmark | 是否重复预测来进行benchmark测速 ｜
+| --output_dir | 输出图片所在的文件夹, 默认为output ｜
+**注意**: 如果同时设置了`video_path`和`image_path`，程序仅预测`video_path`。
+`样例一`：
+```shell
+#不使用`GPU`测试图片 `/root/projects/images/test.jpeg`  
+./build/main --model_dir=/root/projects/models/yolov3_darknet --image_path=/root/projects/images/test.jpeg
+```
+图片文件`可视化预测结果`会保存在当前目录下`output.jpg`文件中。
+`样例二`:
+```shell
+#使用 `GPU`预测视频`/root/projects/videos/test.mp4`
+./build/main --model_dir=/root/projects/models/yolov3_darknet --video_path=/root/projects/images/test.mp4 --use_gpu=1
+```
+视频文件目前支持`.mp4`格式的预测，`可视化预测结果`会保存在当前目录下`output.mp4`文件中。
--- a/dygraph/deploy/cpp/docs/windows_vs2019_build.md
+++ b/dygraph/deploy/cpp/docs/windows_vs2019_build.md
+# Visual Studio 2019 Community CMake 编译指南
+Windows 平台下，我们使用`Visual Studio 2019 Community` 进行了测试。微软从`Visual Studio 2017`开始即支持直接管理`CMake`跨平台编译项目，但是直到`2019`才提供了稳定和完全的支持，所以如果你想使用CMake管理项目编译构建，我们推荐你使用`Visual Studio 2019`环境下构建。
+## 前置条件
+* Visual Studio 2019 (根据Paddle预测库所使用的VS版本选择，请参考 [Visual Studio 不同版本二进制兼容性](https://docs.microsoft.com/zh-cn/cpp/porting/binary-compat-2015-2017?view=vs-2019) )
+* CUDA 9.0 / CUDA 10.0，cudnn 7+ （仅在使用GPU版本的预测库时需要）
+* CMake 3.0+ [CMake下载](https://cmake.org/download/)
+请确保系统已经安装好上述基本软件，我们使用的是`VS2019`的社区版。
+**下面所有示例以工作目录为 `D:\projects`演示**。
+### Step1: 下载代码
+下载源代码
+```shell
+git clone https://github.com/PaddlePaddle/PaddleDetection.git
+```
+**说明**：其中`C++`预测代码在`PaddleDetection/deploy/cpp` 目录，该目录不依赖任何`PaddleDetection`下其他目录。
+### Step2: 下载PaddlePaddle C++ 预测库 fluid_inference
+PaddlePaddle C++ 预测库针对不同的`CPU`和`CUDA`版本提供了不同的预编译版本，请根据实际情况下载:  [C++预测库下载列表](https://www.paddlepaddle.org.cn/documentation/docs/zh/develop/advanced_guide/inference_deployment/inference/windows_cpp_inference.html)
+解压后`D:\projects\fluid_inference`目录包含内容为：
+```
+fluid_inference
+├── paddle # paddle核心库和头文件
+|
+├── third_party # 第三方依赖库和头文件
+|
+└── version.txt # 版本和编译信息
+```
+### Step3: 安装配置OpenCV
+1. 在OpenCV官网下载适用于Windows平台的3.4.6版本， [下载地址](https://sourceforge.net/projects/opencvlibrary/files/3.4.6/opencv-3.4.6-vc14_vc15.exe/download)  
+2. 运行下载的可执行文件，将OpenCV解压至指定目录，如`D:\projects\opencv`
+3. 配置环境变量，如下流程所示（如果使用全局绝对路径，可以不用设置环境变量）  
+    - 我的电脑->属性->高级系统设置->环境变量
+    - 在系统变量中找到Path（如没有，自行创建），并双击编辑
+    - 新建，将opencv路径填入并保存，如`D:\projects\opencv\build\x64\vc14\bin`
+### Step4: 编译
+#### 通过图形化操作编译CMake
+1. 打开Visual Studio 2019 Community，点击`继续但无需代码`
+![step2](https://paddleseg.bj.bcebos.com/inference/vs2019_step1.png)
+2. 点击： `文件`->`打开`->`CMake`
+![step2.1](https://paddleseg.bj.bcebos.com/inference/vs2019_step2.png)
+选择项目代码所在路径，并打开`CMakeList.txt`：
+![step2.2](https://paddleseg.bj.bcebos.com/inference/vs2019_step3.png)
+3. 点击：`项目`->`cpp_inference_demo的CMake设置`
+![step3](https://paddleseg.bj.bcebos.com/inference/vs2019_step4.png)
+4. 点击`浏览`，分别设置编译选项指定`CUDA`、`CUDNN_LIB`、`OpenCV`、`Paddle预测库`的路径
+三个编译参数的含义说明如下（带*表示仅在使用**GPU版本**预测库时指定, 其中CUDA库版本尽量对齐，**使用9.0、10.0版本，不使用9.2、10.1等版本CUDA库**）：
+|  参数名   | 含义  |
+|  ----  | ----  |
+| *CUDA_LIB  | CUDA的库路径 |
+| *CUDNN_LIB | CUDNN的库路径 |
+| OPENCV_DIR  | OpenCV的安装路径， |
+| PADDLE_DIR | Paddle预测库的路径 |
+**注意：** 1. 使用`CPU`版预测库，请把`WITH_GPU`的勾去掉 2. 如果使用的是`openblas`版本，请把`WITH_MKL`勾去掉
+![step4](https://paddleseg.bj.bcebos.com/inference/vs2019_step5.png)
+**设置完成后**, 点击上图中`保存并生成CMake缓存以加载变量`。
+5. 点击`生成`->`全部生成`
+![step6](https://paddleseg.bj.bcebos.com/inference/vs2019_step6.png)
+#### 通过命令行操作编译CMake
+1. 进入到`cpp`文件夹
+```
+cd D:\projects\PaddleDetection\deploy\cpp
+```
+2. 使用CMake生成项目文件
+```
+cmake . -G "Visual Studio 16 2019" -A x64 -T host=x64 -DWITH_GPU=ON -DWITH_MKL=ON -DCMAKE_BUILD_TYPE=Release -DCUDA_LIB=path_to_cuda_lib -DCUDNN_LIB=path_to_cudnn_lib -DPADDLE_DIR=path_to_paddle_lib -DOPENCV_DIR=path_to_opencv
+```
+例如：
+```
+cmake . -G "Visual Studio 16 2019" -A x64 -T host=x64 -DWITH_GPU=ON -DWITH_MKL=ON -DCMAKE_BUILD_TYPE=Release -DCUDA_LIB=D:\projects\packages\cuda10_0\lib\x64 -DCUDNN_LIB=D:\projects\packages\cuda10_0\lib\x64 -DPADDLE_DIR=D:\projects\packages\fluid_inference -DOPENCV_DIR=D:\projects\packages\opencv3_4_6
+```
+3. 编译
+用`Visual Studio 16 2019`打开`cpp`文件夹下的`PaddleObjectDetector.sln`，点击`生成`->`全部生成`
+### Step5: 预测及可视化
+上述`Visual Studio 2019`编译产出的可执行文件在`out\build\x64-Release`目录下，打开`cmd`，并切换到该目录：
+```
+cd D:\projects\PaddleDetection\deploy\cpp\out\build\x64-Release
+```
+可执行文件`main`即为样例的预测程序，其主要的命令行参数如下：
+|  参数   | 说明  |
+|  ----  | ----  |
+| --model_dir  | 导出的预测模型所在路径 |
+| --image_path  | 要预测的图片文件路径 |
+| --video_path  | 要预测的视频文件路径 |
+| --camera_id | Option | 用来预测的摄像头ID，默认为-1（表示不使用摄像头预测）|
+| --use_gpu  | 是否使用 GPU 预测, 支持值为0或1(默认值为0)|
+| --gpu_id  |  指定进行推理的GPU device id(默认值为0)|
+| --run_mode | 使用GPU时，默认为fluid, 可选（fluid/trt_fp32/trt_fp16）|
+| --run_benchmark | 是否重复预测来进行benchmark测速 |
+| --output_dir | 输出图片所在的文件夹, 默认为output |
+**注意**：  
+（1）如果同时设置了`video_path`和`image_path`，程序仅预测`video_path`。  
+（2）如果提示找不到`opencv_world346.dll`，把`D:\projects\packages\opencv3_4_6\build\x64\vc14\bin`文件夹下的`opencv_world346.dll`拷贝到`main.exe`文件夹下即可。
+`样例一`：
+```shell
+#不使用`GPU`测试图片 `D:\\images\\test.jpeg`  
+.\main --model_dir=D:\\models\\yolov3_darknet --image_path=D:\\images\\test.jpeg
+```
+图片文件`可视化预测结果`会保存在当前目录下`output.jpg`文件中。
+`样例二`:
+```shell
+#使用`GPU`测试视频 `D:\\videos\\test.mp4`  
+.\main --model_dir=D:\\models\\yolov3_darknet --video_path=D:\\videos\\test.mp4 --use_gpu=1
+```
+视频文件目前支持`.mp4`格式的预测，`可视化预测结果`会保存在当前目录下`output.mp4`文件中。
+## 性能测试
+测试环境为：系统: Windows 10专业版系统，CPU: I9-9820X, GPU: GTX 2080 Ti，Paddle预测库: 1.8.4，CUDA: 10.0, CUDNN: 7.4.  
+去掉前100轮warmup时间，测试100轮的平均时间，单位ms/image，只计算模型运行时间，不包括数据的处理和拷贝。
+|模型 | AnalysisPredictor(ms) | 输入|
+|---|----|---|
+| YOLOv3-MobileNetv1 | 41.51 |  608*608
+| faster_rcnn_r50_1x | 194.47 | 1333*1333
+| faster_rcnn_r50_vd_fpn_2x | 43.35 | 1344*1344
+| mask_rcnn_r50_fpn_1x | 96.96 | 1344*1344
+| mask_rcnn_r50_vd_fpn_2x | 97.66 | 1344*1344
+| ppyolo_r18vd | 5.54 | 320*320
+| ppyolo_2x | 56.93 | 608*608
+| ttfnet_darknet | 36.17 | 512*512
--- a/dygraph/deploy/cpp/include/config_parser.h
+++ b/dygraph/deploy/cpp/include/config_parser.h
+//   Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+//     http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+#pragma once
+#include <iostream>
+#include <vector>
+#include <string>
+#include <map>
+#include "yaml-cpp/yaml.h"
+#ifdef _WIN32
+#define OS_PATH_SEP "\\"
+#else
+#define OS_PATH_SEP "/"
+#endif
+namespace PaddleDetection {
+// Inference model configuration parser
+class ConfigPaser {
+ public:
+  ConfigPaser() {}
+  ~ConfigPaser() {}
+  bool load_config(const std::string& model_dir,
+                   const std::string& cfg = "infer_cfg.yml") {
+    // Load as a YAML::Node
+    YAML::Node config;
+    config = YAML::LoadFile(model_dir + OS_PATH_SEP + cfg);
+    // Get runtime mode : fluid, trt_fp16, trt_fp32
+    if (config["mode"].IsDefined()) {
+      mode_ = config["mode"].as<std::string>();
+    } else {
+      std::cerr << "Please set mode, "
+                << "support value : fluid/trt_fp16/trt_fp32."
+                << std::endl;
+      return false;
+    }
+    // Get model arch : YOLO, SSD, RetinaNet, RCNN, Face
+    if (config["arch"].IsDefined()) {
+      arch_ = config["arch"].as<std::string>();
+    } else {
+      std::cerr << "Please set model arch,"
+                << "support value : YOLO, SSD, RetinaNet, RCNN, Face."
+                << std::endl;
+      return false;
+    }
+    // Get min_subgraph_size for tensorrt
+    if (config["min_subgraph_size"].IsDefined()) {
+      min_subgraph_size_ = config["min_subgraph_size"].as<int>();
+    } else {
+      std::cerr << "Please set min_subgraph_size." << std::endl;
+      return false;
+    }
+    // Get draw_threshold for visualization
+    if (config["draw_threshold"].IsDefined()) {
+      draw_threshold_ = config["draw_threshold"].as<float>();
+    } else {
+      std::cerr << "Please set draw_threshold." << std::endl;
+      return false;
+    }
+    // Get with_background
+    if (config["with_background"].IsDefined()) {
+      with_background_ = config["with_background"].as<bool>();
+    } else {
+      std::cerr << "Please set with_background." << std::endl;
+      return false;
+    }
+    // Get Preprocess for preprocessing
+    if (config["Preprocess"].IsDefined()) {
+      preprocess_info_ = config["Preprocess"];
+    } else {
+      std::cerr << "Please set Preprocess." << std::endl;
+      return false;
+    }
+    // Get label_list for visualization
+    if (config["label_list"].IsDefined()) {
+      label_list_ = config["label_list"].as<std::vector<std::string>>();
+    } else {
+      std::cerr << "Please set label_list." << std::endl;
+      return false;
+    }
+    if (config["image_shape"].IsDefined()) {
+      image_shape_ = config["image_shape"].as<std::vector<int>>();
+    } else {
+      std::cerr << "Please set image_shape." << std::endl;
+      return false;
+    }
+    return true;
+  }
+  std::string mode_;
+  float draw_threshold_;
+  std::string arch_;
+  int min_subgraph_size_;
+  bool with_background_;
+  YAML::Node preprocess_info_;
+  std::vector<std::string> label_list_;
+  std::vector<int> image_shape_;
+};
+}  // namespace PaddleDetection
--- a/dygraph/deploy/cpp/include/object_detector.h
+++ b/dygraph/deploy/cpp/include/object_detector.h
+//   Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+//     http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+#pragma once
+#include <string>
+#include <vector>
+#include <memory>
+#include <utility>
+#include <ctime>
+#include <opencv2/core/core.hpp>
+#include <opencv2/imgproc/imgproc.hpp>
+#include <opencv2/highgui/highgui.hpp>
+#include "paddle_inference_api.h" // NOLINT
+#include "include/preprocess_op.h"
+#include "include/config_parser.h"
+using namespace paddle_infer;
+namespace PaddleDetection {
+// Object Detection Result
+struct ObjectResult {
+  // Rectangle coordinates of detected object: left, right, top, down
+  std::vector<int> rect;
+  // Class id of detected object
+  int class_id;
+  // Confidence of detected object
+  float confidence;
+};
+// Generate visualization colormap for each class
+std::vector<int> GenerateColorMap(int num_class);
+// Visualiztion Detection Result
+cv::Mat VisualizeResult(const cv::Mat& img,
+                     const std::vector<ObjectResult>& results,
+                     const std::vector<std::string>& lable_list,
+                     const std::vector<int>& colormap);
+class ObjectDetector {
+ public:
+  explicit ObjectDetector(const std::string& model_dir, 
+                          bool use_gpu=false,
+                          const std::string& run_mode="fluid",
+                          const int gpu_id=0) {
+    config_.load_config(model_dir);
+    threshold_ = config_.draw_threshold_;
+    image_shape_ = config_.image_shape_;
+    preprocessor_.Init(config_.preprocess_info_, image_shape_);
+    LoadModel(model_dir, use_gpu, config_.min_subgraph_size_, 1, run_mode, gpu_id);
+  }
+  // Load Paddle inference model
+  void LoadModel(
+    const std::string& model_dir,
+    bool use_gpu,
+    const int min_subgraph_size,
+    const int batch_size = 1,
+    const std::string& run_mode = "fluid",
+    const int gpu_id=0);
+  // Run predictor
+  void Predict(const cv::Mat& im,
+      const double threshold = 0.5,
+      const int warmup = 0,
+      const int repeats = 1,
+      const bool run_benchmark = false,
+      std::vector<ObjectResult>* result = nullptr);
+  // Get Model Label list
+  const std::vector<std::string>& GetLabelList() const {
+    return config_.label_list_;
+  }
+ private:
+  // Preprocess image and copy data to input buffer
+  void Preprocess(const cv::Mat& image_mat);
+  // Postprocess result
+  void Postprocess(
+      const cv::Mat& raw_mat,
+      std::vector<ObjectResult>* result);
+  std::shared_ptr<Predictor> predictor_;
+  Preprocessor preprocessor_;
+  ImageBlob inputs_;
+  std::vector<float> output_data_;
+  float threshold_;
+  ConfigPaser config_;
+  std::vector<int> image_shape_;
+};
+}  // namespace PaddleDetection
--- a/dygraph/deploy/cpp/include/preprocess_op.h
+++ b/dygraph/deploy/cpp/include/preprocess_op.h
+//   Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+//     http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+#pragma once
+#include <glog/logging.h>
+#include <yaml-cpp/yaml.h>
+#include <vector>
+#include <string>
+#include <utility>
+#include <memory>
+#include <unordered_map>
+#include <opencv2/core/core.hpp>
+#include <opencv2/imgproc/imgproc.hpp>
+#include <opencv2/highgui/highgui.hpp>
+namespace PaddleDetection {
+// Object for storing all preprocessed data
+class ImageBlob {
+ public:
+  // image width and height
+  std::vector<float> im_shape_;
+  // Buffer for image data after preprocessing
+  std::vector<float> im_data_;
+  // input image width, height
+  std::vector<int> input_shape_;
+  // Evaluation image width and height
+  //std::vector<float>  eval_im_size_f_;
+  // Scale factor for image size to origin image size
+  std::vector<float> scale_factor_;
+};
+// Abstraction of preprocessing opration class
+class PreprocessOp {
+ public:
+  virtual void Init(const YAML::Node& item, const std::vector<int> image_shape) = 0;
+  virtual void Run(cv::Mat* im, ImageBlob* data) = 0;
+};
+class InitInfo : public PreprocessOp{
+ public:
+  virtual void Init(const YAML::Node& item, const std::vector<int> image_shape) {}
+  virtual void Run(cv::Mat* im, ImageBlob* data);
+};
+class Normalize : public PreprocessOp {
+ public:
+  virtual void Init(const YAML::Node& item, const std::vector<int> image_shape) {
+    mean_ = item["mean"].as<std::vector<float>>();
+    scale_ = item["std"].as<std::vector<float>>();
+    is_scale_ = item["is_scale"].as<bool>();
+  }
+  virtual void Run(cv::Mat* im, ImageBlob* data);
+ private:
+  // CHW or HWC
+  std::vector<float> mean_;
+  std::vector<float> scale_;
+  bool is_scale_;
+};
+class Permute : public PreprocessOp {
+ public:
+  virtual void Init(const YAML::Node& item, const std::vector<int> image_shape) {}
+  virtual void Run(cv::Mat* im, ImageBlob* data);
+};
+class Resize : public PreprocessOp {
+ public:
+  virtual void Init(const YAML::Node& item, const std::vector<int> image_shape) {
+    interp_ = item["interp"].as<int>();
+    //max_size_ = item["target_size"].as<int>();
+    keep_ratio_ = item["keep_ratio"].as<bool>();
+    target_size_ = item["target_size"].as<std::vector<int>>();
+    if (item["keep_ratio"]) {
+      input_shape_ = image_shape;
+    }
+ }
+  // Compute best resize scale for x-dimension, y-dimension
+  std::pair<float, float> GenerateScale(const cv::Mat& im);
+  virtual void Run(cv::Mat* im, ImageBlob* data);
+ private:
+  int interp_;
+  bool keep_ratio_;
+  std::vector<int> target_size_;
+  std::vector<int> input_shape_;
+};
+// Models with FPN need input shape % stride == 0
+class PadStride : public PreprocessOp {
+ public:
+  virtual void Init(const YAML::Node& item, const std::vector<int> image_shape) {
+    stride_ = item["stride"].as<int>();
+  }
+  virtual void Run(cv::Mat* im, ImageBlob* data);
+ private:
+  int stride_;
+};
+class Preprocessor {
+ public:
+  void Init(const YAML::Node& config_node, const std::vector<int> image_shape) {
+    // initialize image info at first
+    ops_["InitInfo"] = std::make_shared<InitInfo>();
+    for (const auto& item : config_node) {
+      auto op_name = item["type"].as<std::string>();
+      ops_[op_name] = CreateOp(op_name);
+      ops_[op_name]->Init(item, image_shape);
+    }
+  }
+  std::shared_ptr<PreprocessOp> CreateOp(const std::string& name) {
+    if (name == "ResizeOp") {
+      return std::make_shared<Resize>();
+    } else if (name == "PermuteOp") {
+      return std::make_shared<Permute>();
+    } else if (name == "NormalizeImageOp") {
+      return std::make_shared<Normalize>();
+    } else if (name == "PadBatchOp") {
+      return std::make_shared<PadStride>();
+    }
+    return nullptr;
+  }
+  void Run(cv::Mat* im, ImageBlob* data);
+ public:
+  static const std::vector<std::string> RUN_ORDER;
+ private:
+  std::unordered_map<std::string, std::shared_ptr<PreprocessOp>> ops_;
+};
+}  // namespace PaddleDetection
--- a/dygraph/deploy/cpp/scripts/build.sh
+++ b/dygraph/deploy/cpp/scripts/build.sh
+# 是否使用GPU(即是否使用 CUDA)
+WITH_GPU=OFF
+# 是否使用MKL or openblas，TX2需要设置为OFF
+WITH_MKL=ON
+# 是否集成 TensorRT(仅WITH_GPU=ON 有效)
+WITH_TENSORRT=OFF
+# TensorRT 的include路径
+TENSORRT_INC_DIR=/path/to/tensorrt/lib
+# TensorRT 的lib路径
+TENSORRT_LIB_DIR=/path/to/tensorrt/include
+# Paddle 预测库路径
+PADDLE_DIR=/path/to/fluid_inference/
+# Paddle 的预测库是否使用静态库来编译
+# 使用TensorRT时，Paddle的预测库通常为动态库
+WITH_STATIC_LIB=OFF
+# CUDA 的 lib 路径
+CUDA_LIB=/path/to/cuda/lib
+# CUDNN 的 lib 路径
+CUDNN_LIB=/path/to/cudnn/lib
+MACHINE_TYPE=`uname -m`
+echo "MACHINE_TYPE: "${MACHINE_TYPE}
+if [ "$MACHINE_TYPE" = "x86_64" ]
+then
+  echo "set OPENCV_DIR for x86_64"
+  # linux系统通过以下命令下载预编译的opencv
+  mkdir -p $(pwd)/deps && cd $(pwd)/deps
+  wget -c https://bj.bcebos.com/paddleseg/deploy/opencv3.4.6gcc4.8ffmpeg.tar.gz2
+  tar xvfj opencv3.4.6gcc4.8ffmpeg.tar.gz2 && cd ..
+  # set OPENCV_DIR
+  OPENCV_DIR=$(pwd)/deps/opencv3.4.6gcc4.8ffmpeg/
+elif [ "$MACHINE_TYPE" = "aarch64" ]
+then
+  echo "set OPENCV_DIR for aarch64"
+  # TX2平台通过以下命令下载预编译的opencv
+  mkdir -p $(pwd)/deps && cd $(pwd)/deps
+  wget -c https://paddlemodels.bj.bcebos.com/TX2_JetPack4.3_opencv_3.4.10_gcc7.5.0.zip
+  unzip TX2_JetPack4.3_opencv_3.4.10_gcc7.5.0.zip && cd ..
+  # set OPENCV_DIR
+  OPENCV_DIR=$(pwd)/deps/TX2_JetPack4.3_opencv_3.4.10_gcc7.5.0/
+else
+  echo "Please set OPENCV_DIR manually"
+fi
+echo "OPENCV_DIR: "$OPENCV_DIR
+# 以下无需改动
+rm -rf build
+mkdir -p build
+cd build
+cmake .. \
+    -DWITH_GPU=${WITH_GPU} \
+    -DWITH_MKL=${WITH_MKL} \
+    -DWITH_TENSORRT=${WITH_TENSORRT} \
+    -DTENSORRT_LIB_DIR=${TENSORRT_LIB_DIR} \
+    -DTENSORRT_INC_DIR=${TENSORRT_INC_DIR} \
+    -DPADDLE_DIR=${PADDLE_DIR} \
+    -DWITH_STATIC_LIB=${WITH_STATIC_LIB} \
+    -DCUDA_LIB=${CUDA_LIB} \
+    -DCUDNN_LIB=${CUDNN_LIB} \
+    -DOPENCV_DIR=${OPENCV_DIR}
+make
+echo "make finished!"
--- a/dygraph/deploy/cpp/src/main.cc
+++ b/dygraph/deploy/cpp/src/main.cc
+//   Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+//     http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+#include <glog/logging.h>
+#include <iostream>
+#include <string>
+#include <vector>
+#include <sys/types.h>
+#include <sys/stat.h>
+#ifdef _WIN32
+#include <direct.h>
+#include <io.h>
+#elif LINUX
+#include <stdarg.h>
+#include <sys/stat.h>
+#endif
+#include "include/object_detector.h"
+DEFINE_string(model_dir, "", "Path of inference model");
+DEFINE_string(image_path, "", "Path of input image");
+DEFINE_string(video_path, "", "Path of input video");
+DEFINE_bool(use_gpu, false, "Infering with GPU or CPU");
+DEFINE_bool(use_camera, false, "Use camera or not");
+DEFINE_string(run_mode, "fluid", "Mode of running(fluid/trt_fp32/trt_fp16)");
+DEFINE_int32(gpu_id, 0, "Device id of GPU to execute");
+DEFINE_int32(camera_id, -1, "Device id of camera to predict");
+DEFINE_bool(run_benchmark, false, "Whether to predict a image_file repeatedly for benchmark");
+DEFINE_double(threshold, 0.5, "Threshold of score.");
+DEFINE_string(output_dir, "output", "Directory of output visualization files.");
+static std::string DirName(const std::string &filepath) {
+  auto pos = filepath.rfind(OS_PATH_SEP);
+  if (pos == std::string::npos) {
+    return "";
+  }
+  return filepath.substr(0, pos);
+}
+static bool PathExists(const std::string& path){
+#ifdef _WIN32
+  struct _stat buffer;
+  return (_stat(path.c_str(), &buffer) == 0);
+#else
+  struct stat buffer;
+  return (stat(path.c_str(), &buffer) == 0);
+#endif  // !_WIN32
+}
+static void MkDir(const std::string& path) {
+  if (PathExists(path)) return;
+  int ret = 0;
+#ifdef _WIN32
+  ret = _mkdir(path.c_str());
+#else
+  ret = mkdir(path.c_str(), 0755);
+#endif  // !_WIN32
+  if (ret != 0) {
+    std::string path_error(path);
+    path_error += " mkdir failed!";
+    throw std::runtime_error(path_error);
+  }
+}
+static void MkDirs(const std::string& path) {
+  if (path.empty()) return;
+  if (PathExists(path)) return;
+  MkDirs(DirName(path));
+  MkDir(path);
+}
+void PredictVideo(const std::string& video_path,
+                  PaddleDetection::ObjectDetector* det) {
+  // Open video
+  cv::VideoCapture capture;
+  if (FLAGS_camera_id != -1){
+    capture.open(FLAGS_camera_id);
+  }else{
+    capture.open(video_path.c_str());
+  }
+  if (!capture.isOpened()) {
+    printf("can not open video : %s\n", video_path.c_str());
+    return;
+  }
+  // Get Video info : resolution, fps
+  int video_width = static_cast<int>(capture.get(CV_CAP_PROP_FRAME_WIDTH));
+  int video_height = static_cast<int>(capture.get(CV_CAP_PROP_FRAME_HEIGHT));
+  int video_fps = static_cast<int>(capture.get(CV_CAP_PROP_FPS));
+  // Create VideoWriter for output
+  cv::VideoWriter video_out;
+  std::string video_out_path = "output.mp4";
+  video_out.open(video_out_path.c_str(),
+                 0x00000021,
+                 video_fps,
+                 cv::Size(video_width, video_height),
+                 true);
+  if (!video_out.isOpened()) {
+    printf("create video writer failed!\n");
+    return;
+  }
+  std::vector<PaddleDetection::ObjectResult> result;
+  auto labels = det->GetLabelList();
+  auto colormap = PaddleDetection::GenerateColorMap(labels.size());
+  // Capture all frames and do inference
+  cv::Mat frame;
+  int frame_id = 0;
+  while (capture.read(frame)) {
+    if (frame.empty()) {
+      break;
+    }
+    det->Predict(frame, 0.5, 0, 1, false, &result);
+    cv::Mat out_im = PaddleDetection::VisualizeResult(
+        frame, result, labels, colormap);
+    for (const auto& item : result) {
+      printf("In frame id %d, we detect: class=%d confidence=%.2f rect=[%d %d %d %d]\n",
+        frame_id,
+        item.class_id,
+        item.confidence,
+        item.rect[0],
+        item.rect[1],
+        item.rect[2],
+        item.rect[3]);
+   }   
+    video_out.write(out_im);
+    frame_id += 1;
+  }
+  capture.release();
+  video_out.release();
+}
+void PredictImage(const std::string& image_path,
+                  const double threshold,
+                  const bool run_benchmark,
+                  PaddleDetection::ObjectDetector* det,
+                  const std::string& output_dir = "output") {
+  // Open input image as an opencv cv::Mat object
+  cv::Mat im = cv::imread(image_path, 1);
+  // Store all detected result
+  std::vector<PaddleDetection::ObjectResult> result;
+  if (run_benchmark)
+  {
+    det->Predict(im, threshold, 100, 100, run_benchmark, &result);
+  }else
+  {
+    det->Predict(im, 0.5, 0, 1, run_benchmark, &result);
+    for (const auto& item : result) {
+      printf("class=%d confidence=%.4f rect=[%d %d %d %d]\n",
+          item.class_id,
+          item.confidence,
+          item.rect[0],
+          item.rect[1],
+          item.rect[2],
+          item.rect[3]);
+    }
+    // Visualization result
+    auto labels = det->GetLabelList();
+    auto colormap = PaddleDetection::GenerateColorMap(labels.size());
+    cv::Mat vis_img = PaddleDetection::VisualizeResult(
+        im, result, labels, colormap);
+    std::vector<int> compression_params;
+    compression_params.push_back(CV_IMWRITE_JPEG_QUALITY);
+    compression_params.push_back(95);
+    std::string output_path(output_dir);
+    if (output_dir.rfind(OS_PATH_SEP) != output_dir.size() - 1) {
+      output_path += OS_PATH_SEP;
+    }
+    output_path += "output.jpg";
+    cv::imwrite(output_path, vis_img, compression_params);
+    printf("Visualized output saved as %s\n", output_path.c_str());
+  }
+}
+int main(int argc, char** argv) {
+  // Parsing command-line
+  google::ParseCommandLineFlags(&argc, &argv, true);
+  if (FLAGS_model_dir.empty()
+      || (FLAGS_image_path.empty() && FLAGS_video_path.empty())) {
+    std::cout << "Usage: ./main --model_dir=/PATH/TO/INFERENCE_MODEL/ "
+                << "--image_path=/PATH/TO/INPUT/IMAGE/" << std::endl;
+    return -1;
+  }
+  if (!(FLAGS_run_mode == "fluid" || FLAGS_run_mode == "trt_fp32"
+      || FLAGS_run_mode == "trt_fp16")) {
+    std::cout << "run_mode should be 'fluid', 'trt_fp32' or 'trt_fp16'.";
+    return -1;
+  }
+  // Load model and create a object detector
+  PaddleDetection::ObjectDetector det(FLAGS_model_dir, FLAGS_use_gpu,
+    FLAGS_run_mode, FLAGS_gpu_id);
+  // Do inference on input video or image
+  if (!FLAGS_video_path.empty() || FLAGS_use_camera) {
+    PredictVideo(FLAGS_video_path, &det);
+  } else if (!FLAGS_image_path.empty()) {
+    if (!PathExists(FLAGS_output_dir)) {
+      MkDirs(FLAGS_output_dir);
+    }
+    PredictImage(FLAGS_image_path, FLAGS_threshold, FLAGS_run_benchmark, &det, FLAGS_output_dir);
+  }
+  return 0;
+}
--- a/dygraph/deploy/cpp/src/object_detector.cc
+++ b/dygraph/deploy/cpp/src/object_detector.cc
+//   Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+//     http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+#include <sstream>
+// for setprecision
+#include <iomanip>
+#include "include/object_detector.h"
+using namespace paddle_infer;
+namespace PaddleDetection {
+// Load Model and create model predictor
+void ObjectDetector::LoadModel(const std::string& model_dir,
+                               bool use_gpu,
+                               const int min_subgraph_size,
+                               const int batch_size,
+                               const std::string& run_mode,
+                               const int gpu_id) {
+  paddle_infer::Config config;
+  std::string prog_file = model_dir + OS_PATH_SEP + "model.pdmodel";
+  std::string params_file = model_dir + OS_PATH_SEP + "model.pdiparams";
+  config.SetModel(prog_file, params_file);
+  if (use_gpu) {
+    config.EnableUseGpu(200, gpu_id);
+    config.SwitchIrOptim(true);
+    if (run_mode != "fluid") {
+      auto precision = paddle_infer::Config::Precision::kFloat32;
+      if (run_mode == "trt_fp16") {
+        precision = paddle_infer::Config::Precision::kHalf;
+      } else if (run_mode == "trt_int8") {
+        printf("TensorRT int8 mode is not supported now, "
+               "please use 'trt_fp32' or 'trt_fp16' instead");
+      } else {
+        if (run_mode != "trt_fp32") {
+          printf("run_mode should be 'fluid', 'trt_fp32' or 'trt_fp16'");
+        }
+      }
+      config.EnableTensorRtEngine(
+          1 << 10,
+          batch_size,
+          min_subgraph_size,
+          precision,
+          false,
+          false);
+   }
+  } else {
+    config.DisableGpu();
+  }
+  config.SwitchUseFeedFetchOps(false);
+  config.DisableGlogInfo();
+  // Memory optimization
+  config.EnableMemoryOptim();
+  predictor_ = std::move(CreatePredictor(config));
+}
+// Visualiztion MaskDetector results
+cv::Mat VisualizeResult(const cv::Mat& img,
+                        const std::vector<ObjectResult>& results,
+                        const std::vector<std::string>& lable_list,
+                        const std::vector<int>& colormap) {
+  cv::Mat vis_img = img.clone();
+  for (int i = 0; i < results.size(); ++i) {
+    int w = results[i].rect[1] - results[i].rect[0];
+    int h = results[i].rect[3] - results[i].rect[2];
+    cv::Rect roi = cv::Rect(results[i].rect[0], results[i].rect[2], w, h);
+    // Configure color and text size
+    std::ostringstream oss;
+    oss << std::setiosflags(std::ios::fixed) << std::setprecision(4);
+    oss << lable_list[results[i].class_id] << " ";
+    oss << results[i].confidence;
+    std::string text = oss.str();
+    int c1 = colormap[3 * results[i].class_id + 0];
+    int c2 = colormap[3 * results[i].class_id + 1];
+    int c3 = colormap[3 * results[i].class_id + 2];
+    cv::Scalar roi_color = cv::Scalar(c1, c2, c3);
+    int font_face = cv::FONT_HERSHEY_COMPLEX_SMALL;
+    double font_scale = 0.5f;
+    float thickness = 0.5;
+    cv::Size text_size = cv::getTextSize(text,
+                                         font_face,
+                                         font_scale,
+                                         thickness,
+                                         nullptr);
+    cv::Point origin;
+    origin.x = roi.x;
+    origin.y = roi.y;
+    // Configure text background
+    cv::Rect text_back = cv::Rect(results[i].rect[0],
+                                  results[i].rect[2] - text_size.height,
+                                  text_size.width,
+                                  text_size.height);
+    // Draw roi object, text, and background
+    cv::rectangle(vis_img, roi, roi_color, 2);
+    cv::rectangle(vis_img, text_back, roi_color, -1);
+    cv::putText(vis_img,
+                text,
+                origin,
+                font_face,
+                font_scale,
+                cv::Scalar(255, 255, 255),
+                thickness);
+  }
+  return vis_img;
+}
+void ObjectDetector::Preprocess(const cv::Mat& ori_im) {
+  // Clone the image : keep the original mat for postprocess
+  cv::Mat im = ori_im.clone();
+  cv::cvtColor(im, im, cv::COLOR_BGR2RGB);
+  preprocessor_.Run(&im, &inputs_);
+}
+void ObjectDetector::Postprocess(
+    const cv::Mat& raw_mat,
+    std::vector<ObjectResult>* result) {
+  result->clear();
+  int rh = 1;
+  int rw = 1;
+  if (config_.arch_ == "SSD" || config_.arch_ == "Face") {
+    rh = raw_mat.rows;
+    rw = raw_mat.cols;
+  }
+  int total_size = output_data_.size() / 6;
+  for (int j = 0; j < total_size; ++j) {
+    // Class id
+    int class_id = static_cast<int>(round(output_data_[0 + j * 6]));
+    // Confidence score
+    float score = output_data_[1 + j * 6];
+    int xmin = (output_data_[2 + j * 6] * rw);
+    int ymin = (output_data_[3 + j * 6] * rh);
+    int xmax = (output_data_[4 + j * 6] * rw);
+    int ymax = (output_data_[5 + j * 6] * rh);
+    int wd = xmax - xmin;
+    int hd = ymax - ymin;
+    if (score > threshold_ && class_id > -1) {
+      ObjectResult result_item;
+      result_item.rect = {xmin, xmax, ymin, ymax};
+      result_item.class_id = class_id;
+      result_item.confidence = score;
+      result->push_back(result_item);
+    }
+  }
+}
+void ObjectDetector::Predict(const cv::Mat& im,
+      const double threshold,
+      const int warmup,
+      const int repeats,
+      const bool run_benchmark,
+      std::vector<ObjectResult>* result) {
+  // Preprocess image
+  Preprocess(im);
+  // Prepare input tensor
+  auto input_names = predictor_->GetInputNames();
+  for (const auto& tensor_name : input_names) {
+    auto in_tensor = predictor_->GetInputHandle(tensor_name);
+    if (tensor_name == "image") {
+      int rh = inputs_.input_shape_[0];
+      int rw = inputs_.input_shape_[1];
+      in_tensor->Reshape({1, 3, rh, rw});
+      in_tensor->CopyFromCpu(inputs_.im_data_.data());
+    } else if (tensor_name == "im_shape") {
+      in_tensor->Reshape({1, 2});
+      in_tensor->CopyFromCpu(inputs_.im_shape_.data());
+    } else if (tensor_name == "scale_factor") {
+      in_tensor->Reshape({1, 2});
+      in_tensor->CopyFromCpu(inputs_.scale_factor_.data());
+    }
+  }
+  // Run predictor
+  for (int i = 0; i < warmup; i++)
+  {
+    predictor_->Run();
+    // Get output tensor
+    auto output_names = predictor_->GetOutputNames();
+    auto out_tensor = predictor_->GetOutputHandle(output_names[0]);
+    std::vector<int> output_shape = out_tensor->shape();
+    // Calculate output length
+    int output_size = 1;
+    for (int j = 0; j < output_shape.size(); ++j) {
+      output_size *= output_shape[j];
+    }
+    if (output_size < 6) {
+      std::cerr << "[WARNING] No object detected." << std::endl;
+    }
+    output_data_.resize(output_size);
+    out_tensor->CopyToCpu(output_data_.data()); 
+  }
+  std::clock_t start = clock();
+  for (int i = 0; i < repeats; i++)
+  {
+    predictor_->Run();
+    // Get output tensor
+    auto output_names = predictor_->GetOutputNames();
+    auto out_tensor = predictor_->GetOutputHandle(output_names[0]);
+    std::vector<int> output_shape = out_tensor->shape();
+    // Calculate output length
+    int output_size = 1;
+    for (int j = 0; j < output_shape.size(); ++j) {
+      output_size *= output_shape[j];
+    }
+    if (output_size < 6) {
+      std::cerr << "[WARNING] No object detected." << std::endl;
+    }
+    output_data_.resize(output_size);
+    out_tensor->CopyToCpu(output_data_.data()); 
+  }
+  std::clock_t end = clock();
+  float ms = static_cast<float>(end - start) / CLOCKS_PER_SEC / repeats * 1000.;
+  printf("Inference: %f ms per batch image\n", ms);
+  // Postprocessing result
+  if(!run_benchmark) {
+    Postprocess(im,  result);
+  }
+}
+std::vector<int> GenerateColorMap(int num_class) {
+  auto colormap = std::vector<int>(3 * num_class, 0);
+  for (int i = 0; i < num_class; ++i) {
+    int j = 0;
+    int lab = i;
+    while (lab) {
+      colormap[i * 3] |= (((lab >> 0) & 1) << (7 - j));
+      colormap[i * 3 + 1] |= (((lab >> 1) & 1) << (7 - j));
+      colormap[i * 3 + 2] |= (((lab >> 2) & 1) << (7 - j));
+      ++j;
+      lab >>= 3;
+    }
+  }
+  return colormap;
+}
+}  // namespace PaddleDetection
--- a/dygraph/deploy/cpp/src/preprocess_op.cc
+++ b/dygraph/deploy/cpp/src/preprocess_op.cc
+//   Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+//     http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+#include <vector>
+#include <string>
+#include "include/preprocess_op.h"
+namespace PaddleDetection {
+void InitInfo::Run(cv::Mat* im, ImageBlob* data) {
+  data->im_shape_ = {
+      static_cast<float>(im->rows),
+      static_cast<float>(im->cols)
+  };
+  data->scale_factor_ = {1., 1.};
+  data->input_shape_ = {
+      static_cast<int>(im->rows),
+      static_cast<int>(im->cols)
+  };
+}
+void Normalize::Run(cv::Mat* im, ImageBlob* data) {
+  double e = 1.0;
+  if (is_scale_) {
+    e /= 255.0;
+  }
+  (*im).convertTo(*im, CV_32FC3, e);
+  for (int h = 0; h < im->rows; h++) {
+    for (int w = 0; w < im->cols; w++) {
+      im->at<cv::Vec3f>(h, w)[0] =
+          (im->at<cv::Vec3f>(h, w)[0] - mean_[0] ) / scale_[0];
+      im->at<cv::Vec3f>(h, w)[1] =
+          (im->at<cv::Vec3f>(h, w)[1] - mean_[1] ) / scale_[1];
+      im->at<cv::Vec3f>(h, w)[2] =
+          (im->at<cv::Vec3f>(h, w)[2] - mean_[2] ) / scale_[2];
+    }
+  }
+}
+void Permute::Run(cv::Mat* im, ImageBlob* data) {
+  int rh = im->rows;
+  int rw = im->cols;
+  int rc = im->channels();
+  (data->im_data_).resize(rc * rh * rw);
+  float* base = (data->im_data_).data();
+  for (int i = 0; i < rc; ++i) {
+    cv::extractChannel(*im, cv::Mat(rh, rw, CV_32FC1, base + i * rh * rw), i);
+  }
+}
+void Resize::Run(cv::Mat* im, ImageBlob* data) {
+  auto resize_scale = GenerateScale(*im);
+  cv::resize(
+      *im, *im, cv::Size(), resize_scale.first, resize_scale.second, interp_);
+  data->im_shape_ = {
+    static_cast<float>(im->rows),
+    static_cast<float>(im->cols),
+  };
+  data->scale_factor_ = {
+    resize_scale.second,
+    resize_scale.first,
+  };
+  if (keep_ratio_) {
+    int max_size = input_shape_[1];
+    // Padding the image with 0 border
+    cv::copyMakeBorder(
+      *im,
+      *im,
+      0,
+      max_size - im->rows,
+      0,
+      max_size - im->cols,
+      cv::BORDER_CONSTANT,
+      cv::Scalar(0));
+  }
+  data->input_shape_ = {
+    static_cast<int>(im->rows),
+    static_cast<int>(im->cols),
+  };
+}
+std::pair<float, float> Resize::GenerateScale(const cv::Mat& im) {
+  std::pair<float, float> resize_scale;
+  int origin_w = im.cols;
+  int origin_h = im.rows;
+  if (keep_ratio_) {
+    int im_size_max = std::max(origin_w, origin_h);
+    int im_size_min = std::min(origin_w, origin_h);
+    int target_size_max = *std::max_element(target_size_.begin(), target_size_.end());
+    int target_size_min = *std::min_element(target_size_.begin(), target_size_.end());
+    float scale_min =
+        static_cast<float>(target_size_min) / static_cast<float>(im_size_min);
+    float scale_max =
+        static_cast<float>(target_size_max) / static_cast<float>(im_size_max);
+    float scale_ratio = std::min(scale_min, scale_max);
+    resize_scale = {scale_ratio, scale_ratio};
+  } else {
+    resize_scale.first =
+        static_cast<float>(target_size_[1]) / static_cast<float>(origin_w);
+    resize_scale.second =
+        static_cast<float>(target_size_[0]) / static_cast<float>(origin_h);
+  }
+  return resize_scale;
+}
+void PadStride::Run(cv::Mat* im, ImageBlob* data) {
+  if (stride_ <= 0) {
+    return;
+  }
+  int rc = im->channels();
+  int rh = im->rows;
+  int rw = im->cols;
+  int nh = (rh / stride_) * stride_ + (rh % stride_ != 0) * stride_;
+  int nw = (rw / stride_) * stride_ + (rw % stride_ != 0) * stride_;
+  cv::copyMakeBorder(
+    *im,
+    *im,
+    0,
+    nh - rh,
+    0,
+    nw - rw,
+    cv::BORDER_CONSTANT,
+    cv::Scalar(0));
+  data->input_shape_ = {
+    static_cast<int>(im->rows),
+    static_cast<int>(im->cols),
+  };
+}
+// Preprocessor op running order
+const std::vector<std::string> Preprocessor::RUN_ORDER = {
+  "InitInfo", "ResizeOp", "NormalizeImageOp", "PadStrideOp", "PermuteOp"
+};
+void Preprocessor::Run(cv::Mat* im, ImageBlob* data) {
+  for (const auto& name : RUN_ORDER) {
+    if (ops_.find(name) != ops_.end()) {
+      ops_[name]->Run(im, data);
+    }
+  }
+}
+}  // namespace PaddleDetection
--- a/dygraph/deploy/python/README.md
+++ b/dygraph/deploy/python/README.md
+# Python端预测部署
+Python预测可以使用`tools/infer.py`，此种方式依赖PaddleDetection源码；也可以使用本篇教程预测方式，先将模型导出，使用一个独立的文件进行预测。
+本篇教程使用AnalysisPredictor对[导出模型](https://github.com/PaddlePaddle/PaddleDetection/tree/dygraph/deploy/EXPORT_MODEL.md)进行高性能预测。
+在PaddlePaddle中预测引擎和训练引擎底层有着不同的优化方法, 预测引擎使用了AnalysisPredictor，专门针对推理进行了优化，是基于[C++预测库](https://www.paddlepaddle.org.cn/documentation/docs/zh/advanced_guide/inference_deployment/inference/native_infer.html)的Python接口，该引擎可以对模型进行多项图优化，减少不必要的内存拷贝。如果用户在部署已训练模型的过程中对性能有较高的要求，我们提供了独立于PaddleDetection的预测脚本，方便用户直接集成部署。
+主要包含两个步骤：
+- 导出预测模型
+- 基于Python的预测
+## 1. 导出预测模型
+PaddleDetection在训练过程包括网络的前向和优化器相关参数，而在部署过程中，我们只需要前向参数，具体参考:[导出模型](https://github.com/PaddlePaddle/PaddleDetection/tree/dygraph/deploy/EXPORT_MODEL.md)
+导出后目录下，包括`infer_cfg.yml`, `model.pdiparams`,  `model.pdiparams.info`, `model.pdmodel`四个文件。
+## 2. 基于python的预测
+### 2.1 安装依赖
+  - `PaddlePaddle`的安装:
+    请点击[官方安装文档](https://paddlepaddle.org.cn/install/quick) 选择适合的方式，版本为2.0rc1以上即可
+  - 切换到`PaddleDetection`代码库根目录，执行`pip install -r requirements.txt`安装其它依赖
+### 2.2 执行预测程序
+在终端输入以下命令进行预测：
+```bash
+python deploy/python/infer.py --model_dir=/path/to/models --image_file=/path/to/image
+--use_gpu=(False/True)
+```
+参数说明如下:
+| 参数 | 是否必须|含义 |
+|-------|-------|----------|
+| --model_dir | Yes|上述导出的模型路径 |
+| --image_file | Option |需要预测的图片 |
+| --video_file | Option |需要预测的视频 |
+| --camera_id | Option | 用来预测的摄像头ID，默认为-1(表示不使用摄像头预测，可设置为：0 - (摄像头数目-1) )，预测过程中在可视化界面按`q`退出输出预测结果到：output/output.mp4|
+| --use_gpu |No|是否GPU，默认为False|
+| --run_mode |No|使用GPU时，默认为fluid, 可选（fluid/trt_fp32/trt_fp16）|
+| --threshold |No|预测得分的阈值，默认为0.5|
+| --output_dir |No|可视化结果保存的根目录，默认为output/|
+| --run_benchmark |No|是否运行benchmark，同时需指定--image_file|
+说明：
+- run_mode：fluid代表使用AnalysisPredictor，精度float32来推理，其他参数指用AnalysisPredictor，TensorRT不同精度来推理。
+- PaddlePaddle默认的GPU安装包(<=1.7)，不支持基于TensorRT进行预测，如果想基于TensorRT加速预测，需要自行编译，详细可参考[预测库编译教程](https://www.paddlepaddle.org.cn/documentation/docs/zh/advanced_usage/deploy/inference/paddle_tensorrt_infer.html)。
+## 3. 部署性能对比测试
+对比AnalysisPredictor相对Executor的推理速度
+### 3.1 测试环境:
+- CUDA 9.0
+- CUDNN 7.5
+- PaddlePaddle 1.71
+- GPU: Tesla P40
+### 3.2 测试方式:
+- Batch Size=1
+- 去掉前100轮warmup时间，测试100轮的平均时间，单位ms/image，只计算模型运行时间，不包括数据的处理和拷贝。
+### 3.3 测试结果
+|模型 | AnalysisPredictor | Executor | 输入|
+|---|----|---|---|
+| YOLOv3-MobileNetv1 | 15.20 | 19.54 |  608*608
+| faster_rcnn_r50_fpn_1x | 50.05 | 69.58 |800*1088
+| faster_rcnn_r50_1x | 326.11 | 347.22 | 800*1067
+| mask_rcnn_r50_fpn_1x | 67.49 | 91.02 | 800*1088
+| mask_rcnn_r50_1x | 326.11 | 350.94 | 800*1067
--- a/dygraph/deploy/python/infer.py
+++ b/dygraph/deploy/python/infer.py
+# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+import os
+import argparse
+import time
+import yaml
+import ast
+from functools import reduce
+from PIL import Image
+import cv2
+import numpy as np
+import paddle
+import paddle.fluid as fluid
+from preprocess import preprocess, ResizeOp, NormalizeImageOp, PermuteOp, PadStride
+from visualize import visualize_box_mask
+from paddle.inference import Config
+from paddle.inference import create_predictor
+# Global dictionary
+SUPPORT_MODELS = {
+    'YOLO',
+    'RCNN',
+    'SSD',
+}
+class Detector(object):
+    """
+    Args:
+        config (object): config of model, defined by `Config(model_dir)`
+        model_dir (str): root path of model.pdiparams, model.pdmodel and infer_cfg.yml
+        use_gpu (bool): whether use gpu
+        run_mode (str): mode of running(fluid/trt_fp32/trt_fp16)
+        threshold (float): threshold to reserve the result for output.
+    """
+    def __init__(self,
+                 pred_config,
+                 model_dir,
+                 use_gpu=False,
+                 run_mode='fluid',
+                 threshold=0.5):
+        self.pred_config = pred_config
+        self.predictor = load_predictor(
+            model_dir,
+            run_mode=run_mode,
+            min_subgraph_size=self.pred_config.min_subgraph_size,
+            use_gpu=use_gpu)
+    def preprocess(self, im):
+        preprocess_ops = []
+        for op_info in self.pred_config.preprocess_infos:
+            new_op_info = op_info.copy()
+            op_type = new_op_info.pop('type')
+            preprocess_ops.append(eval(op_type)(**new_op_info))
+        im, im_info = preprocess(im, preprocess_ops,
+                                 self.pred_config.input_shape)
+        inputs = create_inputs(im, im_info)
+        return inputs
+    def postprocess(self, np_boxes, np_masks, inputs, threshold=0.5):
+        # postprocess output of predictor
+        results = {}
+        if self.pred_config.arch in ['Face']:
+            h, w = inputs['im_shape']
+            scale_y, scale_x = inputs['scale_factor']
+            w, h = float(h) / scale_y, float(w) / scale_x
+            np_boxes[:, 2] *= h
+            np_boxes[:, 3] *= w
+            np_boxes[:, 4] *= h
+            np_boxes[:, 5] *= w
+        expect_boxes = (np_boxes[:, 1] > threshold) & (np_boxes[:, 0] > -1)
+        np_boxes = np_boxes[expect_boxes, :]
+        for box in np_boxes:
+            print('class_id:{:d}, confidence:{:.4f},'
+                  'left_top:[{:.2f},{:.2f}],'
+                  ' right_bottom:[{:.2f},{:.2f}]'.format(
+                      int(box[0]), box[1], box[2], box[3], box[4], box[5]))
+        results['boxes'] = np_boxes
+        if np_masks is not None:
+            np_masks = np_masks[expect_boxes, :, :, :]
+            results['masks'] = np_masks
+        return results
+    def predict(self,
+                image,
+                threshold=0.5,
+                warmup=0,
+                repeats=1,
+                run_benchmark=False):
+        '''
+        Args:
+            image (str/np.ndarray): path of image/ np.ndarray read by cv2
+            threshold (float): threshold of predicted box' score
+        Returns:
+            results (dict): include 'boxes': np.ndarray: shape:[N,6], N: number of box,
+                            matix element:[class, score, x_min, y_min, x_max, y_max]
+                            MaskRCNN's results include 'masks': np.ndarray:
+                            shape:[N, class_num, mask_resolution, mask_resolution]
+        '''
+        inputs = self.preprocess(image)
+        np_boxes, np_masks = None, None
+        input_names = self.predictor.get_input_names()
+        for i in range(len(input_names)):
+            input_tensor = self.predictor.get_input_handle(input_names[i])
+            input_tensor.copy_from_cpu(inputs[input_names[i]])
+        for i in range(warmup):
+            self.predictor.run()
+            output_names = self.predictor.get_output_names()
+            boxes_tensor = self.predictor.get_output_handle(output_names[0])
+            np_boxes = boxes_tensor.copy_to_cpu()
+            if self.pred_config.mask_resolution is not None:
+                masks_tensor = self.predictor.get_output_handle(output_names[2])
+                np_masks = masks_tensor.copy_to_cpu()
+        t1 = time.time()
+        for i in range(repeats):
+            self.predictor.run()
+            output_names = self.predictor.get_output_names()
+            boxes_tensor = self.predictor.get_output_handle(output_names[0])
+            np_boxes = boxes_tensor.copy_to_cpu()
+            if self.pred_config.mask_resolution is not None:
+                masks_tensor = self.predictor.get_output_handle(output_names[2])
+                np_masks = masks_tensor.copy_to_cpu()
+        t2 = time.time()
+        ms = (t2 - t1) * 1000.0 / repeats
+        print("Inference: {} ms per batch image".format(ms))
+        # do not perform postprocess in benchmark mode
+        results = []
+        if not run_benchmark:
+            if reduce(lambda x, y: x * y, np_boxes.shape) < 6:
+                print('[WARNNING] No object detected.')
+                results = {'boxes': np.array([])}
+            else:
+                results = self.postprocess(
+                    np_boxes, np_masks, inputs, threshold=threshold)
+        return results
+def create_inputs(im, im_info):
+    """generate input for different model type
+    Args:
+        im (np.ndarray): image (np.ndarray)
+        im_info (dict): info of image
+        model_arch (str): model type
+    Returns:
+        inputs (dict): input of model
+    """
+    inputs = {}
+    inputs['image'] = np.array((im, )).astype('float32')
+    inputs['im_shape'] = np.array((im_info['im_shape'], )).astype('float32')
+    inputs['scale_factor'] = np.array(
+        (im_info['scale_factor'], )).astype('float32')
+    return inputs
+class PredictConfig():
+    """set config of preprocess, postprocess and visualize
+    Args:
+        model_dir (str): root path of model.yml
+    """
+    def __init__(self, model_dir):
+        # parsing Yaml config for Preprocess
+        deploy_file = os.path.join(model_dir, 'infer_cfg.yml')
+        with open(deploy_file) as f:
+            yml_conf = yaml.safe_load(f)
+        self.check_model(yml_conf)
+        self.arch = yml_conf['arch']
+        self.preprocess_infos = yml_conf['Preprocess']
+        self.min_subgraph_size = yml_conf['min_subgraph_size']
+        self.labels = yml_conf['label_list']
+        self.mask_resolution = None
+        if 'mask_resolution' in yml_conf:
+            self.mask_resolution = yml_conf['mask_resolution']
+        self.input_shape = yml_conf['image_shape']
+        self.print_config()
+    def check_model(self, yml_conf):
+        """
+        Raises:
+            ValueError: loaded model not in supported model type 
+        """
+        for support_model in SUPPORT_MODELS:
+            if support_model in yml_conf['arch']:
+                return True
+        raise ValueError("Unsupported arch: {}, expect {}".format(yml_conf[
+            'arch'], SUPPORT_MODELS))
+    def print_config(self):
+        print('-----------  Model Configuration -----------')
+        print('%s: %s' % ('Model Arch', self.arch))
+        print('%s: ' % ('Transform Order'))
+        for op_info in self.preprocess_infos:
+            print('--%s: %s' % ('transform op', op_info['type']))
+        print('--------------------------------------------')
+def load_predictor(model_dir,
+                   run_mode='fluid',
+                   batch_size=1,
+                   use_gpu=False,
+                   min_subgraph_size=3):
+    """set AnalysisConfig, generate AnalysisPredictor
+    Args:
+        model_dir (str): root path of __model__ and __params__
+        use_gpu (bool): whether use gpu
+    Returns:
+        predictor (PaddlePredictor): AnalysisPredictor
+    Raises:
+        ValueError: predict by TensorRT need use_gpu == True.
+    """
+    if not use_gpu and not run_mode == 'fluid':
+        raise ValueError(
+            "Predict by TensorRT mode: {}, expect use_gpu==True, but use_gpu == {}"
+            .format(run_mode, use_gpu))
+    if run_mode == 'trt_int8':
+        raise ValueError("TensorRT int8 mode is not supported now, "
+                         "please use trt_fp32 or trt_fp16 instead.")
+    config = Config(
+        os.path.join(model_dir, 'model.pdmodel'),
+        os.path.join(model_dir, 'model.pdiparams'))
+    precision_map = {
+        'trt_int8': Config.Precision.Int8,
+        'trt_fp32': Config.Precision.Float32,
+        'trt_fp16': Config.Precision.Half
+    }
+    if use_gpu:
+        # initial GPU memory(M), device ID
+        config.enable_use_gpu(200, 0)
+        # optimize graph and fuse op
+        config.switch_ir_optim(True)
+    else:
+        config.disable_gpu()
+    if run_mode in precision_map.keys():
+        config.enable_tensorrt_engine(
+            workspace_size=1 << 10,
+            max_batch_size=batch_size,
+            min_subgraph_size=min_subgraph_size,
+            precision_mode=precision_map[run_mode],
+            use_static=False,
+            use_calib_mode=False)
+    # disable print log when predict
+    config.disable_glog_info()
+    # enable shared memory
+    config.enable_memory_optim()
+    # disable feed, fetch OP, needed by zero_copy_run
+    config.switch_use_feed_fetch_ops(False)
+    predictor = create_predictor(config)
+    return predictor
+def visualize(image_file,
+              results,
+              labels,
+              mask_resolution=14,
+              output_dir='output/',
+              threshold=0.5):
+    # visualize the predict result
+    im = visualize_box_mask(
+        image_file,
+        results,
+        labels,
+        mask_resolution=mask_resolution,
+        threshold=threshold)
+    img_name = os.path.split(image_file)[-1]
+    if not os.path.exists(output_dir):
+        os.makedirs(output_dir)
+    out_path = os.path.join(output_dir, img_name)
+    im.save(out_path, quality=95)
+    print("save result to: " + out_path)
+def print_arguments(args):
+    print('-----------  Running Arguments -----------')
+    for arg, value in sorted(vars(args).items()):
+        print('%s: %s' % (arg, value))
+    print('------------------------------------------')
+def predict_image(detector):
+    if FLAGS.run_benchmark:
+        detector.predict(
+            FLAGS.image_file,
+            FLAGS.threshold,
+            warmup=100,
+            repeats=100,
+            run_benchmark=True)
+    else:
+        results = detector.predict(FLAGS.image_file, FLAGS.threshold)
+        visualize(
+            FLAGS.image_file,
+            results,
+            detector.pred_config.labels,
+            mask_resolution=detector.pred_config.mask_resolution,
+            output_dir=FLAGS.output_dir,
+            threshold=FLAGS.threshold)
+def predict_video(detector, camera_id):
+    if camera_id != -1:
+        capture = cv2.VideoCapture(camera_id)
+        video_name = 'output.mp4'
+    else:
+        capture = cv2.VideoCapture(FLAGS.video_file)
+        video_name = os.path.split(FLAGS.video_file)[-1]
+    fps = 30
+    width = int(capture.get(cv2.CAP_PROP_FRAME_WIDTH))
+    height = int(capture.get(cv2.CAP_PROP_FRAME_HEIGHT))
+    # yapf: disable
+    fourcc = cv2.VideoWriter_fourcc(*'mp4v')
+    # yapf: enable
+    if not os.path.exists(FLAGS.output_dir):
+        os.makedirs(FLAGS.output_dir)
+    out_path = os.path.join(FLAGS.output_dir, video_name)
+    writer = cv2.VideoWriter(out_path, fourcc, fps, (width, height))
+    index = 1
+    while (1):
+        ret, frame = capture.read()
+        if not ret:
+            break
+        print('detect frame:%d' % (index))
+        index += 1
+        results = detector.predict(frame, FLAGS.threshold)
+        im = visualize_box_mask(
+            frame,
+            results,
+            detector.pred_config.labels,
+            mask_resolution=detector.pred_config.mask_resolution,
+            threshold=FLAGS.threshold)
+        im = np.array(im)
+        writer.write(im)
+        if camera_id != -1:
+            cv2.imshow('Mask Detection', im)
+            if cv2.waitKey(1) & 0xFF == ord('q'):
+                break
+    writer.release()
+def main():
+    pred_config = PredictConfig(FLAGS.model_dir)
+    detector = Detector(
+        pred_config,
+        FLAGS.model_dir,
+        use_gpu=FLAGS.use_gpu,
+        run_mode=FLAGS.run_mode)
+    # predict from image
+    if FLAGS.image_file != '':
+        predict_image(detector)
+    # predict from video file or camera video stream
+    if FLAGS.video_file != '' or FLAGS.camera_id != -1:
+        predict_video(detector, FLAGS.camera_id)
+if __name__ == '__main__':
+    paddle.enable_static()
+    parser = argparse.ArgumentParser(description=__doc__)
+    parser.add_argument(
+        "--model_dir",
+        type=str,
+        default=None,
+        help=("Directory include:'model.pdiparams', 'model.pdmodel', "
+              "'infer_cfg.yml', created by tools/export_model.py."),
+        required=True)
+    parser.add_argument(
+        "--image_file", type=str, default='', help="Path of image file.")
+    parser.add_argument(
+        "--video_file", type=str, default='', help="Path of video file.")
+    parser.add_argument(
+        "--camera_id",
+        type=int,
+        default=-1,
+        help="device id of camera to predict.")
+    parser.add_argument(
+        "--run_mode",
+        type=str,
+        default='fluid',
+        help="mode of running(fluid/trt_fp32/trt_fp16)")
+    parser.add_argument(
+        "--use_gpu",
+        type=ast.literal_eval,
+        default=False,
+        help="Whether to predict with GPU.")
+    parser.add_argument(
+        "--run_benchmark",
+        type=ast.literal_eval,
+        default=False,
+        help="Whether to predict a image_file repeatedly for benchmark")
+    parser.add_argument(
+        "--threshold", type=float, default=0.5, help="Threshold of score.")
+    parser.add_argument(
+        "--output_dir",
+        type=str,
+        default="output",
+        help="Directory of output visualization files.")
+    FLAGS = parser.parse_args()
+    print_arguments(FLAGS)
+    if FLAGS.image_file != '' and FLAGS.video_file != '':
+        assert "Cannot predict image and video at the same time"
+    main()
--- a/dygraph/deploy/python/preprocess.py
+++ b/dygraph/deploy/python/preprocess.py
+# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+from PIL import Image
+import cv2
+import numpy as np
+def decode_image(im_file, im_info):
+    """read rgb image
+    Args:
+        im_file (str|np.ndarray): input can be image path or np.ndarray
+        im_info (dict): info of image
+    Returns:
+        im (np.ndarray):  processed image (np.ndarray)
+        im_info (dict): info of processed image
+    """
+    if isinstance(im_file, str):
+        with open(im_file, 'rb') as f:
+            im_read = f.read()
+        data = np.frombuffer(im_read, dtype='uint8')
+        im = cv2.imdecode(data, 1)  # BGR mode, but need RGB mode
+        im = cv2.cvtColor(im, cv2.COLOR_BGR2RGB)
+    else:
+        im = im_file
+    im_info['im_shape'] = np.array(im.shape[:2], dtype=np.float32)
+    return im, im_info
+class ResizeOp(object):
+    """resize image by target_size and max_size
+    Args:
+        target_size (int): the target size of image
+        keep_ratio (bool): whether keep_ratio or not, default true
+        interp (int): method of resize
+    """
+    def __init__(
+            self,
+            target_size,
+            keep_ratio=True,
+            interp=cv2.INTER_LINEAR, ):
+        if isinstance(target_size, int):
+            target_size = [target_size, target_size]
+        self.target_size = target_size
+        self.keep_ratio = keep_ratio
+        self.interp = interp
+    def __call__(self, im, im_info):
+        """
+        Args:
+            im (np.ndarray): image (np.ndarray)
+            im_info (dict): info of image
+        Returns:
+            im (np.ndarray):  processed image (np.ndarray)
+            im_info (dict): info of processed image
+        """
+        im_channel = im.shape[2]
+        im_scale_y, im_scale_x = self.generate_scale(im)
+        im = cv2.resize(
+            im,
+            None,
+            None,
+            fx=im_scale_x,
+            fy=im_scale_y,
+            interpolation=self.interp)
+        im_info['im_shape'] = np.array(im.shape[:2]).astype('float32')
+        im_info['scale_factor'] = np.array(
+            [im_scale_y, im_scale_x]).astype('float32')
+        # padding im when image_shape fixed by infer_cfg.yml
+        if self.keep_ratio and im_info['input_shape'][1] is not None:
+            max_size = im_info['input_shape'][1]
+            padding_im = np.zeros(
+                (max_size, max_size, im_channel), dtype=np.float32)
+            im_h, im_w = im.shape[:2]
+            padding_im[:im_h, :im_w, :] = im
+            im = padding_im
+        return im, im_info
+    def generate_scale(self, im):
+        """
+        Args:
+            im (np.ndarray): image (np.ndarray)
+        Returns:
+            im_scale_x: the resize ratio of X
+            im_scale_y: the resize ratio of Y
+        """
+        origin_shape = im.shape[:2]
+        im_c = im.shape[2]
+        if self.keep_ratio:
+            im_size_min = np.min(origin_shape)
+            im_size_max = np.max(origin_shape)
+            target_size_min = np.min(self.target_size)
+            target_size_max = np.max(self.target_size)
+            im_scale = float(target_size_min) / float(im_size_min)
+            if np.round(im_scale * im_size_max) > target_size_max:
+                im_scale = float(target_size_max) / float(im_size_max)
+            im_scale_x = im_scale
+            im_scale_y = im_scale
+        else:
+            resize_h, resize_w = self.target_size
+            im_scale_y = resize_h / float(origin_shape[0])
+            im_scale_x = resize_w / float(origin_shape[1])
+        return im_scale_y, im_scale_x
+class NormalizeImageOp(object):
+    """normalize image
+    Args:
+        mean (list): im - mean
+        std (list): im / std
+        is_scale (bool): whether need im / 255
+        is_channel_first (bool): if True: image shape is CHW, else: HWC
+    """
+    def __init__(self, mean, std, is_scale=True):
+        self.mean = mean
+        self.std = std
+        self.is_scale = is_scale
+    def __call__(self, im, im_info):
+        """
+        Args:
+            im (np.ndarray): image (np.ndarray)
+            im_info (dict): info of image
+        Returns:
+            im (np.ndarray):  processed image (np.ndarray)
+            im_info (dict): info of processed image
+        """
+        im = im.astype(np.float32, copy=False)
+        mean = np.array(self.mean)[np.newaxis, np.newaxis, :]
+        std = np.array(self.std)[np.newaxis, np.newaxis, :]
+        if self.is_scale:
+            im = im / 255.0
+        im -= mean
+        im /= std
+        return im, im_info
+class PermuteOp(object):
+    """permute image
+    Args:
+        to_bgr (bool): whether convert RGB to BGR 
+        channel_first (bool): whether convert HWC to CHW
+    """
+    def __init__(self, ):
+        super(PermuteOp, self).__init__()
+    def __call__(self, im, im_info):
+        """
+        Args:
+            im (np.ndarray): image (np.ndarray)
+            im_info (dict): info of image
+        Returns:
+            im (np.ndarray):  processed image (np.ndarray)
+            im_info (dict): info of processed image
+        """
+        im = im.transpose((2, 0, 1)).copy()
+        return im, im_info
+class PadStride(object):
+    """ padding image for model with FPN 
+    Args:
+        stride (bool): model with FPN need image shape % stride == 0 
+    """
+    def __init__(self, stride=0):
+        self.coarsest_stride = stride
+    def __call__(self, im, im_info):
+        """
+        Args:
+            im (np.ndarray): image (np.ndarray)
+            im_info (dict): info of image
+        Returns:
+            im (np.ndarray):  processed image (np.ndarray)
+            im_info (dict): info of processed image
+        """
+        coarsest_stride = self.coarsest_stride
+        if coarsest_stride <= 0:
+            return im, im_info
+        im_c, im_h, im_w = im.shape
+        pad_h = int(np.ceil(float(im_h) / coarsest_stride) * coarsest_stride)
+        pad_w = int(np.ceil(float(im_w) / coarsest_stride) * coarsest_stride)
+        padding_im = np.zeros((im_c, pad_h, pad_w), dtype=np.float32)
+        padding_im[:, :im_h, :im_w] = im
+        return padding_im, im_info
+def preprocess(im, preprocess_ops, input_shape):
+    # process image by preprocess_ops
+    im_info = {
+        'scale_factor': np.array(
+            [1., 1.], dtype=np.float32),
+        'im_shape': None,
+        'input_shape': input_shape,
+    }
+    im, im_info = decode_image(im, im_info)
+    for operator in preprocess_ops:
+        im, im_info = operator(im, im_info)
+    return im, im_info
--- a/dygraph/deploy/python/visualize.py
+++ b/dygraph/deploy/python/visualize.py
+# coding: utf-8
+# copyright (c) 2020 PaddlePaddle Authors. All Rights Reserve.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+from __future__ import division
+import cv2
+import numpy as np
+from PIL import Image, ImageDraw
+from scipy import ndimage
+def visualize_box_mask(im, results, labels, mask_resolution=14, threshold=0.5):
+    """
+    Args:
+        im (str/np.ndarray): path of image/np.ndarray read by cv2
+        results (dict): include 'boxes': np.ndarray: shape:[N,6], N: number of box,
+                        matix element:[class, score, x_min, y_min, x_max, y_max]
+                        MaskRCNN's results include 'masks': np.ndarray:
+                        shape:[N, class_num, mask_resolution, mask_resolution]
+        labels (list): labels:['class1', ..., 'classn']
+        mask_resolution (int): shape of a mask is:[mask_resolution, mask_resolution]
+        threshold (float): Threshold of score.
+    Returns:
+        im (PIL.Image.Image): visualized image
+    """
+    if isinstance(im, str):
+        im = Image.open(im).convert('RGB')
+    else:
+        im = Image.fromarray(im)
+    if 'masks' in results and 'boxes' in results:
+        im = draw_mask(
+            im,
+            results['boxes'],
+            results['masks'],
+            labels,
+            resolution=mask_resolution)
+    if 'boxes' in results:
+        im = draw_box(im, results['boxes'], labels)
+    if 'segm' in results:
+        im = draw_segm(
+            im,
+            results['segm'],
+            results['label'],
+            results['score'],
+            labels,
+            threshold=threshold)
+    return im
+def get_color_map_list(num_classes):
+    """
+    Args:
+        num_classes (int): number of class
+    Returns:
+        color_map (list): RGB color list
+    """
+    color_map = num_classes * [0, 0, 0]
+    for i in range(0, num_classes):
+        j = 0
+        lab = i
+        while lab:
+            color_map[i * 3] |= (((lab >> 0) & 1) << (7 - j))
+            color_map[i * 3 + 1] |= (((lab >> 1) & 1) << (7 - j))
+            color_map[i * 3 + 2] |= (((lab >> 2) & 1) << (7 - j))
+            j += 1
+            lab >>= 3
+    color_map = [color_map[i:i + 3] for i in range(0, len(color_map), 3)]
+    return color_map
+def expand_boxes(boxes, scale=0.0):
+    """
+    Args:
+        boxes (np.ndarray): shape:[N,4], N:number of box,
+                            matix element:[x_min, y_min, x_max, y_max]
+        scale (float): scale of boxes
+    Returns:
+        boxes_exp (np.ndarray): expanded boxes
+    """
+    w_half = (boxes[:, 2] - boxes[:, 0]) * .5
+    h_half = (boxes[:, 3] - boxes[:, 1]) * .5
+    x_c = (boxes[:, 2] + boxes[:, 0]) * .5
+    y_c = (boxes[:, 3] + boxes[:, 1]) * .5
+    w_half *= scale
+    h_half *= scale
+    boxes_exp = np.zeros(boxes.shape)
+    boxes_exp[:, 0] = x_c - w_half
+    boxes_exp[:, 2] = x_c + w_half
+    boxes_exp[:, 1] = y_c - h_half
+    boxes_exp[:, 3] = y_c + h_half
+    return boxes_exp
+def draw_mask(im, np_boxes, np_masks, labels, resolution=14, threshold=0.5):
+    """
+    Args:
+        im (PIL.Image.Image): PIL image
+        np_boxes (np.ndarray): shape:[N,6], N: number of box,
+                               matix element:[class, score, x_min, y_min, x_max, y_max]
+        np_masks (np.ndarray): shape:[N, class_num, resolution, resolution]
+        labels (list): labels:['class1', ..., 'classn']
+        resolution (int): shape of a mask is:[resolution, resolution]
+        threshold (float): threshold of mask
+    Returns:
+        im (PIL.Image.Image): visualized image
+    """
+    color_list = get_color_map_list(len(labels))
+    scale = (resolution + 2.0) / resolution
+    im_w, im_h = im.size
+    w_ratio = 0.4
+    alpha = 0.7
+    im = np.array(im).astype('float32')
+    rects = np_boxes[:, 2:]
+    expand_rects = expand_boxes(rects, scale)
+    expand_rects = expand_rects.astype(np.int32)
+    clsid_scores = np_boxes[:, 0:2]
+    padded_mask = np.zeros((resolution + 2, resolution + 2), dtype=np.float32)
+    clsid2color = {}
+    for idx in range(len(np_boxes)):
+        clsid, score = clsid_scores[idx].tolist()
+        clsid = int(clsid)
+        xmin, ymin, xmax, ymax = expand_rects[idx].tolist()
+        w = xmax - xmin + 1
+        h = ymax - ymin + 1
+        w = np.maximum(w, 1)
+        h = np.maximum(h, 1)
+        padded_mask[1:-1, 1:-1] = np_masks[idx, int(clsid), :, :]
+        resized_mask = cv2.resize(padded_mask, (w, h))
+        resized_mask = np.array(resized_mask > threshold, dtype=np.uint8)
+        x0 = min(max(xmin, 0), im_w)
+        x1 = min(max(xmax + 1, 0), im_w)
+        y0 = min(max(ymin, 0), im_h)
+        y1 = min(max(ymax + 1, 0), im_h)
+        im_mask = np.zeros((im_h, im_w), dtype=np.uint8)
+        im_mask[y0:y1, x0:x1] = resized_mask[(y0 - ymin):(y1 - ymin), (
+            x0 - xmin):(x1 - xmin)]
+        if clsid not in clsid2color:
+            clsid2color[clsid] = color_list[clsid]
+        color_mask = clsid2color[clsid]
+        for c in range(3):
+            color_mask[c] = color_mask[c] * (1 - w_ratio) + w_ratio * 255
+        idx = np.nonzero(im_mask)
+        color_mask = np.array(color_mask)
+        im[idx[0], idx[1], :] *= 1.0 - alpha
+        im[idx[0], idx[1], :] += alpha * color_mask
+    return Image.fromarray(im.astype('uint8'))
+def draw_box(im, np_boxes, labels):
+    """
+    Args:
+        im (PIL.Image.Image): PIL image
+        np_boxes (np.ndarray): shape:[N,6], N: number of box,
+                               matix element:[class, score, x_min, y_min, x_max, y_max]
+        labels (list): labels:['class1', ..., 'classn']
+    Returns:
+        im (PIL.Image.Image): visualized image
+    """
+    draw_thickness = min(im.size) // 320
+    draw = ImageDraw.Draw(im)
+    clsid2color = {}
+    color_list = get_color_map_list(len(labels))
+    for dt in np_boxes:
+        clsid, bbox, score = int(dt[0]), dt[2:], dt[1]
+        xmin, ymin, xmax, ymax = bbox
+        w = xmax - xmin
+        h = ymax - ymin
+        if clsid not in clsid2color:
+            clsid2color[clsid] = color_list[clsid]
+        color = tuple(clsid2color[clsid])
+        # draw bbox
+        draw.line(
+            [(xmin, ymin), (xmin, ymax), (xmax, ymax), (xmax, ymin),
+             (xmin, ymin)],
+            width=draw_thickness,
+            fill=color)
+        # draw label
+        text = "{} {:.4f}".format(labels[clsid], score)
+        tw, th = draw.textsize(text)
+        draw.rectangle(
+            [(xmin + 1, ymin - th), (xmin + tw + 1, ymin)], fill=color)
+        draw.text((xmin + 1, ymin - th), text, fill=(255, 255, 255))
+    return im
+def draw_segm(im,
+              np_segms,
+              np_label,
+              np_score,
+              labels,
+              threshold=0.5,
+              alpha=0.7):
+    """
+    Draw segmentation on image
+    """
+    mask_color_id = 0
+    w_ratio = .4
+    color_list = get_color_map_list(len(labels))
+    im = np.array(im).astype('float32')
+    clsid2color = {}
+    np_segms = np_segms.astype(np.uint8)
+    for i in range(np_segms.shape[0]):
+        mask, score, clsid = np_segms[i], np_score[i], np_label[i] + 1
+        if score < threshold:
+            continue
+        if clsid not in clsid2color:
+            clsid2color[clsid] = color_list[clsid]
+        color_mask = clsid2color[clsid]
+        for c in range(3):
+            color_mask[c] = color_mask[c] * (1 - w_ratio) + w_ratio * 255
+        idx = np.nonzero(mask)
+        color_mask = np.array(color_mask)
+        im[idx[0], idx[1], :] *= 1.0 - alpha
+        im[idx[0], idx[1], :] += alpha * color_mask
+        sum_x = np.sum(mask, axis=0)
+        x = np.where(sum_x > 0.5)[0]
+        sum_y = np.sum(mask, axis=1)
+        y = np.where(sum_y > 0.5)[0]
+        x0, x1, y0, y1 = x[0], x[-1], y[0], y[-1]
+        cv2.rectangle(im, (x0, y0), (x1, y1),
+                      tuple(color_mask.astype('int32').tolist()), 1)
+        bbox_text = '%s %.2f' % (labels[clsid], score)
+        t_size = cv2.getTextSize(bbox_text, 0, 0.3, thickness=1)[0]
+        cv2.rectangle(im, (x0, y0), (x0 + t_size[0], y0 - t_size[1] - 3),
+                      tuple(color_mask.astype('int32').tolist()), -1)
+        cv2.putText(
+            im,
+            bbox_text, (x0, y0 - 2),
+            cv2.FONT_HERSHEY_SIMPLEX,
+            0.3, (0, 0, 0),
+            1,
+            lineType=cv2.LINE_AA)
+    return Image.fromarray(im.astype('uint8'))
--- a/dygraph/docs/MODEL_ZOO_cn.md
+++ b/dygraph/docs/MODEL_ZOO_cn.md
+# 模型库和基线
+## 测试环境
+- Python 3.7
+- PaddlePaddle 每日版本
+- CUDA 9.0
+- cuDNN >=7.4
+- NCCL 2.1.2
+## 通用设置
+- 所有模型均在COCO17数据集中训练和测试。
+- 除非特殊说明，所有ResNet骨干网络采用[ResNet-B](https://arxiv.org/pdf/1812.01187)结构。
+- 对于RCNN和RetinaNet系列模型，训练阶段仅使用水平翻转作为数据增强，测试阶段不使用数据增强。
+- **推理时间(fps)**: 推理时间是在一张Tesla V100的GPU上通过'tools/eval.py'测试所有验证集得到，单位是fps(图片数/秒), cuDNN版本是7.5，包括数据加载、网络前向执行和后处理, batch size是1。
+## 训练策略
+- 我们采用和[Detectron](https://github.com/facebookresearch/Detectron/blob/master/MODEL_ZOO.md#training-schedules)相同的训练策略。
+- 1x 策略表示：在总batch size为8时，初始学习率为0.01，在8 epoch和11 epoch后学习率分别下降10倍，最终训练12 epoch。
+- 2x 策略为1x策略的两倍，同时学习率调整位置也为1x的两倍。
+## ImageNet预训练模型
+Paddle提供基于ImageNet的骨架网络预训练模型。所有预训练模型均通过标准的Imagenet-1k数据集训练得到。[下载链接](https://github.com/PaddlePaddle/models/tree/develop/PaddleCV/image_classification#supported-models-and-performances)
+- 注：ResNet50模型通过余弦学习率调整策略训练得到。[ResNet50下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/ResNet18_pretrained.tar),
+ [ResNet50_vd下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/ResNet50_vd_pretrained.tar)
+## 基线
+### Faster & Mask R-CNN
+| 骨架网络             | 网络类型       | 每张GPU图片个数 | 学习率策略 |推理时间(fps) | Box AP | Mask AP |                           下载                          | 配置文件 |
+| :------------------- | :------------- | :-----: | :-----: | :------------: | :-----: | :-----: | :-----------------------------------------------------: | :-----: |
+| ResNet50             | Faster         |    1    |   1x    |     ----     |  35.1  |    -    | [下载链接](https://paddlemodels.bj.bcebos.com/object_detection/dygraph/faster_rcnn_r50_1x_coco.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/dygraph/configs/faster_rcnn_r50_1x_coco.yml) |
+| ResNet50-FPN         | Faster         |    1    |   1x    |     ----     |  37.0  |    -    | [下载链接](https://paddlemodels.bj.bcebos.com/object_detection/dygraph/faster_rcnn_r50_fpn_1x_coco.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/dygraph/configs/faster_rcnn_r50_fpn_1x_coco.yml) |
+| ResNet50             | Mask         |    1    |   1x    |     ----     |  36.4  |    31.9    | [下载链接](https://paddlemodels.bj.bcebos.com/object_detection/dygraph/mask_rcnn_r50_1x_coco.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/dygraph/configs/mask_rcnn_r50_1x_coco.yml) |
+| ResNet50-FPN         | Mask         |    1    |   1x    |     ----     |  38.3  |    34.5    | [下载链接](https://paddlemodels.bj.bcebos.com/object_detection/dygraph/mask_rcnn_r50_fpn_1x_coco.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/dygraph/configs/mask_rcnn_r50_fpn_1x_coco.yml) |
+| ResNet50-FPN         | Cascade Faster         |    1    |   1x    |     ----     |  41.1  |    -    | [下载链接](https://paddlemodels.bj.bcebos.com/object_detection/dygraph/cascade_rcnn_r50_fpn_1x_coco.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/dygraph/configs/cascade_faster_rcnn_r50_fpn_1x_coco.yml) |
+| ResNet50-FPN         | Cascade Mask         |    1    |   1x    |     ----     |  41.6  |    35.3    | [下载链接](https://paddlemodels.bj.bcebos.com/object_detection/dygraph/cascade_mask_rcnn_r50_fpn_1x_coco.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/dygraph/configs/cascade_mask_rcnn_r50_fpn_1x_coco.yml) |
+| DarkNet53         | YOLOv3         |    1    |   270e    |     ----     |  39.0  |    -    | [下载链接](https://paddlemodels.bj.bcebos.com/object_detection/dygraph/yolov3_darknet53_270e_coco.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/dygraph/configs/yolov3_darknet53_270e_coco.yml) |
+### SSD on Pascal VOC
+| 骨架网络        | 网络类型       | 每张GPU图片个数 | 学习率策略 |推理时间(fps) | Box AP |                           下载                          | 配置文件 |
+| :-------------- | :------------- | :-----: | :-----: | :------------: | :-----: | :-----------------------------------------------------: | :-----: |
+| VGG             | SSD            |    8    |   240e    |     ----     |  78.2  | [下载链接](https://paddlemodels.bj.bcebos.com/object_detection/dygraph/ssd_vgg16_300_240e_voc.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/dygraph/configs/ssd_vgg16_300_240e_voc.yml) |
+**注意：** SSD使用4GPU训练，训练240个epoch
--- a/dygraph/docs/tutorials/GETTING_STARTED_cn.md
+++ b/dygraph/docs/tutorials/GETTING_STARTED_cn.md
+# 入门使用
+## 安装
+`dygraph`分支需要安装每日版本的PaddlePaddle，PaddlePaddle中`c0a991c8740b413559bfc894aa5ae1d5ed3704b5`这个commit会影响精度，建议安装这个commit之前的版本。
+## 准备数据
+请按照[如何准备训练数据](PrepareDataSet.md) 准备训练数据。  
+数据准备好之后，设置数据配置文件`configs/_base_/datasets/coco.yml`中的数据路径。
+## 训练/评估/预测
+PaddleDetection提供了训练/评估/预测，支持通过不同可选参数实现特定功能
+#### 训练
+```bash
+# GPU训练 支持单卡，多卡训练，通过CUDA_VISIBLE_DEVICES指定卡号
+export CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7
+python -m paddle.distributed.launch --selected_gpus 0,1,2,3,4,5,6,7 tools/train.py -c configs/faster_rcnn_r50_fpn_1x_coco.yml
+```
+#### 评估
+```bash
+# 使用单卡评估
+CUDA_VISIBLE_DEVICES=0 python tools/eval.py -c configs/faster_rcnn_r50_fpn_1x_coco.yml
+```
+#### 预测
+```bash
+# 预测
+CUDA_VISIBLE_DEVICES=0 python tools/infer.py -c configs/faster_rcnn_r50_fpn_1x_coco.yml --infer_img=demo/000000570688.jpg
+```
--- a/dygraph/docs/tutorials/INSTALL_cn.md
+++ b/dygraph/docs/tutorials/INSTALL_cn.md
+# 安装说明
+---
+## 目录
+- [简介](#简介)
+- [安装PaddlePaddle](#安装PaddlePaddle)
+- [其他依赖安装](#其他依赖安装)
+- [PaddleDetection](#PaddleDetection)
+## 简介
+这份文档介绍了如何安装PaddleDetection及其依赖项(包括PaddlePaddle)。
+PaddleDetection的相关信息，请参考[README.md](https://github.com/PaddlePaddle/PaddleDetection/blob/master/README.md).
+## 安装PaddlePaddle
+**环境需求:**
+- OS 64位操作系统
+- Python 3(3.5.1+/3.6/3.7)，64位版本
+- pip/pip3(9.0.1+)，64位版本操作系统是
+- CUDA >= 9.0
+- cuDNN >= 7.6
+如果需要 GPU 多卡训练，请先安装NCCL。
+## 其他依赖安装
+[COCO-API](https://github.com/cocodataset/cocoapi):
+运行需要COCO-API，安装方式如下：
+    # 安装pycocotools
+    pip install pycocotools
+**windows用户安装COCO-API方式：**
+    # 若Cython未安装，请安装Cython
+    pip install Cython
+    # 由于原版cocoapi不支持windows，采用第三方实现版本，该版本仅支持Python3
+    pip install git+https://github.com/philferriere/cocoapi.git#subdirectory=PythonAPI
+## PaddleDetection
+**安装Python依赖库：**
+Python依赖库在[requirements.txt](https://github.com/PaddlePaddle/PaddleDetection/blob/master/requirements.txt)中给出，可通过如下命令安装：
+```
+pip install -r requirements.txt
+```
+**注意：`llvmlite`需要安装`0.33`版本，`numba`需要安装`0.50`版本**
+**克隆PaddleDetection库：**
+您可以通过以下命令克隆PaddleDetection：
+```
+cd <path/to/clone/PaddleDetection>
+git clone https://github.com/PaddlePaddle/PaddleDetection.git
+```
+也可以通过 [https://gitee.com/paddlepaddle/PaddleDetection](https://gitee.com/paddlepaddle/PaddleDetection) 克隆。
+```
+cd <path/to/clone/PaddleDetection>
+git clone https://gitee.com/paddlepaddle/PaddleDetection
+```
--- a/dygraph/docs/tutorials/PrepareDataSet.md
+++ b/dygraph/docs/tutorials/PrepareDataSet.md
+# 如何准备训练数据
+## 目录
+- [目标检测数据说明](#目标检测数据说明)
+- [准备训练数据](#准备训练数据)
+    - [VOC数据数据](#VOC数据数据)
+        - [VOC数据集下载](#VOC数据集下载)
+        - [VOC数据标注文件介绍](#VOC数据标注文件介绍)
+    - [COCO数据数据](#COCO数据数据)
+        - [COCO数据集下载](#COCO数据下载)
+        - [COCO数据标注文件介绍](#COCO数据标注文件介绍)
+    - [用户数据](#用户数据)
+        - [用户数据转成VOC数据](#用户数据转成VOC数据)
+        - [用户数据转成COCO数据](#用户数据转成COCO数据)
+        - [用户数据自定义reader](#用户数据自定义reader)
+    - [用户数据数据转换示例](#用户数据数据转换示例)
+### 目标检测数据说明  
+目标检测的数据比分类复杂，一张图像中，需要标记出各个目标区域的位置和类别。
+一般的目标区域位置用一个矩形框来表示，一般用以下3种方式表达：
+|         表达方式    |                 说明               |
+| :----------------: | :--------------------------------: |
+|     x1,y1,x2,y2    | (x1,y1)为左上角坐标，(x2,y2)为右下角坐标  |  
+|       x,y,w,h      | (x,y)为左上角坐标，w为目标区域宽度，h为目标区域高度  |
+|     xc,yc,w,h    | (xc,yc)为目标区域中心坐标，w为目标区域宽度，h为目标区域高度  |  
+常见的目标检测数据集如Pascal VOC和COCO，采用的是第一种 `x1,y1,x2,y2` 表示物体的bounding box.  
+### 准备训练数据  
+PaddleDetection默认支持[COCO](http://cocodataset.org)和[Pascal VOC](http://host.robots.ox.ac.uk/pascal/VOC/) 和[WIDER-FACE](http://shuoyang1213.me/WIDERFACE/) 数据源。  
+同时还支持自定义数据源，包括：  
+(1) 自定义数据数据转换成VOC数据；  
+(2) 自定义数据数据转换成COCO数据；  
+(3) 自定义新的数据源，增加自定义的reader。
+首先进入到`PaddleDetection`根目录下
+```
+cd PaddleDetection/
+ppdet_root=$(pwd)
+```
+#### VOC数据数据  
+VOC数据是[Pascal VOC](http://host.robots.ox.ac.uk/pascal/VOC/) 比赛使用的数据。Pascal VOC比赛不仅包含图像分类分类任务，还包含图像目标检测、图像分割等任务，其标注文件中包含多个任务的标注内容。
+VOC数据集指的是Pascal VOC比赛使用的数据。用户自定义的VOC数据，xml文件中的非必须字段，请根据实际情况选择是否标注或是否使用默认值。
+##### VOC数据集下载  
+- 通过代码自动化下载VOC数据集  
+    ```
+    # 执行代码自动化下载VOC数据集  
+    python dataset/voc/download_voc.py
+    ```
+    代码执行完成后VOC数据集文件组织结构为：
+    ```
+    >>cd dataset/voc/
+    >>tree
+    ├── create_list.py
+    ├── download_voc.py
+    ├── generic_det_label_list.txt
+    ├── generic_det_label_list_zh.txt
+    ├── label_list.txt
+    ├── VOCdevkit/VOC2007
+    │   ├── annotations
+    │       ├── 001789.xml
+    │       |   ...
+    │   ├── JPEGImages
+    │       ├── 001789.jpg
+    │       |   ...
+    │   ├── ImageSets
+    │       |   ...
+    ├── VOCdevkit/VOC2012
+    │   ├── Annotations
+    │       ├── 2011_003876.xml
+    │       |   ...
+    │   ├── JPEGImages
+    │       ├── 2011_003876.jpg
+    │       |   ...
+    │   ├── ImageSets
+    │       |   ...
+    |   ...
+    ```
+    各个文件说明
+    ```
+    # label_list.txt 是类别名称列表，文件名必须是 label_list.txt。若使用VOC数据集，config文件中use_default_label为true时不需要这个文件
+    >>cat label_list.txt
+    aeroplane
+    bicycle
+    ...
+    # trainval.txt 是训练数据集文件列表
+    >>cat trainval.txt
+    VOCdevkit/VOC2007/JPEGImages/007276.jpg VOCdevkit/VOC2007/Annotations/007276.xml
+    VOCdevkit/VOC2012/JPEGImages/2011_002612.jpg VOCdevkit/VOC2012/Annotations/2011_002612.xml
+    ...
+    # test.txt 是测试数据集文件列表
+    >>cat test.txt
+    VOCdevkit/VOC2007/JPEGImages/000001.jpg VOCdevkit/VOC2007/Annotations/000001.xml
+    ...
+    # label_list.txt voc 类别名称列表
+    >>cat label_list.txt
+    aeroplane
+    bicycle
+    ...
+    ```
+- 已下载VOC数据集  
+    按照如上数据文件组织结构组织文件即可。
+##### VOC数据标注文件介绍  
+VOC数据是每个图像文件对应一个同名的xml文件，xml文件中标记物体框的坐标和类别等信息。例如图像`2007_002055.jpg`：
+![](../images/2007_002055.jpg)
+图片对应的xml文件内包含对应图片的基本信息，比如文件名、来源、图像尺寸以及图像中包含的物体区域信息和类别信息等。
+xml文件中包含以下字段：
+- filename，表示图像名称。
+- size，表示图像尺寸。包括：图像宽度、图像高度、图像深度。
+    ```
+    <size>
+        <width>500</width>
+        <height>375</height>
+        <depth>3</depth>
+    </size>
+    ```
+- object字段，表示每个物体。包括:
+    |    标签    |    说明    |
+    | :--------: | :-----------: |
+    |   name    |     物体类别名称       |  
+    |   pose    |    关于目标物体姿态描述（非必须字段）  |  
+    |   truncated    |   如果物体的遮挡超过15-20％并且位于边界框之外，请标记为`truncated`（非必须字段）    |  
+    |   difficult    |   难以识别的物体标记为`difficult`（非必须字段）      |  
+    |   bndbox子标签    |  (xmin,ymin) 左上角坐标，(xmax,ymax) 右下角坐标，  |  
+#### COCO数据  
+COCO数据是[COCO](http://cocodataset.org) 比赛使用的数据。同样的，COCO比赛数也包含多个比赛任务，其标注文件中包含多个任务的标注内容。
+COCO数据集指的是COCO比赛使用的数据。用户自定义的COCO数据，json文件中的一些字段，请根据实际情况选择是否标注或是否使用默认值。
+##### COCO数据下载  
+- 通过代码自动化下载COCO数据集  
+    ```
+    # 执行代码自动化下载COCO数据集  
+    python dataset/voc/download_coco.py
+    ```
+    代码执行完成后COCO数据集文件组织结构为：
+    ```
+    >>cd dataset/coco/
+    >>tree
+    ├── annotations
+    │   ├── instances_train2017.json
+    │   ├── instances_val2017.json
+    │   |   ...
+    ├── train2017
+    │   ├── 000000000009.jpg
+    │   ├── 000000580008.jpg
+    │   |   ...
+    ├── val2017
+    │   ├── 000000000139.jpg
+    │   ├── 000000000285.jpg
+    │   |   ...
+    |   ...
+    ```
+- 已下载COCO数据集  
+    按照如上数据文件组织结构组织文件即可。  
+##### COCO数据标注介绍  
+COCO数据标注是将所有训练图像的标注都存放到一个json文件中。数据以字典嵌套的形式存放。
+json文件中包含以下key：  
+- info，表示标注文件info。
+- licenses，表示标注文件licenses。
+- images，表示标注文件中图像信息列表，每个元素是一张图像的信息。如下为其中一张图像的信息：
+    ```
+    {
+        'license': 3,                       # license
+        'file_name': '000000391895.jpg',    # file_name
+         # coco_url
+        'coco_url': 'http://images.cocodataset.org/train2017/000000391895.jpg',
+        'height': 360,                      # image height
+        'width': 640,                       # image width
+        'date_captured': '2013-11-14 11:18:45', # date_captured
+        # flickr_url
+        'flickr_url': 'http://farm9.staticflickr.com/8186/8119368305_4e622c8349_z.jpg',
+        'id': 391895                        # image id
+    }
+    ```
+- annotations，表示标注文件中目标物体的标注信息列表，每个元素是一个目标物体的标注信息。如下为其中一个目标物体的标注信息：
+    ```
+    {
+        'segmentation':             # 物体的分割标注
+        'area': 2765.1486500000005, # 物体的区域面积
+        'iscrowd': 0,               # iscrowd
+        'image_id': 558840,         # image id
+        'bbox': [199.84, 200.46, 77.71, 70.88], # bbox
+        'category_id': 58,          # category_id
+        'id': 156                   # image id
+    }
+    ```
+    ```
+    # 查看COCO标注文件
+    import json
+    coco_anno = json.load(open('./annotations/instances_train2017.json'))
+    # coco_anno.keys
+    print('\nkeys:', coco_anno.keys())
+    # 查看类别信息
+    print('\n物体类别:', coco_anno['categories'])
+    # 查看一共多少张图
+    print('\n图像数量：', len(coco_anno['images']))
+    # 查看一共多少个目标物体
+    print('\n标注物体数量：', len(coco_anno['annotations']))
+    # 查看一条目标物体标注信息
+    print('\n查看一条目标物体标注信息：', coco_anno['annotations'][0])
+    ```
+    COCO数据准备如下。  
+    `dataset/coco/`最初文件组织结构
+    ```
+    >>cd dataset/coco/
+    >>tree
+    ├── download_coco.py
+    ```
+#### 用户数据  
+对于用户数据有3种处理方法：  
+(1) 将用户数据转成VOC数据(根据需要仅包含物体检测所必须的标签即可)  
+(2) 将用户数据转成COCO数据(根据需要仅包含物体检测所必须的标签即可)  
+(3) 自定义一个用户数据的reader(较复杂数据，需要自定义reader)  
+##### 用户数据转成VOC数据  
+用户数据集转成VOC数据后目录结构如下（注意数据集中路径名、文件名尽量不要使用中文，避免中文编码问题导致出错）：
+```
+dataset/xxx/
+├── annotations
+│   ├── xxx1.xml
+│   ├── xxx2.xml
+│   ├── xxx3.xml
+│   |   ...
+├── images
+│   ├── xxx1.jpg
+│   ├── xxx2.jpg
+│   ├── xxx3.jpg
+│   |   ...
+├── label_list.txt (必须提供，且文件名称必须是label_list.txt )
+├── train.txt (训练数据集文件列表, ./images/xxx1.jpg ./annotations/xxx1.xml)
+└── valid.txt (测试数据集文件列表)
+```
+各个文件说明
+```
+# label_list.txt 是类别名称列表，改文件名必须是这个
+>>cat label_list.txt
+classname1
+classname2
+...
+# train.txt 是训练数据文件列表
+>>cat train.txt
+./images/xxx1.jpg ./annotations/xxx1.xml
+./images/xxx2.jpg ./annotations/xxx2.xml
+...
+# valid.txt 是验证数据文件列表
+>>cat valid.txt
+./images/xxx3.jpg ./annotations/xxx3.xml
+...
+```
+##### 用户数据转成COCO  
+在`./tools/`中提供了`x2coco.py`用于将VOC数据集、labelme标注的数据集或cityscape数据集转换为COCO数据，例如:
+（1）labelme数据转换为COCO数据：
+```bash
+python tools/x2coco.py \
+                --dataset_type labelme \
+                --json_input_dir ./labelme_annos/ \
+                --image_input_dir ./labelme_imgs/ \
+                --output_dir ./cocome/ \
+                --train_proportion 0.8 \
+                --val_proportion 0.2 \
+                --test_proportion 0.0
+```
+（2）voc数据转换为COCO数据：
+```bash
+python tools/x2coco.py \
+        --dataset_type voc \
+        --voc_anno_dir path/to/VOCdevkit/VOC2007/Annotations/ \
+        --voc_anno_list path/to/VOCdevkit/VOC2007/ImageSets/Main/trainval.txt \
+        --voc_label_list dataset/voc/label_list.txt \
+        --voc_out_name voc_train.json
+```
+用户数据集转成COCO数据后目录结构如下（注意数据集中路径名、文件名尽量不要使用中文，避免中文编码问题导致出错）：
+```
+dataset/xxx/
+├── annotations
+│   ├── train.json  # coco数据的标注文件
+│   ├── valid.json  # coco数据的标注文件
+├── images
+│   ├── xxx1.jpg
+│   ├── xxx2.jpg
+│   ├── xxx3.jpg
+│   |   ...
+...
+```
+##### 用户数据自定义reader  
+如果数据集有新的数据需要添加进PaddleDetection中，您可参考数据处理文档中的[添加新数据源](../advanced_tutorials/READER.md#添加新数据源)文档部分，开发相应代码完成新的数据源支持，同时数据处理具体代码解析等可阅读[数据处理文档](../advanced_tutorials/READER.md)
+#### 用户数据数据转换示例  
+以[Kaggle数据集](https://www.kaggle.com/andrewmvd/road-sign-detection) 比赛数据为例，说明如何准备自定义数据。
+Kaggle上的 [road-sign-detection](https://www.kaggle.com/andrewmvd/road-sign-detection) 比赛数据包含877张图像，数据类别4类：crosswalk，speedlimit，stop，trafficlight。
+可从Kaggle上下载，也可以从[下载链接](https://paddlemodels.bj.bcebos.com/object_detection/roadsign.zip) 下载。
+路标数据集示例图：
+![](../images/road554.png)
+```
+# 下载解压数据
+>>cd $(ppdet_root)/dataset
+# 下载kaggle数据集并解压，当前文件组织结构如下
+├── annotations
+│   ├── road0.xml
+│   ├── road1.xml
+│   ├── road10.xml
+│   |   ...
+├── images
+│   ├── road0.jpg
+│   ├── road1.jpg
+│   ├── road2.jpg
+│   |   ...
+```
+将数据划分为训练集和测试集
+```
+# 生成 label_list.txt 文件
+>>echo "speedlimit\ncrosswalk\ntrafficlight\nstop" > label_list.txt
+# 生成 train.txt、valid.txt和test.txt列表文件
+>>ls images/*.png | shuf > all_image_list.txt
+>>awk -F"/" '{print $2}' all_image_list.txt | awk -F".png" '{print $1}'  | awk -F"\t" '{print "images/"$1".png annotations/"$1".xml"}' > all_list.txt
+# 训练集、验证集、测试集比例分别约80%、10%、10%。
+>>head -n 88 all_list.txt > test.txt
+>>head -n 176 all_list.txt | tail -n 88 > valid.txt
+>>tail -n 701 all_list.txt > train.txt
+# 删除不用文件
+>>rm -rf all_image_list.txt all_list.txt
+最终数据集文件组织结构为：
+├── annotations
+│   ├── road0.xml
+│   ├── road1.xml
+│   ├── road10.xml
+│   |   ...
+├── images
+│   ├── road0.jpg
+│   ├── road1.jpg
+│   ├── road2.jpg
+│   |   ...
+├── label_list.txt
+├── test.txt
+├── train.txt
+└── valid.txt
+# label_list.txt 是类别名称列表，文件名必须是 label_list.txt
+>>cat label_list.txt
+crosswalk
+speedlimit
+stop
+trafficlight
+# train.txt 是训练数据集文件列表，每一行是一张图像路径和对应标注文件路径，以空格分开。注意这里的路径是数据集文件夹内的相对路径。
+>>cat train.txt
+./images/road839.png ./annotations/road839.xml
+./images/road363.png ./annotations/road363.xml
+...
+# valid.txt 是验证数据集文件列表，每一行是一张图像路径和对应标注文件路径，以空格分开。注意这里的路径是数据集文件夹内的相对路径。
+>>cat valid.txt
+./images/road218.png ./annotations/road218.xml
+./images/road681.png ./annotations/road681.xml
+```
+也可以下载准备好的数据[下载链接](https://paddlemodels.bj.bcebos.com/object_detection/roadsign_voc.zip) ，解压到`dataset/roadsign_voc/`文件夹下即可。  
+准备好数据后，一般的我们要对数据有所了解，比如图像量，图像尺寸，每一类目标区域个数，目标区域大小等。如有必要，还要对数据进行清洗。  
+roadsign数据集统计:
+|    数据    |    图片数量    |
+| :--------: | :-----------: |
+|   train    |     701       |
+|   valid    |     176       |
+**说明：**
+（1）用户数据，建议在训练前仔细检查数据，避免因数据标注格式错误或图像数据不完整造成训练过程中的crash
+（2）如果图像尺寸太大的话，在不限制读入数据尺寸情况下，占用内存较多，会造成内存/显存溢出，请合理设置batch_size，可从小到大尝试
--- a/dygraph/ppdet/__init__.py
+++ b/dygraph/ppdet/__init__.py
+# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
--- a/dygraph/ppdet/core/__init__.py
+++ b/dygraph/ppdet/core/__init__.py
--- a/dygraph/ppdet/core/config/__init__.py
+++ b/dygraph/ppdet/core/config/__init__.py
--- a/dygraph/ppdet/core/config/schema.py
+++ b/dygraph/ppdet/core/config/schema.py
--- a/dygraph/ppdet/core/config/yaml_helpers.py
+++ b/dygraph/ppdet/core/config/yaml_helpers.py
--- a/dygraph/ppdet/core/workspace.py
+++ b/dygraph/ppdet/core/workspace.py
--- a/dygraph/ppdet/data/__init__.py
+++ b/dygraph/ppdet/data/__init__.py
--- a/dygraph/ppdet/data/reader.py
+++ b/dygraph/ppdet/data/reader.py
--- a/dygraph/ppdet/data/source/__init__.py
+++ b/dygraph/ppdet/data/source/__init__.py
--- a/dygraph/ppdet/data/source/coco.py
+++ b/dygraph/ppdet/data/source/coco.py
--- a/dygraph/ppdet/data/source/dataset.py
+++ b/dygraph/ppdet/data/source/dataset.py
--- a/dygraph/ppdet/data/source/voc.py
+++ b/dygraph/ppdet/data/source/voc.py
--- a/dygraph/ppdet/data/source/widerface.py
+++ b/dygraph/ppdet/data/source/widerface.py
--- a/dygraph/ppdet/data/tools/x2coco.py
+++ b/dygraph/ppdet/data/tools/x2coco.py
--- a/dygraph/ppdet/data/transform/__init__.py
+++ b/dygraph/ppdet/data/transform/__init__.py
--- a/dygraph/ppdet/data/transform/autoaugment_utils.py
+++ b/dygraph/ppdet/data/transform/autoaugment_utils.py
--- a/dygraph/ppdet/data/transform/batch_operator.py
+++ b/dygraph/ppdet/data/transform/batch_operator.py
--- a/dygraph/ppdet/data/transform/batch_operators.py
+++ b/dygraph/ppdet/data/transform/batch_operators.py
--- a/dygraph/ppdet/data/transform/gridmask_utils.py
+++ b/dygraph/ppdet/data/transform/gridmask_utils.py
--- a/dygraph/ppdet/data/transform/op_helper.py
+++ b/dygraph/ppdet/data/transform/op_helper.py
--- a/dygraph/ppdet/data/transform/operator.py
+++ b/dygraph/ppdet/data/transform/operator.py
--- a/dygraph/ppdet/data/transform/operators.py
+++ b/dygraph/ppdet/data/transform/operators.py
--- a/dygraph/ppdet/modeling/__init__.py
+++ b/dygraph/ppdet/modeling/__init__.py
--- a/dygraph/ppdet/modeling/architecture/__init__.py
+++ b/dygraph/ppdet/modeling/architecture/__init__.py
--- a/dygraph/ppdet/modeling/architecture/cascade_rcnn.py
+++ b/dygraph/ppdet/modeling/architecture/cascade_rcnn.py
--- a/dygraph/ppdet/modeling/architecture/faster_rcnn.py
+++ b/dygraph/ppdet/modeling/architecture/faster_rcnn.py
--- a/dygraph/ppdet/modeling/architecture/mask_rcnn.py
+++ b/dygraph/ppdet/modeling/architecture/mask_rcnn.py
--- a/dygraph/ppdet/modeling/architecture/meta_arch.py
+++ b/dygraph/ppdet/modeling/architecture/meta_arch.py
--- a/dygraph/ppdet/modeling/architecture/ssd.py
+++ b/dygraph/ppdet/modeling/architecture/ssd.py
--- a/dygraph/ppdet/modeling/architecture/yolo.py
+++ b/dygraph/ppdet/modeling/architecture/yolo.py
--- a/dygraph/ppdet/modeling/backbone/__init__.py
+++ b/dygraph/ppdet/modeling/backbone/__init__.py
+from . import vgg
+from . import resnet
+from . import darknet
+from .vgg import *
+from .resnet import *
+from .darknet import *
--- a/dygraph/ppdet/modeling/backbone/darknet.py
+++ b/dygraph/ppdet/modeling/backbone/darknet.py
--- a/dygraph/ppdet/modeling/backbone/name_adapter.py
+++ b/dygraph/ppdet/modeling/backbone/name_adapter.py
--- a/dygraph/ppdet/modeling/backbone/resnet.py
+++ b/dygraph/ppdet/modeling/backbone/resnet.py
--- a/dygraph/ppdet/modeling/backbone/vgg.py
+++ b/dygraph/ppdet/modeling/backbone/vgg.py
--- a/dygraph/ppdet/modeling/bbox.py
+++ b/dygraph/ppdet/modeling/bbox.py
--- a/dygraph/ppdet/modeling/head/__init__.py
+++ b/dygraph/ppdet/modeling/head/__init__.py
--- a/dygraph/ppdet/modeling/head/bbox_head.py
+++ b/dygraph/ppdet/modeling/head/bbox_head.py
--- a/dygraph/ppdet/modeling/head/mask_head.py
+++ b/dygraph/ppdet/modeling/head/mask_head.py
--- a/dygraph/ppdet/modeling/head/roi_extractor.py
+++ b/dygraph/ppdet/modeling/head/roi_extractor.py
--- a/dygraph/ppdet/modeling/head/rpn_head.py
+++ b/dygraph/ppdet/modeling/head/rpn_head.py
--- a/dygraph/ppdet/modeling/head/ssd_head.py
+++ b/dygraph/ppdet/modeling/head/ssd_head.py
--- a/dygraph/ppdet/modeling/head/yolo_head.py
+++ b/dygraph/ppdet/modeling/head/yolo_head.py
--- a/dygraph/ppdet/modeling/layers.py
+++ b/dygraph/ppdet/modeling/layers.py
--- a/dygraph/ppdet/modeling/loss/__init__.py
+++ b/dygraph/ppdet/modeling/loss/__init__.py
--- a/dygraph/ppdet/modeling/loss/iou_aware_loss.py
+++ b/dygraph/ppdet/modeling/loss/iou_aware_loss.py
--- a/dygraph/ppdet/modeling/loss/iou_loss.py
+++ b/dygraph/ppdet/modeling/loss/iou_loss.py
--- a/dygraph/ppdet/modeling/loss/ssd_loss.py
+++ b/dygraph/ppdet/modeling/loss/ssd_loss.py
--- a/dygraph/ppdet/modeling/loss/yolo_loss.py
+++ b/dygraph/ppdet/modeling/loss/yolo_loss.py
--- a/dygraph/ppdet/modeling/mask.py
+++ b/dygraph/ppdet/modeling/mask.py
--- a/dygraph/ppdet/modeling/neck/__init__.py
+++ b/dygraph/ppdet/modeling/neck/__init__.py
--- a/dygraph/ppdet/modeling/neck/fpn.py
+++ b/dygraph/ppdet/modeling/neck/fpn.py
--- a/dygraph/ppdet/modeling/neck/yolo_fpn.py
+++ b/dygraph/ppdet/modeling/neck/yolo_fpn.py
--- a/dygraph/ppdet/modeling/ops.py
+++ b/dygraph/ppdet/modeling/ops.py
--- a/dygraph/ppdet/modeling/post_process.py
+++ b/dygraph/ppdet/modeling/post_process.py
--- a/dygraph/ppdet/modeling/tests/__init__.py
+++ b/dygraph/ppdet/modeling/tests/__init__.py
--- a/dygraph/ppdet/modeling/tests/test_base.py
+++ b/dygraph/ppdet/modeling/tests/test_base.py
--- a/dygraph/ppdet/modeling/tests/test_ops.py
+++ b/dygraph/ppdet/modeling/tests/test_ops.py
--- a/dygraph/ppdet/modeling/tests/test_transfrom.py
+++ b/dygraph/ppdet/modeling/tests/test_transfrom.py
--- a/dygraph/ppdet/modeling/tests/test_yolov3_loss.py
+++ b/dygraph/ppdet/modeling/tests/test_yolov3_loss.py
--- a/dygraph/ppdet/modeling/utils/__init__.py
+++ b/dygraph/ppdet/modeling/utils/__init__.py
--- a/dygraph/ppdet/modeling/utils/bbox_util.py
+++ b/dygraph/ppdet/modeling/utils/bbox_util.py
--- a/dygraph/ppdet/optimizer.py
+++ b/dygraph/ppdet/optimizer.py
--- a/dygraph/ppdet/py_op/__init__.py
+++ b/dygraph/ppdet/py_op/__init__.py
--- a/dygraph/ppdet/py_op/bbox.py
+++ b/dygraph/ppdet/py_op/bbox.py
--- a/dygraph/ppdet/py_op/mask.py
+++ b/dygraph/ppdet/py_op/mask.py
--- a/dygraph/ppdet/py_op/post_process.py
+++ b/dygraph/ppdet/py_op/post_process.py
--- a/dygraph/ppdet/py_op/target.py
+++ b/dygraph/ppdet/py_op/target.py
--- a/dygraph/ppdet/utils/__init__.py
+++ b/dygraph/ppdet/utils/__init__.py
--- a/dygraph/ppdet/utils/bbox_utils.py
+++ b/dygraph/ppdet/utils/bbox_utils.py
--- a/dygraph/ppdet/utils/check.py
+++ b/dygraph/ppdet/utils/check.py
--- a/dygraph/ppdet/utils/checkpoint.py
+++ b/dygraph/ppdet/utils/checkpoint.py
--- a/dygraph/ppdet/utils/cli.py
+++ b/dygraph/ppdet/utils/cli.py
--- a/dygraph/ppdet/utils/coco_eval.py
+++ b/dygraph/ppdet/utils/coco_eval.py
--- a/dygraph/ppdet/utils/colormap.py
+++ b/dygraph/ppdet/utils/colormap.py
--- a/dygraph/ppdet/utils/download.py
+++ b/dygraph/ppdet/utils/download.py
--- a/dygraph/ppdet/utils/eval_utils.py
+++ b/dygraph/ppdet/utils/eval_utils.py
--- a/dygraph/ppdet/utils/logger.py
+++ b/dygraph/ppdet/utils/logger.py
--- a/dygraph/ppdet/utils/map_utils.py
+++ b/dygraph/ppdet/utils/map_utils.py
--- a/dygraph/ppdet/utils/post_process.py
+++ b/dygraph/ppdet/utils/post_process.py
--- a/dygraph/ppdet/utils/stats.py
+++ b/dygraph/ppdet/utils/stats.py
--- a/dygraph/ppdet/utils/visualizer.py
+++ b/dygraph/ppdet/utils/visualizer.py
--- a/dygraph/ppdet/utils/voc_eval.py
+++ b/dygraph/ppdet/utils/voc_eval.py
--- a/dygraph/ppdet/utils/voc_utils.py
+++ b/dygraph/ppdet/utils/voc_utils.py
--- a/dygraph/requirements.txt
+++ b/dygraph/requirements.txt
--- a/dygraph/tools/__init__.py
+++ b/dygraph/tools/__init__.py
--- a/dygraph/tools/eval.py
+++ b/dygraph/tools/eval.py
--- a/dygraph/tools/export_model.py
+++ b/dygraph/tools/export_model.py
--- a/dygraph/tools/export_utils.py
+++ b/dygraph/tools/export_utils.py
--- a/dygraph/tools/infer.py
+++ b/dygraph/tools/infer.py
--- a/dygraph/tools/train.py
+++ b/dygraph/tools/train.py