[cherry-pick] add ppyolo readme en (#1122)

* add PP-YOLO README en

[cherry-pick] add ppyolo readme en (#1122)
* add PP-YOLO README en
de6ca3c0 · Kaipeng Deng · GitHub · 80f8c262 · de6ca3c0 · de6ca3c0
5 changed file
--- a/README.md
+++ b/README.md
-简体中文 | [English](README_en.md)
+English | [简体中文](README_cn.md)

-文档：[https://paddledetection.readthedocs.io](https://paddledetection.readthedocs.io)
+Documentation:[https://paddledetection.readthedocs.io](https://paddledetection.readthedocs.io)

 # PaddleDetection

-飞桨推出的PaddleDetection是端到端目标检测开发套件，旨在帮助开发者更快更好地完成检测模型的训练、精度速度优化到部署全流程。PaddleDetection以模块化的设计实现了多种主流目标检测算法，并且提供了丰富的数据增强、网络组件、损失函数等模块，集成了模型压缩和跨平台高性能部署能力。目前基于PaddleDetection已经完成落地的项目涉及工业质检、遥感图像检测、无人巡检等多个领域。
+PaddleDetection is an end-to-end object detection development kit based on PaddlePaddle, which
+aims to help developers in the whole development of training models, optimizing performance and
+inference speed, and deploying models. PaddleDetection provides varied object detection architectures
+in modular design, and wealthy data augmentation methods, network components, loss functions, etc.
+PaddleDetection supported practical projects such as industrial quality inspection, remote sensing
+image object detection, and automatic inspection with its practical features such as model compression
+and multi-platform deployment.

-PaddleDetection新发布精度速度领先的[PP-YOLO](https://arxiv.org/abs/2007.12099)模型，COCO数据集精度达到45.2%，单卡Tesla V100预测速度达到72.9 FPS，详细信息见[PP-YOLO模型](configs/ppyolo/README.md)
+[PP-YOLO](https://arxiv.org/abs/2007.12099), which is faster and has higer performance than YOLOv4,
+has been released, it reached mAP(0.5:0.95) as 45.2% on COCO test2019 dataset and 72.9 FPS on single
+Test V100. Please refer to [PP-YOLO](configs/ppyolo/README.md) for details.

-**目前检测库下模型均要求使用PaddlePaddle 1.8及以上版本或适当的develop版本。**
+**Now all models in PaddleDetection require PaddlePaddle version 1.8 or higher, or suitable develop version.**

 <div align="center">
  <img src="docs/images/000000570688.jpg" />
 </div>


-## 简介
+## Introduction

-特性：
+Features:

- 模型丰富：
+- Rich models:

-  PaddleDetection提供了丰富的模型，包含目标检测、实例分割、人脸检测等100+个预训练模型，涵盖多种数据集竞赛冠军方案、适合云端/边缘端设备部署的检测方案。
+  PaddleDetection provides rich of models, including 100+ pre-trained models
+such as object detection, instance segmentation, face detection etc. It covers
+the champion models, the practical detection models for cloud and edge device.

- 易部署:
+- Production Ready:

-  PaddleDetection的模型中使用的核心算子均通过C++或CUDA实现，同时基于PaddlePaddle的高性能推理引擎可以方便地部署在多种硬件平台上。
+  Key operations are implemented in C++ and CUDA, together with PaddlePaddle's
+highly efficient inference engine, enables easy deployment in server environments.

- 高灵活度：
+- Highly Flexible:

-  PaddleDetection通过模块化设计来解耦各个组件，基于配置文件可以轻松地搭建各种检测模型。
+  Components are designed to be modular. Model architectures, as well as data
+preprocess pipelines, can be easily customized with simple configuration
+changes.

- 高性能：
+- Performance Optimized:

-  基于PaddlePaddle框架的高性能内核，在模型训练速度、显存占用上有一定的优势。例如，YOLOv3的训练速度快于其他框架，在Tesla V100 16GB环境下，Mask-RCNN(ResNet50)可以单卡Batch Size可以达到4 (甚至到5)。
+  With the help of the underlying PaddlePaddle framework, faster training and
+reduced GPU memory footprint is achieved. Notably, YOLOv3 training is
+much faster compared to other frameworks. Another example is Mask-RCNN
+(ResNet50), we managed to fit up to 4 images per GPU (Tesla V100 16GB) during
+multi-GPU training.

+Supported Architectures:

-支持的模型结构：
+|                     | ResNet | ResNet-vd <sup>[1](#vd)</sup> | ResNeXt-vd | SENet | MobileNet |  HRNet | Res2Net |
+| ------------------- | :----: | ----------------------------: | :--------: | :---: | :-------: |:------:|:-----:  |
+| Faster R-CNN        |   ✓    |                             ✓ |     x      |   ✓   |     ✗     |   ✗    |  ✗      |
+| Faster R-CNN + FPN  |   ✓    |                             ✓ |     ✓      |   ✓   |     ✗     |   ✓    |  ✓      |
+| Mask R-CNN          |   ✓    |                             ✓ |     x      |   ✓   |     ✗     |   ✗    |  ✗      |
+| Mask R-CNN + FPN    |   ✓    |                             ✓ |     ✓      |   ✓   |     ✗     |   ✗    |  ✓      |
+| Cascade Faster-RCNN |   ✓    |                             ✓ |     ✓      |   ✗   |     ✗     |   ✗    |  ✗      |
+| Cascade Mask-RCNN   |   ✓    |                             ✗ |     ✗      |   ✓   |     ✗     |   ✗    |  ✗      |
+| Libra R-CNN         |   ✗    |                             ✓ |     ✗      |   ✗   |     ✗     |   ✗    |  ✗      |
+| RetinaNet           |   ✓    |                             ✗ |     ✗      |   ✗   |     ✗     |   ✗    |  ✗      |
+| YOLOv3              |   ✓    |                             ✓ |     ✗      |   ✗   |     ✓     |   ✗    |  ✗      |
+| SSD                 |   ✗    |                             ✗ |     ✗      |   ✗   |     ✓     |   ✗    |  ✗      |
+| BlazeFace           |   ✗    |                             ✗ |     ✗      |   ✗   |     ✗     |   ✗    |  ✗      |
+| Faceboxes           |   ✗    |                             ✗ |     ✗      |   ✗   |     ✗     |   ✗    |  ✗      |

-|                    | ResNet | ResNet-vd <sup>[1](#vd)</sup> | ResNeXt-vd | SENet | MobileNet |  HRNet | Res2Net |
-|--------------------|:------:|------------------------------:|:----------:|:-----:|:---------:|:------:| :--:    |
-| Faster R-CNN       | ✓      |                             ✓ | x          | ✓     | ✗         |  ✗     |  ✗      |
-| Faster R-CNN + FPN | ✓      |                             ✓ | ✓          | ✓     | ✗         |  ✓     |  ✓      |
-| Mask R-CNN         | ✓      |                             ✓ | x          | ✓     | ✗         |  ✗     |  ✗      |
-| Mask R-CNN + FPN   | ✓      |                             ✓ | ✓          | ✓     | ✗         |  ✗     |  ✓      |
-| Cascade Faster-RCNN | ✓     |                             ✓ | ✓          | ✗     | ✗         |  ✗     |  ✗      |
-| Cascade Mask-RCNN  | ✓      |                             ✗ | ✗          | ✓     | ✗         |  ✗     |  ✗      |
-| Libra R-CNN        | ✗      |                             ✓ | ✗          | ✗     | ✗         |  ✗     |  ✗      |
-| RetinaNet          | ✓      |                             ✗ | ✓          | ✗     | ✗         |  ✗     |  ✗      |
-| YOLOv3             | ✓      |                             ✓ | ✗          | ✗     | ✓         |  ✗     |  ✗      |
-| SSD                | ✗      |                             ✗ | ✗          | ✗     | ✓         |  ✗     |  ✗      |
-| BlazeFace          | ✗      |                             ✗ | ✗          | ✗     | ✗         |  ✗     |  ✗      |
-| Faceboxes          | ✗      |                             ✗ | ✗          | ✗     | ✗         |  ✗     |  ✗      |
+<a name="vd">[1]</a> [ResNet-vd](https://arxiv.org/pdf/1812.01187) models offer much improved accuracy with negligible performance cost.

-<a name="vd">[1]</a> [ResNet-vd](https://arxiv.org/pdf/1812.01187) 模型预测速度基本不变的情况下提高了精度。
+**NOTE:** ✓ for config file and pretrain model provided in [Model Zoo](docs/MODEL_ZOO.md), ✗ for not provided but is supported generally.

-**说明：** ✓ 为[模型库](docs/MODEL_ZOO_cn.md)中提供了对应配置文件和预训练模型，✗ 为未提供参考配置，但一般都支持。
-
-更多的模型:
+More models:

 - EfficientDet
 - FCOS
@@ -65,94 +81,96 @@ PaddleDetection新发布精度速度领先的[PP-YOLO](https://arxiv.org/abs/200
 - YOLOv4
 - PP-YOLO

-更多的Backone：
+More Backbones:

 - DarkNet
 - VGG
 - GCNet
 - CBNet
- Hourglass

-扩展特性：
+Advanced Features:

 - [x] **Synchronized Batch Norm**
 - [x] **Group Norm**
 - [x] **Modulated Deformable Convolution**
 - [x] **Deformable PSRoI Pooling**
- [x] **Non-local和GCNet**
+- [x] **Non-local and GCNet**

-**注意:** Synchronized batch normalization 只能在多GPU环境下使用，不能在CPU环境或者单GPU环境下使用。
+**NOTE:** Synchronized batch normalization can only be used on multiple GPU devices, can not be used on CPU devices or single GPU device.

-以下为选取各模型结构和骨干网络的代表模型COCO数据集精度mAP和单卡Tesla V100上预测速度(FPS)关系图。
+The following is the relationship between COCO mAP and FPS on Tesla V100 of representative models of each architectures and backbones.

 <div align="center">
  <img src="docs/images/map_fps.png" width=800 />
 </div>

-**说明：**
- `CBResNet`为`Cascade-Faster-RCNN-CBResNet200vd-FPN`模型，COCO数据集mAP高达53.3%
- `Cascade-Faster-RCNN`为`Cascade-Faster-RCNN-ResNet50vd-DCN`，PaddleDetection将其优化到COCO数据mAP为47.8%时推理速度为20FPS
- PaddleDetection增强版`YOLOv3-ResNet50vd-DCN`在COCO数据集mAP高于原作10.6个绝对百分点，推理速度为61.3FPS，快于原作约70%
- 图中模型均可在[模型库](#模型库)中获取
+**NOTE:**
+- `CBResNet` stands for `Cascade-Faster-RCNN-CBResNet200vd-FPN`, which has highest mAP on COCO as 53.3% in PaddleDetection models
+- `Cascade-Faster-RCNN` stands for `Cascade-Faster-RCNN-ResNet50vd-DCN`, which has been optimized to 20 FPS inference speed when COCO mAP as 47.8%
+- The enhanced `YOLOv3-ResNet50vd-DCN` is 10.6 absolute percentage points higher than paper on COCO mAP, and inference speed is nearly 70% faster than the darknet framework
+- All these models can be get in [Model Zoo](#Model-Zoo)

-以下为PaddleDetection发布的精度和预测速度优于YOLOv4模型的PP-YOLO与前沿目标检测算法的COCO数据集精度与单卡Tesla V100预测速度(FPS)关系图， PP-YOLO模型在[COCO](http://cocodataset.org) test2019数据集上精度达到45.2%，在单卡V100上FP32推理速度为72.9 FPS，详细信息见[PP-YOLO模型](configs/ppyolo/README.md)
+The following is the relationship between COCO mAP and FPS on Tesla V100 of SOTA object detecters and PP-YOLO, which is faster and has better performance than YOLOv4, and reached mAP(0.5:0.95) as 45.2% on COCO test2019 dataset and 72.9 FPS on single Test V100. Please refer to [PP-YOLO](configs/ppyolo/README.md) for details.

 <div align="center">
  <img src="docs/images/ppyolo_map_fps.png" width=600 />
 </div>

-## 文档教程
+## Tutorials
+
+
+### Get Started

-### 入门教程
+- [Installation guide](docs/tutorials/INSTALL.md)
+- [Quick start on small dataset](docs/tutorials/QUICK_STARTED.md)
+- [Train/Evaluation/Inference](docs/tutorials/GETTING_STARTED.md)
+- [How to train a custom dataset](docs/tutorials/Custom_DataSet.md)
+- [FAQ](docs/FAQ.md)

- [安装说明](docs/tutorials/INSTALL_cn.md)
- [快速开始](docs/tutorials/QUICK_STARTED_cn.md)
- [训练/评估/预测流程](docs/tutorials/GETTING_STARTED_cn.md)
- [如何训练自定义数据集](docs/tutorials/Custom_DataSet.md)
- [常见问题汇总](docs/FAQ.md)
+### Advanced Tutorial

-### 进阶教程
- [数据预处理及数据集定义](docs/advanced_tutorials/READER.md)
- [搭建模型步骤](docs/advanced_tutorials/MODEL_TECHNICAL.md)
- [模型参数配置](docs/advanced_tutorials/config_doc):
-  - [配置模块设计和介绍](docs/advanced_tutorials/config_doc/CONFIG_cn.md)
-  - [RCNN模型参数说明](docs/advanced_tutorials/config_doc/RCNN_PARAMS_DOC.md)
- [迁移学习教程](docs/advanced_tutorials/TRANSFER_LEARNING_cn.md)
+- [Guide to preprocess pipeline and dataset definition](docs/advanced_tutorials/READER.md)
+- [Models technical](docs/advanced_tutorials/MODEL_TECHNICAL.md)
+- [Transfer learning document](docs/advanced_tutorials/TRANSFER_LEARNING.md)
+- [Parameter configuration](docs/advanced_tutorials/config_doc):
+  - [Introduction to the configuration workflow](docs/advanced_tutorials/config_doc/CONFIG.md)
+  - [Parameter configuration for RCNN model](docs/advanced_tutorials/config_doc/RCNN_PARAMS_DOC.md)
 - [IPython Notebook demo](demo/mask_rcnn_demo.ipynb)
- [模型压缩](slim)
-    - [压缩benchmark](slim)
-    - [量化](slim/quantization)
-    - [剪枝](slim/prune)
-    - [蒸馏](slim/distillation)
-    - [神经网络搜索](slim/nas)
- [推理部署](deploy)
-    - [模型导出教程](docs/advanced_tutorials/deploy/EXPORT_MODEL.md)
-    - [Python端推理部署](deploy/python)
-    - [C++端推理部署](deploy/cpp)
-    - [推理Benchmark](docs/advanced_tutorials/deploy/BENCHMARK_INFER_cn.md)
-
-## 模型库
-
- [模型库](docs/MODEL_ZOO_cn.md)
- [移动端模型](configs/mobile/README.md)
- [Anchor free模型](configs/anchor_free/README.md)
- [人脸检测模型](docs/featured_model/FACE_DETECTION.md)
- [YOLOv3增强模型](docs/featured_model/YOLOv3_ENHANCEMENT.md): COCO mAP高达43.6%，原论文精度为33.0%
- [PP-YOLO模型](configs/ppyolo/README.md): COCO mAP高达45.3%，单卡Tesla V100预测速度高达72.9 FPS
- [行人检测预训练模型](docs/featured_model/CONTRIB_cn.md)
- [车辆检测预训练模型](docs/featured_model/CONTRIB_cn.md)
- [Objects365 2019 Challenge夺冠模型](docs/featured_model/champion_model/CACascadeRCNN.md)
- [Open Images 2019-Object Detction比赛最佳单模型](docs/featured_model/champion_model/OIDV5_BASELINE_MODEL.md)
- [服务器端实用目标检测模型](configs/rcnn_enhance/README.md): V100上速度20FPS时，COCO mAP高达47.8%。
- [大规模实用目标检测模型](docs/featured_model/LARGE_SCALE_DET_MODEL.md): 提供了包含676个类别的大规模服务器端实用目标检测模型，适用于绝大部分使用场景，可以直接用来预测，也可以用于微调其他任务。
-
-
-## 许可证书
-本项目的发布受[Apache 2.0 license](LICENSE)许可认证。
-
-## 版本更新
-v0.4.0版本已经在`07/2020`发布，增加PP-YOLO, TTFNet, HTC, ACFPN等多个模型，新增BlazeFace人脸关键点检测模型，新增移动端SSDLite系列优化模型，新增GridMask，RandomErasing数据增强方法，新增Matrix NMS和EMA训练，提升易用性，修复已知诸多bug等，详细内容请参考[版本更新文档](docs/CHANGELOG.md)。
-
-## 如何贡献代码
-
-我们非常欢迎你可以为PaddleDetection提供代码，也十分感谢你的反馈。
+- [Model compression](slim)
+    - [Model compression benchmark](slim)
+    - [Quantization](slim/quantization)
+    - [Model pruning](slim/prune)
+    - [Model distillation](slim/distillation)
+    - [Neural Architecture Search](slim/nas)
+- [Deployment](deploy)
+    - [Export model for inference](docs/advanced_tutorials/deploy/EXPORT_MODEL.md)
+    - [Python inference](deploy/python)
+    - [C++ inference](deploy/cpp)
+    - [Inference benchmark](docs/advanced_tutorials/inference/BENCHMARK_INFER_cn.md)
+
+## Model Zoo
+
+- Pretrained models are available in the [PaddleDetection model zoo](docs/MODEL_ZOO.md).
+- [Mobile models](configs/mobile/README.md)
+- [Anchor free models](configs/anchor_free/README.md)
+- [Face detection models](docs/featured_model/FACE_DETECTION_en.md)
+- [Pretrained models for pedestrian detection](docs/featured_model/CONTRIB.md)
+- [Pretrained models for vehicle detection](docs/featured_model/CONTRIB.md)
+- [YOLOv3 enhanced model](docs/featured_model/YOLOv3_ENHANCEMENT.md): Compared to MAP of 33.0% in paper, enhanced YOLOv3 reaches the MAP of 43.6%, and inference speed is improved as well
+- [PP-YOLO](configs/ppyolo/README.md): PP-YOLO reeached mAP as 45.3% on COCO dataset，and 72.9 FPS on single Tesla V100
+- [Objects365 2019 Challenge champion model](docs/featured_model/champion_model/CACascadeRCNN.md)
+- [Best single model of Open Images 2019-Object Detction](docs/featured_model/champion_model/OIDV5_BASELINE_MODEL.md)
+- [Practical Server-side detection method](configs/rcnn_enhance/README_en.md): Inference speed on single V100 GPU can reach 20FPS when COCO mAP is 47.8%.
+- [Large-scale practical object detection models](docs/featured_model/LARGE_SCALE_DET_MODEL_en.md): Large-scale practical server-side detection pretrained models with 676 categories are provided for most application scenarios, which can be used not only for direct inference but also finetuning on other datasets.
+
+
+## License
+PaddleDetection is released under the [Apache 2.0 license](LICENSE).
+
+## Updates
+v0.4.0 was released at `05/2020`, add PP-YOLO, TTFNet, HTC, ACFPN, etc. And add BlaceFace face landmark detection model, add a series of optimized SSDLite models on mobile side, add data augmentations GridMask and RandomErasing, add Matrix NMS and EMA training, and improved ease of use, fix many known bugs, etc.
+Please refer to [版本更新文档](docs/CHANGELOG.md) for details.
+
+## Contributing
+
+Contributions are highly welcomed and we would really appreciate your feedback!!
--- a/README_cn.md
+++ b/README_cn.md
+简体中文 | [English](README.md)
+
+文档：[https://paddledetection.readthedocs.io](https://paddledetection.readthedocs.io)
+
+# PaddleDetection
+
+飞桨推出的PaddleDetection是端到端目标检测开发套件，旨在帮助开发者更快更好地完成检测模型的训练、精度速度优化到部署全流程。PaddleDetection以模块化的设计实现了多种主流目标检测算法，并且提供了丰富的数据增强、网络组件、损失函数等模块，集成了模型压缩和跨平台高性能部署能力。目前基于PaddleDetection已经完成落地的项目涉及工业质检、遥感图像检测、无人巡检等多个领域。
+
+PaddleDetection新发布精度速度领先的[PP-YOLO](https://arxiv.org/abs/2007.12099)模型，COCO数据集精度达到45.2%，单卡Tesla V100预测速度达到72.9 FPS，详细信息见[PP-YOLO模型](configs/ppyolo/README_cn.md)
+
+**目前检测库下模型均要求使用PaddlePaddle 1.8及以上版本或适当的develop版本。**
+
+<div align="center">
+  <img src="docs/images/000000570688.jpg" />
+</div>
+
+
+## 简介
+
+特性：
+
+- 模型丰富：
+
+  PaddleDetection提供了丰富的模型，包含目标检测、实例分割、人脸检测等100+个预训练模型，涵盖多种数据集竞赛冠军方案、适合云端/边缘端设备部署的检测方案。
+
+- 易部署:
+
+  PaddleDetection的模型中使用的核心算子均通过C++或CUDA实现，同时基于PaddlePaddle的高性能推理引擎可以方便地部署在多种硬件平台上。
+
+- 高灵活度：
+
+  PaddleDetection通过模块化设计来解耦各个组件，基于配置文件可以轻松地搭建各种检测模型。
+
+- 高性能：
+
+  基于PaddlePaddle框架的高性能内核，在模型训练速度、显存占用上有一定的优势。例如，YOLOv3的训练速度快于其他框架，在Tesla V100 16GB环境下，Mask-RCNN(ResNet50)可以单卡Batch Size可以达到4 (甚至到5)。
+
+
+支持的模型结构：
+
+|                    | ResNet | ResNet-vd <sup>[1](#vd)</sup> | ResNeXt-vd | SENet | MobileNet |  HRNet | Res2Net |
+|--------------------|:------:|------------------------------:|:----------:|:-----:|:---------:|:------:| :--:    |
+| Faster R-CNN       | ✓      |                             ✓ | x          | ✓     | ✗         |  ✗     |  ✗      |
+| Faster R-CNN + FPN | ✓      |                             ✓ | ✓          | ✓     | ✗         |  ✓     |  ✓      |
+| Mask R-CNN         | ✓      |                             ✓ | x          | ✓     | ✗         |  ✗     |  ✗      |
+| Mask R-CNN + FPN   | ✓      |                             ✓ | ✓          | ✓     | ✗         |  ✗     |  ✓      |
+| Cascade Faster-RCNN | ✓     |                             ✓ | ✓          | ✗     | ✗         |  ✗     |  ✗      |
+| Cascade Mask-RCNN  | ✓      |                             ✗ | ✗          | ✓     | ✗         |  ✗     |  ✗      |
+| Libra R-CNN        | ✗      |                             ✓ | ✗          | ✗     | ✗         |  ✗     |  ✗      |
+| RetinaNet          | ✓      |                             ✗ | ✓          | ✗     | ✗         |  ✗     |  ✗      |
+| YOLOv3             | ✓      |                             ✓ | ✗          | ✗     | ✓         |  ✗     |  ✗      |
+| SSD                | ✗      |                             ✗ | ✗          | ✗     | ✓         |  ✗     |  ✗      |
+| BlazeFace          | ✗      |                             ✗ | ✗          | ✗     | ✗         |  ✗     |  ✗      |
+| Faceboxes          | ✗      |                             ✗ | ✗          | ✗     | ✗         |  ✗     |  ✗      |
+
+<a name="vd">[1]</a> [ResNet-vd](https://arxiv.org/pdf/1812.01187) 模型预测速度基本不变的情况下提高了精度。
+
+**说明：** ✓ 为[模型库](docs/MODEL_ZOO_cn.md)中提供了对应配置文件和预训练模型，✗ 为未提供参考配置，但一般都支持。
+
+更多的模型:
+
+- EfficientDet
+- FCOS
+- CornerNet-Squeeze
+- YOLOv4
+- PP-YOLO
+
+更多的Backone：
+
+- DarkNet
+- VGG
+- GCNet
+- CBNet
+- Hourglass
+
+扩展特性：
+
+- [x] **Synchronized Batch Norm**
+- [x] **Group Norm**
+- [x] **Modulated Deformable Convolution**
+- [x] **Deformable PSRoI Pooling**
+- [x] **Non-local和GCNet**
+
+**注意:** Synchronized batch normalization 只能在多GPU环境下使用，不能在CPU环境或者单GPU环境下使用。
+
+以下为选取各模型结构和骨干网络的代表模型COCO数据集精度mAP和单卡Tesla V100上预测速度(FPS)关系图。
+
+<div align="center">
+  <img src="docs/images/map_fps.png" width=800 />
+</div>
+
+**说明：**
+- `CBResNet`为`Cascade-Faster-RCNN-CBResNet200vd-FPN`模型，COCO数据集mAP高达53.3%
+- `Cascade-Faster-RCNN`为`Cascade-Faster-RCNN-ResNet50vd-DCN`，PaddleDetection将其优化到COCO数据mAP为47.8%时推理速度为20FPS
+- PaddleDetection增强版`YOLOv3-ResNet50vd-DCN`在COCO数据集mAP高于原作10.6个绝对百分点，推理速度为61.3FPS，快于原作约70%
+- 图中模型均可在[模型库](#模型库)中获取
+
+以下为PaddleDetection发布的精度和预测速度优于YOLOv4模型的PP-YOLO与前沿目标检测算法的COCO数据集精度与单卡Tesla V100预测速度(FPS)关系图， PP-YOLO模型在[COCO](http://cocodataset.org) test2019数据集上精度达到45.2%，在单卡V100上FP32推理速度为72.9 FPS，详细信息见[PP-YOLO模型](configs/ppyolo/README_cn.md)
+
+<div align="center">
+  <img src="docs/images/ppyolo_map_fps.png" width=600 />
+</div>
+
+## 文档教程
+
+### 入门教程
+
+- [安装说明](docs/tutorials/INSTALL_cn.md)
+- [快速开始](docs/tutorials/QUICK_STARTED_cn.md)
+- [训练/评估/预测流程](docs/tutorials/GETTING_STARTED_cn.md)
+- [如何训练自定义数据集](docs/tutorials/Custom_DataSet.md)
+- [常见问题汇总](docs/FAQ.md)
+
+### 进阶教程
+- [数据预处理及数据集定义](docs/advanced_tutorials/READER.md)
+- [搭建模型步骤](docs/advanced_tutorials/MODEL_TECHNICAL.md)
+- [模型参数配置](docs/advanced_tutorials/config_doc):
+  - [配置模块设计和介绍](docs/advanced_tutorials/config_doc/CONFIG_cn.md)
+  - [RCNN模型参数说明](docs/advanced_tutorials/config_doc/RCNN_PARAMS_DOC.md)
+- [迁移学习教程](docs/advanced_tutorials/TRANSFER_LEARNING_cn.md)
+- [IPython Notebook demo](demo/mask_rcnn_demo.ipynb)
+- [模型压缩](slim)
+    - [压缩benchmark](slim)
+    - [量化](slim/quantization)
+    - [剪枝](slim/prune)
+    - [蒸馏](slim/distillation)
+    - [神经网络搜索](slim/nas)
+- [推理部署](deploy)
+    - [模型导出教程](docs/advanced_tutorials/deploy/EXPORT_MODEL.md)
+    - [Python端推理部署](deploy/python)
+    - [C++端推理部署](deploy/cpp)
+    - [推理Benchmark](docs/advanced_tutorials/deploy/BENCHMARK_INFER_cn.md)
+
+## 模型库
+
+- [模型库](docs/MODEL_ZOO_cn.md)
+- [移动端模型](configs/mobile/README.md)
+- [Anchor free模型](configs/anchor_free/README.md)
+- [人脸检测模型](docs/featured_model/FACE_DETECTION.md)
+- [YOLOv3增强模型](docs/featured_model/YOLOv3_ENHANCEMENT.md): COCO mAP高达43.6%，原论文精度为33.0%
+- [PP-YOLO模型](configs/ppyolo/README_cn.md): COCO mAP高达45.3%，单卡Tesla V100预测速度高达72.9 FPS
+- [行人检测预训练模型](docs/featured_model/CONTRIB_cn.md)
+- [车辆检测预训练模型](docs/featured_model/CONTRIB_cn.md)
+- [Objects365 2019 Challenge夺冠模型](docs/featured_model/champion_model/CACascadeRCNN.md)
+- [Open Images 2019-Object Detction比赛最佳单模型](docs/featured_model/champion_model/OIDV5_BASELINE_MODEL.md)
+- [服务器端实用目标检测模型](configs/rcnn_enhance/README.md): V100上速度20FPS时，COCO mAP高达47.8%。
+- [大规模实用目标检测模型](docs/featured_model/LARGE_SCALE_DET_MODEL.md): 提供了包含676个类别的大规模服务器端实用目标检测模型，适用于绝大部分使用场景，可以直接用来预测，也可以用于微调其他任务。
+
+
+## 许可证书
+本项目的发布受[Apache 2.0 license](LICENSE)许可认证。
+
+## 版本更新
+v0.4.0版本已经在`07/2020`发布，增加PP-YOLO, TTFNet, HTC, ACFPN等多个模型，新增BlazeFace人脸关键点检测模型，新增移动端SSDLite系列优化模型，新增GridMask，RandomErasing数据增强方法，新增Matrix NMS和EMA训练，提升易用性，修复已知诸多bug等，详细内容请参考[版本更新文档](docs/CHANGELOG.md)。
+
+## 如何贡献代码
+
+我们非常欢迎你可以为PaddleDetection提供代码，也十分感谢你的反馈。
--- a/README_en.md
+++ b/README_en.md
-English | [简体中文](README.md)
-
-Documentation:[https://paddledetection.readthedocs.io](https://paddledetection.readthedocs.io)
-
-# PaddleDetection
-
-PaddleDetection is an end-to-end object detection development kit based on PaddlePaddle, which
-aims to help developers in the whole development of training models, optimizing performance and
-inference speed, and deploying models. PaddleDetection provides varied object detection architectures
-in modular design, and wealthy data augmentation methods, network components, loss functions, etc.
-PaddleDetection supported practical projects such as industrial quality inspection, remote sensing
-image object detection, and automatic inspection with its practical features such as model compression
-and multi-platform deployment.
-
-[PP-YOLO](https://arxiv.org/abs/2007.12099), which is faster and has higer performance than YOLOv4,
-has been released, it reached mAP(0.5:0.95) as 45.2% on COCO test2019 dataset and 72.9 FPS on single
-Test V100. Please refer to [PP-YOLO](configs/ppyolo/README.md) for details.
-
-**Now all models in PaddleDetection require PaddlePaddle version 1.8 or higher, or suitable develop version.**
-
-<div align="center">
-  <img src="docs/images/000000570688.jpg" />
-</div>
-
-
-## Introduction
-
-Features:
-
- Rich models:
-
-  PaddleDetection provides rich of models, including 100+ pre-trained models
-such as object detection, instance segmentation, face detection etc. It covers
-the champion models, the practical detection models for cloud and edge device.
-
- Production Ready:
-
-  Key operations are implemented in C++ and CUDA, together with PaddlePaddle's
-highly efficient inference engine, enables easy deployment in server environments.
-
- Highly Flexible:
-
-  Components are designed to be modular. Model architectures, as well as data
-preprocess pipelines, can be easily customized with simple configuration
-changes.
-
- Performance Optimized:
-
-  With the help of the underlying PaddlePaddle framework, faster training and
-reduced GPU memory footprint is achieved. Notably, YOLOv3 training is
-much faster compared to other frameworks. Another example is Mask-RCNN
-(ResNet50), we managed to fit up to 4 images per GPU (Tesla V100 16GB) during
-multi-GPU training.
-
-Supported Architectures:
-
-|                     | ResNet | ResNet-vd <sup>[1](#vd)</sup> | ResNeXt-vd | SENet | MobileNet |  HRNet | Res2Net |
-| ------------------- | :----: | ----------------------------: | :--------: | :---: | :-------: |:------:|:-----:  |
-| Faster R-CNN        |   ✓    |                             ✓ |     x      |   ✓   |     ✗     |   ✗    |  ✗      |
-| Faster R-CNN + FPN  |   ✓    |                             ✓ |     ✓      |   ✓   |     ✗     |   ✓    |  ✓      |
-| Mask R-CNN          |   ✓    |                             ✓ |     x      |   ✓   |     ✗     |   ✗    |  ✗      |
-| Mask R-CNN + FPN    |   ✓    |                             ✓ |     ✓      |   ✓   |     ✗     |   ✗    |  ✓      |
-| Cascade Faster-RCNN |   ✓    |                             ✓ |     ✓      |   ✗   |     ✗     |   ✗    |  ✗      |
-| Cascade Mask-RCNN   |   ✓    |                             ✗ |     ✗      |   ✓   |     ✗     |   ✗    |  ✗      |
-| Libra R-CNN         |   ✗    |                             ✓ |     ✗      |   ✗   |     ✗     |   ✗    |  ✗      |
-| RetinaNet           |   ✓    |                             ✗ |     ✗      |   ✗   |     ✗     |   ✗    |  ✗      |
-| YOLOv3              |   ✓    |                             ✓ |     ✗      |   ✗   |     ✓     |   ✗    |  ✗      |
-| SSD                 |   ✗    |                             ✗ |     ✗      |   ✗   |     ✓     |   ✗    |  ✗      |
-| BlazeFace           |   ✗    |                             ✗ |     ✗      |   ✗   |     ✗     |   ✗    |  ✗      |
-| Faceboxes           |   ✗    |                             ✗ |     ✗      |   ✗   |     ✗     |   ✗    |  ✗      |
-
-<a name="vd">[1]</a> [ResNet-vd](https://arxiv.org/pdf/1812.01187) models offer much improved accuracy with negligible performance cost.
-
-**NOTE:** ✓ for config file and pretrain model provided in [Model Zoo](docs/MODEL_ZOO.md), ✗ for not provided but is supported generally.
-
-More models:
-
- EfficientDet
- FCOS
- CornerNet-Squeeze
- YOLOv4
- PP-YOLO
-
-More Backbones:
-
- DarkNet
- VGG
- GCNet
- CBNet
-
-Advanced Features:
-
- [x] **Synchronized Batch Norm**
- [x] **Group Norm**
- [x] **Modulated Deformable Convolution**
- [x] **Deformable PSRoI Pooling**
- [x] **Non-local and GCNet**
-
-**NOTE:** Synchronized batch normalization can only be used on multiple GPU devices, can not be used on CPU devices or single GPU device.
-
-The following is the relationship between COCO mAP and FPS on Tesla V100 of representative models of each architectures and backbones.
-
-<div align="center">
-  <img src="docs/images/map_fps.png" width=800 />
-</div>
-
-**NOTE:**
- `CBResNet` stands for `Cascade-Faster-RCNN-CBResNet200vd-FPN`, which has highest mAP on COCO as 53.3% in PaddleDetection models
- `Cascade-Faster-RCNN` stands for `Cascade-Faster-RCNN-ResNet50vd-DCN`, which has been optimized to 20 FPS inference speed when COCO mAP as 47.8%
- The enhanced `YOLOv3-ResNet50vd-DCN` is 10.6 absolute percentage points higher than paper on COCO mAP, and inference speed is nearly 70% faster than the darknet framework
- All these models can be get in [Model Zoo](#Model-Zoo)
-
-The following is the relationship between COCO mAP and FPS on Tesla V100 of SOTA object detecters and PP-YOLO, which is faster and has better performance than YOLOv4, and reached mAP(0.5:0.95) as 45.2% on COCO test2019 dataset and 72.9 FPS on single Test V100. Please refer to [PP-YOLO](configs/ppyolo/README.md) for details.
-
-<div align="center">
-  <img src="docs/images/ppyolo_map_fps.png" width=600 />
-</div>
-
-## Tutorials
-
-
-### Get Started
-
- [Installation guide](docs/tutorials/INSTALL.md)
- [Quick start on small dataset](docs/tutorials/QUICK_STARTED.md)
- [Train/Evaluation/Inference](docs/tutorials/GETTING_STARTED.md)
- [How to train a custom dataset](docs/tutorials/Custom_DataSet.md)
- [FAQ](docs/FAQ.md)
-
-### Advanced Tutorial
-
- [Guide to preprocess pipeline and dataset definition](docs/advanced_tutorials/READER.md)
- [Models technical](docs/advanced_tutorials/MODEL_TECHNICAL.md)
- [Transfer learning document](docs/advanced_tutorials/TRANSFER_LEARNING.md)
- [Parameter configuration](docs/advanced_tutorials/config_doc):
-  - [Introduction to the configuration workflow](docs/advanced_tutorials/config_doc/CONFIG.md)
-  - [Parameter configuration for RCNN model](docs/advanced_tutorials/config_doc/RCNN_PARAMS_DOC.md)
- [IPython Notebook demo](demo/mask_rcnn_demo.ipynb)
- [Model compression](slim)
-    - [Model compression benchmark](slim)
-    - [Quantization](slim/quantization)
-    - [Model pruning](slim/prune)
-    - [Model distillation](slim/distillation)
-    - [Neural Architecture Search](slim/nas)
- [Deployment](deploy)
-    - [Export model for inference](docs/advanced_tutorials/deploy/EXPORT_MODEL.md)
-    - [Python inference](deploy/python)
-    - [C++ inference](deploy/cpp)
-    - [Inference benchmark](docs/advanced_tutorials/inference/BENCHMARK_INFER_cn.md)
-
-## Model Zoo
-
- Pretrained models are available in the [PaddleDetection model zoo](docs/MODEL_ZOO.md).
- [Mobile models](configs/mobile/README.md)
- [Anchor free models](configs/anchor_free/README.md)
- [Face detection models](docs/featured_model/FACE_DETECTION_en.md)
- [Pretrained models for pedestrian detection](docs/featured_model/CONTRIB.md)
- [Pretrained models for vehicle detection](docs/featured_model/CONTRIB.md)
- [YOLOv3 enhanced model](docs/featured_model/YOLOv3_ENHANCEMENT.md): Compared to MAP of 33.0% in paper, enhanced YOLOv3 reaches the MAP of 43.6%, and inference speed is improved as well
- [PP-YOLO](configs/ppyolo/README.md): PP-YOLO reeached mAP as 45.3% on COCO dataset，and 72.9 FPS on single Tesla V100
- [Objects365 2019 Challenge champion model](docs/featured_model/champion_model/CACascadeRCNN.md)
- [Best single model of Open Images 2019-Object Detction](docs/featured_model/champion_model/OIDV5_BASELINE_MODEL.md)
- [Practical Server-side detection method](configs/rcnn_enhance/README_en.md): Inference speed on single V100 GPU can reach 20FPS when COCO mAP is 47.8%.
- [Large-scale practical object detection models](docs/featured_model/LARGE_SCALE_DET_MODEL_en.md): Large-scale practical server-side detection pretrained models with 676 categories are provided for most application scenarios, which can be used not only for direct inference but also finetuning on other datasets.
-
-
-## License
-PaddleDetection is released under the [Apache 2.0 license](LICENSE).
-
-## Updates
-v0.4.0 was released at `05/2020`, add PP-YOLO, TTFNet, HTC, ACFPN, etc. And add BlaceFace face landmark detection model, add a series of optimized SSDLite models on mobile side, add data augmentations GridMask and RandomErasing, add Matrix NMS and EMA training, and improved ease of use, fix many known bugs, etc.
-Please refer to [版本更新文档](docs/CHANGELOG.md) for details.
-
-## Contributing
-
-Contributions are highly welcomed and we would really appreciate your feedback!!
--- a/configs/ppyolo/README.md
+++ b/configs/ppyolo/README.md
--- a/configs/ppyolo/README_cn.md
+++ b/configs/ppyolo/README_cn.md
+简体中文 | [English](README.md)
+
+# PP-YOLO 模型
+
+## 内容
+- [简介](#简介)
+- [模型库与基线](#模型库与基线)
+- [使用说明](#使用说明)
+- [未来工作](#未来工作)
+- [附录](#附录)
+
+## 简介
+
+[PP-YOLO](https://arxiv.org/abs/2007.12099)是PaddleDetection优化和改进的YOLOv3的模型，其精度(COCO数据集mAP)和推理速度均优于[YOLOv4](https://arxiv.org/abs/2004.10934)模型，要求使用PaddlePaddle 1.8.4(2020年8月中旬发布)或适当的[develop版本](https://www.paddlepaddle.org.cn/documentation/docs/zh/install/Tables.html#whl-dev)。
+
+PP-YOLO在[COCO](http://cocodataset.org) test-dev2017数据集上精度达到45.2%，在单卡V100上FP32推理速度为72.9 FPS, V100上开启TensorRT下FP16推理速度为155.6 FPS。
+
+<div align="center">
+  <img src="../../docs/images/ppyolo_map_fps.png" width=500 />
+</div>
+
+PP-YOLO从如下方面优化和提升YOLOv3模型的精度和速度：
+
+- 更优的骨干网络: ResNet50vd-DCN
+- 更大的训练batch size: 8 GPUs，每GPU batch_size=24，对应调整学习率和迭代轮数
+- [Drop Block](https://arxiv.org/abs/1810.12890)
+- [Exponential Moving Average](https://www.investopedia.com/terms/e/ema.asp)
+- [IoU Loss](https://arxiv.org/pdf/1902.09630.pdf)
+- [Grid Sensitive](https://arxiv.org/abs/2004.10934)
+- [Matrix NMS](https://arxiv.org/pdf/2003.10152.pdf)
+- [CoordConv](https://arxiv.org/abs/1807.03247)
+- [Spatial Pyramid Pooling](https://arxiv.org/abs/1406.4729)
+- 更优的预训练模型
+
+## 模型库
+
+### PP-YOLO模型
+
+|          模型            | GPU个数 | 每GPU图片个数 |  骨干网络  | 输入尺寸 | Box AP | V100 FP32(FPS) | V100 TensorRT FP16(FPS) | 模型下载 | 配置文件 |
+|:------------------------:|:-------:|:-------------:|:----------:| :-------:| :----: | :------------: | :---------------------: | :------: | :------: |
+| YOLOv4(AlexyAB)          |    -    |       -       | CSPDarknet |   608    |  43.5  |       62       |          105.5           | [下载链接](https://paddlemodels.bj.bcebos.com/object_detection/yolov4_cspdarknet.pdparams) |  [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/master/configs/yolov4/yolov4_csdarknet.yml)                   |
+| YOLOv4(AlexyAB)          |    -    |       -       | CSPDarknet |   512    |  43.0  |       83       |          138.4           | [下载链接](https://paddlemodels.bj.bcebos.com/object_detection/yolov4_cspdarknet.pdparams) |  [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/master/configs/yolov4/yolov4_csdarknet.yml)                   |
+| YOLOv4(AlexyAB)          |    -    |       -       | CSPDarknet |   416    |  41.2  |       96       |          164.0           | [下载链接](https://paddlemodels.bj.bcebos.com/object_detection/yolov4_cspdarknet.pdparams) |  [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/master/configs/yolov4/yolov4_csdarknet.yml)                   |
+| YOLOv4(AlexyAB)          |    -    |       -       | CSPDarknet |   320    |  38.0  |      123       |          199.0           | [下载链接](https://paddlemodels.bj.bcebos.com/object_detection/yolov4_cspdarknet.pdparams) |  [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/master/configs/yolov4/yolov4_csdarknet.yml)                   |
+| PP-YOLO                   |    8    |      24      | ResNet50vd |   608    |  45.2  |      72.9      |          155.6          | [下载链接](https://paddlemodels.bj.bcebos.com/object_detection/ppyolo.pdparams) |  [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/master/configs/ppyolo/ppyolo.yml)                   |
+| PP-YOLO                   |    8    |      24      | ResNet50vd |   512    |  44.4  |      89.9      |          188.4          | [下载链接](https://paddlemodels.bj.bcebos.com/object_detection/ppyolo.pdparams) |  [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/master/configs/ppyolo/ppyolo.yml)                   |
+| PP-YOLO                   |    8    |      24      | ResNet50vd |   416    |  42.5  |     109.1      |          215.4          | [下载链接](https://paddlemodels.bj.bcebos.com/object_detection/ppyolo.pdparams) |  [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/master/configs/ppyolo/ppyolo.yml)                   |
+| PP-YOLO                   |    8    |      24      | ResNet50vd |   320    |  39.3  |     132.2      |          242.2          | [下载链接](https://paddlemodels.bj.bcebos.com/object_detection/ppyolo.pdparams) |  [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/master/configs/ppyolo/ppyolo.yml)                   |
+
+**注意:**
+
+- PP-YOLO模型使用COCO数据集中train2017作为训练集，使用test-dev2017作为测试集，`Box AP`为`mAP(IoU=0.5:0.95)`评估结果。
+- PP-YOLO模型训练过程中使用8 GPUs，每GPU batch size为24进行训练，如训练GPU数和batch size不使用上述配置，须参考[FAQ](../../docs/FAQ.md)调整学习率和迭代次数。
+- PP-YOLO模型推理速度测试采用单卡V100，batch size=1进行测试，使用CUDA 10.2, CUDNN 7.5.1，TensorRT推理速度测试使用TensorRT 5.1.2.2。
+- PP-YOLO模型FP32的推理速度测试数据为使用`tools/export_model.py`脚本导出模型后，使用`deploy/python/infer.py`脚本中的`--run_benchnark`参数使用Paddle预测库进行推理速度benchmark测试结果, 且测试的均为不包含数据预处理和模型输出后处理(NMS)的数据(与[YOLOv4(AlexyAB)](https://github.com/AlexeyAB/darknet)测试方法一致)。
+- TensorRT FP16的速度测试相比于FP32去除了`yolo_box`(bbox解码)部分耗时，即不包含数据预处理，bbox解码和NMS(与[YOLOv4(AlexyAB)](https://github.com/AlexeyAB/darknet)测试方法一致)。
+- YOLOv4(AlexyAB)模型精度和V100 FP32推理速度数据使用[YOLOv4 github库](https://github.com/AlexeyAB/darknet)提供的单卡V100上精度速度测试数据，V100 TensorRT FP16推理速度为使用[AlexyAB/darknet](https://github.com/AlexeyAB/darknet)库中tkDNN配置于单卡V100，TensorRT 5.1.2.2的测试结果。
+- PP-YOLO模型推理速度测试采用单卡V100，batch size=1进行测试，使用CUDA 10.2, CUDNN 7.5.1，TensorRT推理速度测试使用TensorRT 5.1.2.2。
+- YOLOv4(AlexyAB)行`模型下载`和`配置文件`为PaddleDetection复现的YOLOv4模型，目前评估精度已对齐，支持finetune，训练精度对齐中，可参见[PaddleDetection YOLOv4 模型](../yolov4/README.md)
+
+### PP-YOLO tiny模型
+
+|          模型            | GPU个数 | 每GPU图片个数 |  骨干网络  | 输入尺寸 | Box AP | V100 FP32(FPS) | V100 TensorRT FP16(FPS) | 模型下载 | 配置文件 |
+|:------------------------:|:-------:|:-------------:|:----------:| :-------:| :----: | :------------: | :---------------------: | :------: | :------: |
+| PP-YOLO tiny              |    4    |      32      | ResNet18vd |   416    |  47.0  |     401.6      |          724.6          | [下载链接](https://paddlemodels.bj.bcebos.com/object_detection/ppyolo_tiny.pdparams) |  [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/master/configs/ppyolo/ppyolo_tiny.yml)                   |
+| PP-YOLO tiny              |    4    |      32      | ResNet18vd |   320    |  43.7  |     478.5      |          791.3          | [下载链接](https://paddlemodels.bj.bcebos.com/object_detection/ppyolo_tiny.pdparams) |  [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/master/configs/ppyolo/ppyolo_tiny.yml)                   |
+
+- PP-YOLO tiny模型使用COCO数据集中train2017作为训练集，使用val2017左右测试集，`Box AP`为`mAP(IoU=0.5)`评估结果。
+- PP-YOLO tiny模型训练过程中使用4GPU，每GPU batch size为32进行训练，如训练GPU数和batch size不使用上述配置，须参考[FAQ](../../docs/FAQ.md)调整学习率和迭代次数。
+- PP-YOLO tiny模型推理速度测试环境配置和测试方法与PP-YOLO模型一致。
+
+## 使用说明
+
+### 1. 训练
+
+使用8GPU通过如下命令一键式启动训练(以下命令均默认在PaddleDetection根目录运行), 通过`--eval`参数开启训练中交替评估。
+
+```bash
+CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 python tools/train.py -c configs/ppyolo/ppyolo.yml --eval
+```
+
+### 2. 评估
+
+使用单GPU通过如下命令一键式评估模型在COCO val2017数据集效果
+
+```bash
+# 使用PaddleDetection发布的权重
+CUDA_VISIBLE_DEVICES=0 python tools/eval.py -c configs/ppyolo/ppyolo.yml -o weights=https://paddlemodels.bj.bcebos.com/object_detection/ppyolo.pdparams
+
+# 使用训练保存的checkpoint
+CUDA_VISIBLE_DEVICES=0 python tools/eval.py -c configs/ppyolo/ppyolo.yml -o weights=output/ppyolo/best_model
+```
+
+我们提供了`configs/ppyolo/ppyolo_test.yml`用于评估COCO test-dev2017数据集的效果，评估COCO test-dev2017数据集的效果须先从[COCO数据集下载页](https://cocodataset.org/#download)下载test-dev2017数据集，解压到`configs/ppyolo/ppyolo_test.yml`中`EvalReader.dataset`中配置的路径，并使用如下命令进行评估
+
+```bash
+# 使用PaddleDetection发布的权重
+CUDA_VISIBLE_DEVICES=0 python tools/eval.py -c configs/ppyolo/ppyolo_test.yml -o weights=https://paddlemodels.bj.bcebos.com/object_detection/ppyolo.pdparams
+
+# 使用训练保存的checkpoint
+CUDA_VISIBLE_DEVICES=0 python tools/eval.py -c configs/ppyolo/ppyolo_test.yml -o weights=output/ppyolo/best_model
+```
+
+评估结果保存于`bbox.json`中，将其压缩为zip包后通过[COCO数据集评估页](https://competitions.codalab.org/competitions/20794#participate)提交评估。
+
+**注意:** `configs/ppyolo/ppyolo_test.yml`仅用于评估COCO test-dev数据集，不用于训练和评估COCO val2017数据集。
+
+### 3. 推理
+
+使用单GPU通过如下命令一键式推理图像，通过`--infer_img`指定图像路径，或通过`--infer_dir`指定目录并推理目录下所有图像
+
+```bash
+# 推理单张图像
+CUDA_VISIBLE_DEVICES=0 python tools/infer.py -c configs/ppyolo/ppyolo.yml -o weights=https://paddlemodels.bj.bcebos.com/object_detection/ppyolo.pdparams --infer_img=demo/000000014439_640x640.jpg
+
+# 推理目录下所有图像
+CUDA_VISIBLE_DEVICES=0 python tools/infer.py -c configs/ppyolo/ppyolo.yml -o weights=https://paddlemodels.bj.bcebos.com/object_detection/ppyolo.pdparams --infer_dir=demo
+```
+
+### 4. 推理部署与benchmark
+
+PP-YOLO模型部署及推理benchmark需要通过`tools/export_model.py`导出模型后使用Paddle预测库进行部署和推理，可通过如下命令一键式启动。
+
+```bash
+# 导出模型，默认存储于output/ppyolo目录
+python tools/export_model.py -c configs/ppyolo/ppyolo.yml -o weights=https://paddlemodels.bj.bcebos.com/object_detection/ppyolo.pdparams
+
+# 预测库推理
+CUDA_VISIBLE_DEVICES=0 python deploy/python/infer.py --model_dir=output/ppyolo --image_file=demo/000000014439_640x640.jpg --use_gpu=True
+```
+
+PP-YOLO模型benchmark测试为不包含数据预处理和网络输出后处理(NMS)的网络结构部分数据，导出模型时须指定`--exlcude_nms`来裁剪掉模型中后处理的NMS部分，通过如下命令进行模型导出和benchmark测试。
+
+```bash
+# 导出模型，通过--exclude_nms参数裁剪掉模型中的NMS部分，默认存储于output/ppyolo目录
+python tools/export_model.py -c configs/ppyolo/ppyolo.yml -o weights=https://paddlemodels.bj.bcebos.com/object_detection/ppyolo.pdparams --exclude_nms
+
+# FP32 benchmark测试
+CUDA_VISIBLE_DEVICES=0 python deploy/python/infer.py --model_dir=output/ppyolo --image_file=demo/000000014439_640x640.jpg --use_gpu=True --run_benchmark=True
+
+# TensorRT FP16 benchmark测试
+CUDA_VISIBLE_DEVICES=0 python deploy/python/infer.py --model_dir=output/ppyolo --image_file=demo/000000014439_640x640.jpg --use_gpu=True --run_benchmark=True --run_mode=trt_fp16
+```
+
+## 未来工作
+
+1. 发布PP-YOLO-tiny模型
+2. 发布更多骨干网络的PP-YOLO及PP-YOLO-tiny模型
+
+## 附录
+
+PP-YOLO模型相对于YOLOv3模型优化项消融实验数据如下表所示。
+
+| 序号 |        模型                  | Box AP | 参数量(M) | FLOPs(G) | V100 FP32 FPS |
+| :--: | :--------------------------- | :----: | :-------: | :------: | :-----------: |
+|  A   | YOLOv3-DarkNet53             |  38.9  |   59.13   |  65.52   |      58.2     |
+|  B   | YOLOv3-ResNet50vd-DCN        |  39.1  |   43.89   |  44.71   |      79.2     |
+|  C   | B + LB + EMA + DropBlock     |  41.4  |   43.89   |  44.71   |      79.2     |
+|  D   | C + IoU Loss                 |  41.9  |   43.89   |  44.71   |      79.2     |
+|  E   | D + IoU Aware                |  42.5  |   43.90   |  44.71   |      74.9     |
+|  F   | E + Grid Sensitive           |  42.8  |   43.90   |  44.71   |      74.8     |
+|  G   | F + Matrix NMS               |  43.5  |   43.90   |  44.71   |      74.8     |
+|  H   | G + CoordConv                |  44.0  |   43.93   |  44.76   |      74.1     |
+|  I   | H + SPP                      |  44.3  |   44.93   |  45.12   |      72.9     |
+|  J   | I + Better ImageNet Pretrain |  44.6  |   44.93   |  45.12   |      72.9     |
+
+**注意:**
+
+- 精度与推理速度数据均为使用输入图像尺寸为608的测试结果
+- Box AP为在COCO train2017数据集训练，val2017数据集上评估数据
+- 推理速度为单卡V100上，batch size=1, 使用上述benchmark测试方法的测试结果，测试环境配置为CUDA 10.2，CUDNN 7.5.1
+- [YOLOv3-DarkNet53](../yolov3_darknet.yml)精度38.9为PaddleDetection优化后的YOLOv3模型，可参见[模型库](../../docs/MODEL_ZOO_cn.md)