Merge branch 'ppdet_split' of /paddle/work/paddle-fork/models

db2d30dd · root · 4f74dbf2 · 1ea40a8d · db2d30dd · db2d30dd
283 changed file
--- a/.gitignore
+++ b/.gitignore
+# Virtualenv
+/.venv/
+/venv/
+
+# Byte-compiled / optimized / DLL files
+__pycache__/
+*.py[cod]
+
+# C extensions
+*.so
+
+# json file
+*.json
+
+# Distribution / packaging
+/bin/
+/build/
+/develop-eggs/
+/dist/
+/eggs/
+/lib/
+/lib64/
+/output/
+/parts/
+/sdist/
+/var/
+/*.egg-info/
+/.installed.cfg
+/*.egg
+/.eggs
+
+# AUTHORS and ChangeLog will be generated while packaging
+/AUTHORS
+/ChangeLog
+
+# BCloud / BuildSubmitter
+/build_submitter.*
+/logger_client_log
+
+# Installer logs
+pip-log.txt
+pip-delete-this-directory.txt
+
+# Unit test / coverage reports
+.tox/
+.coverage
+.cache
+.pytest_cache
+nosetests.xml
+coverage.xml
+
+# Translations
+*.mo
+
+# Sphinx documentation
+/docs/_build/
+
+*.json
+
+
+dataset/coco/annotations
+dataset/coco/train2017
+dataset/coco/val2017
+dataset/voc/VOCdevkit
--- a/.style.yapf
+++ b/.style.yapf
+[style]
+based_on_style = pep8
+column_limit = 80
--- a/README.md
+++ b/README.md
+English | [简体中文](README_cn.md)
+
+# PaddleDetection
+
+The goal of PaddleDetection is to provide easy access to a wide range of object
+detection models in both industry and research settings. We design
+PaddleDetection to be not only performant, production-ready but also highly
+flexible, catering to research needs.
+
+
+<div align="center">
+  <img src="demo/output/000000570688.jpg" />
+</div>
+
+
+## Introduction
+
+Features:
+
+- Production Ready:
+
+  Key operations are implemented in C++ and CUDA, together with PaddlePaddle's
+highly efficient inference engine, enables easy deployment in server environments.
+
+- Highly Flexible:
+
+  Components are designed to be modular. Model architectures, as well as data
+preprocess pipelines, can be easily customized with simple configuration
+changes.
+
+- Performance Optimized:
+
+  With the help of the underlying PaddlePaddle framework, faster training and
+reduced GPU memory footprint is achieved. Notably, YOLOv3 training is
+much faster compared to other frameworks. Another example is Mask-RCNN
+(ResNet50), we managed to fit up to 4 images per GPU (Tesla V100 16GB) during
+multi-GPU training.
+
+Supported Architectures:
+
+|                     | ResNet | ResNet-vd <sup>[1](#vd)</sup> | ResNeXt-vd | SENet | MobileNet | DarkNet | VGG  |
+| ------------------- | :----: | ----------------------------: | :--------: | :---: | :-------: | :-----: | :--: |
+| Faster R-CNN        |   ✓    |                             ✓ |     x      |   ✓   |     ✗     |    ✗    |  ✗   |
+| Faster R-CNN + FPN  |   ✓    |                             ✓ |     ✓      |   ✓   |     ✗     |    ✗    |  ✗   |
+| Mask R-CNN          |   ✓    |                             ✓ |     x      |   ✓   |     ✗     |    ✗    |  ✗   |
+| Mask R-CNN + FPN    |   ✓    |                             ✓ |     ✓      |   ✓   |     ✗     |    ✗    |  ✗   |
+| Cascade Faster-RCNN |   ✓    |                             ✓ |     ✓      |   ✗   |     ✗     |    ✗    |  ✗   |
+| Cascade Mask-RCNN   |   ✓    |                             ✗ |     ✗      |   ✓   |     ✗     |    ✗    |  ✗   |
+| RetinaNet           |   ✓    |                             ✗ |     ✗      |   ✗   |     ✗     |    ✗    |  ✗   |
+| YOLOv3              |   ✓    |                             ✗ |     ✗      |   ✗   |     ✓     |    ✓    |  ✗   |
+| SSD                 |   ✗    |                             ✗ |     ✗      |   ✗   |     ✓     |    ✗    |  ✓   |
+
+<a name="vd">[1]</a> [ResNet-vd](https://arxiv.org/pdf/1812.01187) models offer much improved accuracy with negligible performance cost.
+
+Advanced Features:
+
+- [x] **Synchronized Batch Norm**: currently used by YOLOv3.
+- [x] **Group Norm**
+- [x] **Modulated Deformable Convolution**
+- [x] **Deformable PSRoI Pooling**
+
+**NOTE:** Synchronized batch normalization can only be used on multiple GPU devices, can not be used on CPU devices or single GPU device.
+
+## Get Started
+
+- [Installation guide](docs/INSTALL.md)
+- [Quick start on small dataset](docs/QUICK_STARTED.md)
+- [Guide to traing, evaluate and arguments description](docs/GETTING_STARTED.md)
+- [Guide to preprocess pipeline and custom dataset](docs/DATA.md)
+- [Introduction to the configuration workflow](docs/CONFIG.md)
+- [Examples for detailed configuration explanation](docs/config_example/)
+- [IPython Notebook demo](demo/mask_rcnn_demo.ipynb)
+- [Transfer learning document](docs/TRANSFER_LEARNING.md)
+
+## Model Zoo
+
+- Pretrained models are available in the [PaddleDetection model zoo](docs/MODEL_ZOO.md).
+- [Face detection models](configs/face_detection/README.md)
+- [Pretrained models for pedestrian  and vehicle detection](contrib/README.md)
+
+## Model compression
+
+- [ Quantification aware training example](slim/quantization)
+- [ Pruning compression example](slim/prune)
+
+## Depoly
+
+- [Export model for inference depolyment](docs/EXPORT_MODEL.md)
+- [C++ inference depolyment](inference/README.md)
+
+## Benchmark
+
+- [Inference benchmark](docs/BENCHMARK_INFER_cn.md)
+
+
+## Updates
+
+#### 10/2019
+
+- Face detection models included: BlazeFace, Faceboxes.
+- Enrich COCO models,  box mAP up to 51.9%.
+- Add CACacascade RCNN, one of the best single model of Objects365 2019 challenge Full Track champion.
+- Add pretrained models for pedestrian and vehicle detection.
+- Support mixed-precision training.
+- Add C++ inference depolyment.
+- Add model compression examples.
+
+#### 2/9/2019
+
+- Add retrained models for GroupNorm.
+
+- Add Cascade-Mask-RCNN+FPN.
+
+#### 5/8/2019
+
+- Add a series of models ralated modulated Deformable Convolution.
+
+#### 7/29/2019
+
+- Update Chinese docs for PaddleDetection
+- Fix bug in R-CNN models when train and test at the same time
+- Add ResNext101-vd + Mask R-CNN + FPN models
+- Add YOLOv3 on VOC models
+
+#### 7/3/2019
+
+- Initial release of PaddleDetection and detection model zoo
+- Models included: Faster R-CNN, Mask R-CNN, Faster R-CNN+FPN, Mask
+  R-CNN+FPN, Cascade-Faster-RCNN+FPN, RetinaNet, YOLOv3, and SSD.
+
+
+## Contributing
+
+Contributions are highly welcomed and we would really appreciate your feedback!!
--- a/README_cn.md
+++ b/README_cn.md
+[English](README.md) | 简体中文
+
+# PaddleDetection
+
+PaddleDetection的目的是为工业界和学术界提供丰富、易用的目标检测模型。不仅性能优越、易于部署，而且能够灵活的满足算法研究的需求。
+
+<div align="center">
+  <img src="demo/output/000000570688.jpg" />
+</div>
+
+
+## 简介
+
+特性：
+
+- 易部署:
+
+  PaddleDetection的模型中使用的核心算子均通过C++或CUDA实现，同时基于PaddlePaddle的高性能推理引擎可以方便地部署在多种硬件平台上。
+
+- 高灵活度：
+
+  PaddleDetection通过模块化设计来解耦各个组件，基于配置文件可以轻松地搭建各种检测模型。
+
+- 高性能：
+
+  基于PaddlePaddle框架的高性能内核，在模型训练速度、显存占用上有一定的优势。例如，YOLOv3的训练速度快于其他框架，在Tesla V100 16GB环境下，Mask-RCNN(ResNet50)可以单卡Batch Size可以达到4 (甚至到5)。
+
+支持的模型结构：
+
+|                    | ResNet | ResNet-vd <sup>[1](#vd)</sup> | ResNeXt-vd | SENet | MobileNet | DarkNet | VGG |
+|--------------------|:------:|------------------------------:|:----------:|:-----:|:---------:|:-------:|:---:|
+| Faster R-CNN       | ✓      |                             ✓ | x          | ✓     | ✗         | ✗       | ✗   |
+| Faster R-CNN + FPN | ✓      |                             ✓ | ✓          | ✓     | ✗         | ✗       | ✗   |
+| Mask R-CNN         | ✓      |                             ✓ | x          | ✓     | ✗         | ✗       | ✗   |
+| Mask R-CNN + FPN   | ✓      |                             ✓ | ✓          | ✓     | ✗         | ✗       | ✗   |
+| Cascade Faster-CNN | ✓      |                             ✓ | ✓          | ✗     | ✗         | ✗       | ✗  |
+| Cascade Mask-CNN   | ✓      |                             ✗ | ✗          | ✓     | ✗         | ✗       | ✗   |
+| RetinaNet          | ✓      |                             ✗ | ✓          | ✗     | ✗         | ✗       | ✗   |
+| YOLOv3             | ✓      |                             ✗ | ✗          | ✗     | ✓         | ✓       | ✗   |
+| SSD                | ✗      |                             ✗ | ✗          | ✗     | ✓         | ✗       | ✓   |
+
+<a name="vd">[1]</a> [ResNet-vd](https://arxiv.org/pdf/1812.01187) 模型提供了较大的精度提高和较少的性能损失。
+
+扩展特性：
+
+- [x] **Synchronized Batch Norm**: 目前在YOLOv3中使用。
+- [x] **Group Norm**
+- [x] **Modulated Deformable Convolution**
+- [x] **Deformable PSRoI Pooling**
+
+**注意:** Synchronized batch normalization 只能在多GPU环境下使用，不能在CPU环境或者单GPU环境下使用。
+
+
+## 使用教程
+
+- [安装说明](docs/INSTALL_cn.md)
+- [快速开始](docs/QUICK_STARTED_cn.md)
+- [训练、评估及参数说明](docs/GETTING_STARTED_cn.md)
+- [数据预处理及自定义数据集](docs/DATA_cn.md)
+- [配置模块设计和介绍](docs/CONFIG_cn.md)
+- [详细的配置信息和参数说明示例](docs/config_example/)
+- [IPython Notebook demo](demo/mask_rcnn_demo.ipynb)
+- [迁移学习教程](docs/TRANSFER_LEARNING_cn.md)
+
+## 模型库
+
+- [模型库](docs/MODEL_ZOO_cn.md)
+- [人脸检测模型](configs/face_detection/README.md)
+- [行人检测和车辆检测预训练模型](contrib/README_cn.md)
+
+
+## 模型压缩
+- [量化训练压缩示例](slim/quantization)
+- [剪枝压缩示例](slim/prune)
+
+## 推理部署
+
+- [模型导出教程](docs/EXPORT_MODEL.md)
+- [C++推理部署](inference/README.md)
+
+## Benchmark
+
+- [推理Benchmark](docs/BENCHMARK_INFER_cn.md)
+
+
+
+## 版本更新
+
+### 10/2019
+
+- 增加人脸检测模型BlazeFace、Faceboxes。
+- 丰富基于COCO的模型，精度高达51.9%。
+- 增加Objects365 2019 Challenge上夺冠的最佳单模型之一CACascade-RCNN。
+- 增加行人检测和车辆检测预训练模型。
+- 支持FP16训练。
+- 增加跨平台的C++推理部署方案。
+- 增加模型压缩示例。
+
+
+### 2/9/2019
+- 增加GroupNorm模型。
+- 增加CascadeRCNN+Mask模型。
+
+#### 5/8/2019
+- 增加Modulated Deformable Convolution系列模型。
+
+#### 7/22/2019
+
+- 增加检测库中文文档
+- 修复R-CNN系列模型训练同时进行评估的问题
+- 新增ResNext101-vd + Mask R-CNN + FPN模型
+- 新增基于VOC数据集的YOLOv3模型
+
+#### 7/3/2019
+
+- 首次发布PaddleDetection检测库和检测模型库
+- 模型包括：Faster R-CNN, Mask R-CNN, Faster R-CNN+FPN, Mask
+  R-CNN+FPN, Cascade-Faster-RCNN+FPN, RetinaNet, YOLOv3, 和SSD.
+
+## 如何贡献代码
+
+我们非常欢迎你可以为PaddleDetection提供代码，也十分感谢你的反馈。
--- a/configs/cascade_mask_rcnn_r50_fpn_1x.yml
+++ b/configs/cascade_mask_rcnn_r50_fpn_1x.yml
+architecture: CascadeMaskRCNN
+train_feed: MaskRCNNTrainFeed
+eval_feed: MaskRCNNEvalFeed
+test_feed: MaskRCNNTestFeed
+use_gpu: true
+max_iters: 180000
+snapshot_iter: 10000
+log_smooth_window: 20
+save_dir: output
+pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/ResNet50_cos_pretrained.tar
+metric: COCO
+weights: output/cascade_mask_rcnn_r50_fpn_1x/model_final/
+num_classes: 81
+
+CascadeMaskRCNN:
+  backbone: ResNet
+  fpn: FPN
+  rpn_head: FPNRPNHead
+  roi_extractor: FPNRoIAlign
+  bbox_head: CascadeBBoxHead
+  bbox_assigner: CascadeBBoxAssigner
+  mask_assigner: MaskAssigner
+  mask_head: MaskHead
+
+ResNet:
+  depth: 50
+  feature_maps: [2, 3, 4, 5]
+  freeze_at: 2
+  norm_type: affine_channel
+
+FPN:
+  max_level: 6
+  min_level: 2
+  num_chan: 256
+  spatial_scale: [0.03125, 0.0625, 0.125, 0.25]
+
+FPNRPNHead:
+  anchor_generator:
+    aspect_ratios: [0.5, 1.0, 2.0]
+    variance: [1.0, 1.0, 1.0, 1.0]
+  anchor_start_size: 32
+  max_level: 6
+  min_level: 2
+  num_chan: 256
+  rpn_target_assign:
+    rpn_batch_size_per_im: 256
+    rpn_fg_fraction: 0.5
+    rpn_negative_overlap: 0.3
+    rpn_positive_overlap: 0.7
+    rpn_straddle_thresh: 0.0
+  train_proposal:
+    min_size: 0.0
+    nms_thresh: 0.7
+    pre_nms_top_n: 2000
+    post_nms_top_n: 2000
+  test_proposal:
+    min_size: 0.0
+    nms_thresh: 0.7
+    pre_nms_top_n: 1000
+    post_nms_top_n: 1000
+
+FPNRoIAlign:
+  canconical_level: 4
+  canonical_size: 224
+  max_level: 5
+  min_level: 2
+  sampling_ratio: 2
+  box_resolution: 7
+  mask_resolution: 14
+
+MaskHead:
+  dilation: 1
+  conv_dim: 256
+  num_convs: 4
+  resolution: 28
+
+CascadeBBoxAssigner:
+  batch_size_per_im: 512
+  bbox_reg_weights: [10, 20, 30]
+  bg_thresh_hi: [0.5, 0.6, 0.7]
+  bg_thresh_lo: [0.0, 0.0, 0.0]
+  fg_fraction: 0.25
+  fg_thresh: [0.5, 0.6, 0.7]
+
+MaskAssigner:
+  resolution: 28
+
+CascadeBBoxHead:
+  head: CascadeTwoFCHead 
+  nms:
+    keep_top_k: 100
+    nms_threshold: 0.5
+    score_threshold: 0.05
+
+CascadeTwoFCHead:
+  mlp_dim: 1024
+
+LearningRate:
+  base_lr: 0.01
+  schedulers:
+  - !PiecewiseDecay
+    gamma: 0.1
+    milestones: [120000, 160000]
+  - !LinearWarmup
+    start_factor: 0.3333333333333333
+    steps: 500
+
+OptimizerBuilder:
+  optimizer:
+    momentum: 0.9
+    type: Momentum
+  regularizer:
+    factor: 0.0001
+    type: L2
+
+MaskRCNNTrainFeed:
+  batch_size: 1
+  dataset:
+    dataset_dir: dataset/coco
+    annotation: annotations/instances_train2017.json
+    image_dir: train2017
+  batch_transforms:
+  - !PadBatch
+    pad_to_stride: 32
+  num_workers: 2
+
+MaskRCNNEvalFeed:
+  batch_size: 1
+  dataset:
+    dataset_dir: dataset/coco
+    annotation: annotations/instances_val2017.json
+    image_dir: val2017
+  batch_transforms:
+  - !PadBatch
+    pad_to_stride: 32
+  num_workers: 2
+
+MaskRCNNTestFeed:
+  batch_size: 1
+  dataset:
+    annotation: dataset/coco/annotations/instances_val2017.json
+  batch_transforms:
+  - !PadBatch
+    pad_to_stride: 32
+  num_workers: 2
--- a/configs/cascade_rcnn_r50_fpn_1x.yml
+++ b/configs/cascade_rcnn_r50_fpn_1x.yml
+architecture: CascadeRCNN
+train_feed: FasterRCNNTrainFeed
+eval_feed: FasterRCNNEvalFeed
+test_feed: FasterRCNNTestFeed
+max_iters: 90000
+snapshot_iter: 10000
+use_gpu: true
+log_smooth_window: 20
+save_dir: output
+pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/ResNet50_cos_pretrained.tar
+weights: output/cascade_rcnn_r50_fpn_1x/model_final
+metric: COCO
+num_classes: 81
+
+CascadeRCNN:
+  backbone: ResNet
+  fpn: FPN
+  rpn_head: FPNRPNHead
+  roi_extractor: FPNRoIAlign
+  bbox_head: CascadeBBoxHead
+  bbox_assigner: CascadeBBoxAssigner
+
+ResNet:
+  norm_type: affine_channel
+  depth: 50
+  feature_maps: [2, 3, 4, 5]
+  freeze_at: 2
+  variant: b
+
+FPN:
+  min_level: 2
+  max_level: 6
+  num_chan: 256
+  spatial_scale: [0.03125, 0.0625, 0.125, 0.25]
+
+FPNRPNHead:
+  anchor_generator:
+    anchor_sizes: [32, 64, 128, 256, 512]
+    aspect_ratios: [0.5, 1.0, 2.0]
+    stride: [16.0, 16.0]
+    variance: [1.0, 1.0, 1.0, 1.0]
+  anchor_start_size: 32
+  min_level: 2
+  max_level: 6
+  num_chan: 256
+  rpn_target_assign:
+    rpn_batch_size_per_im: 256
+    rpn_fg_fraction: 0.5
+    rpn_positive_overlap: 0.7
+    rpn_negative_overlap: 0.3
+    rpn_straddle_thresh: 0.0
+  train_proposal:
+    min_size: 0.0
+    nms_thresh: 0.7
+    pre_nms_top_n: 2000
+    post_nms_top_n: 2000
+  test_proposal:
+    min_size: 0.0
+    nms_thresh: 0.7
+    pre_nms_top_n: 1000
+    post_nms_top_n: 1000
+
+FPNRoIAlign:
+  canconical_level: 4
+  canonical_size: 224
+  min_level: 2
+  max_level: 5
+  box_resolution: 7
+  sampling_ratio: 2
+
+CascadeBBoxAssigner:
+  batch_size_per_im: 512
+  bbox_reg_weights: [10, 20, 30]
+  bg_thresh_lo: [0.0, 0.0, 0.0]
+  bg_thresh_hi: [0.5, 0.6, 0.7]
+  fg_thresh: [0.5, 0.6, 0.7]
+  fg_fraction: 0.25
+
+CascadeBBoxHead:
+  head: CascadeTwoFCHead
+  nms:
+    keep_top_k: 100
+    nms_threshold: 0.5
+    score_threshold: 0.05
+
+CascadeTwoFCHead:
+  mlp_dim: 1024
+
+LearningRate:
+  base_lr: 0.02
+  schedulers:
+  - !PiecewiseDecay
+    gamma: 0.1
+    milestones: [60000, 80000]
+  - !LinearWarmup
+    start_factor: 0.3333333333333333
+    steps: 500
+
+OptimizerBuilder:
+  optimizer:
+    momentum: 0.9
+    type: Momentum
+  regularizer:
+    factor: 0.0001
+    type: L2
+
+FasterRCNNTrainFeed:
+  batch_size: 2
+  dataset:
+    dataset_dir: dataset/coco
+    annotation: annotations/instances_train2017.json
+    image_dir: train2017
+  batch_transforms:
+  - !PadBatch
+    pad_to_stride: 32
+  drop_last: false
+  num_workers: 2
+
+FasterRCNNEvalFeed:
+  batch_size: 1
+  dataset:
+    dataset_dir: dataset/coco
+    annotation: annotations/instances_val2017.json
+    image_dir: val2017
+  batch_transforms:
+  - !PadBatch
+    pad_to_stride: 32
+
+FasterRCNNTestFeed:
+  batch_size: 1
+  dataset:
+    annotation: dataset/coco/annotations/instances_val2017.json
+  batch_transforms:
+  - !PadBatch
+    pad_to_stride: 32
+  drop_last: false
+  num_workers: 2
--- a/configs/cascade_rcnn_r50_fpn_1x_ms_test.yml
+++ b/configs/cascade_rcnn_r50_fpn_1x_ms_test.yml
+architecture: CascadeRCNN
+train_feed: FasterRCNNTrainFeed
+eval_feed: FasterRCNNEvalFeed
+test_feed: FasterRCNNTestFeed
+max_iters: 90000
+snapshot_iter: 10000
+use_gpu: true
+log_smooth_window: 20
+save_dir: output
+pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/ResNet50_cos_pretrained.tar
+weights: output/cascade_rcnn_r50_fpn_1x/model_final
+metric: COCO
+num_classes: 81
+
+CascadeRCNN:
+  backbone: ResNet
+  fpn: FPN
+  rpn_head: FPNRPNHead
+  roi_extractor: FPNRoIAlign
+  bbox_head: CascadeBBoxHead
+  bbox_assigner: CascadeBBoxAssigner
+
+ResNet:
+  norm_type: affine_channel
+  depth: 50
+  feature_maps: [2, 3, 4, 5]
+  freeze_at: 2
+  variant: b
+
+FPN:
+  min_level: 2
+  max_level: 6
+  num_chan: 256
+  spatial_scale: [0.03125, 0.0625, 0.125, 0.25]
+
+FPNRPNHead:
+  anchor_generator:
+    anchor_sizes: [32, 64, 128, 256, 512]
+    aspect_ratios: [0.5, 1.0, 2.0]
+    stride: [16.0, 16.0]
+    variance: [1.0, 1.0, 1.0, 1.0]
+  anchor_start_size: 32
+  min_level: 2
+  max_level: 6
+  num_chan: 256
+  rpn_target_assign:
+    rpn_batch_size_per_im: 256
+    rpn_fg_fraction: 0.5
+    rpn_positive_overlap: 0.7
+    rpn_negative_overlap: 0.3
+    rpn_straddle_thresh: 0.0
+  train_proposal:
+    min_size: 0.0
+    nms_thresh: 0.7
+    pre_nms_top_n: 2000
+    post_nms_top_n: 2000
+  test_proposal:
+    min_size: 0.0
+    nms_thresh: 0.7
+    pre_nms_top_n: 1000
+    post_nms_top_n: 1000
+
+FPNRoIAlign:
+  canconical_level: 4
+  canonical_size: 224
+  min_level: 2
+  max_level: 5
+  box_resolution: 7
+  sampling_ratio: 2
+
+CascadeBBoxAssigner:
+  batch_size_per_im: 512
+  bbox_reg_weights: [10, 20, 30]
+  bg_thresh_lo: [0.0, 0.0, 0.0]
+  bg_thresh_hi: [0.5, 0.6, 0.7]
+  fg_thresh: [0.5, 0.6, 0.7]
+  fg_fraction: 0.25
+
+CascadeBBoxHead:
+  head: CascadeTwoFCHead
+  nms:
+    keep_top_k: 100
+    nms_threshold: 0.5
+    score_threshold: 0.05
+
+CascadeTwoFCHead:
+  mlp_dim: 1024
+
+MultiScaleTEST:
+  score_thresh: 0.05
+  nms_thresh: 0.5
+  detections_per_im: 100
+  enable_voting: true
+  vote_thresh: 0.9
+
+LearningRate:
+  base_lr: 0.02
+  schedulers:
+  - !PiecewiseDecay
+    gamma: 0.1
+    milestones: [60000, 80000]
+  - !LinearWarmup
+    start_factor: 0.3333333333333333
+    steps: 500
+
+OptimizerBuilder:
+  optimizer:
+    momentum: 0.9
+    type: Momentum
+  regularizer:
+    factor: 0.0001
+    type: L2
+
+FasterRCNNTrainFeed:
+  batch_size: 2
+  dataset:
+    dataset_dir: dataset/coco
+    annotation: annotations/instances_train2017.json
+    image_dir: train2017
+  batch_transforms:
+  - !PadBatch
+    pad_to_stride: 32
+  drop_last: false
+  num_workers: 2
+
+FasterRCNNEvalFeed:
+  batch_size: 1
+  dataset:
+    dataset_dir: dataset/coco
+    annotation: annotations/instances_val2017.json
+    image_dir: val2017
+  sample_transforms:
+  - !DecodeImage
+    to_rgb: true
+  - !NormalizeImage
+    is_channel_first: false
+    is_scale: true
+    mean:
+    - 0.485
+    - 0.456
+    - 0.406
+    std:
+    - 0.229
+    - 0.224
+    - 0.225
+  - !MultiscaleTestResize
+    origin_target_size: 800
+    origin_max_size: 1333
+    target_size:
+    - 400
+    - 500
+    - 600
+    - 700
+    - 900
+    - 1000
+    - 1100
+    - 1200
+    max_size: 2000
+    use_flip: true
+  - !Permute
+    channel_first: true
+    to_bgr: false
+  batch_transforms:
+  - !PadMSTest
+    pad_to_stride: 32
+  num_scale: 18
+  num_workers: 2
+
+FasterRCNNTestFeed:
+  batch_size: 1
+  dataset:
+    annotation: dataset/coco/annotations/instances_val2017.json
+  batch_transforms:
+  - !PadBatch
+    pad_to_stride: 32
+  drop_last: false
+  num_workers: 2
--- a/configs/dcn/cascade_mask_rcnn_dcnv2_se154_vd_fpn_gn_s1x.yml
+++ b/configs/dcn/cascade_mask_rcnn_dcnv2_se154_vd_fpn_gn_s1x.yml
+architecture: CascadeMaskRCNN
+train_feed: MaskRCNNTrainFeed
+eval_feed: MaskRCNNEvalFeed
+test_feed: MaskRCNNTestFeed
+max_iters: 300000
+snapshot_iter: 10
+use_gpu: true
+log_iter: 20
+log_smooth_window: 20
+save_dir: output
+pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/SENet154_vd_caffe_pretrained.tar 
+weights: output/cascade_mask_rcnn_dcn_se154_vd_fpn_gn_s1x/model_final/
+metric: COCO
+num_classes: 81
+
+CascadeMaskRCNN:
+  backbone: SENet
+  fpn: FPN
+  rpn_head: FPNRPNHead
+  roi_extractor: FPNRoIAlign
+  bbox_head: CascadeBBoxHead
+  bbox_assigner: CascadeBBoxAssigner
+  mask_assigner: MaskAssigner
+  mask_head: MaskHead
+
+SENet:
+  depth: 152
+  feature_maps: [2, 3, 4, 5]
+  freeze_at: 2
+  group_width: 4
+  groups: 64
+  norm_type: bn
+  freeze_norm: True
+  variant: d
+  dcn_v2_stages: [3, 4, 5]
+  std_senet: True
+
+FPN:
+  max_level: 6
+  min_level: 2
+  num_chan: 256
+  spatial_scale: [0.03125, 0.0625, 0.125, 0.25]
+  freeze_norm: False
+  norm_type: gn
+
+FPNRPNHead:
+  anchor_generator:
+    aspect_ratios: [0.5, 1.0, 2.0]
+    variance: [1.0, 1.0, 1.0, 1.0]
+  anchor_start_size: 32
+  max_level: 6
+  min_level: 2
+  num_chan: 256
+  rpn_target_assign:
+    rpn_batch_size_per_im: 256
+    rpn_fg_fraction: 0.5
+    rpn_negative_overlap: 0.3
+    rpn_positive_overlap: 0.7
+    rpn_straddle_thresh: 0.0
+  train_proposal:
+    min_size: 0.0
+    nms_thresh: 0.7
+    pre_nms_top_n: 2000
+    post_nms_top_n: 2000
+  test_proposal:
+    min_size: 0.0
+    nms_thresh: 0.7
+    pre_nms_top_n: 1000
+    post_nms_top_n: 1000
+
+FPNRoIAlign:
+  canconical_level: 4
+  canonical_size: 224
+  max_level: 5
+  min_level: 2
+  box_resolution: 7
+  sampling_ratio: 2
+  mask_resolution: 14
+
+MaskHead:
+  dilation: 1
+  conv_dim: 256
+  num_convs: 4
+  resolution: 28
+  norm_type: gn
+
+CascadeBBoxAssigner:
+  batch_size_per_im: 512
+  bbox_reg_weights: [10, 20, 30]
+  bg_thresh_hi: [0.5, 0.6, 0.7]
+  bg_thresh_lo: [0.0, 0.0, 0.0]
+  fg_fraction: 0.25
+  fg_thresh: [0.5, 0.6, 0.7]
+
+MaskAssigner:
+  resolution: 28
+
+CascadeBBoxHead:
+  head: CascadeXConvNormHead 
+  nms:
+    keep_top_k: 100
+    nms_threshold: 0.5
+    score_threshold: 0.05
+
+CascadeXConvNormHead:
+  norm_type: gn
+
+LearningRate:
+  base_lr: 0.01
+  schedulers:
+  - !PiecewiseDecay
+    gamma: 0.1
+    milestones: [240000, 280000]
+  - !LinearWarmup
+    start_factor: 0.01
+    steps: 2000
+
+OptimizerBuilder:
+  optimizer:
+    momentum: 0.9
+    type: Momentum
+  regularizer:
+    factor: 0.0001
+    type: L2
+
+MaskRCNNTrainFeed:
+  # batch size per device
+  batch_size: 1
+  dataset:
+    dataset_dir: dataset/coco
+    image_dir: train2017
+    annotation: annotations/instances_train2017.json
+  sample_transforms: 
+  - !DecodeImage
+    to_rgb: False
+    with_mixup: False
+  - !RandomFlipImage
+    is_mask_flip: true
+    is_normalized: false
+    prob: 0.5
+  - !NormalizeImage
+    is_channel_first: false
+    is_scale: False
+    mean:
+    - 102.9801 
+    - 115.9465
+    - 122.7717
+    std:
+    - 1.0 
+    - 1.0 
+    - 1.0 
+  - !ResizeImage
+    interp: 1
+    target_size:
+    - 416
+    - 448
+    - 480
+    - 512
+    - 544
+    - 576
+    - 608
+    - 640
+    - 672
+    - 704 
+    - 736
+    - 768
+    - 800
+    - 832
+    - 864
+    - 896
+    - 928
+    - 960
+    - 992
+    - 1024
+    - 1056
+    - 1088
+    - 1120
+    - 1152
+    - 1184
+    - 1216
+    - 1248
+    - 1280
+    - 1312
+    - 1344
+    - 1376
+    - 1408
+    max_size: 1600
+    use_cv2: true
+  - !Permute
+    channel_first: true
+    to_bgr: false
+  batch_transforms:
+  - !PadBatch
+    pad_to_stride: 32
+  num_workers: 8
+
+MaskRCNNEvalFeed:
+  batch_size: 1
+  dataset:
+    dataset_dir: dataset/coco
+    annotation: annotations/instances_val2017.json
+    image_dir: val2017
+  sample_transforms: 
+  - !DecodeImage
+    to_rgb: False
+    with_mixup: False
+  - !NormalizeImage
+    is_channel_first: false
+    is_scale: False
+    mean:
+    - 102.9801 
+    - 115.9465
+    - 122.7717
+    std:
+    - 1.0 
+    - 1.0 
+    - 1.0 
+  - !ResizeImage
+    interp: 1
+    target_size:
+    - 800
+    max_size: 1333
+    use_cv2: true
+  - !Permute
+    channel_first: true
+    to_bgr: false
+  batch_transforms:
+  - !PadBatch
+    pad_to_stride: 32
+  num_workers: 2
+
+MaskRCNNTestFeed:
+  batch_size: 1
+  dataset:
+    annotation: dataset/coco/annotations/instances_val2017.json
+  batch_transforms:
+  - !PadBatch
+    pad_to_stride: 32
+  num_workers: 2
--- a/configs/dcn/cascade_mask_rcnn_dcnv2_se154_vd_fpn_gn_s1x_ms_test.yml
+++ b/configs/dcn/cascade_mask_rcnn_dcnv2_se154_vd_fpn_gn_s1x_ms_test.yml
+architecture: CascadeMaskRCNN
+train_feed: MaskRCNNTrainFeed
+eval_feed: MaskRCNNEvalFeed
+test_feed: MaskRCNNTestFeed
+max_iters: 300000
+snapshot_iter: 10000
+use_gpu: true
+log_iter: 20
+log_smooth_window: 20
+save_dir: output
+pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/SENet154_vd_caffe_pretrained.tar 
+weights: output/cascade_mask_rcnn_dcn_se154_vd_fpn_gn_s1x/model_final/
+metric: COCO
+num_classes: 81
+
+CascadeMaskRCNN:
+  backbone: SENet
+  fpn: FPN
+  rpn_head: FPNRPNHead
+  roi_extractor: FPNRoIAlign
+  bbox_head: CascadeBBoxHead
+  bbox_assigner: CascadeBBoxAssigner
+  mask_assigner: MaskAssigner
+  mask_head: MaskHead
+
+SENet:
+  depth: 152
+  feature_maps: [2, 3, 4, 5]
+  freeze_at: 2
+  group_width: 4
+  groups: 64
+  norm_type: bn
+  freeze_norm: True
+  variant: d
+  dcn_v2_stages: [3, 4, 5]
+  std_senet: True
+
+FPN:
+  max_level: 6
+  min_level: 2
+  num_chan: 256
+  spatial_scale: [0.03125, 0.0625, 0.125, 0.25]
+  freeze_norm: False
+  norm_type: gn
+
+FPNRPNHead:
+  anchor_generator:
+    aspect_ratios: [0.5, 1.0, 2.0]
+    variance: [1.0, 1.0, 1.0, 1.0]
+  anchor_start_size: 32
+  max_level: 6
+  min_level: 2
+  num_chan: 256
+  rpn_target_assign:
+    rpn_batch_size_per_im: 256
+    rpn_fg_fraction: 0.5
+    rpn_negative_overlap: 0.3
+    rpn_positive_overlap: 0.7
+    rpn_straddle_thresh: 0.0
+  train_proposal:
+    min_size: 0.0
+    nms_thresh: 0.7
+    pre_nms_top_n: 2000
+    post_nms_top_n: 2000
+  test_proposal:
+    min_size: 0.0
+    nms_thresh: 0.7
+    pre_nms_top_n: 1000
+    post_nms_top_n: 1000
+
+FPNRoIAlign:
+  canconical_level: 4
+  canonical_size: 224
+  max_level: 5
+  min_level: 2
+  box_resolution: 7
+  sampling_ratio: 2
+  mask_resolution: 14
+
+MaskHead:
+  dilation: 1
+  conv_dim: 256
+  num_convs: 4
+  resolution: 28
+  norm_type: gn
+
+CascadeBBoxAssigner:
+  batch_size_per_im: 512
+  bbox_reg_weights: [10, 20, 30]
+  bg_thresh_hi: [0.5, 0.6, 0.7]
+  bg_thresh_lo: [0.0, 0.0, 0.0]
+  fg_fraction: 0.25
+  fg_thresh: [0.5, 0.6, 0.7]
+
+MaskAssigner:
+  resolution: 28
+
+CascadeBBoxHead:
+  head: CascadeXConvNormHead 
+  nms:
+    keep_top_k: 100
+    nms_threshold: 0.5
+    score_threshold: 0.05
+
+CascadeXConvNormHead:
+  norm_type: gn
+
+MultiScaleTEST:
+  score_thresh: 0.05
+  nms_thresh: 0.5
+  detections_per_im: 100
+  enable_voting: true
+  vote_thresh: 0.9
+
+LearningRate:
+  base_lr: 0.01
+  schedulers:
+  - !PiecewiseDecay
+    gamma: 0.1
+    milestones: [240000, 280000]
+  - !LinearWarmup
+    start_factor: 0.01
+    steps: 2000
+
+OptimizerBuilder:
+  optimizer:
+    momentum: 0.9
+    type: Momentum
+  regularizer:
+    factor: 0.0001
+    type: L2
+
+MaskRCNNTrainFeed:
+  # batch size per device
+  batch_size: 1
+  dataset:
+    dataset_dir: dataset/coco
+    image_dir: train2017
+    annotation: annotations/instances_train2017.json
+  sample_transforms: 
+  - !DecodeImage
+    to_rgb: False
+    with_mixup: False
+  - !RandomFlipImage
+    is_mask_flip: true
+    is_normalized: false
+    prob: 0.5
+  - !NormalizeImage
+    is_channel_first: false
+    is_scale: False
+    mean:
+    - 102.9801 
+    - 115.9465
+    - 122.7717
+    std:
+    - 1.0 
+    - 1.0 
+    - 1.0 
+  - !ResizeImage
+    interp: 1
+    target_size:
+    - 416
+    - 448
+    - 480
+    - 512
+    - 544
+    - 576
+    - 608
+    - 640
+    - 672
+    - 704 
+    - 736
+    - 768
+    - 800
+    - 832
+    - 864
+    - 896
+    - 928
+    - 960
+    - 992
+    - 1024
+    - 1056
+    - 1088
+    - 1120
+    - 1152
+    - 1184
+    - 1216
+    - 1248
+    - 1280
+    - 1312
+    - 1344
+    - 1376
+    - 1408
+    max_size: 1600
+    use_cv2: true
+  - !Permute
+    channel_first: true
+    to_bgr: false
+  batch_transforms:
+  - !PadBatch
+    pad_to_stride: 32
+  num_workers: 8
+
+MaskRCNNEvalFeed:
+  batch_size: 1
+  dataset:
+    dataset_dir: dataset/coco
+    annotation: annotations/instances_val2017.json
+    image_dir: val2017
+  sample_transforms: 
+  - !DecodeImage
+    to_rgb: False
+  - !NormalizeImage
+    is_channel_first: false
+    is_scale: False
+    mean:
+    - 102.9801 
+    - 115.9465
+    - 122.7717
+    std:
+    - 1.0 
+    - 1.0 
+    - 1.0 
+  - !MultiscaleTestResize
+    origin_target_size: 800
+    origin_max_size: 1333
+    target_size:
+    - 400
+    - 500
+    - 600
+    - 700
+    - 900
+    - 1000
+    - 1100
+    - 1200
+    max_size: 2000
+    use_flip: true
+  - !Permute
+    channel_first: true
+    to_bgr: false
+  batch_transforms:
+  - !PadMSTest
+    pad_to_stride: 32
+  # num_scale = (len(target_size) + 1) * (1 + use_flip)
+  num_scale: 18
+  num_workers: 2
+
+MaskRCNNTestFeed:
+  batch_size: 1
+  dataset:
+    annotation: dataset/coco/annotations/instances_val2017.json
+  batch_transforms:
+  - !PadBatch
+    pad_to_stride: 32
+  num_workers: 2
--- a/configs/dcn/cascade_rcnn_dcn_r101_vd_fpn_1x.yml
+++ b/configs/dcn/cascade_rcnn_dcn_r101_vd_fpn_1x.yml
+architecture: CascadeRCNN
+train_feed: FasterRCNNTrainFeed
+eval_feed: FasterRCNNEvalFeed
+test_feed: FasterRCNNTestFeed
+max_iters: 90000
+snapshot_iter: 10000
+use_gpu: true
+log_smooth_window: 20
+log_iter: 20
+save_dir: output
+pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/ResNet101_vd_pretrained.tar
+weights: output/cascade_rcnn_dcn_r101_vd_fpn_1x/model_final
+metric: COCO
+num_classes: 81
+
+CascadeRCNN:
+  backbone: ResNet
+  fpn: FPN
+  rpn_head: FPNRPNHead
+  roi_extractor: FPNRoIAlign
+  bbox_head: CascadeBBoxHead
+  bbox_assigner: CascadeBBoxAssigner
+
+ResNet:
+  norm_type: bn
+  depth: 101
+  feature_maps: [2, 3, 4, 5]
+  freeze_at: 2
+  variant: d
+  dcn_v2_stages: [3, 4, 5]
+
+FPN:
+  min_level: 2
+  max_level: 6
+  num_chan: 256
+  spatial_scale: [0.03125, 0.0625, 0.125, 0.25]
+
+FPNRPNHead:
+  anchor_generator:
+    anchor_sizes: [32, 64, 128, 256, 512]
+    aspect_ratios: [0.5, 1.0, 2.0]
+    stride: [16.0, 16.0]
+    variance: [1.0, 1.0, 1.0, 1.0]
+  anchor_start_size: 32
+  min_level: 2
+  max_level: 6
+  num_chan: 256
+  rpn_target_assign:
+    rpn_batch_size_per_im: 256
+    rpn_fg_fraction: 0.5
+    rpn_positive_overlap: 0.7
+    rpn_negative_overlap: 0.3
+    rpn_straddle_thresh: 0.0
+  train_proposal:
+    min_size: 0.0
+    nms_thresh: 0.7
+    pre_nms_top_n: 2000
+    post_nms_top_n: 2000
+  test_proposal:
+    min_size: 0.0
+    nms_thresh: 0.7
+    pre_nms_top_n: 1000
+    post_nms_top_n: 1000
+
+FPNRoIAlign:
+  canconical_level: 4
+  canonical_size: 224
+  min_level: 2
+  max_level: 5
+  box_resolution: 7
+  sampling_ratio: 2
+
+CascadeBBoxAssigner:
+  batch_size_per_im: 512
+  bbox_reg_weights: [10, 20, 30]
+  bg_thresh_lo: [0.0, 0.0, 0.0]
+  bg_thresh_hi: [0.5, 0.6, 0.7]
+  fg_thresh: [0.5, 0.6, 0.7]
+  fg_fraction: 0.25
+
+CascadeBBoxHead:
+  head: CascadeTwoFCHead
+  nms:
+    keep_top_k: 100
+    nms_threshold: 0.5
+    score_threshold: 0.05
+
+CascadeTwoFCHead:
+  mlp_dim: 1024
+
+LearningRate:
+  base_lr: 0.02
+  schedulers:
+  - !PiecewiseDecay
+    gamma: 0.1
+    milestones: [60000, 80000]
+  - !LinearWarmup
+    start_factor: 0.1
+    steps: 1000
+
+OptimizerBuilder:
+  optimizer:
+    momentum: 0.9
+    type: Momentum
+  regularizer:
+    factor: 0.0001
+    type: L2
+
+FasterRCNNTrainFeed:
+  batch_size: 2
+  dataset:
+    dataset_dir: dataset/coco
+    annotation: annotations/instances_train2017.json
+    image_dir: train2017
+  batch_transforms:
+  - !PadBatch
+    pad_to_stride: 32
+  drop_last: false
+  num_workers: 2
+
+FasterRCNNEvalFeed:
+  batch_size: 1
+  dataset:
+    dataset_dir: dataset/coco
+    annotation: annotations/instances_val2017.json
+    image_dir: val2017
+  batch_transforms:
+  - !PadBatch
+    pad_to_stride: 32
+
+FasterRCNNTestFeed:
+  batch_size: 1
+  dataset:
+    annotation: dataset/coco/annotations/instances_val2017.json
+  batch_transforms:
+  - !PadBatch
+    pad_to_stride: 32
+  drop_last: false
+  num_workers: 2
--- a/configs/dcn/cascade_rcnn_dcn_r50_fpn_1x.yml
+++ b/configs/dcn/cascade_rcnn_dcn_r50_fpn_1x.yml
+architecture: CascadeRCNN
+train_feed: FasterRCNNTrainFeed
+eval_feed: FasterRCNNEvalFeed
+test_feed: FasterRCNNTestFeed
+max_iters: 90000
+snapshot_iter: 10000
+use_gpu: true
+log_smooth_window: 20
+log_iter: 20
+save_dir: output
+pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/ResNet50_cos_pretrained.tar
+weights: output/cascade_rcnn_dcn_r50_fpn_1x/model_final
+metric: COCO
+num_classes: 81
+
+CascadeRCNN:
+  backbone: ResNet
+  fpn: FPN
+  rpn_head: FPNRPNHead
+  roi_extractor: FPNRoIAlign
+  bbox_head: CascadeBBoxHead
+  bbox_assigner: CascadeBBoxAssigner
+
+ResNet:
+  norm_type: bn
+  depth: 50
+  feature_maps: [2, 3, 4, 5]
+  freeze_at: 2
+  variant: b
+  dcn_v2_stages: [3, 4, 5]
+
+FPN:
+  min_level: 2
+  max_level: 6
+  num_chan: 256
+  spatial_scale: [0.03125, 0.0625, 0.125, 0.25]
+
+FPNRPNHead:
+  anchor_generator:
+    anchor_sizes: [32, 64, 128, 256, 512]
+    aspect_ratios: [0.5, 1.0, 2.0]
+    stride: [16.0, 16.0]
+    variance: [1.0, 1.0, 1.0, 1.0]
+  anchor_start_size: 32
+  min_level: 2
+  max_level: 6
+  num_chan: 256
+  rpn_target_assign:
+    rpn_batch_size_per_im: 256
+    rpn_fg_fraction: 0.5
+    rpn_positive_overlap: 0.7
+    rpn_negative_overlap: 0.3
+    rpn_straddle_thresh: 0.0
+  train_proposal:
+    min_size: 0.0
+    nms_thresh: 0.7
+    pre_nms_top_n: 2000
+    post_nms_top_n: 2000
+  test_proposal:
+    min_size: 0.0
+    nms_thresh: 0.7
+    pre_nms_top_n: 1000
+    post_nms_top_n: 1000
+
+FPNRoIAlign:
+  canconical_level: 4
+  canonical_size: 224
+  min_level: 2
+  max_level: 5
+  box_resolution: 7
+  sampling_ratio: 2
+
+CascadeBBoxAssigner:
+  batch_size_per_im: 512
+  bbox_reg_weights: [10, 20, 30]
+  bg_thresh_lo: [0.0, 0.0, 0.0]
+  bg_thresh_hi: [0.5, 0.6, 0.7]
+  fg_thresh: [0.5, 0.6, 0.7]
+  fg_fraction: 0.25
+
+CascadeBBoxHead:
+  head: CascadeTwoFCHead
+  nms:
+    keep_top_k: 100
+    nms_threshold: 0.5
+    score_threshold: 0.05
+
+CascadeTwoFCHead:
+  mlp_dim: 1024
+
+LearningRate:
+  base_lr: 0.02
+  schedulers:
+  - !PiecewiseDecay
+    gamma: 0.1
+    milestones: [60000, 80000]
+  - !LinearWarmup
+    start_factor: 0.1
+    steps: 1000
+
+OptimizerBuilder:
+  optimizer:
+    momentum: 0.9
+    type: Momentum
+  regularizer:
+    factor: 0.0001
+    type: L2
+
+FasterRCNNTrainFeed:
+  batch_size: 2
+  dataset:
+    dataset_dir: dataset/coco
+    annotation: annotations/instances_train2017.json
+    image_dir: train2017
+  batch_transforms:
+  - !PadBatch
+    pad_to_stride: 32
+  drop_last: false
+  num_workers: 2
+
+FasterRCNNEvalFeed:
+  batch_size: 1
+  dataset:
+    dataset_dir: dataset/coco
+    annotation: annotations/instances_val2017.json
+    image_dir: val2017
+  batch_transforms:
+  - !PadBatch
+    pad_to_stride: 32
+
+FasterRCNNTestFeed:
+  batch_size: 1
+  dataset:
+    annotation: dataset/coco/annotations/instances_val2017.json
+  batch_transforms:
+  - !PadBatch
+    pad_to_stride: 32
+  drop_last: false
+  num_workers: 2
--- a/configs/dcn/cascade_rcnn_dcn_x101_vd_64x4d_fpn_1x.yml
+++ b/configs/dcn/cascade_rcnn_dcn_x101_vd_64x4d_fpn_1x.yml
+architecture: CascadeRCNN
+train_feed: FasterRCNNTrainFeed
+eval_feed: FasterRCNNEvalFeed
+test_feed: FasterRCNNTestFeed
+max_iters: 90000
+snapshot_iter: 10000
+use_gpu: true
+log_smooth_window: 20
+log_iter: 20
+save_dir: output
+pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/ResNeXt101_vd_64x4d_pretrained.tar
+weights: output/cascade_rcnn_dcn_x101_vd_64x4d_fpn_1x/model_final
+metric: COCO
+num_classes: 81
+
+CascadeRCNN:
+  backbone: ResNeXt
+  fpn: FPN
+  rpn_head: FPNRPNHead
+  roi_extractor: FPNRoIAlign
+  bbox_head: CascadeBBoxHead
+  bbox_assigner: CascadeBBoxAssigner
+
+ResNeXt:
+  norm_type: bn
+  depth: 101
+  feature_maps: [2, 3, 4, 5]
+  freeze_at: 2
+  group_width: 4                                                                                                                                          
+  groups: 64
+  variant: d
+  dcn_v2_stages: [3, 4, 5]
+
+FPN:
+  min_level: 2
+  max_level: 6
+  num_chan: 256
+  spatial_scale: [0.03125, 0.0625, 0.125, 0.25]
+
+FPNRPNHead:
+  anchor_generator:
+    anchor_sizes: [32, 64, 128, 256, 512]
+    aspect_ratios: [0.5, 1.0, 2.0]
+    stride: [16.0, 16.0]
+    variance: [1.0, 1.0, 1.0, 1.0]
+  anchor_start_size: 32
+  min_level: 2
+  max_level: 6
+  num_chan: 256
+  rpn_target_assign:
+    rpn_batch_size_per_im: 256
+    rpn_fg_fraction: 0.5
+    rpn_positive_overlap: 0.7
+    rpn_negative_overlap: 0.3
+    rpn_straddle_thresh: 0.0
+  train_proposal:
+    min_size: 0.0
+    nms_thresh: 0.7
+    pre_nms_top_n: 2000
+    post_nms_top_n: 2000
+  test_proposal:
+    min_size: 0.0
+    nms_thresh: 0.7
+    pre_nms_top_n: 1000
+    post_nms_top_n: 1000
+
+FPNRoIAlign:
+  canconical_level: 4
+  canonical_size: 224
+  min_level: 2
+  max_level: 5
+  box_resolution: 7
+  sampling_ratio: 2
+
+CascadeBBoxAssigner:
+  batch_size_per_im: 512
+  bbox_reg_weights: [10, 20, 30]
+  bg_thresh_lo: [0.0, 0.0, 0.0]
+  bg_thresh_hi: [0.5, 0.6, 0.7]
+  fg_thresh: [0.5, 0.6, 0.7]
+  fg_fraction: 0.25
+
+CascadeBBoxHead:
+  head: CascadeTwoFCHead
+  nms:
+    keep_top_k: 100
+    nms_threshold: 0.5
+    score_threshold: 0.05
+
+CascadeTwoFCHead:
+  mlp_dim: 1024
+
+LearningRate:
+  base_lr: 0.02
+  schedulers:
+  - !PiecewiseDecay
+    gamma: 0.1
+    milestones: [60000, 80000]
+  - !LinearWarmup
+    start_factor: 0.1
+    steps: 1000
+
+OptimizerBuilder:
+  optimizer:
+    momentum: 0.9
+    type: Momentum
+  regularizer:
+    factor: 0.0001
+    type: L2
+
+FasterRCNNTrainFeed:
+  batch_size: 2
+  dataset:
+    dataset_dir: dataset/coco
+    annotation: annotations/instances_train2017.json
+    image_dir: train2017
+  batch_transforms:
+  - !PadBatch
+    pad_to_stride: 32
+  drop_last: false
+  num_workers: 2
+
+FasterRCNNEvalFeed:
+  batch_size: 1
+  dataset:
+    dataset_dir: dataset/coco
+    annotation: annotations/instances_val2017.json
+    image_dir: val2017
+  batch_transforms:
+  - !PadBatch
+    pad_to_stride: 32
+
+FasterRCNNTestFeed:
+  batch_size: 1
+  dataset:
+    annotation: dataset/coco/annotations/instances_val2017.json
+  batch_transforms:
+  - !PadBatch
+    pad_to_stride: 32
+  drop_last: false
+  num_workers: 2
--- a/configs/dcn/faster_rcnn_dcn_r101_vd_fpn_1x.yml
+++ b/configs/dcn/faster_rcnn_dcn_r101_vd_fpn_1x.yml
+architecture: FasterRCNN
+train_feed: FasterRCNNTrainFeed
+eval_feed: FasterRCNNEvalFeed
+test_feed: FasterRCNNTestFeed
+max_iters: 90000
+snapshot_iter: 10000
+use_gpu: true
+log_smooth_window: 20
+log_iter: 20
+save_dir: output
+pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/ResNet101_vd_pretrained.tar
+weights: output/faster_rcnn_dcn_r101_vd_fpn_1x/model_final
+metric: COCO
+num_classes: 81
+
+FasterRCNN:
+  backbone: ResNet
+  fpn: FPN
+  rpn_head: FPNRPNHead
+  roi_extractor: FPNRoIAlign
+  bbox_head: BBoxHead
+  bbox_assigner: BBoxAssigner
+
+ResNet:
+  depth: 101
+  feature_maps: [2, 3, 4, 5]
+  freeze_at: 2
+  norm_type: bn
+  variant: d
+  dcn_v2_stages: [3, 4, 5]
+
+FPN:
+  max_level: 6
+  min_level: 2
+  num_chan: 256
+  spatial_scale: [0.03125, 0.0625, 0.125, 0.25]
+
+FPNRPNHead:
+  anchor_generator:
+    anchor_sizes: [32, 64, 128, 256, 512]
+    aspect_ratios: [0.5, 1.0, 2.0]
+    stride: [16.0, 16.0]
+    variance: [1.0, 1.0, 1.0, 1.0]
+  anchor_start_size: 32
+  max_level: 6
+  min_level: 2
+  num_chan: 256
+  rpn_target_assign:
+    rpn_batch_size_per_im: 256
+    rpn_fg_fraction: 0.5
+    rpn_negative_overlap: 0.3
+    rpn_positive_overlap: 0.7
+    rpn_straddle_thresh: 0.0
+  train_proposal:
+    min_size: 0.0
+    nms_thresh: 0.7
+    post_nms_top_n: 2000
+    pre_nms_top_n: 2000
+  test_proposal:
+    min_size: 0.0
+    nms_thresh: 0.7
+    post_nms_top_n: 1000
+    pre_nms_top_n: 1000
+
+FPNRoIAlign:
+  canconical_level: 4
+  canonical_size: 224
+  max_level: 5
+  min_level: 2
+  box_resolution: 7
+  sampling_ratio: 2
+
+BBoxAssigner:
+  batch_size_per_im: 512
+  bbox_reg_weights: [0.1, 0.1, 0.2, 0.2]
+  bg_thresh_hi: 0.5
+  bg_thresh_lo: 0.0
+  fg_fraction: 0.25
+  fg_thresh: 0.5
+
+BBoxHead:
+  head: TwoFCHead
+  nms:
+    keep_top_k: 100
+    nms_threshold: 0.5
+    score_threshold: 0.05
+
+TwoFCHead:
+  mlp_dim: 1024
+
+LearningRate:
+  base_lr: 0.02
+  schedulers:
+  - !PiecewiseDecay
+    gamma: 0.1
+    milestones: [60000, 80000]
+  - !LinearWarmup
+    start_factor: 0.1
+    steps: 1000
+
+OptimizerBuilder:
+  optimizer:
+    momentum: 0.9
+    type: Momentum
+  regularizer:
+    factor: 0.0001
+    type: L2
+
+FasterRCNNTrainFeed:
+  # batch size per device
+  batch_size: 2
+  dataset:
+    dataset_dir: dataset/coco
+    image_dir: train2017
+    annotation: annotations/instances_train2017.json
+  batch_transforms:
+  - !PadBatch
+    pad_to_stride: 32
+  num_workers: 2
+
+FasterRCNNEvalFeed:
+  batch_size: 1
+  dataset:
+    dataset_dir: dataset/coco
+    annotation: annotations/instances_val2017.json
+    image_dir: val2017
+  batch_transforms:
+  - !PadBatch
+    pad_to_stride: 32
+  num_workers: 2
+
+FasterRCNNTestFeed:
+  batch_size: 1
+  dataset:
+    annotation: dataset/coco/annotations/instances_val2017.json
+  batch_transforms:
+  - !PadBatch
+    pad_to_stride: 32
+  num_workers: 2
--- a/configs/dcn/faster_rcnn_dcn_r50_fpn_1x.yml
+++ b/configs/dcn/faster_rcnn_dcn_r50_fpn_1x.yml
+architecture: FasterRCNN
+train_feed: FasterRCNNTrainFeed
+eval_feed: FasterRCNNEvalFeed
+test_feed: FasterRCNNTestFeed
+max_iters: 90000
+use_gpu: true
+snapshot_iter: 10000
+log_smooth_window: 20
+log_iter: 20
+save_dir: output
+pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/ResNet50_cos_pretrained.tar
+metric: COCO
+weights: output/faster_rcnn_dcn_r50_fpn_1x/model_final
+num_classes: 81
+
+FasterRCNN:
+  backbone: ResNet
+  fpn: FPN
+  rpn_head: FPNRPNHead
+  roi_extractor: FPNRoIAlign
+  bbox_head: BBoxHead
+  bbox_assigner: BBoxAssigner
+
+ResNet:
+  depth: 50
+  norm_type: bn
+  feature_maps: [2, 3, 4, 5]
+  freeze_at: 2
+  dcn_v2_stages: [3, 4, 5]
+
+FPN:
+  min_level: 2
+  max_level: 6
+  num_chan: 256
+  spatial_scale: [0.03125, 0.0625, 0.125, 0.25]
+
+FPNRPNHead:
+  anchor_generator:
+    anchor_sizes: [32, 64, 128, 256, 512]
+    aspect_ratios: [0.5, 1.0, 2.0]
+    stride: [16.0, 16.0]
+    variance: [1.0, 1.0, 1.0, 1.0]
+  anchor_start_size: 32
+  min_level: 2
+  max_level: 6
+  num_chan: 256
+  rpn_target_assign:
+    rpn_batch_size_per_im: 256
+    rpn_fg_fraction: 0.5
+    rpn_positive_overlap: 0.7
+    rpn_negative_overlap: 0.3
+    rpn_straddle_thresh: 0.0
+  train_proposal:
+    min_size: 0.0
+    nms_thresh: 0.7
+    pre_nms_top_n: 2000
+    post_nms_top_n: 2000
+  test_proposal:
+    min_size: 0.0
+    nms_thresh: 0.7
+    pre_nms_top_n: 1000
+    post_nms_top_n: 1000
+
+FPNRoIAlign:
+  canconical_level: 4
+  canonical_size: 224
+  min_level: 2
+  max_level: 5
+  box_resolution: 7
+  sampling_ratio: 2
+
+BBoxAssigner:
+  batch_size_per_im: 512
+  bbox_reg_weights: [0.1, 0.1, 0.2, 0.2]
+  bg_thresh_lo: 0.0
+  bg_thresh_hi: 0.5
+  fg_fraction: 0.25
+  fg_thresh: 0.5
+
+BBoxHead:
+  head: TwoFCHead
+  nms:
+    keep_top_k: 100
+    nms_threshold: 0.5
+    score_threshold: 0.05
+
+TwoFCHead:
+  mlp_dim: 1024
+
+LearningRate:
+  base_lr: 0.02
+  schedulers:
+  - !PiecewiseDecay
+    gamma: 0.1
+    milestones: [60000, 80000]
+  - !LinearWarmup
+    start_factor: 0.1
+    steps: 1000
+
+OptimizerBuilder:
+  optimizer:
+    momentum: 0.9
+    type: Momentum
+  regularizer:
+    factor: 0.0001
+    type: L2
+
+FasterRCNNTrainFeed:
+  batch_size: 2
+  dataset:
+    dataset_dir: dataset/coco
+    annotation: annotations/instances_train2017.json
+    image_dir: train2017
+  batch_transforms:
+  - !PadBatch
+    pad_to_stride: 32
+  drop_last: false
+  num_workers: 2
+
+FasterRCNNEvalFeed:
+  batch_size: 1
+  dataset:
+    dataset_dir: dataset/coco
+    annotation: annotations/instances_val2017.json
+    image_dir: val2017
+  batch_transforms:
+  - !PadBatch
+    pad_to_stride: 32
+
+FasterRCNNTestFeed:
+  batch_size: 1
+  dataset:
+    annotation: dataset/coco/annotations/instances_val2017.json
+  batch_transforms:
+  - !PadBatch
+    pad_to_stride: 32
+  drop_last: false
+  num_workers: 2
--- a/configs/dcn/faster_rcnn_dcn_r50_vd_fpn_2x.yml
+++ b/configs/dcn/faster_rcnn_dcn_r50_vd_fpn_2x.yml
+architecture: FasterRCNN
+train_feed: FasterRCNNTrainFeed
+eval_feed: FasterRCNNEvalFeed
+test_feed: FasterRCNNTestFeed
+max_iters: 180000
+snapshot_iter: 10000
+use_gpu: true
+log_smooth_window: 20
+log_iter: 20
+save_dir: output
+pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/ResNet50_vd_pretrained.tar
+weights: output/faster_rcnn_dcn_r50_vd_fpn_2x/model_final
+metric: COCO
+num_classes: 81
+
+FasterRCNN:
+  backbone: ResNet
+  fpn: FPN
+  rpn_head: FPNRPNHead
+  roi_extractor: FPNRoIAlign
+  bbox_head: BBoxHead
+  bbox_assigner: BBoxAssigner
+
+ResNet:
+  depth: 50
+  feature_maps: [2, 3, 4, 5]
+  freeze_at: 2
+  norm_type: bn
+  variant: d
+  dcn_v2_stages: [3, 4, 5]
+
+FPN:
+  max_level: 6
+  min_level: 2
+  num_chan: 256
+  spatial_scale: [0.03125, 0.0625, 0.125, 0.25]
+
+FPNRPNHead:
+  anchor_generator:
+    anchor_sizes: [32, 64, 128, 256, 512]
+    aspect_ratios: [0.5, 1.0, 2.0]
+    stride: [16.0, 16.0]
+    variance: [1.0, 1.0, 1.0, 1.0]
+  anchor_start_size: 32
+  max_level: 6
+  min_level: 2
+  num_chan: 256
+  rpn_target_assign:
+    rpn_batch_size_per_im: 256
+    rpn_fg_fraction: 0.5
+    rpn_negative_overlap: 0.3
+    rpn_positive_overlap: 0.7
+    rpn_straddle_thresh: 0.0
+  train_proposal:
+    min_size: 0.0
+    nms_thresh: 0.7
+    post_nms_top_n: 2000
+    pre_nms_top_n: 2000
+  test_proposal:
+    min_size: 0.0
+    nms_thresh: 0.7
+    post_nms_top_n: 1000
+    pre_nms_top_n: 1000
+
+FPNRoIAlign:
+  canconical_level: 4
+  canonical_size: 224
+  max_level: 5
+  min_level: 2
+  box_resolution: 7
+  sampling_ratio: 2
+
+BBoxAssigner:
+  batch_size_per_im: 512
+  bbox_reg_weights: [0.1, 0.1, 0.2, 0.2]
+  bg_thresh_hi: 0.5
+  bg_thresh_lo: 0.0
+  fg_fraction: 0.25
+  fg_thresh: 0.5
+
+BBoxHead:
+  head: TwoFCHead
+  nms:
+    keep_top_k: 100
+    nms_threshold: 0.5
+    score_threshold: 0.05
+
+TwoFCHead:
+  mlp_dim: 1024
+
+LearningRate:
+  base_lr: 0.02
+  schedulers:
+  - !PiecewiseDecay
+    gamma: 0.1
+    milestones: [120000, 160000]
+  - !LinearWarmup
+    start_factor: 0.1
+    steps: 1000
+
+OptimizerBuilder:
+  optimizer:
+    momentum: 0.9
+    type: Momentum
+  regularizer:
+    factor: 0.0001
+    type: L2
+
+FasterRCNNTrainFeed:
+  # batch size per device
+  batch_size: 2
+  dataset:
+    dataset_dir: dataset/coco
+    image_dir: train2017
+    annotation: annotations/instances_train2017.json
+  batch_transforms:
+  - !PadBatch
+    pad_to_stride: 32
+  num_workers: 2
+
+FasterRCNNEvalFeed:
+  batch_size: 1
+  dataset:
+    dataset_dir: dataset/coco
+    annotation: annotations/instances_val2017.json
+    image_dir: val2017
+  batch_transforms:
+  - !PadBatch
+    pad_to_stride: 32
+  num_workers: 2
+
+FasterRCNNTestFeed:
+  batch_size: 1
+  dataset:
+    annotation: dataset/coco/annotations/instances_val2017.json
+  batch_transforms:
+  - !PadBatch
+    pad_to_stride: 32
+  num_workers: 2
--- a/configs/dcn/faster_rcnn_dcn_x101_vd_64x4d_fpn_1x.yml
+++ b/configs/dcn/faster_rcnn_dcn_x101_vd_64x4d_fpn_1x.yml
+architecture: FasterRCNN
+train_feed: FasterRCNNTrainFeed
+eval_feed: FasterRCNNEvalFeed
+test_feed: FasterRCNNTestFeed
+max_iters: 180000
+snapshot_iter: 10000
+use_gpu: true
+log_smooth_window: 20
+log_iter: 20
+save_dir: output
+pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/ResNeXt101_vd_64x4d_pretrained.tar
+weights: output/faster_rcnn_dcn_x101_vd_64x4d_fpn_1x/model_final
+metric: COCO
+num_classes: 81
+
+FasterRCNN:
+  backbone: ResNeXt
+  fpn: FPN
+  rpn_head: FPNRPNHead
+  roi_extractor: FPNRoIAlign
+  bbox_head: BBoxHead
+  bbox_assigner: BBoxAssigner
+
+ResNeXt:
+  depth: 101
+  feature_maps: [2, 3, 4, 5]
+  freeze_at: 2
+  group_width: 4
+  groups: 64
+  norm_type: bn
+  variant: d
+  dcn_v2_stages: [3, 4, 5]
+
+FPN:
+  max_level: 6
+  min_level: 2
+  num_chan: 256
+  spatial_scale: [0.03125, 0.0625, 0.125, 0.25]
+
+FPNRPNHead:
+  anchor_generator:
+    anchor_sizes: [32, 64, 128, 256, 512]
+    aspect_ratios: [0.5, 1.0, 2.0]
+    stride: [16.0, 16.0]
+    variance: [1.0, 1.0, 1.0, 1.0]
+  anchor_start_size: 32
+  max_level: 6
+  min_level: 2
+  num_chan: 256
+  rpn_target_assign:
+    rpn_batch_size_per_im: 256
+    rpn_fg_fraction: 0.5
+    rpn_negative_overlap: 0.3
+    rpn_positive_overlap: 0.7
+    rpn_straddle_thresh: 0.0
+  train_proposal:
+    min_size: 0.0
+    nms_thresh: 0.7
+    post_nms_top_n: 2000
+    pre_nms_top_n: 2000
+  test_proposal:
+    min_size: 0.0
+    nms_thresh: 0.7
+    post_nms_top_n: 1000
+    pre_nms_top_n: 1000
+
+FPNRoIAlign:
+  canconical_level: 4
+  canonical_size: 224
+  max_level: 5
+  min_level: 2
+  box_resolution: 7
+  sampling_ratio: 2
+
+BBoxAssigner:
+  batch_size_per_im: 512
+  bbox_reg_weights: [0.1, 0.1, 0.2, 0.2]
+  bg_thresh_hi: 0.5
+  bg_thresh_lo: 0.0
+  fg_fraction: 0.25
+  fg_thresh: 0.5
+
+BBoxHead:
+  head: TwoFCHead
+  nms:
+    keep_top_k: 100
+    nms_threshold: 0.5
+    score_threshold: 0.05
+
+TwoFCHead:
+  mlp_dim: 1024
+
+LearningRate:
+  base_lr: 0.01
+  schedulers:
+  - !PiecewiseDecay
+    gamma: 0.1
+    milestones: [120000, 160000]
+  - !LinearWarmup
+    start_factor: 0.1
+    steps: 1000
+
+OptimizerBuilder:
+  optimizer:
+    momentum: 0.9
+    type: Momentum
+  regularizer:
+    factor: 0.0001
+    type: L2
+
+FasterRCNNTrainFeed:
+  # batch size per device
+  batch_size: 1
+  dataset:
+    dataset_dir: dataset/coco
+    image_dir: train2017
+    annotation: annotations/instances_train2017.json
+  batch_transforms:
+  - !PadBatch
+    pad_to_stride: 32
+  num_workers: 2
+  shuffle: true
+
+FasterRCNNEvalFeed:
+  batch_size: 1
+  dataset:
+    dataset_dir: dataset/coco
+    annotation: annotations/instances_val2017.json
+    image_dir: val2017
+  batch_transforms:
+  - !PadBatch
+    pad_to_stride: 32
+  num_workers: 2
+
+FasterRCNNTestFeed:
+  batch_size: 1
+  dataset:
+    annotation: dataset/coco/annotations/instances_val2017.json
+  batch_transforms:
+  - !PadBatch
+    pad_to_stride: 32
+  num_workers: 2
+  shuffle: false
--- a/configs/dcn/mask_rcnn_dcn_r101_vd_fpn_1x.yml
+++ b/configs/dcn/mask_rcnn_dcn_r101_vd_fpn_1x.yml
+architecture: MaskRCNN
+train_feed: MaskRCNNTrainFeed
+eval_feed: MaskRCNNEvalFeed
+test_feed: MaskRCNNTestFeed
+max_iters: 180000
+snapshot_iter: 10000
+use_gpu: true
+log_smooth_window: 20
+log_iter: 20
+save_dir: output
+pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/ResNet101_vd_pretrained.tar 
+weights: output/mask_rcnn_dcn_r101_vd_fpn_1x/model_final
+metric: COCO
+num_classes: 81
+
+MaskRCNN:
+  backbone: ResNet
+  fpn: FPN
+  rpn_head: FPNRPNHead
+  roi_extractor: FPNRoIAlign
+  bbox_head: BBoxHead
+  bbox_assigner: BBoxAssigner
+
+ResNet:
+  depth: 101
+  feature_maps: [2, 3, 4, 5]
+  freeze_at: 2
+  norm_type: bn
+  variant: d
+  dcn_v2_stages: [3, 4, 5]
+
+FPN:
+  max_level: 6
+  min_level: 2
+  num_chan: 256
+  spatial_scale: [0.03125, 0.0625, 0.125, 0.25]
+
+FPNRPNHead:
+  anchor_generator:
+    aspect_ratios: [0.5, 1.0, 2.0]
+    variance: [1.0, 1.0, 1.0, 1.0]
+  anchor_start_size: 32
+  max_level: 6
+  min_level: 2
+  num_chan: 256
+  rpn_target_assign:
+    rpn_batch_size_per_im: 256
+    rpn_fg_fraction: 0.5
+    rpn_negative_overlap: 0.3
+    rpn_positive_overlap: 0.7
+    rpn_straddle_thresh: 0.0
+  train_proposal:
+    min_size: 0.0
+    nms_thresh: 0.7
+    pre_nms_top_n: 2000
+    post_nms_top_n: 2000
+  test_proposal:
+    min_size: 0.0
+    nms_thresh: 0.7
+    pre_nms_top_n: 1000
+    post_nms_top_n: 1000
+
+FPNRoIAlign:
+  canconical_level: 4
+  canonical_size: 224
+  max_level: 5
+  min_level: 2
+  sampling_ratio: 2
+  box_resolution: 7
+  mask_resolution: 14
+
+MaskHead:
+  dilation: 1
+  conv_dim: 256
+  num_convs: 4
+  resolution: 28
+
+BBoxAssigner:
+  batch_size_per_im: 512
+  bbox_reg_weights: [0.1, 0.1, 0.2, 0.2]
+  bg_thresh_hi: 0.5
+  bg_thresh_lo: 0.0
+  fg_fraction: 0.25
+  fg_thresh: 0.5
+
+MaskAssigner:
+  resolution: 28
+
+BBoxHead:
+  head: TwoFCHead
+  nms:
+    keep_top_k: 100
+    nms_threshold: 0.5
+    score_threshold: 0.05
+
+TwoFCHead:
+  mlp_dim: 1024
+
+LearningRate:
+  base_lr: 0.01
+  schedulers:
+  - !PiecewiseDecay
+    gamma: 0.1
+    milestones: [120000, 160000]
+  - !LinearWarmup
+    start_factor: 0.1
+    steps: 1000
+
+OptimizerBuilder:
+  optimizer:
+    momentum: 0.9
+    type: Momentum
+  regularizer:
+    factor: 0.0001
+    type: L2
+
+MaskRCNNTrainFeed:
+  batch_size: 1
+  dataset:
+    dataset_dir: dataset/coco
+    annotation: annotations/instances_train2017.json
+    image_dir: train2017
+  batch_transforms:
+  - !PadBatch
+    pad_to_stride: 32
+  num_workers: 2
+
+MaskRCNNEvalFeed:
+  batch_size: 1
+  dataset:
+    dataset_dir: dataset/coco
+    annotation: annotations/instances_val2017.json
+    image_dir: val2017
+  batch_transforms:
+  - !PadBatch
+    pad_to_stride: 32
+  num_workers: 2
+
+MaskRCNNTestFeed:
+  batch_size: 1
+  dataset:
+    annotation: dataset/coco/annotations/instances_val2017.json
+  batch_transforms:
+  - !PadBatch
+    pad_to_stride: 32
+  num_workers: 2
--- a/configs/dcn/mask_rcnn_dcn_r50_fpn_1x.yml
+++ b/configs/dcn/mask_rcnn_dcn_r50_fpn_1x.yml
+architecture: MaskRCNN
+train_feed: MaskRCNNTrainFeed
+eval_feed: MaskRCNNEvalFeed
+test_feed: MaskRCNNTestFeed
+use_gpu: true
+max_iters: 180000
+snapshot_iter: 10000
+log_smooth_window: 20
+log_iter: 20
+save_dir: output
+pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/ResNet50_cos_pretrained.tar
+metric: COCO
+weights: output/mask_rcnn_dcn_r50_fpn_1x/model_final/
+num_classes: 81
+
+MaskRCNN:
+  backbone: ResNet
+  fpn: FPN
+  rpn_head: FPNRPNHead
+  roi_extractor: FPNRoIAlign
+  bbox_head: BBoxHead
+  bbox_assigner: BBoxAssigner
+
+ResNet:
+  depth: 50
+  feature_maps: [2, 3, 4, 5]
+  freeze_at: 2
+  norm_type: bn
+  dcn_v2_stages: [3, 4, 5]
+
+FPN:
+  max_level: 6
+  min_level: 2
+  num_chan: 256
+  spatial_scale: [0.03125, 0.0625, 0.125, 0.25]
+
+FPNRPNHead:
+  anchor_generator:
+    aspect_ratios: [0.5, 1.0, 2.0]
+    variance: [1.0, 1.0, 1.0, 1.0]
+  anchor_start_size: 32
+  max_level: 6
+  min_level: 2
+  num_chan: 256
+  rpn_target_assign:
+    rpn_batch_size_per_im: 256
+    rpn_fg_fraction: 0.5
+    rpn_negative_overlap: 0.3
+    rpn_positive_overlap: 0.7
+    rpn_straddle_thresh: 0.0
+  train_proposal:
+    min_size: 0.0
+    nms_thresh: 0.7
+    pre_nms_top_n: 2000
+    post_nms_top_n: 2000
+  test_proposal:
+    min_size: 0.0
+    nms_thresh: 0.7
+    pre_nms_top_n: 1000
+    post_nms_top_n: 1000
+
+FPNRoIAlign:
+  canconical_level: 4
+  canonical_size: 224
+  max_level: 5
+  min_level: 2
+  sampling_ratio: 2
+  box_resolution: 7
+  mask_resolution: 14
+
+MaskHead:
+  dilation: 1
+  conv_dim: 256
+  num_convs: 4
+  resolution: 28
+
+BBoxAssigner:
+  batch_size_per_im: 512
+  bbox_reg_weights: [0.1, 0.1, 0.2, 0.2]
+  bg_thresh_hi: 0.5
+  bg_thresh_lo: 0.0
+  fg_fraction: 0.25
+  fg_thresh: 0.5
+
+MaskAssigner:
+  resolution: 28
+
+BBoxHead:
+  head: TwoFCHead
+  nms:
+    keep_top_k: 100
+    nms_threshold: 0.5
+    score_threshold: 0.05
+
+TwoFCHead:
+  mlp_dim: 1024
+
+LearningRate:
+  base_lr: 0.01
+  schedulers:
+  - !PiecewiseDecay
+    gamma: 0.1
+    milestones: [120000, 160000]
+  - !LinearWarmup
+    start_factor: 0.1
+    steps: 1000
+
+OptimizerBuilder:
+  optimizer:
+    momentum: 0.9
+    type: Momentum
+  regularizer:
+    factor: 0.0001
+    type: L2
+
+MaskRCNNTrainFeed:
+  batch_size: 1
+  dataset:
+    dataset_dir: dataset/coco
+    annotation: annotations/instances_train2017.json
+    image_dir: train2017
+  batch_transforms:
+  - !PadBatch
+    pad_to_stride: 32
+  num_workers: 2
+
+MaskRCNNEvalFeed:
+  batch_size: 1
+  dataset:
+    dataset_dir: dataset/coco
+    annotation: annotations/instances_val2017.json
+    image_dir: val2017
+  batch_transforms:
+  - !PadBatch
+    pad_to_stride: 32
+  num_workers: 2
+
+MaskRCNNTestFeed:
+  batch_size: 1
+  dataset:
+    annotation: dataset/coco/annotations/instances_val2017.json
+  batch_transforms:
+  - !PadBatch
+    pad_to_stride: 32
+  num_workers: 2
--- a/configs/dcn/mask_rcnn_dcn_r50_vd_fpn_2x.yml
+++ b/configs/dcn/mask_rcnn_dcn_r50_vd_fpn_2x.yml
+architecture: MaskRCNN
+train_feed: MaskRCNNTrainFeed
+eval_feed: MaskRCNNEvalFeed
+test_feed: MaskRCNNTestFeed
+use_gpu: true
+max_iters: 360000
+snapshot_iter: 10000
+log_smooth_window: 20
+log_iter: 20
+save_dir: output
+pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/ResNet50_vd_pretrained.tar
+metric: COCO
+weights: output/mask_rcnn_dcn_r50_vd_fpn_2x/model_final/
+num_classes: 81
+
+MaskRCNN:
+  backbone: ResNet
+  fpn: FPN
+  rpn_head: FPNRPNHead
+  roi_extractor: FPNRoIAlign
+  bbox_head: BBoxHead
+  bbox_assigner: BBoxAssigner
+
+ResNet:
+  depth: 50
+  feature_maps: [2, 3, 4, 5]
+  freeze_at: 2
+  norm_type: bn
+  variant: d
+  dcn_v2_stages: [3, 4, 5]
+
+FPN:
+  max_level: 6
+  min_level: 2
+  num_chan: 256
+  spatial_scale: [0.03125, 0.0625, 0.125, 0.25]
+
+FPNRPNHead:
+  anchor_generator:
+    aspect_ratios: [0.5, 1.0, 2.0]
+    variance: [1.0, 1.0, 1.0, 1.0]
+  anchor_start_size: 32
+  max_level: 6
+  min_level: 2
+  num_chan: 256
+  rpn_target_assign:
+    rpn_batch_size_per_im: 256
+    rpn_fg_fraction: 0.5
+    rpn_negative_overlap: 0.3
+    rpn_positive_overlap: 0.7
+    rpn_straddle_thresh: 0.0
+  train_proposal:
+    min_size: 0.0
+    nms_thresh: 0.7
+    pre_nms_top_n: 2000
+    post_nms_top_n: 2000
+  test_proposal:
+    min_size: 0.0
+    nms_thresh: 0.7
+    pre_nms_top_n: 1000
+    post_nms_top_n: 1000
+
+FPNRoIAlign:
+  canconical_level: 4
+  canonical_size: 224
+  max_level: 5
+  min_level: 2
+  box_resolution: 7
+  sampling_ratio: 2
+  mask_resolution: 14
+
+MaskHead:
+  dilation: 1
+  conv_dim: 256
+  num_convs: 4
+  resolution: 28
+
+BBoxAssigner:
+  batch_size_per_im: 512
+  bbox_reg_weights: [0.1, 0.1, 0.2, 0.2]
+  bg_thresh_hi: 0.5
+  bg_thresh_lo: 0.0
+  fg_fraction: 0.25
+  fg_thresh: 0.5
+
+MaskAssigner:
+  resolution: 28
+
+BBoxHead:
+  head: TwoFCHead
+  nms:
+    keep_top_k: 100
+    nms_threshold: 0.5
+    score_threshold: 0.05
+
+TwoFCHead:
+  mlp_dim: 1024
+
+LearningRate:
+  base_lr: 0.01
+  schedulers:
+  - !PiecewiseDecay
+    gamma: 0.1
+    milestones: [240000, 320000]
+  - !LinearWarmup
+    start_factor: 0.1
+    steps: 1000
+
+OptimizerBuilder:
+  optimizer:
+    momentum: 0.9
+    type: Momentum
+  regularizer:
+    factor: 0.0001
+    type: L2
+
+MaskRCNNTrainFeed:
+  # batch size per device
+  batch_size: 1
+  dataset:
+    dataset_dir: dataset/coco
+    image_dir: train2017
+    annotation: annotations/instances_train2017.json
+  batch_transforms:
+  - !PadBatch
+    pad_to_stride: 32
+  num_workers: 2
+
+MaskRCNNEvalFeed:
+  batch_size: 1
+  dataset:
+    dataset_dir: dataset/coco
+    annotation: annotations/instances_val2017.json
+    image_dir: val2017
+  batch_transforms:
+  - !PadBatch
+    pad_to_stride: 32
+  num_workers: 2
+
+MaskRCNNTestFeed:
+  batch_size: 1
+  dataset:
+    annotation: dataset/coco/annotations/instances_val2017.json
+  batch_transforms:
+  - !PadBatch
+    pad_to_stride: 32
+  num_workers: 2
--- a/configs/dcn/mask_rcnn_dcn_x101_vd_64x4d_fpn_1x.yml
+++ b/configs/dcn/mask_rcnn_dcn_x101_vd_64x4d_fpn_1x.yml
+architecture: MaskRCNN
+train_feed: MaskRCNNTrainFeed
+eval_feed: MaskRCNNEvalFeed
+test_feed: MaskRCNNTestFeed
+max_iters: 180000
+snapshot_iter: 10000
+use_gpu: true
+log_smooth_window: 20
+log_iter: 20
+save_dir: output
+pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/ResNeXt101_vd_64x4d_pretrained.tar 
+weights: output/mask_rcnn_dcn_x101_vd_64x4d_fpn_1x/model_final
+metric: COCO
+num_classes: 81
+
+MaskRCNN:
+  backbone: ResNeXt
+  fpn: FPN
+  rpn_head: FPNRPNHead
+  roi_extractor: FPNRoIAlign
+  bbox_head: BBoxHead
+  bbox_assigner: BBoxAssigner
+
+ResNeXt:
+  depth: 101
+  feature_maps: [2, 3, 4, 5]
+  freeze_at: 2
+  group_width: 4
+  groups: 64
+  norm_type: bn
+  variant: d
+  dcn_v2_stages: [3, 4, 5]
+
+FPN:
+  max_level: 6
+  min_level: 2
+  num_chan: 256
+  spatial_scale: [0.03125, 0.0625, 0.125, 0.25]
+
+FPNRPNHead:
+  anchor_generator:
+    aspect_ratios: [0.5, 1.0, 2.0]
+    variance: [1.0, 1.0, 1.0, 1.0]
+  anchor_start_size: 32
+  max_level: 6
+  min_level: 2
+  num_chan: 256
+  rpn_target_assign:
+    rpn_batch_size_per_im: 256
+    rpn_fg_fraction: 0.5
+    rpn_negative_overlap: 0.3
+    rpn_positive_overlap: 0.7
+    rpn_straddle_thresh: 0.0
+  train_proposal:
+    min_size: 0.0
+    nms_thresh: 0.7
+    pre_nms_top_n: 2000
+    post_nms_top_n: 2000
+  test_proposal:
+    min_size: 0.0
+    nms_thresh: 0.7
+    pre_nms_top_n: 1000
+    post_nms_top_n: 1000
+
+FPNRoIAlign:
+  canconical_level: 4
+  canonical_size: 224
+  max_level: 5
+  min_level: 2
+  sampling_ratio: 2
+  box_resolution: 7
+  mask_resolution: 14
+
+MaskHead:
+  dilation: 1
+  conv_dim: 256
+  num_convs: 4
+  resolution: 28
+
+BBoxAssigner:
+  batch_size_per_im: 512
+  bbox_reg_weights: [0.1, 0.1, 0.2, 0.2]
+  bg_thresh_hi: 0.5
+  bg_thresh_lo: 0.0
+  fg_fraction: 0.25
+  fg_thresh: 0.5
+
+MaskAssigner:
+  resolution: 28
+
+BBoxHead:
+  head: TwoFCHead
+  nms:
+    keep_top_k: 100
+    nms_threshold: 0.5
+    score_threshold: 0.05
+
+TwoFCHead:
+  mlp_dim: 1024
+
+LearningRate:
+  base_lr: 0.01
+  schedulers:
+  - !PiecewiseDecay
+    gamma: 0.1
+    milestones: [120000, 160000]
+  - !LinearWarmup
+    start_factor: 0.1
+    steps: 1000
+
+OptimizerBuilder:
+  optimizer:
+    momentum: 0.9
+    type: Momentum
+  regularizer:
+    factor: 0.0001
+    type: L2
+
+MaskRCNNTrainFeed:
+  batch_size: 1
+  dataset:
+    dataset_dir: dataset/coco
+    annotation: annotations/instances_train2017.json
+    image_dir: train2017
+  batch_transforms:
+  - !PadBatch
+    pad_to_stride: 32
+  num_workers: 2
+
+MaskRCNNEvalFeed:
+  batch_size: 1
+  dataset:
+    dataset_dir: dataset/coco
+    annotation: annotations/instances_val2017.json
+    image_dir: val2017
+  batch_transforms:
+  - !PadBatch
+    pad_to_stride: 32
+  num_workers: 2
+
+MaskRCNNTestFeed:
+  batch_size: 1
+  dataset:
+    annotation: dataset/coco/annotations/instances_val2017.json
+  batch_transforms:
+  - !PadBatch
+    pad_to_stride: 32
+  num_workers: 2
--- a/configs/face_detection/README.md
+++ b/configs/face_detection/README.md
+English | [简体中文](README_cn.md)
+
+# FaceDetection
+The goal of FaceDetection is to provide efficient and high-speed face detection solutions,
+including cutting-edge and classic models.
+
+
+<div align="center">
+  <img src="../../demo/output/12_Group_Group_12_Group_Group_12_935.jpg" />
+</div>
+
+## Data Pipline
+We use the [WIDER FACE dataset](http://shuoyang1213.me/WIDERFACE/) to carry out the training
+and testing of the model, the official website gives detailed data introduction.
+- WIDER Face data source:  
+Loads `wider_face` type dataset with directory structures like this:
+
+  ```
+  dataset/wider_face/
+  ├── wider_face_split
+  │   ├── wider_face_train_bbx_gt.txt
+  │   ├── wider_face_val_bbx_gt.txt
+  ├── WIDER_train
+  │   ├── images
+  │   │   ├── 0--Parade
+  │   │   │   ├── 0_Parade_marchingband_1_100.jpg
+  │   │   │   ├── 0_Parade_marchingband_1_381.jpg
+  │   │   │   │   ...
+  │   │   ├── 10--People_Marching
+  │   │   │   ...
+  ├── WIDER_val
+  │   ├── images
+  │   │   ├── 0--Parade
+  │   │   │   ├── 0_Parade_marchingband_1_1004.jpg
+  │   │   │   ├── 0_Parade_marchingband_1_1045.jpg
+  │   │   │   │   ...
+  │   │   ├── 10--People_Marching
+  │   │   │   ...
+  ```
+
+- Download dataset manually:  
+To download the WIDER FACE dataset, run the following commands:
+```
+cd dataset/wider_face && ./download.sh
+```
+
+- Download dataset automatically:
+If a training session is started but the dataset is not setup properly
+(e.g, not found in dataset/wider_face), PaddleDetection can automatically
+download them from [WIDER FACE dataset](http://shuoyang1213.me/WIDERFACE/),
+the decompressed datasets will be cached in ~/.cache/paddle/dataset/ and can be discovered
+automatically subsequently.
+
+### Data Augmentation
+
+- **Data-anchor-sampling:** Randomly transform the scale of the image to a certain range of scales,
+greatly enhancing the scale change of the face. The specific operation is to obtain $v=\sqrt{width * height}$
+according to the randomly selected face height and width, and judge the value of `v` in which interval of
+ `[16,32,64,128]`. Assuming `v=45` && `32<v<64`, and any value of `[16,32,64]` is selected with a probability
+ of uniform distribution. If `64` is selected, the face's interval is selected in `[64 / 2, min(v * 2, 64 * 2)]`.
+
+- **Other methods:** Including `RandomDistort`,`ExpandImage`,`RandomInterpImage`,`RandomFlipImage` etc.
+Please refer to [DATA.md](../../docs/DATA.md#APIs) for details.
+
+
+##  Benchmark and Model Zoo
+Supported architectures is shown in the below table, please refer to
+[Algorithm Description](#Algorithm-Description) for details of the algorithm.
+
+|                          | Original | Lite <sup>[1](#lite)</sup> | NAS <sup>[2](#nas)</sup> |
+|:------------------------:|:--------:|:--------------------------:|:------------------------:|
+| [BlazeFace](#BlazeFace)  | ✓        |                          ✓ | ✓                        |
+| [FaceBoxes](#FaceBoxes)  | ✓        |                          ✓ | x                        |
+
+<a name="lite">[1]</a> `Lite` edition means reduces the number of network layers and channels.  
+<a name="nas">[2]</a> `NAS` edition means use `Neural Architecture Search` algorithm to
+optimized network structure.
+
+**Todo List:**
+- [ ] HamBox
+- [ ] Pyramidbox
+
+### Model Zoo
+
+#### mAP in WIDER FACE
+
+| Architecture | Type     | Size | Img/gpu | Lr schd | Easy Set  | Medium Set | Hard Set  | Download |
+|:------------:|:--------:|:----:|:-------:|:-------:|:---------:|:----------:|:---------:|:--------:|
+| BlazeFace    | Original | 640  |    8    | 32w     | **0.915** | **0.892**  | **0.797** | [model](https://paddlemodels.bj.bcebos.com/object_detection/blazeface_original.tar) |
+| BlazeFace    | Lite     | 640  |    8    | 32w     | 0.909     | 0.885      | 0.781     | [model](https://paddlemodels.bj.bcebos.com/object_detection/blazeface_lite.tar) |
+| BlazeFace    | NAS      | 640  |    8    | 32w     | 0.837     | 0.807      | 0.658     | [model](https://paddlemodels.bj.bcebos.com/object_detection/blazeface_nas.tar) |
+| FaceBoxes    | Original | 640  |    8    | 32w     | 0.875     | 0.848      | 0.568     | [model](https://paddlemodels.bj.bcebos.com/object_detection/faceboxes_original.tar) |
+| FaceBoxes    | Lite     | 640  |    8    | 32w     | 0.898     | 0.872      | 0.752     | [model](https://paddlemodels.bj.bcebos.com/object_detection/faceboxes_lite.tar) |
+
+**NOTES:**  
+- Get mAP in `Easy/Medium/Hard Set` by multi-scale evaluation in `tools/face_eval.py`.
+For details can refer to [Evaluation](#Evaluate-on-the-WIDER-FACE).
+- BlazeFace-Lite Training and Testing ues [blazeface.yml](../../configs/face_detection/blazeface.yml)
+configs file and set `lite_edition: true`.
+
+#### mAP in FDDB
+
+| Architecture | Type     | Size | DistROC | ContROC |
+|:------------:|:--------:|:----:|:-------:|:-------:|
+| BlazeFace    | Original | 640  | **0.992**   | **0.762**   |
+| BlazeFace    | Lite     | 640  | 0.990   | 0.756   |
+| BlazeFace    | NAS      | 640  | 0.981   | 0.741   |
+| FaceBoxes    | Original | 640  | 0.985   | 0.731   |
+| FaceBoxes    | Lite     | 640  | 0.987   | 0.741   |
+
+**NOTES:**  
+- Get mAP by multi-scale evaluation on the FDDB dataset.
+For details can refer to [Evaluation](#Evaluate-on-the-FDDB).
+
+#### Infer Time and Model Size comparison  
+
+| Architecture | Type     | Size | P4 (ms)   | CPU (ms) | ARM (ms)   | File size (MB) | Flops     |
+|:------------:|:--------:|:----:|:---------:|:--------:|:----------:|:--------------:|:---------:|
+| BlazeFace    | Original | 128  | -         | -        | -          | -              | -         |
+| BlazeFace    | Lite     | 128  | -         | -        | -          | -              | -         |
+| BlazeFace    | NAS      | 128  | -         | -        | -          | -              | -         |
+| FaceBoxes    | Original | 128  | -         | -        | -          | -              | -         |
+| FaceBoxes    | Lite     | 128  | -         | -        | -          | -              | -         |
+| BlazeFace    | Original | 320  | -         | -        | -          | -              | -         |
+| BlazeFace    | Lite     | 320  | -         | -        | -          | -              | -         |
+| BlazeFace    | NAS      | 320  | -         | -        | -          | -              | -         |
+| FaceBoxes    | Original | 320  | -         | -        | -          | -              | -         |
+| FaceBoxes    | Lite     | 320  | -         | -        | -          | -              | -         |
+| BlazeFace    | Original | 640  | -         | -        | -          | -              | -         |
+| BlazeFace    | Lite     | 640  | -         | -        | -          | -              | -         |
+| BlazeFace    | NAS      | 640  | -         | -        | -          | -              | -         |
+| FaceBoxes    | Original | 640  | -         | -        | -          | -              | -         |
+| FaceBoxes    | Lite     | 640  | -         | -        | -          | -              | -         |
+
+
+**NOTES:**  
+- CPU: i5-7360U @ 2.30GHz. Single core and single thread.
+
+
+
+## Get Started
+`Training` and `Inference` please refer to [GETTING_STARTED.md](../../docs/GETTING_STARTED.md)
+- **NOTES:**     
+- `BlazeFace` and `FaceBoxes` is trained in 4 GPU with `batch_size=8` per gpu (total batch size as 32) 
+and trained 320000 iters.(If your GPU count is not 4, please refer to the rule of training parameters 
+in the table of [calculation rules](../../docs/GETTING_STARTED.md#faq))
+- Currently we do not support evaluation in training.
+
+### Evaluation
+```
+export CUDA_VISIBLE_DEVICES=0
+export PYTHONPATH=$PYTHONPATH:.
+python tools/face_eval.py -c configs/face_detection/blazeface.yml
+```
+- Optional arguments
+- `-d` or `--dataset_dir`: Dataset path, same as dataset_dir of configs. Such as: `-d dataset/wider_face`.
+- `-f` or `--output_eval`: Evaluation file directory, default is `output/pred`.
+- `-e` or `--eval_mode`: Evaluation mode, include `widerface` and `fddb`, default is `widerface`.
+- `--multi_scale`: If you add this action button in the command, it will select `multi_scale` evaluation. 
+Default is `False`, it will select `single-scale` evaluation.
+
+After the evaluation is completed, the test result in txt format will be generated in `output/pred`,
+and then mAP will be calculated according to different data sets. If you set `--eval_mode=widerface`,
+it will [Evaluate on the WIDER FACE](#Evaluate-on-the-WIDER-FACE).If you set `--eval_mode=fddb`,
+it will [Evaluate on the FDDB](#Evaluate-on-the-FDDB).
+
+#### Evaluate on the WIDER FACE
+- Download the official evaluation script to evaluate the AP metrics:
+```
+wget http://mmlab.ie.cuhk.edu.hk/projects/WIDERFace/support/eval_script/eval_tools.zip
+unzip eval_tools.zip && rm -f eval_tools.zip
+```
+- Modify the result path and the name of the curve to be drawn in `eval_tools/wider_eval.m`:
+```
+# Modify the folder name where the result is stored.
+pred_dir = './pred';  
+# Modify the name of the curve to be drawn
+legend_name = 'Fluid-BlazeFace';
+```
+- `wider_eval.m` is the main execution program of the evaluation module. The run command is as follows:
+```
+matlab -nodesktop -nosplash -nojvm -r "run wider_eval.m;quit;"
+```
+
+#### Evaluate on the FDDB
+[FDDB dataset](http://vis-www.cs.umass.edu/fddb/) details can refer to FDDB's official website.   
+- Download the official dataset and evaluation script to evaluate the ROC metrics:
+```
+#external link to the Faces in the Wild data set
+wget http://tamaraberg.com/faceDataset/originalPics.tar.gz
+#The annotations are split into ten folds. See README for details.
+wget http://vis-www.cs.umass.edu/fddb/FDDB-folds.tgz
+#information on directory structure and file formats
+wget http://vis-www.cs.umass.edu/fddb/README.txt
+```
+- Install OpenCV: Requires [OpenCV library](http://sourceforge.net/projects/opencvlibrary/)  
+If the utility 'pkg-config' is not available for your operating system,
+edit the Makefile to manually specify the OpenCV flags as following:
+```
+INCS = -I/usr/local/include/opencv
+LIBS = -L/usr/local/lib -lcxcore -lcv -lhighgui -lcvaux -lml
+```
+
+- Compile FDDB evaluation code: execute `make` in evaluation folder.
+
+- Generate full image path list and groundtruth in FDDB-folds. The run command is as follows:
+```
+cat `ls|grep -v"ellipse"` > filePath.txt` and `cat *ellipse* > fddb_annotFile.txt`
+```
+- Evaluation
+Finally evaluation command is:
+```
+./evaluate -a ./FDDB/FDDB-folds/fddb_annotFile.txt \
+           -d DETECTION_RESULT.txt -f 0 \
+           -i ./FDDB -l ./FDDB/FDDB-folds/filePath.txt \
+           -r ./OUTPUT_DIR -z .jpg
+```
+**NOTES:** The interpretation of the argument can be performed by `./evaluate --help`.
+
+## Algorithm Description
+
+### BlazeFace
+**Introduction:**  
+[BlazeFace](https://arxiv.org/abs/1907.05047) is Google Research published face detection model.
+It's lightweight but good performance, and tailored for mobile GPU inference. It runs at a speed
+of 200-1000+ FPS on flagship devices.
+
+**Particularity:**  
+- Anchor scheme stops at 8×8(input 128x128), 6 anchors per pixel at that resolution.
+- 5 single, and 6 double BlazeBlocks: 5×5 depthwise convs, same accuracy with fewer layers.
+- Replace the non-maximum suppression algorithm with a blending strategy that estimates the
+regression parameters of a bounding box as a weighted mean between the overlapping predictions.
+
+**Edition information:**
+- Original: Reference original paper reproduction.
+- Lite: Replace 5x5 conv with 3x3 conv, fewer network layers and conv channels.
+- NAS: use `Neural Architecture Search` algorithm to optimized network structure,
+less network layer and conv channel number than `Lite`.
+
+### FaceBoxes
+**Introduction:**     
+[FaceBoxes](https://arxiv.org/abs/1708.05234) which named A CPU Real-time Face Detector
+with High Accuracy is face detector proposed by Shifeng Zhang, with high performance on
+both speed and accuracy. This paper is published by IJCB(2017).
+
+**Particularity:**
+- Anchor scheme stops at 20x20, 10x10, 5x5, which network input size is 640x640,
+including 3, 1, 1 anchors per pixel at each resolution. The corresponding densities
+are 1, 2, 4(20x20), 4(10x10) and 4(5x5).
+- 2 convs with CReLU, 2 poolings, 3 inceptions and 2 convs with ReLU.
+- Use density prior box to improve detection accuracy.
+
+**Edition information:**
+- Original: Reference original paper reproduction.
+- Lite: 2 convs with CReLU, 1 pooling, 2 convs with ReLU, 3 inceptions and 2 convs with ReLU.
+Anchor scheme stops at 80x80 and 40x40, including 3, 1 anchors per pixel at each resolution.
+The corresponding densities are 1, 2, 4(80x80) and 4(40x40), using less conv channel number than lite.
+
+
+## Contributing
+Contributions are highly welcomed and we would really appreciate your feedback!!
--- a/configs/face_detection/blazeface.yml
+++ b/configs/face_detection/blazeface.yml
+architecture: BlazeFace
+max_iters: 320000
+train_feed: SSDTrainFeed
+eval_feed: SSDEvalFeed
+test_feed: SSDTestFeed
+pretrain_weights: 
+use_gpu: true
+snapshot_iter: 10000
+log_smooth_window: 20
+log_iter: 20
+metric: WIDERFACE
+save_dir: output
+weights: output/blazeface/model_final/
+# 1(label_class) + 1(background)
+num_classes: 2
+
+BlazeFace:
+  backbone: BlazeNet
+  output_decoder:
+    keep_top_k: 750
+    nms_threshold: 0.3
+    nms_top_k: 5000
+    score_threshold: 0.01
+  min_sizes: [[16.,24.], [32., 48., 64., 80., 96., 128.]]
+  use_density_prior_box: false
+
+BlazeNet:
+  with_extra_blocks: true
+  lite_edition: false
+
+LearningRate:
+  base_lr: 0.001
+  schedulers:
+  - !PiecewiseDecay
+    gamma: 0.1
+    milestones: [240000, 300000]
+
+OptimizerBuilder:
+  optimizer:
+    momentum: 0.0
+    type: RMSPropOptimizer
+  regularizer:
+    factor: 0.0005
+    type: L2
+
+SSDTrainFeed:
+  batch_size: 8
+  use_process: True
+  dataset:
+    dataset_dir: dataset/wider_face
+    annotation: wider_face_split/wider_face_train_bbx_gt.txt
+    image_dir: WIDER_train/images
+  image_shape: [3, 640, 640]
+  sample_transforms:
+  - !DecodeImage
+    to_rgb: true
+    with_mixup: false
+  - !NormalizeBox {}
+  - !RandomDistort
+    brightness_lower: 0.875
+    brightness_upper: 1.125
+    is_order: true
+  - !ExpandImage
+    max_ratio: 4
+    prob: 0.5
+  - !CropImageWithDataAchorSampling
+    anchor_sampler:
+    - [1, 10, 1.0, 1.0, 1.0, 1.0, 0.0, 0.0, 0.2, 0.0]
+    batch_sampler:
+    - [1, 50, 1.0, 1.0, 1.0, 1.0, 0.0, 0.0, 1.0, 0.0]
+    - [1, 50, 0.3, 1.0, 1.0, 1.0, 0.0, 0.0, 1.0, 0.0]
+    - [1, 50, 0.3, 1.0, 1.0, 1.0, 0.0, 0.0, 1.0, 0.0]
+    - [1, 50, 0.3, 1.0, 1.0, 1.0, 0.0, 0.0, 1.0, 0.0]
+    - [1, 50, 0.3, 1.0, 1.0, 1.0, 0.0, 0.0, 1.0, 0.0]
+    target_size: 640
+  - !RandomInterpImage
+    target_size: 640
+  - !RandomFlipImage
+    is_normalized: true
+  - !Permute {}
+  - !NormalizeImage
+    is_scale: false
+    mean: [104, 117, 123]
+    std: [127.502231, 127.502231, 127.502231]
+
+SSDEvalFeed:
+  batch_size: 1
+  use_process: false
+  fields: ['image', 'im_id', 'gt_box']
+  dataset:
+    dataset_dir: dataset/wider_face
+    annotation: wider_face_split/wider_face_val_bbx_gt.txt   
+    image_dir: WIDER_val/images
+  drop_last: false
+  image_shape: [3, 640, 640]
+  sample_transforms:
+  - !DecodeImage
+    to_rgb: true
+    with_mixup: false
+  - !NormalizeBox {}
+  - !ResizeImage
+    interp: 1
+    target_size: 640
+    use_cv2: false
+  - !Permute {}
+  - !NormalizeImage
+    is_scale: false
+    mean: [104, 117, 123]
+    std: [127.502231, 127.502231, 127.502231]
+
+SSDTestFeed:
+  batch_size: 1
+  use_process: false
+  dataset:
+    use_default_label: true
+  drop_last: false
+  image_shape: [3, 640, 640]
+  sample_transforms:
+  - !DecodeImage
+    to_rgb: true
+    with_mixup: false
+  - !ResizeImage
+    interp: 1
+    target_size: 640
+    use_cv2: false
+  - !Permute {}
+  - !NormalizeImage
+    is_scale: false
+    mean: [104, 117, 123]
+    std: [127.502231, 127.502231, 127.502231]
--- a/configs/face_detection/blazeface_nas.yml
+++ b/configs/face_detection/blazeface_nas.yml
+architecture: BlazeFace
+max_iters: 320000
+train_feed: SSDTrainFeed
+eval_feed: SSDEvalFeed
+test_feed: SSDTestFeed
+pretrain_weights: 
+use_gpu: true
+snapshot_iter: 10000
+log_smooth_window: 20
+log_iter: 20
+metric: WIDERFACE
+save_dir: output
+weights: output/blazeface_nas/model_final/
+# 1(label_class) + 1(background)
+num_classes: 2
+
+BlazeFace:
+  backbone: BlazeNet
+  output_decoder:
+    keep_top_k: 750
+    nms_threshold: 0.3
+    nms_top_k: 5000
+    score_threshold: 0.01
+  min_sizes: [[16.,24.], [32., 48., 64., 80., 96., 128.]]
+  use_density_prior_box: false
+
+BlazeNet:
+  blaze_filters: [[12, 12], [12, 12, 2], [12, 12]]
+  double_blaze_filters: [[12, 16, 24, 2], [24, 12, 24], [24, 16, 72, 2], [72, 12, 72]]
+  with_extra_blocks: true
+  lite_edition: false
+
+LearningRate:
+  base_lr: 0.001
+  schedulers:
+  - !PiecewiseDecay
+    gamma: 0.1
+    milestones: [240000, 300000]
+
+OptimizerBuilder:
+  optimizer:
+    momentum: 0.0
+    type: RMSPropOptimizer
+  regularizer:
+    factor: 0.0005
+    type: L2
+
+SSDTrainFeed:
+  batch_size: 8
+  use_process: True
+  dataset:
+    dataset_dir: dataset/wider_face
+    annotation: wider_face_split/wider_face_train_bbx_gt.txt
+    image_dir: WIDER_train/images
+  image_shape: [3, 640, 640]
+  sample_transforms:
+  - !DecodeImage
+    to_rgb: true
+    with_mixup: false
+  - !NormalizeBox {}
+  - !RandomDistort
+    brightness_lower: 0.875
+    brightness_upper: 1.125
+    is_order: true
+  - !ExpandImage
+    max_ratio: 4
+    prob: 0.5
+  - !CropImageWithDataAchorSampling
+    anchor_sampler:
+    - [1, 10, 1.0, 1.0, 1.0, 1.0, 0.0, 0.0, 0.2, 0.0]
+    batch_sampler:
+    - [1, 50, 1.0, 1.0, 1.0, 1.0, 0.0, 0.0, 1.0, 0.0]
+    - [1, 50, 0.3, 1.0, 1.0, 1.0, 0.0, 0.0, 1.0, 0.0]
+    - [1, 50, 0.3, 1.0, 1.0, 1.0, 0.0, 0.0, 1.0, 0.0]
+    - [1, 50, 0.3, 1.0, 1.0, 1.0, 0.0, 0.0, 1.0, 0.0]
+    - [1, 50, 0.3, 1.0, 1.0, 1.0, 0.0, 0.0, 1.0, 0.0]
+    target_size: 640
+  - !RandomInterpImage
+    target_size: 640
+  - !RandomFlipImage
+    is_normalized: true
+  - !Permute {}
+  - !NormalizeImage
+    is_scale: false
+    mean: [104, 117, 123]
+    std: [127.502231, 127.502231, 127.502231]
+
+SSDEvalFeed:
+  batch_size: 1
+  use_process: false
+  fields: ['image', 'im_id', 'gt_box']
+  dataset:
+    dataset_dir: dataset/wider_face
+    annotation: wider_face_split/wider_face_val_bbx_gt.txt
+    image_dir: WIDER_val/images
+  drop_last: false
+  image_shape: [3, 640, 640]
+  sample_transforms:
+  - !DecodeImage
+    to_rgb: true
+    with_mixup: false
+  - !NormalizeBox {}
+  - !ResizeImage
+    interp: 1
+    target_size: 640
+    use_cv2: false
+  - !Permute {}
+  - !NormalizeImage
+    is_scale: false
+    mean: [104, 117, 123]
+    std: [127.502231, 127.502231, 127.502231]
+
+SSDTestFeed:
+  batch_size: 1
+  use_process: false
+  dataset:
+    use_default_label: true
+  drop_last: false
+  image_shape: [3, 640, 640]
+  sample_transforms:
+  - !DecodeImage
+    to_rgb: true
+    with_mixup: false
+  - !ResizeImage
+    interp: 1
+    target_size: 640
+    use_cv2: false
+  - !Permute {}
+  - !NormalizeImage
+    is_scale: false
+    mean: [104, 117, 123]
+    std: [127.502231, 127.502231, 127.502231]
--- a/configs/face_detection/faceboxes.yml
+++ b/configs/face_detection/faceboxes.yml
+architecture: FaceBoxes
+train_feed: SSDTrainFeed
+eval_feed: SSDEvalFeed
+test_feed: SSDTestFeed
+pretrain_weights:
+use_gpu: true
+max_iters: 320000
+snapshot_iter: 10000
+log_smooth_window: 20
+log_iter: 20
+metric: WIDERFACE
+save_dir: output
+weights: output/faceboxes/model_final/
+# 1(label_class) + 1(background)
+num_classes: 2
+
+FaceBoxes:
+  backbone: FaceBoxNet
+  densities: [[4, 2, 1], [1], [1]]
+  fixed_sizes: [[32., 64., 128.], [256.], [512.]]
+  output_decoder:
+    keep_top_k: 750
+    nms_threshold: 0.3
+    nms_top_k: 5000
+    score_threshold: 0.01
+
+FaceBoxNet:
+  with_extra_blocks: true
+  lite_edition: false
+
+LearningRate:
+  base_lr: 0.001
+  schedulers:
+  - !PiecewiseDecay
+    gamma: 0.1
+    milestones: [240000, 300000]
+
+OptimizerBuilder:
+  optimizer:
+    momentum: 0.0
+    type: RMSPropOptimizer
+  regularizer:
+    factor: 0.0005
+    type: L2
+
+SSDTrainFeed:
+  batch_size: 8
+  use_process: True
+  dataset:
+    dataset_dir: dataset/wider_face
+    annotation: wider_face_split/wider_face_train_bbx_gt.txt
+    image_dir: WIDER_train/images
+  image_shape: [3, 640, 640]
+  sample_transforms:
+  - !DecodeImage
+    to_rgb: true
+    with_mixup: false
+  - !NormalizeBox {}
+  - !RandomDistort
+    brightness_lower: 0.875
+    brightness_upper: 1.125
+    is_order: true
+  - !ExpandImage
+    max_ratio: 4
+    prob: 0.5
+  - !CropImageWithDataAchorSampling
+    anchor_sampler:
+    - [1, 10, 1.0, 1.0, 1.0, 1.0, 0.0, 0.0, 0.2, 0.0]
+    batch_sampler:
+    - [1, 50, 1.0, 1.0, 1.0, 1.0, 0.0, 0.0, 1.0, 0.0]
+    - [1, 50, 0.3, 1.0, 1.0, 1.0, 0.0, 0.0, 1.0, 0.0]
+    - [1, 50, 0.3, 1.0, 1.0, 1.0, 0.0, 0.0, 1.0, 0.0]
+    - [1, 50, 0.3, 1.0, 1.0, 1.0, 0.0, 0.0, 1.0, 0.0]
+    - [1, 50, 0.3, 1.0, 1.0, 1.0, 0.0, 0.0, 1.0, 0.0]
+    target_size: 640
+  - !RandomInterpImage
+    target_size: 640
+  - !RandomFlipImage
+    is_normalized: true
+  - !Permute {}
+  - !NormalizeImage
+    is_scale: false
+    mean: [104, 117, 123]
+    std: [127.502231, 127.502231, 127.502231]
+
+SSDEvalFeed:
+  batch_size: 1
+  use_process: false
+  fields: ['image', 'im_id', 'gt_box']
+  dataset:
+    dataset_dir: dataset/wider_face
+    annotation: wider_face_split/wider_face_val_bbx_gt.txt
+    image_dir: WIDER_val/images
+  drop_last: false
+  image_shape: [3, 640, 640]
+  sample_transforms:
+  - !DecodeImage
+    to_rgb: true
+    with_mixup: false
+  - !NormalizeBox {}
+  - !ResizeImage
+    interp: 1
+    target_size: 640
+    use_cv2: false
+  - !Permute {}
+  - !NormalizeImage
+    is_scale: false
+    mean: [104, 117, 123]
+    std: [127.502231, 127.502231, 127.502231]
+
+SSDTestFeed:
+  batch_size: 1
+  use_process: false
+  dataset:
+    use_default_label: true
+  drop_last: false
+  image_shape: [3, 640, 640]
+  sample_transforms:
+  - !DecodeImage
+    to_rgb: true
+    with_mixup: false
+  - !ResizeImage
+    interp: 1
+    target_size: 640
+    use_cv2: false
+  - !Permute {}
+  - !NormalizeImage
+    is_scale: false
+    mean: [104, 117, 123]
+    std: [127.502231, 127.502231, 127.502231]
--- a/configs/face_detection/faceboxes_lite.yml
+++ b/configs/face_detection/faceboxes_lite.yml
+architecture: FaceBoxes
+train_feed: SSDTrainFeed
+eval_feed: SSDEvalFeed
+test_feed: SSDTestFeed
+pretrain_weights:
+use_gpu: true
+max_iters: 320000
+snapshot_iter: 10000
+log_smooth_window: 20
+log_iter: 20
+metric: WIDERFACE
+save_dir: output
+weights: output/faceboxes_lite/model_final/
+# 1(label_class) + 1(background)
+num_classes: 2
+
+FaceBoxes:
+  backbone: FaceBoxNet
+  densities: [[2, 1, 1], [1, 1]]
+  fixed_sizes: [[16., 32., 64.], [96., 128.]]
+  output_decoder:
+    keep_top_k: 750
+    nms_threshold: 0.3
+    nms_top_k: 5000
+    score_threshold: 0.01
+
+FaceBoxNet:
+  with_extra_blocks: true
+  lite_edition: true
+
+LearningRate:
+  base_lr: 0.001
+  schedulers:
+  - !PiecewiseDecay
+    gamma: 0.1
+    milestones: [240000, 300000]
+
+OptimizerBuilder:
+  optimizer:
+    momentum: 0.0
+    type: RMSPropOptimizer
+  regularizer:
+    factor: 0.0005
+    type: L2
+
+SSDTrainFeed:
+  batch_size: 8
+  use_process: True
+  dataset:
+    dataset_dir: dataset/wider_face
+    annotation: wider_face_split/wider_face_train_bbx_gt.txt
+    image_dir: WIDER_train/images
+  image_shape: [3, 640, 640]
+  sample_transforms:
+  - !DecodeImage
+    to_rgb: true
+    with_mixup: false
+  - !NormalizeBox {}
+  - !RandomDistort
+    brightness_lower: 0.875
+    brightness_upper: 1.125
+    is_order: true
+  - !ExpandImage
+    max_ratio: 4
+    prob: 0.5
+  - !CropImageWithDataAchorSampling
+    anchor_sampler:
+    - [1, 10, 1.0, 1.0, 1.0, 1.0, 0.0, 0.0, 0.2, 0.0]
+    batch_sampler:
+    - [1, 50, 1.0, 1.0, 1.0, 1.0, 0.0, 0.0, 1.0, 0.0]
+    - [1, 50, 0.3, 1.0, 1.0, 1.0, 0.0, 0.0, 1.0, 0.0]
+    - [1, 50, 0.3, 1.0, 1.0, 1.0, 0.0, 0.0, 1.0, 0.0]
+    - [1, 50, 0.3, 1.0, 1.0, 1.0, 0.0, 0.0, 1.0, 0.0]
+    - [1, 50, 0.3, 1.0, 1.0, 1.0, 0.0, 0.0, 1.0, 0.0]
+    target_size: 640
+  - !RandomInterpImage
+    target_size: 640
+  - !RandomFlipImage
+    is_normalized: true
+  - !Permute {}
+  - !NormalizeImage
+    is_scale: false
+    mean: [104, 117, 123]
+    std: [127.502231, 127.502231, 127.502231]
+
+SSDEvalFeed:
+  batch_size: 1
+  use_process: false
+  fields: ['image', 'im_id', 'gt_box']
+  dataset:
+    dataset_dir: dataset/wider_face
+    annotation: wider_face_split/wider_face_val_bbx_gt.txt
+    image_dir: WIDER_val/images
+  drop_last: false
+  image_shape: [3, 640, 640]
+  sample_transforms:
+  - !DecodeImage
+    to_rgb: true
+    with_mixup: false
+  - !NormalizeBox {}
+  - !ResizeImage
+    interp: 1
+    target_size: 640
+    use_cv2: false
+  - !Permute {}
+  - !NormalizeImage
+    is_scale: false
+    mean: [104, 117, 123]
+    std: [127.502231, 127.502231, 127.502231]
+
+SSDTestFeed:
+  batch_size: 1
+  use_process: false
+  dataset:
+    use_default_label: true
+  drop_last: false
+  image_shape: [3, 640, 640]
+  sample_transforms:
+  - !DecodeImage
+    to_rgb: true
+    with_mixup: false
+  - !ResizeImage
+    interp: 1
+    target_size: 640
+    use_cv2: false
+  - !Permute {}
+  - !NormalizeImage
+    is_scale: false
+    mean: [104, 117, 123]
+    std: [127.502231, 127.502231, 127.502231]
--- a/configs/faster_rcnn_r101_1x.yml
+++ b/configs/faster_rcnn_r101_1x.yml
+architecture: FasterRCNN
+train_feed: FasterRCNNTrainFeed
+eval_feed: FasterRCNNEvalFeed
+test_feed: FasterRCNNTestFeed
+use_gpu: true
+max_iters: 180000
+log_smooth_window: 20
+save_dir: output
+snapshot_iter: 10000
+pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/ResNet101_pretrained.tar
+metric: COCO
+weights: output/faster_rcnn_r101_1x/model_final
+num_classes: 81
+
+FasterRCNN:
+  backbone: ResNet
+  rpn_head: RPNHead
+  roi_extractor: RoIAlign
+  bbox_head: BBoxHead
+  bbox_assigner: BBoxAssigner
+
+ResNet:
+  norm_type: affine_channel
+  depth: 101
+  feature_maps: 4
+  freeze_at: 2
+
+ResNetC5:
+  depth: 101
+  norm_type: affine_channel
+
+RPNHead:
+  anchor_generator:
+    anchor_sizes: [32, 64, 128, 256, 512]
+    aspect_ratios: [0.5, 1.0, 2.0]
+    stride: [16.0, 16.0]
+    variance: [1.0, 1.0, 1.0, 1.0]
+  rpn_target_assign:
+    rpn_batch_size_per_im: 256
+    rpn_fg_fraction: 0.5
+    rpn_negative_overlap: 0.3
+    rpn_positive_overlap: 0.7
+    rpn_straddle_thresh: 0.0
+    use_random: true
+  train_proposal:
+    min_size: 0.0
+    nms_thresh: 0.7
+    pre_nms_top_n: 12000
+    post_nms_top_n: 2000
+  test_proposal:
+    min_size: 0.0
+    nms_thresh: 0.7
+    pre_nms_top_n: 6000
+    post_nms_top_n: 1000
+
+RoIAlign:
+  resolution: 14
+  sampling_ratio: 0
+  spatial_scale: 0.0625
+
+BBoxAssigner:
+  batch_size_per_im: 512
+  bbox_reg_weights: [0.1, 0.1, 0.2, 0.2]
+  bg_thresh_hi: 0.5
+  bg_thresh_lo: 0.0
+  fg_fraction: 0.25
+  fg_thresh: 0.5
+
+BBoxHead:
+  head: ResNetC5
+  nms:
+    keep_top_k: 100
+    nms_threshold: 0.5
+    score_threshold: 0.05
+
+LearningRate:
+  base_lr: 0.01
+  schedulers:
+  - !PiecewiseDecay
+    gamma: 0.1
+    milestones: [120000, 160000]
+  - !LinearWarmup
+    start_factor: 0.3333333333333333
+    steps: 500
+
+OptimizerBuilder:
+  optimizer:
+    momentum: 0.9
+    type: Momentum
+  regularizer:
+    factor: 0.0001
+    type: L2
+
+FasterRCNNTrainFeed:
+  # batch size per device
+  batch_size: 1
+  dataset:
+    dataset_dir: dataset/coco
+    annotation: annotations/instances_train2017.json
+    image_dir: train2017
+  drop_last: false
+  num_workers: 2
+
+FasterRCNNEvalFeed:
+  batch_size: 1
+  dataset:
+    dataset_dir: dataset/coco
+    annotation: annotations/instances_val2017.json
+    image_dir: val2017
+  num_workers: 2
+
+FasterRCNNTestFeed:
+  batch_size: 1
+  dataset:
+    annotation: dataset/coco/annotations/instances_val2017.json
--- a/configs/faster_rcnn_r101_fpn_1x.yml
+++ b/configs/faster_rcnn_r101_fpn_1x.yml
+architecture: FasterRCNN
+train_feed: FasterRCNNTrainFeed
+eval_feed: FasterRCNNEvalFeed
+test_feed: FasterRCNNTestFeed
+max_iters: 180000
+snapshot_iter: 10000
+use_gpu: true
+log_smooth_window: 20
+save_dir: output
+pretrain_weights: http://paddle-imagenet-models-name.bj.bcebos.com/ResNet101_pretrained.tar
+weights: output/faster_rcnn_r101_fpn_1x/model_final
+metric: COCO
+num_classes: 81
+
+FasterRCNN:
+  backbone: ResNet
+  fpn: FPN
+  rpn_head: FPNRPNHead
+  roi_extractor: FPNRoIAlign
+  bbox_head: BBoxHead
+  bbox_assigner: BBoxAssigner
+
+ResNet:
+  depth: 101
+  feature_maps: [2, 3, 4, 5]
+  freeze_at: 2
+  norm_type: affine_channel
+
+FPN:
+  max_level: 6
+  min_level: 2
+  num_chan: 256
+  spatial_scale: [0.03125, 0.0625, 0.125, 0.25]
+
+FPNRPNHead:
+  anchor_generator:
+    anchor_sizes: [32, 64, 128, 256, 512]
+    aspect_ratios: [0.5, 1.0, 2.0]
+    stride: [16.0, 16.0]
+    variance: [1.0, 1.0, 1.0, 1.0]
+  anchor_start_size: 32
+  max_level: 6
+  min_level: 2
+  num_chan: 256
+  rpn_target_assign:
+    rpn_batch_size_per_im: 256
+    rpn_fg_fraction: 0.5
+    rpn_negative_overlap: 0.3
+    rpn_positive_overlap: 0.7
+    rpn_straddle_thresh: 0.0
+  train_proposal:
+    min_size: 0.0
+    nms_thresh: 0.7
+    post_nms_top_n: 2000
+    pre_nms_top_n: 2000
+  test_proposal:
+    min_size: 0.0
+    nms_thresh: 0.7
+    post_nms_top_n: 1000
+    pre_nms_top_n: 1000
+
+FPNRoIAlign:
+  canconical_level: 4
+  canonical_size: 224
+  max_level: 5
+  min_level: 2
+  box_resolution: 7
+  sampling_ratio: 2
+
+BBoxAssigner:
+  batch_size_per_im: 512
+  bbox_reg_weights: [0.1, 0.1, 0.2, 0.2]
+  bg_thresh_hi: 0.5
+  bg_thresh_lo: 0.0
+  fg_fraction: 0.25
+  fg_thresh: 0.5
+
+BBoxHead:
+  head: TwoFCHead
+  nms:
+    keep_top_k: 100
+    nms_threshold: 0.5
+    score_threshold: 0.05
+
+TwoFCHead:
+  mlp_dim: 1024
+
+LearningRate:
+  base_lr: 0.01
+  schedulers:
+  - !PiecewiseDecay
+    gamma: 0.1
+    milestones: [120000, 160000]
+  - !LinearWarmup
+    start_factor: 0.3333333333333333
+    steps: 500
+
+OptimizerBuilder:
+  optimizer:
+    momentum: 0.9
+    type: Momentum
+  regularizer:
+    factor: 0.0001
+    type: L2
+
+FasterRCNNTrainFeed:
+  # batch size per device
+  batch_size: 1
+  dataset:
+    dataset_dir: dataset/coco
+    image_dir: train2017
+    annotation: annotations/instances_train2017.json
+  batch_transforms:
+  - !PadBatch
+    pad_to_stride: 32
+  num_workers: 2
+
+FasterRCNNEvalFeed:
+  batch_size: 1
+  dataset:
+    dataset_dir: dataset/coco
+    annotation: annotations/instances_val2017.json
+    image_dir: val2017
+  batch_transforms:
+  - !PadBatch
+    pad_to_stride: 32
+  num_workers: 2
+
+FasterRCNNTestFeed:
+  batch_size: 1
+  dataset:
+    annotation: dataset/coco/annotations/instances_val2017.json
+  batch_transforms:
+  - !PadBatch
+    pad_to_stride: 32
+  num_workers: 2
--- a/configs/faster_rcnn_r101_fpn_2x.yml
+++ b/configs/faster_rcnn_r101_fpn_2x.yml
+architecture: FasterRCNN
+train_feed: FasterRCNNTrainFeed
+eval_feed: FasterRCNNEvalFeed
+test_feed: FasterRCNNTestFeed
+max_iters: 360000
+snapshot_iter: 10000
+use_gpu: true
+log_smooth_window: 20
+save_dir: output
+pretrain_weights: http://paddle-imagenet-models-name.bj.bcebos.com/ResNet101_pretrained.tar
+weights: output/faster_rcnn_r101_fpn_2x/model_final
+metric: COCO
+num_classes: 81
+
+FasterRCNN:
+  backbone: ResNet
+  fpn: FPN
+  rpn_head: FPNRPNHead
+  roi_extractor: FPNRoIAlign
+  bbox_head: BBoxHead
+  bbox_assigner: BBoxAssigner
+
+ResNet:
+  depth: 101
+  feature_maps: [2, 3, 4, 5]
+  freeze_at: 2
+  norm_type: affine_channel
+
+FPN:
+  max_level: 6
+  min_level: 2
+  num_chan: 256
+  spatial_scale: [0.03125, 0.0625, 0.125, 0.25]
+
+FPNRPNHead:
+  anchor_generator:
+    anchor_sizes: [32, 64, 128, 256, 512]
+    aspect_ratios: [0.5, 1.0, 2.0]
+    stride: [16.0, 16.0]
+    variance: [1.0, 1.0, 1.0, 1.0]
+  anchor_start_size: 32
+  max_level: 6
+  min_level: 2
+  num_chan: 256
+  rpn_target_assign:
+    rpn_batch_size_per_im: 256
+    rpn_fg_fraction: 0.5
+    rpn_negative_overlap: 0.3
+    rpn_positive_overlap: 0.7
+    rpn_straddle_thresh: 0.0
+  train_proposal:
+    min_size: 0.0
+    nms_thresh: 0.7
+    post_nms_top_n: 2000
+    pre_nms_top_n: 2000
+  test_proposal:
+    min_size: 0.0
+    nms_thresh: 0.7
+    post_nms_top_n: 1000
+    pre_nms_top_n: 1000
+
+FPNRoIAlign:
+  canconical_level: 4
+  canonical_size: 224
+  max_level: 5
+  min_level: 2
+  box_resolution: 7
+  sampling_ratio: 2
+
+BBoxAssigner:
+  batch_size_per_im: 512
+  bbox_reg_weights: [0.1, 0.1, 0.2, 0.2]
+  bg_thresh_hi: 0.5
+  bg_thresh_lo: 0.0
+  fg_fraction: 0.25
+  fg_thresh: 0.5
+
+BBoxHead:
+  head: TwoFCHead
+  nms:
+    keep_top_k: 100
+    nms_threshold: 0.5
+    score_threshold: 0.05
+
+TwoFCHead:
+  mlp_dim: 1024
+
+LearningRate:
+  base_lr: 0.01
+  schedulers:
+  - !PiecewiseDecay
+    gamma: 0.1
+    milestones: [240000, 320000]
+  - !LinearWarmup
+    start_factor: 0.3333333333333333
+    steps: 500
+
+OptimizerBuilder:
+  optimizer:
+    momentum: 0.9
+    type: Momentum
+  regularizer:
+    factor: 0.0001
+    type: L2
+
+FasterRCNNTrainFeed:
+  # batch size per device
+  batch_size: 1
+  dataset:
+    dataset_dir: dataset/coco
+    image_dir: train2017
+    annotation: annotations/instances_train2017.json
+  batch_transforms:
+  - !PadBatch
+    pad_to_stride: 32
+  num_workers: 2
+
+FasterRCNNEvalFeed:
+  batch_size: 1
+  dataset:
+    dataset_dir: dataset/coco
+    annotation: annotations/instances_val2017.json
+    image_dir: val2017
+  batch_transforms:
+  - !PadBatch
+    pad_to_stride: 32
+  num_workers: 2
+
+FasterRCNNTestFeed:
+  batch_size: 1
+  dataset:
+    annotation: dataset/coco/annotations/instances_val2017.json
+  batch_transforms:
+  - !PadBatch
+    pad_to_stride: 32
+  num_workers: 2
--- a/configs/faster_rcnn_r101_vd_fpn_1x.yml
+++ b/configs/faster_rcnn_r101_vd_fpn_1x.yml
+architecture: FasterRCNN
+train_feed: FasterRCNNTrainFeed
+eval_feed: FasterRCNNEvalFeed
+test_feed: FasterRCNNTestFeed
+max_iters: 180000
+snapshot_iter: 10000
+use_gpu: true
+log_smooth_window: 20
+save_dir: output
+pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/ResNet101_vd_pretrained.tar
+weights: output/faster_rcnn_r101_vd_fpn_1x/model_final
+metric: COCO
+num_classes: 81
+
+FasterRCNN:
+  backbone: ResNet
+  fpn: FPN
+  rpn_head: FPNRPNHead
+  roi_extractor: FPNRoIAlign
+  bbox_head: BBoxHead
+  bbox_assigner: BBoxAssigner
+
+ResNet:
+  depth: 101
+  feature_maps: [2, 3, 4, 5]
+  freeze_at: 2
+  norm_type: affine_channel
+  variant: d
+
+FPN:
+  max_level: 6
+  min_level: 2
+  num_chan: 256
+  spatial_scale: [0.03125, 0.0625, 0.125, 0.25]
+
+FPNRPNHead:
+  anchor_generator:
+    anchor_sizes: [32, 64, 128, 256, 512]
+    aspect_ratios: [0.5, 1.0, 2.0]
+    stride: [16.0, 16.0]
+    variance: [1.0, 1.0, 1.0, 1.0]
+  anchor_start_size: 32
+  max_level: 6
+  min_level: 2
+  num_chan: 256
+  rpn_target_assign:
+    rpn_batch_size_per_im: 256
+    rpn_fg_fraction: 0.5
+    rpn_negative_overlap: 0.3
+    rpn_positive_overlap: 0.7
+    rpn_straddle_thresh: 0.0
+  train_proposal:
+    min_size: 0.0
+    nms_thresh: 0.7
+    post_nms_top_n: 2000
+    pre_nms_top_n: 2000
+  test_proposal:
+    min_size: 0.0
+    nms_thresh: 0.7
+    post_nms_top_n: 1000
+    pre_nms_top_n: 1000
+
+FPNRoIAlign:
+  canconical_level: 4
+  canonical_size: 224
+  max_level: 5
+  min_level: 2
+  box_resolution: 7
+  sampling_ratio: 2
+
+BBoxAssigner:
+  batch_size_per_im: 512
+  bbox_reg_weights: [0.1, 0.1, 0.2, 0.2]
+  bg_thresh_hi: 0.5
+  bg_thresh_lo: 0.0
+  fg_fraction: 0.25
+  fg_thresh: 0.5
+
+BBoxHead:
+  head: TwoFCHead
+  nms:
+    keep_top_k: 100
+    nms_threshold: 0.5
+    score_threshold: 0.05
+
+TwoFCHead:
+  mlp_dim: 1024
+
+LearningRate:
+  base_lr: 0.01
+  schedulers:
+  - !PiecewiseDecay
+    gamma: 0.1
+    milestones: [120000, 160000]
+  - !LinearWarmup
+    start_factor: 0.3333333333333333
+    steps: 1000
+
+OptimizerBuilder:
+  optimizer:
+    momentum: 0.9
+    type: Momentum
+  regularizer:
+    factor: 0.0001
+    type: L2
+
+FasterRCNNTrainFeed:
+  # batch size per device
+  batch_size: 1
+  dataset:
+    dataset_dir: dataset/coco
+    image_dir: train2017
+    annotation: annotations/instances_train2017.json
+  batch_transforms:
+  - !PadBatch
+    pad_to_stride: 32
+  num_workers: 2
+
+FasterRCNNEvalFeed:
+  batch_size: 1
+  dataset:
+    dataset_dir: dataset/coco
+    annotation: annotations/instances_val2017.json
+    image_dir: val2017
+  batch_transforms:
+  - !PadBatch
+    pad_to_stride: 32
+  num_workers: 2
+
+FasterRCNNTestFeed:
+  batch_size: 1
+  dataset:
+    annotation: dataset/coco/annotations/instances_val2017.json
+  batch_transforms:
+  - !PadBatch
+    pad_to_stride: 32
+  num_workers: 2
--- a/configs/faster_rcnn_r101_vd_fpn_2x.yml
+++ b/configs/faster_rcnn_r101_vd_fpn_2x.yml
+architecture: FasterRCNN
+train_feed: FasterRCNNTrainFeed
+eval_feed: FasterRCNNEvalFeed
+test_feed: FasterRCNNTestFeed
+max_iters: 360000
+snapshot_iter: 10000
+use_gpu: true
+log_smooth_window: 20
+save_dir: output
+pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/ResNet101_vd_pretrained.tar
+weights: output/faster_rcnn_r101_vd_fpn_2x/model_final
+metric: COCO
+num_classes: 81
+
+FasterRCNN:
+  backbone: ResNet
+  fpn: FPN
+  rpn_head: FPNRPNHead
+  roi_extractor: FPNRoIAlign
+  bbox_head: BBoxHead
+  bbox_assigner: BBoxAssigner
+
+ResNet:
+  depth: 101
+  feature_maps: [2, 3, 4, 5]
+  freeze_at: 2
+  norm_type: affine_channel
+  variant: d
+
+FPN:
+  max_level: 6
+  min_level: 2
+  num_chan: 256
+  spatial_scale: [0.03125, 0.0625, 0.125, 0.25]
+
+FPNRPNHead:
+  anchor_generator:
+    anchor_sizes: [32, 64, 128, 256, 512]
+    aspect_ratios: [0.5, 1.0, 2.0]
+    stride: [16.0, 16.0]
+    variance: [1.0, 1.0, 1.0, 1.0]
+  anchor_start_size: 32
+  max_level: 6
+  min_level: 2
+  num_chan: 256
+  rpn_target_assign:
+    rpn_batch_size_per_im: 256
+    rpn_fg_fraction: 0.5
+    rpn_negative_overlap: 0.3
+    rpn_positive_overlap: 0.7
+    rpn_straddle_thresh: 0.0
+  train_proposal:
+    min_size: 0.0
+    nms_thresh: 0.7
+    post_nms_top_n: 2000
+    pre_nms_top_n: 2000
+  test_proposal:
+    min_size: 0.0
+    nms_thresh: 0.7
+    post_nms_top_n: 1000
+    pre_nms_top_n: 1000
+
+FPNRoIAlign:
+  canconical_level: 4
+  canonical_size: 224
+  max_level: 5
+  min_level: 2
+  box_resolution: 7
+  sampling_ratio: 2
+
+BBoxAssigner:
+  batch_size_per_im: 512
+  bbox_reg_weights: [0.1, 0.1, 0.2, 0.2]
+  bg_thresh_hi: 0.5
+  bg_thresh_lo: 0.0
+  fg_fraction: 0.25
+  fg_thresh: 0.5
+
+BBoxHead:
+  head: TwoFCHead
+  nms:
+    keep_top_k: 100
+    nms_threshold: 0.5
+    score_threshold: 0.05
+
+TwoFCHead:
+  mlp_dim: 1024
+
+LearningRate:
+  base_lr: 0.01
+  schedulers:
+  - !PiecewiseDecay
+    gamma: 0.1
+    milestones: [240000, 320000]
+  - !LinearWarmup
+    start_factor: 0.3333333333333333
+    steps: 1000
+
+OptimizerBuilder:
+  optimizer:
+    momentum: 0.9
+    type: Momentum
+  regularizer:
+    factor: 0.0001
+    type: L2
+
+FasterRCNNTrainFeed:
+  # batch size per device
+  batch_size: 1
+  dataset:
+    dataset_dir: dataset/coco
+    image_dir: train2017
+    annotation: annotations/instances_train2017.json
+  batch_transforms:
+  - !PadBatch
+    pad_to_stride: 32
+  num_workers: 2
+
+FasterRCNNEvalFeed:
+  batch_size: 1
+  dataset:
+    dataset_dir: dataset/coco
+    annotation: annotations/instances_val2017.json
+    image_dir: val2017
+  batch_transforms:
+  - !PadBatch
+    pad_to_stride: 32
+  num_workers: 2
+
+FasterRCNNTestFeed:
+  batch_size: 1
+  dataset:
+    annotation: dataset/coco/annotations/instances_val2017.json
+  batch_transforms:
+  - !PadBatch
+    pad_to_stride: 32
+  num_workers: 2
--- a/configs/faster_rcnn_r50_1x.yml
+++ b/configs/faster_rcnn_r50_1x.yml
+architecture: FasterRCNN
+train_feed: FasterRCNNTrainFeed
+eval_feed: FasterRCNNEvalFeed
+test_feed: FasterRCNNTestFeed
+use_gpu: true
+max_iters: 180000
+log_smooth_window: 20
+save_dir: output
+snapshot_iter: 10000
+pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/ResNet50_cos_pretrained.tar
+metric: COCO
+weights: output/faster_rcnn_r50_1x/model_final
+num_classes: 81
+
+FasterRCNN:
+  backbone: ResNet
+  rpn_head: RPNHead
+  roi_extractor: RoIAlign
+  bbox_head: BBoxHead
+  bbox_assigner: BBoxAssigner
+
+ResNet:
+  norm_type: affine_channel
+  depth: 50
+  feature_maps: 4
+  freeze_at: 2
+
+ResNetC5:
+  depth: 50
+  norm_type: affine_channel
+
+RPNHead:
+  anchor_generator:
+    anchor_sizes: [32, 64, 128, 256, 512]
+    aspect_ratios: [0.5, 1.0, 2.0]
+    stride: [16.0, 16.0]
+    variance: [1.0, 1.0, 1.0, 1.0]
+  rpn_target_assign:
+    rpn_batch_size_per_im: 256
+    rpn_fg_fraction: 0.5
+    rpn_negative_overlap: 0.3
+    rpn_positive_overlap: 0.7
+    rpn_straddle_thresh: 0.0
+    use_random: true
+  train_proposal:
+    min_size: 0.0
+    nms_thresh: 0.7
+    pre_nms_top_n: 12000
+    post_nms_top_n: 2000
+  test_proposal:
+    min_size: 0.0
+    nms_thresh: 0.7
+    pre_nms_top_n: 6000
+    post_nms_top_n: 1000
+
+RoIAlign:
+  resolution: 14
+  sampling_ratio: 0
+  spatial_scale: 0.0625
+
+BBoxAssigner:
+  batch_size_per_im: 512
+  bbox_reg_weights: [0.1, 0.1, 0.2, 0.2]
+  bg_thresh_hi: 0.5
+  bg_thresh_lo: 0.0
+  fg_fraction: 0.25
+  fg_thresh: 0.5
+
+BBoxHead:
+  head: ResNetC5
+  nms:
+    keep_top_k: 100
+    nms_threshold: 0.5
+    score_threshold: 0.05
+
+LearningRate:
+  base_lr: 0.01
+  schedulers:
+  - !PiecewiseDecay
+    gamma: 0.1
+    milestones: [120000, 160000]
+  - !LinearWarmup
+    start_factor: 0.3333333333333333
+    steps: 500
+
+OptimizerBuilder:
+  optimizer:
+    momentum: 0.9
+    type: Momentum
+  regularizer:
+    factor: 0.0001
+    type: L2
+
+FasterRCNNTrainFeed:
+  # batch size per device
+  batch_size: 1
+  dataset:
+    dataset_dir: dataset/coco
+    annotation: annotations/instances_train2017.json
+    image_dir: train2017
+  drop_last: false
+  num_workers: 2
+
+FasterRCNNEvalFeed:
+  batch_size: 1
+  dataset:
+    dataset_dir: dataset/coco
+    annotation: annotations/instances_val2017.json
+    image_dir: val2017
+  num_workers: 2
+
+FasterRCNNTestFeed:
+  batch_size: 1
+  dataset:
+    annotation: dataset/coco/annotations/instances_val2017.json
--- a/configs/faster_rcnn_r50_2x.yml
+++ b/configs/faster_rcnn_r50_2x.yml
+architecture: FasterRCNN
+train_feed: FasterRCNNTrainFeed
+eval_feed: FasterRCNNEvalFeed
+test_feed: FasterRCNNTestFeed
+use_gpu: true
+max_iters: 360000
+log_smooth_window: 20
+save_dir: output
+snapshot_iter: 10000
+pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/ResNet50_cos_pretrained.tar
+metric: COCO
+weights: output/faster_rcnn_r50_2x/model_final
+num_classes: 81
+
+FasterRCNN:
+  backbone: ResNet
+  rpn_head: RPNHead
+  roi_extractor: RoIAlign
+  bbox_head: BBoxHead
+  bbox_assigner: BBoxAssigner
+
+ResNet:
+  norm_type: affine_channel
+  depth: 50
+  feature_maps: 4
+  freeze_at: 2
+
+ResNetC5:
+  depth: 50
+  norm_type: affine_channel
+
+RPNHead:
+  anchor_generator:
+    anchor_sizes: [32, 64, 128, 256, 512]
+    aspect_ratios: [0.5, 1.0, 2.0]
+    stride: [16.0, 16.0]
+    variance: [1.0, 1.0, 1.0, 1.0]
+  rpn_target_assign:
+    rpn_batch_size_per_im: 256
+    rpn_fg_fraction: 0.5
+    rpn_negative_overlap: 0.3
+    rpn_positive_overlap: 0.7
+    rpn_straddle_thresh: 0.0
+    use_random: true
+  train_proposal:
+    min_size: 0.0
+    nms_thresh: 0.7
+    pre_nms_top_n: 12000
+    post_nms_top_n: 2000
+  test_proposal:
+    min_size: 0.0
+    nms_thresh: 0.7
+    pre_nms_top_n: 6000
+    post_nms_top_n: 1000
+
+RoIAlign:
+  resolution: 14
+  sampling_ratio: 0
+  spatial_scale: 0.0625
+
+BBoxAssigner:
+  batch_size_per_im: 512
+  bbox_reg_weights: [0.1, 0.1, 0.2, 0.2]
+  bg_thresh_hi: 0.5
+  bg_thresh_lo: 0.0
+  fg_fraction: 0.25
+  fg_thresh: 0.5
+
+BBoxHead:
+  head: ResNetC5
+  nms:
+    keep_top_k: 100
+    nms_threshold: 0.5
+    score_threshold: 0.05
+
+LearningRate:
+  base_lr: 0.01
+  schedulers:
+  - !PiecewiseDecay
+    gamma: 0.1
+    milestones: [240000, 320000]
+  - !LinearWarmup
+    start_factor: 0.3333333333333333
+    steps: 500
+
+OptimizerBuilder:
+  optimizer:
+    momentum: 0.9
+    type: Momentum
+  regularizer:
+    factor: 0.0001
+    type: L2
+
+FasterRCNNTrainFeed:
+  # batch size per device
+  batch_size: 1
+  dataset:
+    dataset_dir: dataset/coco
+    annotation: annotations/instances_train2017.json
+    image_dir: train2017
+  drop_last: false
+  num_workers: 2
+
+FasterRCNNEvalFeed:
+  batch_size: 1
+  dataset:
+    dataset_dir: dataset/coco
+    annotation: annotations/instances_val2017.json
+    image_dir: val2017
+  num_workers: 2
+
+FasterRCNNTestFeed:
+  batch_size: 1
+  dataset:
+    annotation: dataset/coco/annotations/instances_val2017.json
--- a/configs/faster_rcnn_r50_fpn_1x.yml
+++ b/configs/faster_rcnn_r50_fpn_1x.yml
+architecture: FasterRCNN
+train_feed: FasterRCNNTrainFeed
+eval_feed: FasterRCNNEvalFeed
+test_feed: FasterRCNNTestFeed
+max_iters: 90000
+use_gpu: true
+snapshot_iter: 10000
+log_smooth_window: 20
+save_dir: output
+pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/ResNet50_cos_pretrained.tar
+metric: COCO
+weights: output/faster_rcnn_r50_fpn_1x/model_final
+num_classes: 81
+
+FasterRCNN:
+  backbone: ResNet
+  fpn: FPN
+  rpn_head: FPNRPNHead
+  roi_extractor: FPNRoIAlign
+  bbox_head: BBoxHead
+  bbox_assigner: BBoxAssigner
+
+ResNet:
+  norm_type: bn
+  norm_decay: 0.
+  depth: 50
+  feature_maps: [2, 3, 4, 5]
+  freeze_at: 2
+
+FPN:
+  min_level: 2
+  max_level: 6
+  num_chan: 256
+  spatial_scale: [0.03125, 0.0625, 0.125, 0.25]
+
+FPNRPNHead:
+  anchor_generator:
+    anchor_sizes: [32, 64, 128, 256, 512]
+    aspect_ratios: [0.5, 1.0, 2.0]
+    stride: [16.0, 16.0]
+    variance: [1.0, 1.0, 1.0, 1.0]
+  anchor_start_size: 32
+  min_level: 2
+  max_level: 6
+  num_chan: 256
+  rpn_target_assign:
+    rpn_batch_size_per_im: 256
+    rpn_fg_fraction: 0.5
+    rpn_positive_overlap: 0.7
+    rpn_negative_overlap: 0.3
+    rpn_straddle_thresh: 0.0
+  train_proposal:
+    min_size: 0.0
+    nms_thresh: 0.7
+    pre_nms_top_n: 2000
+    post_nms_top_n: 2000
+  test_proposal:
+    min_size: 0.0
+    nms_thresh: 0.7
+    pre_nms_top_n: 1000
+    post_nms_top_n: 1000
+
+FPNRoIAlign:
+  canconical_level: 4
+  canonical_size: 224
+  min_level: 2
+  max_level: 5
+  box_resolution: 7
+  sampling_ratio: 2
+
+BBoxAssigner:
+  batch_size_per_im: 512
+  bbox_reg_weights: [0.1, 0.1, 0.2, 0.2]
+  bg_thresh_lo: 0.0
+  bg_thresh_hi: 0.5
+  fg_fraction: 0.25
+  fg_thresh: 0.5
+
+BBoxHead:
+  head: TwoFCHead
+  nms:
+    keep_top_k: 100
+    nms_threshold: 0.5
+    score_threshold: 0.05
+
+TwoFCHead:
+  mlp_dim: 1024
+
+LearningRate:
+  base_lr: 0.02
+  schedulers:
+  - !PiecewiseDecay
+    gamma: 0.1
+    milestones: [60000, 80000]
+  - !LinearWarmup
+    start_factor: 0.3333333333333333
+    steps: 500
+
+OptimizerBuilder:
+  optimizer:
+    momentum: 0.9
+    type: Momentum
+  regularizer:
+    factor: 0.0001
+    type: L2
+
+FasterRCNNTrainFeed:
+  batch_size: 2
+  dataset:
+    dataset_dir: dataset/coco
+    annotation: annotations/instances_train2017.json
+    image_dir: train2017
+  batch_transforms:
+  - !PadBatch
+    pad_to_stride: 32
+  drop_last: false
+  num_workers: 2
+
+FasterRCNNEvalFeed:
+  batch_size: 1
+  dataset:
+    dataset_dir: dataset/coco
+    annotation: annotations/instances_val2017.json
+    image_dir: val2017
+  batch_transforms:
+  - !PadBatch
+    pad_to_stride: 32
+
+FasterRCNNTestFeed:
+  batch_size: 1
+  dataset:
+    annotation: dataset/coco/annotations/instances_val2017.json
+  batch_transforms:
+  - !PadBatch
+    pad_to_stride: 32
+  drop_last: false
+  num_workers: 2
--- a/configs/faster_rcnn_r50_fpn_2x.yml
+++ b/configs/faster_rcnn_r50_fpn_2x.yml
+architecture: FasterRCNN
+train_feed: FasterRCNNTrainFeed
+eval_feed: FasterRCNNEvalFeed
+test_feed: FasterRCNNTestFeed
+max_iters: 180000
+use_gpu: true
+snapshot_iter: 10000
+log_smooth_window: 20
+save_dir: output
+pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/ResNet50_cos_pretrained.tar
+metric: COCO
+weights: output/faster_rcnn_r50_fpn_2x/model_final
+num_classes: 81
+
+FasterRCNN:
+  backbone: ResNet
+  fpn: FPN
+  rpn_head: FPNRPNHead
+  roi_extractor: FPNRoIAlign
+  bbox_head: BBoxHead
+  bbox_assigner: BBoxAssigner
+
+ResNet:
+  norm_type: affine_channel
+  norm_decay: 0.
+  depth: 50
+  feature_maps: [2, 3, 4, 5]
+  freeze_at: 2
+
+FPN:
+  min_level: 2
+  max_level: 6
+  num_chan: 256
+  spatial_scale: [0.03125, 0.0625, 0.125, 0.25]
+
+FPNRPNHead:
+  anchor_generator:
+    anchor_sizes: [32, 64, 128, 256, 512]
+    aspect_ratios: [0.5, 1.0, 2.0]
+    stride: [16.0, 16.0]
+    variance: [1.0, 1.0, 1.0, 1.0]
+  anchor_start_size: 32
+  min_level: 2
+  max_level: 6
+  num_chan: 256
+  rpn_target_assign:
+    rpn_batch_size_per_im: 256
+    rpn_fg_fraction: 0.5
+    rpn_positive_overlap: 0.7
+    rpn_negative_overlap: 0.3
+    rpn_straddle_thresh: 0.0
+  train_proposal:
+    min_size: 0.0
+    nms_thresh: 0.7
+    pre_nms_top_n: 2000
+    post_nms_top_n: 2000
+  test_proposal:
+    min_size: 0.0
+    nms_thresh: 0.7
+    pre_nms_top_n: 1000
+    post_nms_top_n: 1000
+
+FPNRoIAlign:
+  canconical_level: 4
+  canonical_size: 224
+  min_level: 2
+  max_level: 5
+  box_resolution: 7
+  sampling_ratio: 2
+
+BBoxAssigner:
+  batch_size_per_im: 512
+  bbox_reg_weights: [0.1, 0.1, 0.2, 0.2]
+  bg_thresh_lo: 0.0
+  bg_thresh_hi: 0.5
+  fg_fraction: 0.25
+  fg_thresh: 0.5
+
+BBoxHead:
+  head: TwoFCHead
+  nms:
+    keep_top_k: 100
+    nms_threshold: 0.5
+    score_threshold: 0.05
+
+TwoFCHead:
+  mlp_dim: 1024
+
+LearningRate:
+  base_lr: 0.02
+  schedulers:
+  - !PiecewiseDecay
+    gamma: 0.1
+    milestones: [120000, 160000]
+  - !LinearWarmup
+    start_factor: 0.3333333333333333
+    steps: 500
+
+OptimizerBuilder:
+  optimizer:
+    momentum: 0.9
+    type: Momentum
+  regularizer:
+    factor: 0.0001
+    type: L2
+
+FasterRCNNTrainFeed:
+  batch_size: 2
+  dataset:
+    dataset_dir: dataset/coco
+    annotation: annotations/instances_train2017.json
+    image_dir: train2017
+  batch_transforms:
+  - !PadBatch
+    pad_to_stride: 32
+  drop_last: false
+  num_workers: 2
+
+FasterRCNNEvalFeed:
+  batch_size: 1
+  dataset:
+    dataset_dir: dataset/coco
+    annotation: annotations/instances_val2017.json
+    image_dir: val2017
+  batch_transforms:
+  - !PadBatch
+    pad_to_stride: 32
+
+FasterRCNNTestFeed:
+  batch_size: 1
+  dataset:
+    annotation: dataset/coco/coco/annotations/instances_val2017.json
+  batch_transforms:
+  - !PadBatch
+    pad_to_stride: 32
+  drop_last: false
+  num_workers: 2
--- a/configs/faster_rcnn_r50_vd_1x.yml
+++ b/configs/faster_rcnn_r50_vd_1x.yml
+architecture: FasterRCNN
+train_feed: FasterRCNNTrainFeed
+eval_feed: FasterRCNNEvalFeed
+test_feed: FasterRCNNTestFeed
+use_gpu: true
+max_iters: 180000
+log_smooth_window: 20
+save_dir: output/faster-r50-vd-c4-1x
+snapshot_iter: 10000
+pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/ResNet50_vd_pretrained.tar
+metric: COCO
+weights: output/faster_rcnn_r50_vd_1x/model_final
+num_classes: 81
+
+FasterRCNN:
+  backbone: ResNet
+  rpn_head: RPNHead
+  roi_extractor: RoIAlign
+  bbox_head: BBoxHead
+  bbox_assigner: BBoxAssigner
+
+ResNet:
+  norm_type: affine_channel
+  depth: 50
+  feature_maps: 4
+  freeze_at: 2
+  variant: d
+
+ResNetC5:
+  depth: 50
+  norm_type: affine_channel
+  variant: d
+
+RPNHead:
+  anchor_generator:
+    anchor_sizes: [32, 64, 128, 256, 512]
+    aspect_ratios: [0.5, 1.0, 2.0]
+    stride: [16.0, 16.0]
+    variance: [1.0, 1.0, 1.0, 1.0]
+  rpn_target_assign:
+    rpn_batch_size_per_im: 256
+    rpn_fg_fraction: 0.5
+    rpn_negative_overlap: 0.3
+    rpn_positive_overlap: 0.7
+    rpn_straddle_thresh: 0.0
+    use_random: true
+  train_proposal:
+    min_size: 0.0
+    nms_thresh: 0.7
+    pre_nms_top_n: 12000
+    post_nms_top_n: 2000
+  test_proposal:
+    min_size: 0.0
+    nms_thresh: 0.7
+    pre_nms_top_n: 6000
+    post_nms_top_n: 1000
+
+RoIAlign:
+  resolution: 14
+  sampling_ratio: 0
+  spatial_scale: 0.0625
+
+BBoxAssigner:
+  batch_size_per_im: 512
+  bbox_reg_weights: [0.1, 0.1, 0.2, 0.2]
+  bg_thresh_hi: 0.5
+  bg_thresh_lo: 0.0
+  fg_fraction: 0.25
+  fg_thresh: 0.5
+
+BBoxHead:
+  head: ResNetC5
+  nms:
+    keep_top_k: 100
+    nms_threshold: 0.5
+    score_threshold: 0.05
+
+LearningRate:
+  base_lr: 0.01
+  schedulers:
+  - !PiecewiseDecay
+    gamma: 0.1
+    milestones: [120000, 160000]
+  - !LinearWarmup
+    start_factor: 0.1
+    steps: 1000
+
+OptimizerBuilder:
+  optimizer:
+    momentum: 0.9
+    type: Momentum
+  regularizer:
+    factor: 0.0001
+    type: L2
+
+FasterRCNNTrainFeed:
+  # batch size per device
+  batch_size: 1
+  dataset:
+    dataset_dir: dataset/coco
+    annotation: annotations/instances_train2017.json
+    image_dir: train2017
+  drop_last: false
+  num_workers: 2
+
+FasterRCNNEvalFeed:
+  batch_size: 1
+  dataset:
+    dataset_dir: dataset/coco
+    annotation: annotations/instances_val2017.json
+    image_dir: val2017
+  num_workers: 2
+
+FasterRCNNTestFeed:
+  batch_size: 1
+  dataset:
+    annotation: dataset/coco/annotations/instances_val2017.json
--- a/configs/faster_rcnn_r50_vd_fpn_2x.yml
+++ b/configs/faster_rcnn_r50_vd_fpn_2x.yml
--- a/configs/faster_rcnn_se154_vd_fpn_s1x.yml
+++ b/configs/faster_rcnn_se154_vd_fpn_s1x.yml
--- a/configs/faster_rcnn_x101_vd_64x4d_fpn_1x.yml
+++ b/configs/faster_rcnn_x101_vd_64x4d_fpn_1x.yml
--- a/configs/faster_rcnn_x101_vd_64x4d_fpn_2x.yml
+++ b/configs/faster_rcnn_x101_vd_64x4d_fpn_2x.yml
--- a/configs/gn/cascade_mask_rcnn_r50_fpn_gn_2x.yml
+++ b/configs/gn/cascade_mask_rcnn_r50_fpn_gn_2x.yml
--- a/configs/gn/faster_rcnn_r50_fpn_gn_2x.yml
+++ b/configs/gn/faster_rcnn_r50_fpn_gn_2x.yml
--- a/configs/gn/mask_rcnn_r50_fpn_gn_2x.yml
+++ b/configs/gn/mask_rcnn_r50_fpn_gn_2x.yml
--- a/configs/mask_rcnn_r101_fpn_1x.yml
+++ b/configs/mask_rcnn_r101_fpn_1x.yml
--- a/configs/mask_rcnn_r101_vd_fpn_1x.yml
+++ b/configs/mask_rcnn_r101_vd_fpn_1x.yml
--- a/configs/mask_rcnn_r50_1x.yml
+++ b/configs/mask_rcnn_r50_1x.yml
--- a/configs/mask_rcnn_r50_2x.yml
+++ b/configs/mask_rcnn_r50_2x.yml
--- a/configs/mask_rcnn_r50_fpn_1x.yml
+++ b/configs/mask_rcnn_r50_fpn_1x.yml
--- a/configs/mask_rcnn_r50_fpn_2x.yml
+++ b/configs/mask_rcnn_r50_fpn_2x.yml
--- a/configs/mask_rcnn_r50_vd_fpn_2x.yml
+++ b/configs/mask_rcnn_r50_vd_fpn_2x.yml
--- a/configs/mask_rcnn_se154_vd_fpn_s1x.yml
+++ b/configs/mask_rcnn_se154_vd_fpn_s1x.yml
--- a/configs/mask_rcnn_x101_vd_64x4d_fpn_1x.yml
+++ b/configs/mask_rcnn_x101_vd_64x4d_fpn_1x.yml
--- a/configs/mask_rcnn_x101_vd_64x4d_fpn_2x.yml
+++ b/configs/mask_rcnn_x101_vd_64x4d_fpn_2x.yml
--- a/configs/obj365/cascade_rcnn_dcnv2_se154_vd_fpn_gn_cas.yml
+++ b/configs/obj365/cascade_rcnn_dcnv2_se154_vd_fpn_gn_cas.yml
--- a/configs/retinanet_r101_fpn_1x.yml
+++ b/configs/retinanet_r101_fpn_1x.yml
--- a/configs/retinanet_r50_fpn_1x.yml
+++ b/configs/retinanet_r50_fpn_1x.yml
--- a/configs/retinanet_x101_vd_64x4d_fpn_1x.yml
+++ b/configs/retinanet_x101_vd_64x4d_fpn_1x.yml
--- a/configs/ssd/ssd_mobilenet_v1_voc.yml
+++ b/configs/ssd/ssd_mobilenet_v1_voc.yml
+architecture: SSD
+train_feed: SSDTrainFeed
+eval_feed: SSDEvalFeed
+test_feed: SSDTestFeed
+pretrain_weights: https://paddlemodels.bj.bcebos.com/object_detection/ssd_mobilenet_v1_coco_pretrained.tar
+use_gpu: true
+max_iters: 28000
+snapshot_iter: 2000
+log_smooth_window: 1
+metric: VOC
+map_type: 11point
+save_dir: output
+weights: output/ssd_mobilenet_v1_voc/model_final/
+# 20(label_class) + 1(background)
+num_classes: 21
+
+SSD:
+  backbone: MobileNet
+  multi_box_head: MultiBoxHead
+  output_decoder:
+    background_label: 0
+    keep_top_k: 200
+    nms_eta: 1.0
+    nms_threshold: 0.45
+    nms_top_k: 400
+    score_threshold: 0.01
+
+MobileNet:
+  norm_decay: 0.
+  conv_group_scale: 1
+  conv_learning_rate: 0.1
+  extra_block_filters: [[256, 512], [128, 256], [128, 256], [64, 128]]
+  with_extra_blocks: true
+
+MultiBoxHead:
+  aspect_ratios: [[2.], [2., 3.], [2., 3.], [2., 3.], [2., 3.], [2., 3.]]
+  base_size: 300
+  flip: true
+  max_ratio: 90
+  max_sizes: [[], 150.0, 195.0, 240.0, 285.0, 300.0]
+  min_ratio: 20
+  min_sizes: [60.0, 105.0, 150.0, 195.0, 240.0, 285.0]
+  offset: 0.5
+
+LearningRate:
+  schedulers:
+  - !PiecewiseDecay
+    milestones: [10000, 15000, 20000, 25000]
+    values: [0.001, 0.0005, 0.00025, 0.0001, 0.00001]
+
+OptimizerBuilder:
+  optimizer:
+    momentum: 0.0
+    type: RMSPropOptimizer
+  regularizer:
+    factor: 0.00005
+    type: L2
+
+SSDTrainFeed:
+  batch_size: 32
+  use_process: true
+  dataset:
+    dataset_dir: dataset/voc
+    annotation: VOCdevkit/VOC_all/ImageSets/Main/train.txt
+    image_dir: VOCdevkit/VOC_all/JPEGImages
+    use_default_label: true
+
+SSDEvalFeed:
+  batch_size: 64
+  use_process: true
+  dataset:
+    dataset_dir: dataset/voc
+    annotation: VOCdevkit/VOC_all/ImageSets/Main/val.txt
+    image_dir: VOCdevkit/VOC_all/JPEGImages
+    use_default_label: true
+  drop_last: false
+
+SSDTestFeed:
+  batch_size: 1
+  dataset:
+    use_default_label: true
+  drop_last: false
--- a/configs/ssd/ssd_vgg16_300.yml
+++ b/configs/ssd/ssd_vgg16_300.yml
--- a/configs/ssd/ssd_vgg16_300_voc.yml
+++ b/configs/ssd/ssd_vgg16_300_voc.yml
--- a/configs/ssd/ssd_vgg16_512.yml
+++ b/configs/ssd/ssd_vgg16_512.yml
--- a/configs/ssd/ssd_vgg16_512_voc.yml
+++ b/configs/ssd/ssd_vgg16_512_voc.yml
--- a/configs/yolov3_darknet.yml
+++ b/configs/yolov3_darknet.yml
--- a/configs/yolov3_darknet_voc.yml
+++ b/configs/yolov3_darknet_voc.yml
--- a/configs/yolov3_mobilenet_v1.yml
+++ b/configs/yolov3_mobilenet_v1.yml
--- a/configs/yolov3_mobilenet_v1_fruit.yml
+++ b/configs/yolov3_mobilenet_v1_fruit.yml
--- a/configs/yolov3_mobilenet_v1_voc.yml
+++ b/configs/yolov3_mobilenet_v1_voc.yml
--- a/configs/yolov3_r34.yml
+++ b/configs/yolov3_r34.yml
--- a/configs/yolov3_r34_voc.yml
+++ b/configs/yolov3_r34_voc.yml
--- a/contrib/PedestrianDetection/demo/001.png
+++ b/contrib/PedestrianDetection/demo/001.png
--- a/contrib/PedestrianDetection/demo/002.png
+++ b/contrib/PedestrianDetection/demo/002.png
--- a/contrib/PedestrianDetection/demo/003.png
+++ b/contrib/PedestrianDetection/demo/003.png
--- a/contrib/PedestrianDetection/demo/004.png
+++ b/contrib/PedestrianDetection/demo/004.png
--- a/contrib/PedestrianDetection/demo/output/001.png
+++ b/contrib/PedestrianDetection/demo/output/001.png
--- a/contrib/PedestrianDetection/demo/output/004.png
+++ b/contrib/PedestrianDetection/demo/output/004.png
--- a/contrib/PedestrianDetection/pedestrian.json
+++ b/contrib/PedestrianDetection/pedestrian.json
--- a/contrib/PedestrianDetection/pedestrian_yolov3_darknet.yml
+++ b/contrib/PedestrianDetection/pedestrian_yolov3_darknet.yml
--- a/contrib/README.md
+++ b/contrib/README.md
--- a/contrib/README_cn.md
+++ b/contrib/README_cn.md
--- a/contrib/VehicleDetection/demo/001.jpeg
+++ b/contrib/VehicleDetection/demo/001.jpeg
--- a/contrib/VehicleDetection/demo/003.png
+++ b/contrib/VehicleDetection/demo/003.png
--- a/contrib/VehicleDetection/demo/004.png
+++ b/contrib/VehicleDetection/demo/004.png
--- a/contrib/VehicleDetection/demo/005.png
+++ b/contrib/VehicleDetection/demo/005.png
--- a/contrib/VehicleDetection/demo/output/001.jpeg
+++ b/contrib/VehicleDetection/demo/output/001.jpeg
--- a/contrib/VehicleDetection/demo/output/005.png
+++ b/contrib/VehicleDetection/demo/output/005.png
--- a/contrib/VehicleDetection/vehicle.json
+++ b/contrib/VehicleDetection/vehicle.json
--- a/contrib/VehicleDetection/vehicle_yolov3_darknet.yml
+++ b/contrib/VehicleDetection/vehicle_yolov3_darknet.yml
--- a/dataset/coco/download_coco.py
+++ b/dataset/coco/download_coco.py
--- a/dataset/fruit/download_fruit.py
+++ b/dataset/fruit/download_fruit.py
--- a/dataset/voc/download_voc.py
+++ b/dataset/voc/download_voc.py
--- a/dataset/wider_face/download.sh
+++ b/dataset/wider_face/download.sh
--- a/demo/000000014439.jpg
+++ b/demo/000000014439.jpg
--- a/demo/000000014439_640x640.jpg
+++ b/demo/000000014439_640x640.jpg
--- a/demo/000000087038.jpg
+++ b/demo/000000087038.jpg
--- a/demo/000000570688.jpg
+++ b/demo/000000570688.jpg
--- a/demo/cas.png
+++ b/demo/cas.png
--- a/demo/mask_rcnn_demo.ipynb
+++ b/demo/mask_rcnn_demo.ipynb
--- a/demo/obj365_gt.png
+++ b/demo/obj365_gt.png
--- a/demo/obj365_pred.png
+++ b/demo/obj365_pred.png
--- a/demo/orange_71.jpg
+++ b/demo/orange_71.jpg
--- a/demo/orange_71_detection.jpg
+++ b/demo/orange_71_detection.jpg
--- a/demo/output/000000570688.jpg
+++ b/demo/output/000000570688.jpg
--- a/demo/output/12_Group_Group_12_Group_Group_12_935.jpg
+++ b/demo/output/12_Group_Group_12_Group_Group_12_935.jpg
--- a/demo/tensorboard_fruit.jpg
+++ b/demo/tensorboard_fruit.jpg
--- a/docs/BENCHMARK_INFER_cn.md
+++ b/docs/BENCHMARK_INFER_cn.md
--- a/docs/CACascadeRCNN.md
+++ b/docs/CACascadeRCNN.md
--- a/docs/CONFIG.md
+++ b/docs/CONFIG.md
--- a/docs/CONFIG_cn.md
+++ b/docs/CONFIG_cn.md
--- a/docs/DATA.md
+++ b/docs/DATA.md
--- a/docs/DATA_cn.md
+++ b/docs/DATA_cn.md
--- a/docs/EXPORT_MODEL.md
+++ b/docs/EXPORT_MODEL.md
--- a/docs/GETTING_STARTED.md
+++ b/docs/GETTING_STARTED.md
--- a/docs/GETTING_STARTED_cn.md
+++ b/docs/GETTING_STARTED_cn.md
--- a/docs/INSTALL.md
+++ b/docs/INSTALL.md
--- a/docs/INSTALL_cn.md
+++ b/docs/INSTALL_cn.md
--- a/docs/MODEL_ZOO.md
+++ b/docs/MODEL_ZOO.md
--- a/docs/MODEL_ZOO_cn.md
+++ b/docs/MODEL_ZOO_cn.md
--- a/docs/QUICK_STARTED.md
+++ b/docs/QUICK_STARTED.md
--- a/docs/QUICK_STARTED_cn.md
+++ b/docs/QUICK_STARTED_cn.md
--- a/docs/TRANSFER_LEARNING.md
+++ b/docs/TRANSFER_LEARNING.md
--- a/docs/TRANSFER_LEARNING_cn.md
+++ b/docs/TRANSFER_LEARNING_cn.md
--- a/docs/config_example/mask_rcnn_r50_fpn_1x.yml
+++ b/docs/config_example/mask_rcnn_r50_fpn_1x.yml
--- a/docs/config_example/ssd_vgg16_300.yml
+++ b/docs/config_example/ssd_vgg16_300.yml
--- a/docs/config_example/yolov3_darknet.yml
+++ b/docs/config_example/yolov3_darknet.yml
--- a/docs/images/bench_ssd_yolo_infer.png
+++ b/docs/images/bench_ssd_yolo_infer.png
--- a/inference/CMakeLists.txt
+++ b/inference/CMakeLists.txt
--- a/inference/LICENSE
+++ b/inference/LICENSE
--- a/inference/README.md
+++ b/inference/README.md
--- a/inference/conf/detection_rcnn.yaml
+++ b/inference/conf/detection_rcnn.yaml
--- a/inference/conf/detection_rcnn_fpn.yaml
+++ b/inference/conf/detection_rcnn_fpn.yaml
--- a/inference/demo_images/000000087038.jpg
+++ b/inference/demo_images/000000087038.jpg
--- a/inference/demo_images/000000087038.jpg.png
+++ b/inference/demo_images/000000087038.jpg.png
--- a/inference/detection_demo.cpp
+++ b/inference/detection_demo.cpp
--- a/inference/docs/configuration.md
+++ b/inference/docs/configuration.md
--- a/inference/docs/linux_build.md
+++ b/inference/docs/linux_build.md
--- a/inference/docs/windows_vs2015_build.md
+++ b/inference/docs/windows_vs2015_build.md
--- a/inference/docs/windows_vs2019_build.md
+++ b/inference/docs/windows_vs2019_build.md
--- a/inference/external-cmake/yaml-cpp.cmake
+++ b/inference/external-cmake/yaml-cpp.cmake
--- a/inference/images/detection_rcnn/000000014439.jpg
+++ b/inference/images/detection_rcnn/000000014439.jpg
--- a/inference/images/detection_rcnn/000000087038.jpg
+++ b/inference/images/detection_rcnn/000000087038.jpg
--- a/inference/images/detection_rcnn/000000570688.jpg
+++ b/inference/images/detection_rcnn/000000570688.jpg
--- a/inference/predictor/detection_predictor.cpp
+++ b/inference/predictor/detection_predictor.cpp
--- a/inference/predictor/detection_predictor.h
+++ b/inference/predictor/detection_predictor.h
--- a/inference/preprocessor/preprocessor.cpp
+++ b/inference/preprocessor/preprocessor.cpp
--- a/inference/preprocessor/preprocessor.h
+++ b/inference/preprocessor/preprocessor.h
--- a/inference/preprocessor/preprocessor_detection.cpp
+++ b/inference/preprocessor/preprocessor_detection.cpp
--- a/inference/preprocessor/preprocessor_detection.h
+++ b/inference/preprocessor/preprocessor_detection.h
--- a/inference/tools/coco17.json
+++ b/inference/tools/coco17.json
--- a/inference/tools/detection_result_pb2.py
+++ b/inference/tools/detection_result_pb2.py
--- a/inference/tools/vis.py
+++ b/inference/tools/vis.py
--- a/inference/utils/conf_parser.h
+++ b/inference/utils/conf_parser.h
--- a/inference/utils/detection_result.pb.cc
+++ b/inference/utils/detection_result.pb.cc
--- a/inference/utils/detection_result.pb.h
+++ b/inference/utils/detection_result.pb.h
--- a/inference/utils/detection_result.proto
+++ b/inference/utils/detection_result.proto
--- a/inference/utils/utils.h
+++ b/inference/utils/utils.h
--- a/ppdet/__init__.py
+++ b/ppdet/__init__.py
--- a/ppdet/core/__init__.py
+++ b/ppdet/core/__init__.py
--- a/ppdet/core/config/__init__.py
+++ b/ppdet/core/config/__init__.py
--- a/ppdet/core/config/schema.py
+++ b/ppdet/core/config/schema.py
--- a/ppdet/core/config/yaml_helpers.py
+++ b/ppdet/core/config/yaml_helpers.py
--- a/ppdet/core/workspace.py
+++ b/ppdet/core/workspace.py
--- a/ppdet/data/README.md
+++ b/ppdet/data/README.md
--- a/ppdet/data/README_cn.md
+++ b/ppdet/data/README_cn.md
--- a/ppdet/data/__init__.py
+++ b/ppdet/data/__init__.py
--- a/ppdet/data/data_feed.py
+++ b/ppdet/data/data_feed.py
--- a/ppdet/data/dataset.py
+++ b/ppdet/data/dataset.py
--- a/ppdet/data/reader.py
+++ b/ppdet/data/reader.py
--- a/ppdet/data/source/__init__.py
+++ b/ppdet/data/source/__init__.py
--- a/ppdet/data/source/class_aware_sampling_roidb_source.py
+++ b/ppdet/data/source/class_aware_sampling_roidb_source.py
--- a/ppdet/data/source/coco_loader.py
+++ b/ppdet/data/source/coco_loader.py
--- a/ppdet/data/source/iterator_source.py
+++ b/ppdet/data/source/iterator_source.py
--- a/ppdet/data/source/loader.py
+++ b/ppdet/data/source/loader.py
--- a/ppdet/data/source/roidb_source.py
+++ b/ppdet/data/source/roidb_source.py
--- a/ppdet/data/source/simple_source.py
+++ b/ppdet/data/source/simple_source.py
--- a/ppdet/data/source/voc_loader.py
+++ b/ppdet/data/source/voc_loader.py
--- a/ppdet/data/source/widerface_loader.py
+++ b/ppdet/data/source/widerface_loader.py
--- a/ppdet/data/tests/000012.jpg
+++ b/ppdet/data/tests/000012.jpg
--- a/ppdet/data/tests/coco.yml
+++ b/ppdet/data/tests/coco.yml
--- a/ppdet/data/tests/data/prepare_data.sh
+++ b/ppdet/data/tests/data/prepare_data.sh
--- a/ppdet/data/tests/rcnn_dataset.yml
+++ b/ppdet/data/tests/rcnn_dataset.yml
--- a/ppdet/data/tests/run_all_tests.py
+++ b/ppdet/data/tests/run_all_tests.py
--- a/ppdet/data/tests/set_env.py
+++ b/ppdet/data/tests/set_env.py
--- a/ppdet/data/tests/test_iterator_source.py
+++ b/ppdet/data/tests/test_iterator_source.py
--- a/ppdet/data/tests/test_loader.py
+++ b/ppdet/data/tests/test_loader.py
--- a/ppdet/data/tests/test_operator.py
+++ b/ppdet/data/tests/test_operator.py
--- a/ppdet/data/tests/test_reader.py
+++ b/ppdet/data/tests/test_reader.py
--- a/ppdet/data/tests/test_roidb_source.py
+++ b/ppdet/data/tests/test_roidb_source.py
--- a/ppdet/data/tests/test_transformer.py
+++ b/ppdet/data/tests/test_transformer.py
--- a/ppdet/data/tools/generate_data_for_training.py
+++ b/ppdet/data/tools/generate_data_for_training.py
--- a/ppdet/data/tools/labelme2coco.py
+++ b/ppdet/data/tools/labelme2coco.py
--- a/ppdet/data/transform/__init__.py
+++ b/ppdet/data/transform/__init__.py
--- a/ppdet/data/transform/arrange_sample.py
+++ b/ppdet/data/transform/arrange_sample.py
--- a/ppdet/data/transform/op_helper.py
+++ b/ppdet/data/transform/op_helper.py
--- a/ppdet/data/transform/operators.py
+++ b/ppdet/data/transform/operators.py
--- a/ppdet/data/transform/parallel_map.py
+++ b/ppdet/data/transform/parallel_map.py
--- a/ppdet/data/transform/post_map.py
+++ b/ppdet/data/transform/post_map.py
--- a/ppdet/data/transform/shared_queue/__init__.py
+++ b/ppdet/data/transform/shared_queue/__init__.py
--- a/ppdet/data/transform/shared_queue/queue.py
+++ b/ppdet/data/transform/shared_queue/queue.py
--- a/ppdet/data/transform/shared_queue/sharedmemory.py
+++ b/ppdet/data/transform/shared_queue/sharedmemory.py
--- a/ppdet/data/transform/transformer.py
+++ b/ppdet/data/transform/transformer.py
--- a/ppdet/experimental/__init__.py
+++ b/ppdet/experimental/__init__.py
--- a/ppdet/experimental/mixed_precision.py
+++ b/ppdet/experimental/mixed_precision.py
--- a/ppdet/modeling/__init__.py
+++ b/ppdet/modeling/__init__.py
--- a/ppdet/modeling/anchor_heads/__init__.py
+++ b/ppdet/modeling/anchor_heads/__init__.py
--- a/ppdet/modeling/anchor_heads/retina_head.py
+++ b/ppdet/modeling/anchor_heads/retina_head.py
--- a/ppdet/modeling/anchor_heads/rpn_head.py
+++ b/ppdet/modeling/anchor_heads/rpn_head.py
--- a/ppdet/modeling/anchor_heads/yolo_head.py
+++ b/ppdet/modeling/anchor_heads/yolo_head.py
--- a/ppdet/modeling/architectures/__init__.py
+++ b/ppdet/modeling/architectures/__init__.py
--- a/ppdet/modeling/architectures/blazeface.py
+++ b/ppdet/modeling/architectures/blazeface.py
--- a/ppdet/modeling/architectures/cascade_mask_rcnn.py
+++ b/ppdet/modeling/architectures/cascade_mask_rcnn.py
--- a/ppdet/modeling/architectures/cascade_rcnn.py
+++ b/ppdet/modeling/architectures/cascade_rcnn.py
--- a/ppdet/modeling/architectures/faceboxes.py
+++ b/ppdet/modeling/architectures/faceboxes.py
--- a/ppdet/modeling/architectures/faster_rcnn.py
+++ b/ppdet/modeling/architectures/faster_rcnn.py
--- a/ppdet/modeling/architectures/mask_rcnn.py
+++ b/ppdet/modeling/architectures/mask_rcnn.py
--- a/ppdet/modeling/architectures/retinanet.py
+++ b/ppdet/modeling/architectures/retinanet.py
--- a/ppdet/modeling/architectures/ssd.py
+++ b/ppdet/modeling/architectures/ssd.py
--- a/ppdet/modeling/architectures/yolov3.py
+++ b/ppdet/modeling/architectures/yolov3.py
--- a/ppdet/modeling/backbones/__init__.py
+++ b/ppdet/modeling/backbones/__init__.py
--- a/ppdet/modeling/backbones/blazenet.py
+++ b/ppdet/modeling/backbones/blazenet.py
--- a/ppdet/modeling/backbones/darknet.py
+++ b/ppdet/modeling/backbones/darknet.py
--- a/ppdet/modeling/backbones/faceboxnet.py
+++ b/ppdet/modeling/backbones/faceboxnet.py
--- a/ppdet/modeling/backbones/fpn.py
+++ b/ppdet/modeling/backbones/fpn.py
--- a/ppdet/modeling/backbones/mobilenet.py
+++ b/ppdet/modeling/backbones/mobilenet.py
--- a/ppdet/modeling/backbones/name_adapter.py
+++ b/ppdet/modeling/backbones/name_adapter.py
--- a/ppdet/modeling/backbones/resnet.py
+++ b/ppdet/modeling/backbones/resnet.py
--- a/ppdet/modeling/backbones/resnext.py
+++ b/ppdet/modeling/backbones/resnext.py
--- a/ppdet/modeling/backbones/senet.py
+++ b/ppdet/modeling/backbones/senet.py
--- a/ppdet/modeling/backbones/vgg.py
+++ b/ppdet/modeling/backbones/vgg.py
--- a/ppdet/modeling/model_input.py
+++ b/ppdet/modeling/model_input.py
--- a/ppdet/modeling/ops.py
+++ b/ppdet/modeling/ops.py
--- a/ppdet/modeling/roi_extractors/__init__.py
+++ b/ppdet/modeling/roi_extractors/__init__.py
--- a/ppdet/modeling/roi_extractors/roi_extractor.py
+++ b/ppdet/modeling/roi_extractors/roi_extractor.py
--- a/ppdet/modeling/roi_heads/__init__.py
+++ b/ppdet/modeling/roi_heads/__init__.py
--- a/ppdet/modeling/roi_heads/bbox_head.py
+++ b/ppdet/modeling/roi_heads/bbox_head.py
--- a/ppdet/modeling/roi_heads/cascade_head.py
+++ b/ppdet/modeling/roi_heads/cascade_head.py
--- a/ppdet/modeling/roi_heads/mask_head.py
+++ b/ppdet/modeling/roi_heads/mask_head.py
--- a/ppdet/modeling/target_assigners.py
+++ b/ppdet/modeling/target_assigners.py
--- a/ppdet/modeling/tests/__init__.py
+++ b/ppdet/modeling/tests/__init__.py
--- a/ppdet/modeling/tests/decorator_helper.py
+++ b/ppdet/modeling/tests/decorator_helper.py
--- a/ppdet/modeling/tests/test_architectures.py
+++ b/ppdet/modeling/tests/test_architectures.py
--- a/ppdet/optimizer.py
+++ b/ppdet/optimizer.py
--- a/ppdet/utils/__init__.py
+++ b/ppdet/utils/__init__.py
--- a/ppdet/utils/check.py
+++ b/ppdet/utils/check.py
--- a/ppdet/utils/checkpoint.py
+++ b/ppdet/utils/checkpoint.py
--- a/ppdet/utils/cli.py
+++ b/ppdet/utils/cli.py
--- a/ppdet/utils/coco_eval.py
+++ b/ppdet/utils/coco_eval.py
--- a/ppdet/utils/colormap.py
+++ b/ppdet/utils/colormap.py
--- a/ppdet/utils/dist_utils.py
+++ b/ppdet/utils/dist_utils.py
--- a/ppdet/utils/download.py
+++ b/ppdet/utils/download.py
--- a/ppdet/utils/eval_utils.py
+++ b/ppdet/utils/eval_utils.py
--- a/ppdet/utils/map_utils.py
+++ b/ppdet/utils/map_utils.py
--- a/ppdet/utils/post_process.py
+++ b/ppdet/utils/post_process.py
--- a/ppdet/utils/stats.py
+++ b/ppdet/utils/stats.py
--- a/ppdet/utils/visualizer.py
+++ b/ppdet/utils/visualizer.py
--- a/ppdet/utils/voc_eval.py
+++ b/ppdet/utils/voc_eval.py
--- a/ppdet/utils/voc_utils.py
+++ b/ppdet/utils/voc_utils.py
--- a/ppdet/utils/widerface_eval_utils.py
+++ b/ppdet/utils/widerface_eval_utils.py
--- a/requirements.txt
+++ b/requirements.txt
--- a/slim/distillation/README.md
+++ b/slim/distillation/README.md
--- a/slim/distillation/compress.py
+++ b/slim/distillation/compress.py
--- a/slim/distillation/run.sh
+++ b/slim/distillation/run.sh
--- a/slim/distillation/yolov3_mobilenet_v1_yolov3_resnet34_distillation.yml
+++ b/slim/distillation/yolov3_mobilenet_v1_yolov3_resnet34_distillation.yml
--- a/slim/distillation/yolov3_resnet34.yml
+++ b/slim/distillation/yolov3_resnet34.yml
--- a/slim/eval.py
+++ b/slim/eval.py
--- a/slim/infer.py
+++ b/slim/infer.py
--- a/slim/prune/README.md
+++ b/slim/prune/README.md
--- a/slim/prune/compress.py
+++ b/slim/prune/compress.py
--- a/slim/prune/images/MobileNetV1-YoloV3.pdf
+++ b/slim/prune/images/MobileNetV1-YoloV3.pdf
--- a/slim/prune/yolov3_mobilenet_v1_slim.yaml
+++ b/slim/prune/yolov3_mobilenet_v1_slim.yaml
--- a/slim/quantization/README.md
+++ b/slim/quantization/README.md
--- a/slim/quantization/compress.py
+++ b/slim/quantization/compress.py
--- a/slim/quantization/freeze.py
+++ b/slim/quantization/freeze.py
--- a/slim/quantization/images/ConvertToInt8Pass.png
+++ b/slim/quantization/images/ConvertToInt8Pass.png
--- a/slim/quantization/images/FreezePass.png
+++ b/slim/quantization/images/FreezePass.png
--- a/slim/quantization/images/TransformForMobilePass.png
+++ b/slim/quantization/images/TransformForMobilePass.png
--- a/slim/quantization/images/TransformPass.png
+++ b/slim/quantization/images/TransformPass.png
--- a/slim/quantization/yolov3_mobilenet_v1_slim.yaml
+++ b/slim/quantization/yolov3_mobilenet_v1_slim.yaml
--- a/tools/__init__.py
+++ b/tools/__init__.py
--- a/tools/configure.py
+++ b/tools/configure.py
--- a/tools/eval.py
+++ b/tools/eval.py
--- a/tools/export_model.py
+++ b/tools/export_model.py
--- a/tools/face_eval.py
+++ b/tools/face_eval.py
--- a/tools/infer.py
+++ b/tools/infer.py
--- a/tools/train.py
+++ b/tools/train.py