未验证 提交 3b3a5099 编写于 作者: W wangguanzhong 提交者: GitHub

Update api (#3)

* update code & doc
上级 20a611cf
......@@ -7,6 +7,7 @@ detection models in both industry and research settings. We design
PaddleDetection to be not only performant, production-ready but also highly
flexible, catering to research needs.
**Now all models in PaddleDetection require PaddlePaddle version 1.6 or higher, or suitable develop version.**
<div align="center">
<img src="demo/output/000000570688.jpg" />
......@@ -97,6 +98,7 @@ Advanced Features:
#### 10/2019
- Add enhanced YOLOv3 models, box mAP up to 41.4%.
- Face detection models included: BlazeFace, Faceboxes.
- Enrich COCO models, box mAP up to 51.9%.
- Add CACacascade RCNN, one of the best single model of Objects365 2019 challenge Full Track champion.
......
......@@ -4,6 +4,8 @@
PaddleDetection的目的是为工业界和学术界提供丰富、易用的目标检测模型。不仅性能优越、易于部署,而且能够灵活的满足算法研究的需求。
**目前检测库下模型均要求使用PaddlePaddle 1.6及以上版本或适当的develop版本。**
<div align="center">
<img src="demo/output/000000570688.jpg" />
</div>
......@@ -88,6 +90,7 @@ PaddleDetection的目的是为工业界和学术界提供丰富、易用的目
### 10/2019
- 增加增强版YOLOv3模型,精度高达41.4%。
- 增加人脸检测模型BlazeFace、Faceboxes。
- 丰富基于COCO的模型,精度高达51.9%。
- 增加Objects365 2019 Challenge上夺冠的最佳单模型之一CACascade-RCNN。
......
......@@ -242,7 +242,7 @@ MaskRCNNEvalFeed:
- !PadMSTest
pad_to_stride: 32
# num_scale = (len(target_size) + 1) * (1 + use_flip)
num_scale: 18
num_scale: 18
num_workers: 2
MaskRCNNTestFeed:
......
architecture: YOLOv3
train_feed: YoloTrainFeed
eval_feed: YoloEvalFeed
test_feed: YoloTestFeed
use_gpu: true
max_iters: 500000
log_smooth_window: 20
save_dir: output
snapshot_iter: 20000
metric: COCO
pretrain_weights: /paddle-imagenet-models-name.bj.bcebos.com/ResNet50_vd_pretrained.tar
weights: output/yolov3_r50vd_dcn/model_final
num_classes: 80
YOLOv3:
backbone: ResNet
yolo_head: YOLOv3Head
ResNet:
norm_type: sync_bn
freeze_at: 0
freeze_norm: false
norm_decay: 0.
depth: 50
feature_maps: [3, 4, 5]
variant: d
dcn_v2_stages: [5]
YOLOv3Head:
anchor_masks: [[6, 7, 8], [3, 4, 5], [0, 1, 2]]
anchors: [[10, 13], [16, 30], [33, 23],
[30, 61], [62, 45], [59, 119],
[116, 90], [156, 198], [373, 326]]
norm_decay: 0.
ignore_thresh: 0.7
label_smooth: true
nms:
background_label: -1
keep_top_k: 100
nms_threshold: 0.45
nms_top_k: 1000
normalized: false
score_threshold: 0.01
LearningRate:
base_lr: 0.001
schedulers:
- !PiecewiseDecay
gamma: 0.1
milestones:
- 400000
- 450000
- !LinearWarmup
start_factor: 0.
steps: 4000
OptimizerBuilder:
optimizer:
momentum: 0.9
type: Momentum
regularizer:
factor: 0.0005
type: L2
YoloTrainFeed:
batch_size: 8
dataset:
dataset_dir: dataset/coco
annotation: annotations/instances_val2017.json
image_dir: val2017
sample_transforms:
- !DecodeImage
to_rgb: True
with_mixup: True
- !MixupImage
alpha: 1.5
beta: 1.5
- !NormalizeBox {}
- !RandomDistort {}
- !ExpandImage
max_ratio: 4
prob: 0.5
mean:
- 123.675
- 116.28
- 103.53
- !CropImage
batch_sampler: [[1, 1, 1.0, 1.0, 1.0, 1.0, 0.0, 0.0],
[1, 50, 0.3, 1.0, 0.5, 2.0, 0.1, 1.0],
[1, 50, 0.3, 1.0, 0.5, 2.0, 0.3, 1.0],
[1, 50, 0.3, 1.0, 0.5, 2.0, 0.5, 1.0],
[1, 50, 0.3, 1.0, 0.5, 2.0, 0.7, 1.0],
[1, 50, 0.3, 1.0, 0.5, 2.0, 0.9, 1.0],
[1, 50, 0.3, 1.0, 0.5, 2.0, 0.0, 1.0]]
- !RandomInterpImage
target_size: 608
- !RandomFlipImage
is_normalized: True
- !NormalizeImage
mean:
- 0.485
- 0.456
- 0.406
std:
- 0.229
- 0.224
- 0.225
is_scale: True
is_channel_first: False
- !Permute
to_bgr: False
num_workers: 8
bufsize: 128
use_process: true
YoloEvalFeed:
batch_size: 8
image_shape: [3, 608, 608]
dataset:
dataset_dir: dataset/coco
annotation: annotations/instances_val2017.json
image_dir: val2017
YoloTestFeed:
batch_size: 1
image_shape: [3, 608, 608]
dataset:
annotation: dataset/coco/annotations/instances_val2017.json
architecture: YOLOv3
train_feed: YoloTrainFeed
eval_feed: YoloEvalFeed
test_feed: YoloTestFeed
use_gpu: true
max_iters: 55000
log_smooth_window: 20
save_dir: output
snapshot_iter: 10000
metric: COCO
pretrain_weights: https://paddlemodels.bj.bcebos.com/object_detection/ResNet50_vd_obj365_pretrained.tar
weights: output/yolov3_r50vd_dcn_obj365_pretrained_coco/model_final
num_classes: 80
YOLOv3:
backbone: ResNet
yolo_head: YOLOv3Head
ResNet:
norm_type: sync_bn
freeze_at: 0
freeze_norm: false
norm_decay: 0.
depth: 50
feature_maps: [3, 4, 5]
variant: d
dcn_v2_stages: [5]
YOLOv3Head:
anchor_masks: [[6, 7, 8], [3, 4, 5], [0, 1, 2]]
anchors: [[10, 13], [16, 30], [33, 23],
[30, 61], [62, 45], [59, 119],
[116, 90], [156, 198], [373, 326]]
norm_decay: 0.
ignore_thresh: 0.7
label_smooth: true
nms:
background_label: -1
keep_top_k: 100
nms_threshold: 0.45
nms_top_k: 1000
normalized: false
score_threshold: 0.01
LearningRate:
base_lr: 0.001
schedulers:
- !PiecewiseDecay
gamma: 0.1
milestones:
- 40000
- 50000
- !LinearWarmup
start_factor: 0.
steps: 4000
OptimizerBuilder:
optimizer:
momentum: 0.9
type: Momentum
regularizer:
factor: 0.0005
type: L2
YoloTrainFeed:
batch_size: 8
dataset:
dataset_dir: dataset/coco
annotation: annotations/instances_train2017.json
image_dir: train2017
sample_transforms:
- !DecodeImage
to_rgb: True
with_mixup: False
- !NormalizeBox {}
- !CropImage
batch_sampler: [[1, 1, 1.0, 1.0, 1.0, 1.0, 0.0, 0.0],
[1, 50, 0.3, 1.0, 0.5, 2.0, 0.1, 1.0],
[1, 50, 0.3, 1.0, 0.5, 2.0, 0.3, 1.0],
[1, 50, 0.3, 1.0, 0.5, 2.0, 0.5, 1.0],
[1, 50, 0.3, 1.0, 0.5, 2.0, 0.7, 1.0],
[1, 50, 0.3, 1.0, 0.5, 2.0, 0.9, 1.0],
[1, 50, 0.3, 1.0, 0.5, 2.0, 0.0, 1.0]]
- !RandomInterpImage
target_size: 608
- !RandomFlipImage
is_normalized: True
- !NormalizeImage
mean:
- 0.485
- 0.456
- 0.406
std:
- 0.229
- 0.224
- 0.225
is_scale: False
is_channel_first: False
- !Permute
to_bgr: False
num_workers: 8
bufsize: 128
use_process: true
YoloEvalFeed:
batch_size: 8
image_shape: [3, 608, 608]
dataset:
dataset_dir: dataset/coco
annotation: annotations/instances_val2017.json
image_dir: val2017
sample_transforms:
- !DecodeImage
to_rgb: True
with_mixup: False
- !ResizeImage
interp: 2
target_size: 608
- !NormalizeImage
mean:
- 0.485
- 0.456
- 0.406
std:
- 0.229
- 0.224
- 0.225
is_scale: False
is_channel_first: False
- !Permute
to_bgr: False
YoloTestFeed:
batch_size: 1
image_shape: [3, 608, 608]
dataset:
annotation: dataset/coco/annotations/instances_val2017.json
sample_transforms:
- !DecodeImage
to_rgb: True
with_mixup: False
- !ResizeImage
interp: 2
target_size: 608
- !NormalizeImage
mean:
- 0.485
- 0.456
- 0.406
std:
- 0.229
- 0.224
- 0.225
is_scale: False
is_channel_first: False
- !Permute
to_bgr: False
......@@ -140,9 +140,9 @@ For details can refer to [Evaluation](#Evaluate-on-the-FDDB).
## Get Started
`Training` and `Inference` please refer to [GETTING_STARTED.md](../../docs/GETTING_STARTED.md)
- **NOTES:**
- `BlazeFace` and `FaceBoxes` is trained in 4 GPU with `batch_size=8` per gpu (total batch size as 32)
and trained 320000 iters.(If your GPU count is not 4, please refer to the rule of training parameters
- **NOTES:**
- `BlazeFace` and `FaceBoxes` is trained in 4 GPU with `batch_size=8` per gpu (total batch size as 32)
and trained 320000 iters.(If your GPU count is not 4, please refer to the rule of training parameters
in the table of [calculation rules](../../docs/GETTING_STARTED.md#faq))
- Currently we do not support evaluation in training.
......@@ -156,7 +156,7 @@ python tools/face_eval.py -c configs/face_detection/blazeface.yml
- `-d` or `--dataset_dir`: Dataset path, same as dataset_dir of configs. Such as: `-d dataset/wider_face`.
- `-f` or `--output_eval`: Evaluation file directory, default is `output/pred`.
- `-e` or `--eval_mode`: Evaluation mode, include `widerface` and `fddb`, default is `widerface`.
- `--multi_scale`: If you add this action button in the command, it will select `multi_scale` evaluation.
- `--multi_scale`: If you add this action button in the command, it will select `multi_scale` evaluation.
Default is `False`, it will select `single-scale` evaluation.
After the evaluation is completed, the test result in txt format will be generated in `output/pred`,
......@@ -183,7 +183,7 @@ matlab -nodesktop -nosplash -nojvm -r "run wider_eval.m;quit;"
```
#### Evaluate on the FDDB
[FDDB dataset](http://vis-www.cs.umass.edu/fddb/) details can refer to FDDB's official website.
[FDDB dataset](http://vis-www.cs.umass.edu/fddb/) details can refer to FDDB's official website.
- Download the official dataset and evaluation script to evaluate the ROC metrics:
```
#external link to the Faces in the Wild data set
......@@ -238,7 +238,7 @@ regression parameters of a bounding box as a weighted mean between the overlappi
less network layer and conv channel number than `Lite`.
### FaceBoxes
**Introduction:**
**Introduction:**
[FaceBoxes](https://arxiv.org/abs/1708.05234) which named A CPU Real-time Face Detector
with High Accuracy is face detector proposed by Shifeng Zhang, with high performance on
both speed and accuracy. This paper is published by IJCB(2017).
......
......@@ -63,8 +63,8 @@ OptimizerBuilder:
YoloTrainFeed:
batch_size: 1
dataset:
dataset_dir: dataset/fruit/fruit-detection
annotation: train.txt
dataset_dir: dataset/fruit
annotation: fruit-detection/train.txt
use_default_label: false
num_workers: 16
bufsize: 128
......@@ -109,8 +109,8 @@ YoloEvalFeed:
batch_size: 1
image_shape: [3, 608, 608]
dataset:
dataset_dir: dataset/fruit/fruit-detection
annotation: val.txt
dataset_dir: dataset/fruit
annotation: fruit-detection/val.txt
use_default_label: false
......@@ -118,5 +118,5 @@ YoloTestFeed:
batch_size: 1
image_shape: [3, 608, 608]
dataset:
dataset_dir: dataset/fruit/fruit-detection
dataset_dir: dataset/fruit
use_default_label: false
......@@ -44,7 +44,7 @@ Users can employ the model to conduct the inference:
export CUDA_VISIBLE_DEVICES=0
export PYTHONPATH=$PYTHONPATH:.
python -u tools/infer.py -c contrib/VehicleDetection/vehicle_yolov3_darknet.yml \
-o weights=https://paddlemodels.bj.bcebos.com/object_detection/vehicle_yolov3_darknet.tar \
-o weights=https://paddlemodels.bj.bcebos.com/object_detection/vehicle_yolov3_darknet.tar \
--infer_dir contrib/VehicleDetection/demo \
--draw_threshold 0.2 \
--output_dir contrib/VehicleDetection/demo/output
......@@ -90,9 +90,9 @@ Users can employ the model to conduct the inference:
```
export CUDA_VISIBLE_DEVICES=0
export PYTHONPATH=$PYTHONPATH:.
python -u tools/infer.py -c contrib/PedestrianDetection/pedestrian_yolov3_darknet.yml \
python -u tools/infer.py -c contrib/PedestrianDetection/pedestrian_yolov3_darknet.yml \
-o weights=https://paddlemodels.bj.bcebos.com/object_detection/pedestrian_yolov3_darknet.tar \
--infer_dir contrib/PedestrianDetection/demo \
--infer_dir contrib/PedestrianDetection/demo \
--draw_threshold 0.3 \
--output_dir contrib/PedestrianDetection/demo/output
```
......
......@@ -45,7 +45,7 @@ IOU=.5时的AP为 0.764。
export CUDA_VISIBLE_DEVICES=0
export PYTHONPATH=$PYTHONPATH:.
python -u tools/infer.py -c contrib/VehicleDetection/vehicle_yolov3_darknet.yml \
-o weights=https://paddlemodels.bj.bcebos.com/object_detection/vehicle_yolov3_darknet.tar \
-o weights=https://paddlemodels.bj.bcebos.com/object_detection/vehicle_yolov3_darknet.tar \
--infer_dir contrib/VehicleDetection/demo \
--draw_threshold 0.2 \
--output_dir contrib/VehicleDetection/demo/output
......@@ -92,9 +92,9 @@ IOU=.5-.95时的AP为 0.518。
```
export CUDA_VISIBLE_DEVICES=0
export PYTHONPATH=$PYTHONPATH:.
python -u tools/infer.py -c contrib/PedestrianDetection/pedestrian_yolov3_darknet.yml \
python -u tools/infer.py -c contrib/PedestrianDetection/pedestrian_yolov3_darknet.yml \
-o weights=https://paddlemodels.bj.bcebos.com/object_detection/pedestrian_yolov3_darknet.tar \
--infer_dir contrib/PedestrianDetection/demo \
--infer_dir contrib/PedestrianDetection/demo \
--draw_threshold 0.3 \
--output_dir contrib/PedestrianDetection/demo/output
```
......
......@@ -12,7 +12,6 @@
# See the License for the specific language governing permissions and
# limitations under the License.
import sys
import os.path as osp
import logging
......
......@@ -12,7 +12,6 @@
# See the License for the specific language governing permissions and
# limitations under the License.
import sys
import os.path as osp
import logging
......
......@@ -12,7 +12,6 @@
# See the License for the specific language governing permissions and
# limitations under the License.
import sys
import os.path as osp
import logging
......
......@@ -12,7 +12,6 @@
# See the License for the specific language governing permissions and
# limitations under the License.
import sys
import os.path as osp
import logging
......
......@@ -63,7 +63,7 @@ Loads `Pascal VOC` like datasets with directory structure like this:
│ ├── Annotations
│ ├── 001789.xml
│ | ...
│ ├── JPEGImages
│ ├── JPEGImages
│ ├── 001789.xml
│ | ...
│ ├── ImageSets
......@@ -72,7 +72,7 @@ Loads `Pascal VOC` like datasets with directory structure like this:
│ ├── Annotations
│ ├── 003876.xml
│ | ...
│ ├── JPEGImages
│ ├── JPEGImages
│ ├── 003876.xml
│ | ...
│ ├── ImageSets
......@@ -82,7 +82,7 @@ Loads `Pascal VOC` like datasets with directory structure like this:
**NOTE:** If you set `use_default_label=False` in yaml configs, the `label_list.txt`
of Pascal VOC dataset will be read, otherwise, `label_list.txt` is unnecessary and
the default Pascal VOC label list which defined in
the default Pascal VOC label list which defined in
[voc\_loader.py](../ppdet/data/source/voc_loader.py) will be used.
- Roidb data source
......
......@@ -52,7 +52,7 @@
│ ├── Annotations
│ ├── 001789.xml
│ | ...
│ ├── JPEGImages
│ ├── JPEGImages
│ ├── 001789.xml
│ | ...
│ ├── ImageSets
......@@ -61,7 +61,7 @@
│ ├── Annotations
│ ├── 003876.xml
│ | ...
│ ├── JPEGImages
│ ├── JPEGImages
│ ├── 003876.xml
│ | ...
│ ├── ImageSets
......
......@@ -22,7 +22,7 @@ For general information about PaddleDetection, please see [README.md](../README.
## PaddlePaddle
Running PaddleDetection requires PaddlePaddle Fluid v.1.5 and later. please follow the instructions in [installation document](http://www.paddlepaddle.org.cn/).
Running PaddleDetection requires PaddlePaddle Fluid v.1.6 and later. please follow the instructions in [installation document](http://www.paddlepaddle.org.cn/).
Please make sure your PaddlePaddle installation was successful and the version
of your PaddlePaddle is not lower than required. Verify with the following commands.
......@@ -170,7 +170,7 @@ python dataset/voc/create_list.py
│ ├── Annotations
│ ├── 001789.xml
│ | ...
│ ├── JPEGImages
│ ├── JPEGImages
│ ├── 001789.xml
│ | ...
│ ├── ImageSets
......@@ -179,7 +179,7 @@ python dataset/voc/create_list.py
│ ├── Annotations
│ ├── 003876.xml
│ | ...
│ ├── JPEGImages
│ ├── JPEGImages
│ ├── 003876.xml
│ | ...
│ ├── ImageSets
......@@ -189,7 +189,7 @@ python dataset/voc/create_list.py
**NOTE:** If you set `use_default_label=False` in yaml configs, the `label_list.txt`
of Pascal VOC dataset will be read, otherwise, `label_list.txt` is unnecessary and
the default Pascal VOC label list which defined in
the default Pascal VOC label list which defined in
[voc\_loader.py](../ppdet/data/source/voc_loader.py) will be used.
**Download datasets automatically:**
......
......@@ -20,7 +20,7 @@ PaddleDetection的相关信息,请参考[README.md](../README.md).
## PaddlePaddle
运行PaddleDetection需要PaddlePaddle Fluid v.1.5及更高版本。请按照[安装文档](http://www.paddlepaddle.org.cn/)中的说明进行操作。
运行PaddleDetection需要PaddlePaddle Fluid v.1.6及更高版本。请按照[安装文档](http://www.paddlepaddle.org.cn/)中的说明进行操作。
请确保您的PaddlePaddle安装成功并且版本不低于需求版本。使用以下命令进行验证。
......@@ -167,7 +167,7 @@ python dataset/voc/create_list.py
│ ├── Annotations
│ ├── 001789.xml
│ | ...
│ ├── JPEGImages
│ ├── JPEGImages
│ ├── 001789.xml
│ | ...
│ ├── ImageSets
......@@ -176,7 +176,7 @@ python dataset/voc/create_list.py
│ ├── Annotations
│ ├── 003876.xml
│ | ...
│ ├── JPEGImages
│ ├── JPEGImages
│ ├── 003876.xml
│ | ...
│ ├── ImageSets
......
......@@ -97,17 +97,19 @@ The backbone models pretrained on ImageNet are available. All backbone models ar
### Yolo v3
| Backbone | Size | Image/gpu | Lr schd | Inf time (fps) | Box AP | Download |
| :----------- | :--: | :-------: | :-----: | :------------: | :----: | :----------------------------------------------------------: |
| DarkNet53 | 608 | 8 | 270e | 45.571 | 38.9 | [model](https://paddlemodels.bj.bcebos.com/object_detection/yolov3_darknet.tar) |
| DarkNet53 | 416 | 8 | 270e | - | 37.5 | [model](https://paddlemodels.bj.bcebos.com/object_detection/yolov3_darknet.tar) |
| DarkNet53 | 320 | 8 | 270e | - | 34.8 | [model](https://paddlemodels.bj.bcebos.com/object_detection/yolov3_darknet.tar) |
| MobileNet-V1 | 608 | 8 | 270e | 78.302 | 29.3 | [model](https://paddlemodels.bj.bcebos.com/object_detection/yolov3_mobilenet_v1.tar) |
| MobileNet-V1 | 416 | 8 | 270e | - | 29.3 | [model](https://paddlemodels.bj.bcebos.com/object_detection/yolov3_mobilenet_v1.tar) |
| MobileNet-V1 | 320 | 8 | 270e | - | 27.1 | [model](https://paddlemodels.bj.bcebos.com/object_detection/yolov3_mobilenet_v1.tar) |
| ResNet34 | 608 | 8 | 270e | 63.356 | 36.2 | [model](https://paddlemodels.bj.bcebos.com/object_detection/yolov3_r34.tar) |
| ResNet34 | 416 | 8 | 270e | - | 34.3 | [model](https://paddlemodels.bj.bcebos.com/object_detection/yolov3_r34.tar) |
| ResNet34 | 320 | 8 | 270e | - | 31.4 | [model](https://paddlemodels.bj.bcebos.com/object_detection/yolov3_r34.tar) |
| Backbone | Pretrain dataset | Size | deformable Conv | Image/gpu | Lr schd | Inf time (fps) | Box AP | Download |
| :----------- | :--------: | :-----: | :-----: |:------------: |:----: | :-------: | :----: | :-------: |
| DarkNet53 | ImageNet | 608 | False | 8 | 270e | 45.571 | 38.9 | [下载链接](https://paddlemodels.bj.bcebos.com/object_detection/yolov3_darknet.tar) |
| DarkNet53 | ImageNet | 416 | False | 8 | 270e | - | 37.5 | [下载链接](https://paddlemodels.bj.bcebos.com/object_detection/yolov3_darknet.tar) |
| DarkNet53 | ImageNet | 320 | False | 8 | 270e | - | 34.8 | [下载链接](https://paddlemodels.bj.bcebos.com/object_detection/yolov3_darknet.tar) |
| MobileNet-V1 | ImageNet | 608 | False | 8 | 270e | 78.302 | 29.3 | [下载链接](https://paddlemodels.bj.bcebos.com/object_detection/yolov3_mobilenet_v1.tar) |
| MobileNet-V1 | ImageNet | 416 | False | 8 | 270e | - | 29.3 | [下载链接](https://paddlemodels.bj.bcebos.com/object_detection/yolov3_mobilenet_v1.tar) |
| MobileNet-V1 | ImageNet | 320 | False | 8 | 270e | - | 27.1 | [下载链接](https://paddlemodels.bj.bcebos.com/object_detection/yolov3_mobilenet_v1.tar) |
| ResNet34 | ImageNet | 608 | False | 8 | 270e | 63.356 | 36.2 | [下载链接](https://paddlemodels.bj.bcebos.com/object_detection/yolov3_r34.tar) |
| ResNet34 | ImageNet | 416 | False | 8 | 270e | - | 34.3 | [下载链接](https://paddlemodels.bj.bcebos.com/object_detection/yolov3_r34.tar) |
| ResNet34 | ImageNet | 320 | False | 8 | 270e | - | 31.4 | [下载链接](https://paddlemodels.bj.bcebos.com/object_detection/yolov3_r34.tar) |
| ResNet50_vd | ImageNet | 608 | True | 8 | 270e | - | 39.1 | [下载链接](https://paddlemodels.bj.bcebos.com/object_detection/yolov3_r50vd_dcn.tar) |
| ResNet50_vd | Object365 | 608 | True | 8 | 270e | - | 41.4 | [下载链接](https://paddlemodels.bj.bcebos.com/object_detection/yolov3_r50vd_dcn_obj365_pretrained_coco.tar) |
### Yolo v3 on Pascal VOC
......@@ -127,7 +129,7 @@ The backbone models pretrained on ImageNet are available. All backbone models ar
**Notes:** Yolo v3 is trained in 8 GPU with total batch size as 64 and trained 270 epoches. Yolo v3 training data augmentations: mixup,
randomly color distortion, randomly cropping, randomly expansion, randomly interpolation method, randomly flippling. Yolo v3 used randomly
reshaped minibatch in training, inferences can be performed on different image sizes with the same model weights, and we provided evaluation
results of image size 608/416/320 above.
results of image size 608/416/320 above. Deformable conv is added on stage 5 of backbone.
### RetinaNet
......
......@@ -94,17 +94,19 @@ Paddle提供基于ImageNet的骨架网络预训练模型。所有预训练模型
### Yolo v3
| 骨架网络 | 输入尺寸 | 每张GPU图片个数 | 学习率策略 |推理时间(fps)| Box AP | 下载 |
| :----------- | :--: | :-----: | :-----: |:------------: |:----: | :-------: |
| DarkNet53 | 608 | 8 | 270e | 45.571 | 38.9 | [下载链接](https://paddlemodels.bj.bcebos.com/object_detection/yolov3_darknet.tar) |
| DarkNet53 | 416 | 8 | 270e | - | 37.5 | [下载链接](https://paddlemodels.bj.bcebos.com/object_detection/yolov3_darknet.tar) |
| DarkNet53 | 320 | 8 | 270e | - | 34.8 | [下载链接](https://paddlemodels.bj.bcebos.com/object_detection/yolov3_darknet.tar) |
| MobileNet-V1 | 608 | 8 | 270e | 78.302 | 29.3 | [下载链接](https://paddlemodels.bj.bcebos.com/object_detection/yolov3_mobilenet_v1.tar) |
| MobileNet-V1 | 416 | 8 | 270e | - | 29.3 | [下载链接](https://paddlemodels.bj.bcebos.com/object_detection/yolov3_mobilenet_v1.tar) |
| MobileNet-V1 | 320 | 8 | 270e | - | 27.1 | [下载链接](https://paddlemodels.bj.bcebos.com/object_detection/yolov3_mobilenet_v1.tar) |
| ResNet34 | 608 | 8 | 270e | 63.356 | 36.2 | [下载链接](https://paddlemodels.bj.bcebos.com/object_detection/yolov3_r34.tar) |
| ResNet34 | 416 | 8 | 270e | - | 34.3 | [下载链接](https://paddlemodels.bj.bcebos.com/object_detection/yolov3_r34.tar) |
| ResNet34 | 320 | 8 | 270e | - | 31.4 | [下载链接](https://paddlemodels.bj.bcebos.com/object_detection/yolov3_r34.tar) |
| 骨架网络 | 预训练数据集 | 输入尺寸 | 加入deformable卷积 | 每张GPU图片个数 | 学习率策略 |推理时间(fps)| Box AP | 下载 |
| :----------- | :--: | :-----: | :-----: |:------------: |:----: | :-------: | :----: | :-------: |
| DarkNet53 | ImageNet | 608 | 否 | 8 | 270e | 45.571 | 38.9 | [下载链接](https://paddlemodels.bj.bcebos.com/object_detection/yolov3_darknet.tar) |
| DarkNet53 | ImageNet | 416 | 否 | 8 | 270e | - | 37.5 | [下载链接](https://paddlemodels.bj.bcebos.com/object_detection/yolov3_darknet.tar) |
| DarkNet53 | ImageNet | 320 | 否 | 8 | 270e | - | 34.8 | [下载链接](https://paddlemodels.bj.bcebos.com/object_detection/yolov3_darknet.tar) |
| MobileNet-V1 | ImageNet | 608 | 否 | 8 | 270e | 78.302 | 29.3 | [下载链接](https://paddlemodels.bj.bcebos.com/object_detection/yolov3_mobilenet_v1.tar) |
| MobileNet-V1 | ImageNet | 416 | 否 | 8 | 270e | - | 29.3 | [下载链接](https://paddlemodels.bj.bcebos.com/object_detection/yolov3_mobilenet_v1.tar) |
| MobileNet-V1 | ImageNet | 320 | 否 | 8 | 270e | - | 27.1 | [下载链接](https://paddlemodels.bj.bcebos.com/object_detection/yolov3_mobilenet_v1.tar) |
| ResNet34 | ImageNet | 608 | 否 | 8 | 270e | 63.356 | 36.2 | [下载链接](https://paddlemodels.bj.bcebos.com/object_detection/yolov3_r34.tar) |
| ResNet34 | ImageNet | 416 | 否 | 8 | 270e | - | 34.3 | [下载链接](https://paddlemodels.bj.bcebos.com/object_detection/yolov3_r34.tar) |
| ResNet34 | ImageNet | 320 | 否 | 8 | 270e | - | 31.4 | [下载链接](https://paddlemodels.bj.bcebos.com/object_detection/yolov3_r34.tar) |
| ResNet50_vd | ImageNet | 608 | 是 | 8 | 270e | - | 39.1 | [下载链接](https://paddlemodels.bj.bcebos.com/object_detection/yolov3_r50vd_dcn.tar) |
| ResNet50_vd | Object365 | 608 | 是 | 8 | 270e | - | 41.4 | [下载链接](https://paddlemodels.bj.bcebos.com/object_detection/yolov3_r50vd_dcn_obj365_pretrained_coco.tar) |
### Yolo v3 基于Pasacl VOC数据集
......@@ -120,7 +122,7 @@ Paddle提供基于ImageNet的骨架网络预训练模型。所有预训练模型
| ResNet34 | 416 | 8 | 270e | - | 81.9 | [下载链接](https://paddlemodels.bj.bcebos.com/object_detection/yolov3_r34_voc.tar) |
| ResNet34 | 320 | 8 | 270e | - | 80.1 | [下载链接](https://paddlemodels.bj.bcebos.com/object_detection/yolov3_r34_voc.tar) |
**注意事项:** Yolo v3在8卡,总batch size为64下训练270轮。数据增强包括:mixup, 随机颜色失真,随机剪裁,随机扩张,随机插值法,随机翻转。Yolo v3在训练阶段对minibatch采用随机reshape,可以采用相同的模型测试不同尺寸图片,我们分别提供了尺寸为608/416/320大小的测试结果。
**注意事项:** Yolo v3在8卡,总batch size为64下训练270轮。数据增强包括:mixup, 随机颜色失真,随机剪裁,随机扩张,随机插值法,随机翻转。Yolo v3在训练阶段对minibatch采用随机reshape,可以采用相同的模型测试不同尺寸图片,我们分别提供了尺寸为608/416/320大小的测试结果。deformable卷积作用在骨架网络5阶段。
### RetinaNet
......
# YOLOv3增强模型
---
## 简介
[YOLOv3](https://arxiv.org/abs/1804.02767) 是由 [Joseph Redmon](https://arxiv.org/search/cs?searchtype=author&query=Redmon%2C+J)[Ali Farhadi](https://arxiv.org/search/cs?searchtype=author&query=Farhadi%2C+A) 提出的单阶段检测器, 该检测
器与达到同样精度的传统目标检测方法相比,推断速度能达到接近两倍.
PaddleDetection实现版本中使用了 [Bag of Freebies for Training Object Detection Neural Networks](https://arxiv.org/abs/1902.04103v3) 中提出的图像增强和label smooth等优化方法,精度优于darknet框架的实现版本,在COCO-2017数据集上,YOLOv3(DarkNet)达到`mAP(0.50:0.95)= 38.9`的精度,比darknet实现版本的精度(33.0)要高5.9。同时,在推断速度方面,基于Paddle预测库的加速方法,推断速度比darknet高30%。
在此基础上,PaddleDetection对YOLOv3进一步改进,得到了更大的精度和速度优势。
## 方法描述
将YOLOv3骨架网络更换为ResNet50-vd,同时在最后一个Residual block中引入[Deformable convolution v2](https://arxiv.org/abs/1811.11168)(可变形卷积)替代原始卷积操作。另外,使用[object365数据集](https://www.objects365.org/download.html)训练得到的模型作为coco数据集上的预训练模型,进一步提高YOLOv3的精度。
## 使用方法
### 模型训练
```bash
export CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7
python tools/train.py -c configs/dcn/yolov3_r50vd_dcn.yml
```
更多模型参数请使用``python tools/train.py --help``查看,或参考[训练、评估及参数说明](docs/GETTING_STARTED_cn.md)文档
### 模型效果
| 模型 | 预训练模型 | 验证集 mAP | P4预测速度 | 下载 |
| :---------------------:|:-----------------: | :-------------: | :----------------------:|:-----------------------------------------------------: |
| YOLOv3 DarkNet | [DarkNet pretrain](https://paddle-imagenet-models-name.bj.bcebos.com/DarkNet53_pretrained.tar) | 38.9 | 原生:88.3ms<br>tensorRT-FP32: 42.5ms | [下载链接](https://paddlemodels.bj.bcebos.com/object_detection/yolov3_darknet.tar) |
| YOLOv3 ResNet50_vd dcn | [ImageNet pretrain](https://paddle-imagenet-models-name.bj.bcebos.com/ResNet50_vd_pretrained.tar) | 39.1 | 原生:74.4ms<br>tensorRT-FP32: 35.2ms | [下载链接](https://paddlemodels.bj.bcebos.com/object_detection/yolov3_r50vd_dcn_imagenet.tar) |
| YOLOv3 ResNet50_vd dcn | [Object365 pretrain](https://paddlemodels.bj.bcebos.com/object_detection/ResNet50_vd_obj365_pretrained.tar) | 41.4 | 原生:74.4ms<br>tensorRT-FP32: 35.2ms | [下载链接](https://paddlemodels.bj.bcebos.com/object_detection/yolov3_r50vd_dcn_obj365.tar) |
......@@ -118,7 +118,7 @@ DEPLOY:
# 预测模式,支持 NATIVE 和 ANALYSIS
PREDICTOR_MODE: "ANALYSIS"
# 每次预测的 batch_size
BATCH_SIZE : 3
BATCH_SIZE : 3
# 长边伸缩的最大长度,-1代表无限制。
RESIZE_MAX_SIZE: 1333
# 输入的tensor数量。
......@@ -155,7 +155,7 @@ DEPLOY:
运行可视化脚本时,只需输入命令行参数图片路径、检测结果pb文件路径、目标框阈值以及类别-标签映射文件路径即可得到可视化的图片`X.png` (tools目录下提供coco17的类别标签映射文件coco17.json)。
```bash
```bash
python vis.py --img_path=../build/images/detection_rcnn/000000087038.jpg --img_result_path=../build/images/detection_rcnn/000000087038.jpg.pb --threshold=0.1 --c2l_path=coco17.json
```
......@@ -168,4 +168,3 @@ python vis.py --img_path=../build/images/detection_rcnn/000000087038.jpg --img_r
```检测结果图:```
![检测结果](./demo_images/000000087038.jpg.png)
......@@ -16,7 +16,7 @@
```yaml
# 预测部署时所有配置字段需在DEPLOY字段下
DEPLOY:
DEPLOY:
# 类型:required int
# 含义:是否使用GPU预测。 0:不使用 1:使用
USE_GPU: 1
......@@ -71,5 +71,5 @@ DEPLOY:
FEEDS_SIZE: 2
# 类型: optional int
# 含义: 将图像的边变为该字段的值的整数倍。默认值为1。
COARSEST_STRIDE: 32
```
\ No newline at end of file
COARSEST_STRIDE: 32
```
......@@ -42,9 +42,9 @@ fluid_inference
1. 在OpenCV官网下载适用于Windows平台的3.4.6版本, [下载地址](https://sourceforge.net/projects/opencvlibrary/files/3.4.6/opencv-3.4.6-vc14_vc15.exe/download)
2. 运行下载的可执行文件,将OpenCV解压至指定目录,如`D:\projects\opencv`
3. 配置环境变量,如下流程所示
- 我的电脑->属性->高级系统设置->环境变量
- 我的电脑->属性->高级系统设置->环境变量
- 在系统变量中找到Path(如没有,自行创建),并双击编辑
- 新建,将opencv路径填入并保存,如`D:\projects\opencv\build\x64\vc14\bin`
- 新建,将opencv路径填入并保存,如`D:\projects\opencv\build\x64\vc14\bin`
### Step4: 以VS2015为例编译代码
......@@ -56,7 +56,7 @@ fluid_inference
```
call "C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\vcvarsall.bat" amd64
```
* CMAKE编译工程
* PADDLE_DIR: fluid_inference预测库路径
* CUDA_LIB: CUDA动态库目录, 请根据实际安装情况调整
......@@ -94,4 +94,3 @@ detection_demo.exe --conf=/path/to/your/conf --input_dir=/path/to/your/input/dat
```
更详细说明请参考ReadMe文档: [预测和可视化部分](../README.md)
......@@ -46,9 +46,9 @@ fluid_inference
1. 在OpenCV官网下载适用于Windows平台的3.4.6版本, [下载地址](https://sourceforge.net/projects/opencvlibrary/files/3.4.6/opencv-3.4.6-vc14_vc15.exe/download)
2. 运行下载的可执行文件,将OpenCV解压至指定目录,如`D:\projects\opencv`
3. 配置环境变量,如下流程所示
- 我的电脑->属性->高级系统设置->环境变量
- 我的电脑->属性->高级系统设置->环境变量
- 在系统变量中找到Path(如没有,自行创建),并双击编辑
- 新建,将opencv路径填入并保存,如`D:\projects\opencv\build\x64\vc14\bin`
- 新建,将opencv路径填入并保存,如`D:\projects\opencv\build\x64\vc14\bin`
### Step4: 使用Visual Studio 2019直接编译CMake
......@@ -99,4 +99,3 @@ detection_demo.exe --conf=/path/to/your/conf --input_dir=/path/to/your/input/dat
```
更详细说明请参考ReadMe文档: [预测和可视化部分](../README.md)
......@@ -2,7 +2,7 @@
# source: detection_result.proto
import sys
_b=sys.version_info[0]<3 and (lambda x:x) or (lambda x:x.encode('latin1'))
_b = sys.version_info[0] < 3 and (lambda x: x) or (lambda x: x.encode('latin1'))
from google.protobuf import descriptor as _descriptor
from google.protobuf import message as _message
from google.protobuf import reflection as _reflection
......@@ -12,140 +12,203 @@ from google.protobuf import descriptor_pb2
_sym_db = _symbol_database.Default()
DESCRIPTOR = _descriptor.FileDescriptor(
name='detection_result.proto',
package='PaddleSolution',
syntax='proto2',
serialized_pb=_b('\n\x16\x64\x65tection_result.proto\x12\x0ePaddleSolution\"\x84\x01\n\x0c\x44\x65tectionBox\x12\r\n\x05\x63lass\x18\x01 \x01(\x05\x12\r\n\x05score\x18\x02 \x01(\x02\x12\x12\n\nleft_top_x\x18\x03 \x01(\x02\x12\x12\n\nleft_top_y\x18\x04 \x01(\x02\x12\x16\n\x0eright_bottom_x\x18\x05 \x01(\x02\x12\x16\n\x0eright_bottom_y\x18\x06 \x01(\x02\"Z\n\x0f\x44\x65tectionResult\x12\x10\n\x08\x66ilename\x18\x01 \x01(\t\x12\x35\n\x0f\x64\x65tection_boxes\x18\x02 \x03(\x0b\x32\x1c.PaddleSolution.DetectionBox')
)
name='detection_result.proto',
package='PaddleSolution',
syntax='proto2',
serialized_pb=_b(
'\n\x16\x64\x65tection_result.proto\x12\x0ePaddleSolution\"\x84\x01\n\x0c\x44\x65tectionBox\x12\r\n\x05\x63lass\x18\x01 \x01(\x05\x12\r\n\x05score\x18\x02 \x01(\x02\x12\x12\n\nleft_top_x\x18\x03 \x01(\x02\x12\x12\n\nleft_top_y\x18\x04 \x01(\x02\x12\x16\n\x0eright_bottom_x\x18\x05 \x01(\x02\x12\x16\n\x0eright_bottom_y\x18\x06 \x01(\x02\"Z\n\x0f\x44\x65tectionResult\x12\x10\n\x08\x66ilename\x18\x01 \x01(\t\x12\x35\n\x0f\x64\x65tection_boxes\x18\x02 \x03(\x0b\x32\x1c.PaddleSolution.DetectionBox'
))
_sym_db.RegisterFileDescriptor(DESCRIPTOR)
_DETECTIONBOX = _descriptor.Descriptor(
name='DetectionBox',
full_name='PaddleSolution.DetectionBox',
filename=None,
file=DESCRIPTOR,
containing_type=None,
fields=[
_descriptor.FieldDescriptor(
name='class', full_name='PaddleSolution.DetectionBox.class', index=0,
number=1, type=5, cpp_type=1, label=1,
has_default_value=False, default_value=0,
message_type=None, enum_type=None, containing_type=None,
is_extension=False, extension_scope=None,
options=None),
_descriptor.FieldDescriptor(
name='score', full_name='PaddleSolution.DetectionBox.score', index=1,
number=2, type=2, cpp_type=6, label=1,
has_default_value=False, default_value=float(0),
message_type=None, enum_type=None, containing_type=None,
is_extension=False, extension_scope=None,
options=None),
_descriptor.FieldDescriptor(
name='left_top_x', full_name='PaddleSolution.DetectionBox.left_top_x', index=2,
number=3, type=2, cpp_type=6, label=1,
has_default_value=False, default_value=float(0),
message_type=None, enum_type=None, containing_type=None,
is_extension=False, extension_scope=None,
options=None),
_descriptor.FieldDescriptor(
name='left_top_y', full_name='PaddleSolution.DetectionBox.left_top_y', index=3,
number=4, type=2, cpp_type=6, label=1,
has_default_value=False, default_value=float(0),
message_type=None, enum_type=None, containing_type=None,
is_extension=False, extension_scope=None,
options=None),
_descriptor.FieldDescriptor(
name='right_bottom_x', full_name='PaddleSolution.DetectionBox.right_bottom_x', index=4,
number=5, type=2, cpp_type=6, label=1,
has_default_value=False, default_value=float(0),
message_type=None, enum_type=None, containing_type=None,
is_extension=False, extension_scope=None,
options=None),
_descriptor.FieldDescriptor(
name='right_bottom_y', full_name='PaddleSolution.DetectionBox.right_bottom_y', index=5,
number=6, type=2, cpp_type=6, label=1,
has_default_value=False, default_value=float(0),
message_type=None, enum_type=None, containing_type=None,
is_extension=False, extension_scope=None,
options=None),
],
extensions=[
],
nested_types=[],
enum_types=[
],
options=None,
is_extendable=False,
syntax='proto2',
extension_ranges=[],
oneofs=[
],
serialized_start=43,
serialized_end=175,
)
name='DetectionBox',
full_name='PaddleSolution.DetectionBox',
filename=None,
file=DESCRIPTOR,
containing_type=None,
fields=[
_descriptor.FieldDescriptor(
name='class',
full_name='PaddleSolution.DetectionBox.class',
index=0,
number=1,
type=5,
cpp_type=1,
label=1,
has_default_value=False,
default_value=0,
message_type=None,
enum_type=None,
containing_type=None,
is_extension=False,
extension_scope=None,
options=None),
_descriptor.FieldDescriptor(
name='score',
full_name='PaddleSolution.DetectionBox.score',
index=1,
number=2,
type=2,
cpp_type=6,
label=1,
has_default_value=False,
default_value=float(0),
message_type=None,
enum_type=None,
containing_type=None,
is_extension=False,
extension_scope=None,
options=None),
_descriptor.FieldDescriptor(
name='left_top_x',
full_name='PaddleSolution.DetectionBox.left_top_x',
index=2,
number=3,
type=2,
cpp_type=6,
label=1,
has_default_value=False,
default_value=float(0),
message_type=None,
enum_type=None,
containing_type=None,
is_extension=False,
extension_scope=None,
options=None),
_descriptor.FieldDescriptor(
name='left_top_y',
full_name='PaddleSolution.DetectionBox.left_top_y',
index=3,
number=4,
type=2,
cpp_type=6,
label=1,
has_default_value=False,
default_value=float(0),
message_type=None,
enum_type=None,
containing_type=None,
is_extension=False,
extension_scope=None,
options=None),
_descriptor.FieldDescriptor(
name='right_bottom_x',
full_name='PaddleSolution.DetectionBox.right_bottom_x',
index=4,
number=5,
type=2,
cpp_type=6,
label=1,
has_default_value=False,
default_value=float(0),
message_type=None,
enum_type=None,
containing_type=None,
is_extension=False,
extension_scope=None,
options=None),
_descriptor.FieldDescriptor(
name='right_bottom_y',
full_name='PaddleSolution.DetectionBox.right_bottom_y',
index=5,
number=6,
type=2,
cpp_type=6,
label=1,
has_default_value=False,
default_value=float(0),
message_type=None,
enum_type=None,
containing_type=None,
is_extension=False,
extension_scope=None,
options=None),
],
extensions=[],
nested_types=[],
enum_types=[],
options=None,
is_extendable=False,
syntax='proto2',
extension_ranges=[],
oneofs=[],
serialized_start=43,
serialized_end=175, )
_DETECTIONRESULT = _descriptor.Descriptor(
name='DetectionResult',
full_name='PaddleSolution.DetectionResult',
filename=None,
file=DESCRIPTOR,
containing_type=None,
fields=[
_descriptor.FieldDescriptor(
name='filename', full_name='PaddleSolution.DetectionResult.filename', index=0,
number=1, type=9, cpp_type=9, label=1,
has_default_value=False, default_value=_b("").decode('utf-8'),
message_type=None, enum_type=None, containing_type=None,
is_extension=False, extension_scope=None,
options=None),
_descriptor.FieldDescriptor(
name='detection_boxes', full_name='PaddleSolution.DetectionResult.detection_boxes', index=1,
number=2, type=11, cpp_type=10, label=3,
has_default_value=False, default_value=[],
message_type=None, enum_type=None, containing_type=None,
is_extension=False, extension_scope=None,
options=None),
],
extensions=[
],
nested_types=[],
enum_types=[
],
options=None,
is_extendable=False,
syntax='proto2',
extension_ranges=[],
oneofs=[
],
serialized_start=177,
serialized_end=267,
)
name='DetectionResult',
full_name='PaddleSolution.DetectionResult',
filename=None,
file=DESCRIPTOR,
containing_type=None,
fields=[
_descriptor.FieldDescriptor(
name='filename',
full_name='PaddleSolution.DetectionResult.filename',
index=0,
number=1,
type=9,
cpp_type=9,
label=1,
has_default_value=False,
default_value=_b("").decode('utf-8'),
message_type=None,
enum_type=None,
containing_type=None,
is_extension=False,
extension_scope=None,
options=None),
_descriptor.FieldDescriptor(
name='detection_boxes',
full_name='PaddleSolution.DetectionResult.detection_boxes',
index=1,
number=2,
type=11,
cpp_type=10,
label=3,
has_default_value=False,
default_value=[],
message_type=None,
enum_type=None,
containing_type=None,
is_extension=False,
extension_scope=None,
options=None),
],
extensions=[],
nested_types=[],
enum_types=[],
options=None,
is_extendable=False,
syntax='proto2',
extension_ranges=[],
oneofs=[],
serialized_start=177,
serialized_end=267, )
_DETECTIONRESULT.fields_by_name['detection_boxes'].message_type = _DETECTIONBOX
DESCRIPTOR.message_types_by_name['DetectionBox'] = _DETECTIONBOX
DESCRIPTOR.message_types_by_name['DetectionResult'] = _DETECTIONRESULT
DetectionBox = _reflection.GeneratedProtocolMessageType('DetectionBox', (_message.Message,), dict(
DESCRIPTOR = _DETECTIONBOX,
__module__ = 'detection_result_pb2'
# @@protoc_insertion_point(class_scope:PaddleSolution.DetectionBox)
))
DetectionBox = _reflection.GeneratedProtocolMessageType(
'DetectionBox',
(_message.Message, ),
dict(
DESCRIPTOR=_DETECTIONBOX,
__module__='detection_result_pb2'
# @@protoc_insertion_point(class_scope:PaddleSolution.DetectionBox)
))
_sym_db.RegisterMessage(DetectionBox)
DetectionResult = _reflection.GeneratedProtocolMessageType('DetectionResult', (_message.Message,), dict(
DESCRIPTOR = _DETECTIONRESULT,
__module__ = 'detection_result_pb2'
# @@protoc_insertion_point(class_scope:PaddleSolution.DetectionResult)
))
DetectionResult = _reflection.GeneratedProtocolMessageType(
'DetectionResult',
(_message.Message, ),
dict(
DESCRIPTOR=_DETECTIONRESULT,
__module__='detection_result_pb2'
# @@protoc_insertion_point(class_scope:PaddleSolution.DetectionResult)
))
_sym_db.RegisterMessage(DetectionResult)
# @@protoc_insertion_point(module_scope)
......@@ -24,9 +24,10 @@ from PIL import Image, ImageDraw, ImageFont
Flags = gflags.FLAGS
gflags.DEFINE_string('img_path', 'abc', 'image path')
gflags.DEFINE_string('img_result_path', 'def', 'image result path')
gflags.DEFINE_float('threshold', 0.0, 'threshold of score')
gflags.DEFINE_float('threshold', 0.0, 'threshold of score')
gflags.DEFINE_string('c2l_path', 'ghk', 'class to label path')
def colormap(rgb=False):
"""
Get colormap
......@@ -62,11 +63,14 @@ def colormap(rgb=False):
color_list = color_list[:, ::-1]
return color_list
if __name__ == "__main__":
if len(sys.argv) != 5:
print("Usage: python vis.py --img_path=/path/to/image --img_result_path=/path/to/image_result.pb --threshold=0.1 --c2l_path=/path/to/class2label.json")
print(
"Usage: python vis.py --img_path=/path/to/image --img_result_path=/path/to/image_result.pb --threshold=0.1 --c2l_path=/path/to/class2label.json"
)
else:
Flags(sys.argv)
Flags(sys.argv)
color_list = colormap(rgb=True)
text_thickness = 1
text_scale = 0.3
......@@ -81,24 +85,33 @@ if __name__ == "__main__":
for box in detection_result.detection_boxes:
if box.score >= Flags.threshold:
box_class = getattr(box, 'class')
text_class_score_str = "%s %.2f" % (class2LabelMap.get(str(box_class)), box.score)
text_class_score_str = "%s %.2f" % (
class2LabelMap.get(str(box_class)), box.score)
text_point = (int(box.left_top_x), int(box.left_top_y))
ptLeftTop = (int(box.left_top_x), int(box.left_top_y))
ptRightBottom = (int(box.right_bottom_x), int(box.right_bottom_y))
ptRightBottom = (int(box.right_bottom_x),
int(box.right_bottom_y))
box_thickness = 1
color = tuple([int(c) for c in color_list[box_class]])
cv2.rectangle(img, ptLeftTop, ptRightBottom, color, box_thickness, 8)
cv2.rectangle(img, ptLeftTop, ptRightBottom, color,
box_thickness, 8)
if text_point[1] < 0:
text_point = (int(box.left_top_x), int(box.right_bottom_y))
text_point = (int(box.left_top_x),
int(box.right_bottom_y))
WHITE = (255, 255, 255)
font = cv2.FONT_HERSHEY_SIMPLEX
text_size = cv2.getTextSize(text_class_score_str, font, text_scale, text_thickness)
text_box_left_top = (text_point[0], text_point[1] - text_size[0][1])
text_box_right_bottom = (text_point[0] + text_size[0][0], text_point[1])
text_size = cv2.getTextSize(text_class_score_str, font,
text_scale, text_thickness)
text_box_left_top = (text_point[0],
text_point[1] - text_size[0][1])
text_box_right_bottom = (
text_point[0] + text_size[0][0], text_point[1])
cv2.rectangle(img, text_box_left_top, text_box_right_bottom, color, -1, 8)
cv2.putText(img, text_class_score_str, text_point, font, text_scale, WHITE, text_thickness)
cv2.rectangle(img, text_box_left_top,
text_box_right_bottom, color, -1, 8)
cv2.putText(img, text_class_score_str, text_point, font,
text_scale, WHITE, text_thickness)
img = cv2.cvtColor(img, cv2.COLOR_RGB2BGR)
cv2.imwrite(Flags.img_path + ".png", img)
......@@ -35,10 +35,7 @@ class IteratorSource(Dataset):
samples (int): number of samples to load, -1 means all
"""
def __init__(self,
iter_maker,
samples=-1,
**kwargs):
def __init__(self, iter_maker, samples=-1, **kwargs):
super(IteratorSource, self).__init__()
self._epoch = -1
......@@ -63,7 +60,8 @@ class IteratorSource(Dataset):
self._sample_num = self._pos
elif self._sample_num != self._pos:
logger.info('num of loaded samples is different '
'with previouse setting[prev:%d,now:%d]' % (self._sample_num, self._pos))
'with previouse setting[prev:%d,now:%d]' %
(self._sample_num, self._pos))
self._sample_num = self._pos
self._data_iter = None
......@@ -100,4 +98,3 @@ class IteratorSource(Dataset):
def epoch_id(self):
return self._epoch
......@@ -127,12 +127,14 @@ def load(fname,
elif os.path.isfile(fname):
from . import voc_loader
if use_default_label is None or cname2cid is not None:
records, cname2cid = voc_loader.get_roidb(fname, samples, cname2cid,
with_background=with_background)
records, cname2cid = voc_loader.get_roidb(
fname, samples, cname2cid, with_background=with_background)
else:
records, cname2cid = voc_loader.load(fname, samples,
use_default_label,
with_background=with_background)
records, cname2cid = voc_loader.load(
fname,
samples,
use_default_label,
with_background=with_background)
else:
raise ValueError('invalid file type when load data from file[%s]' %
(fname))
......
......@@ -35,11 +35,7 @@ class SimpleSource(Dataset):
load_img (bool): should images be loaded
"""
def __init__(self,
images=[],
samples=-1,
load_img=True,
**kwargs):
def __init__(self, images=[], samples=-1, load_img=True, **kwargs):
super(SimpleSource, self).__init__()
self._epoch = -1
for image in images:
......
......@@ -18,10 +18,7 @@ import numpy as np
import xml.etree.ElementTree as ET
def get_roidb(anno_path,
sample_num=-1,
cname2cid=None,
with_background=True):
def get_roidb(anno_path, sample_num=-1, cname2cid=None, with_background=True):
"""
Load VOC records with annotations in xml directory 'anno_path'
......@@ -129,9 +126,7 @@ def get_roidb(anno_path,
return [records, cname2cid]
def load(anno_path,
sample_num=-1,
use_default_label=True,
def load(anno_path, sample_num=-1, use_default_label=True,
with_background=True):
"""
Load VOC records with annotations in
......@@ -246,26 +241,26 @@ def load(anno_path,
def pascalvoc_label(with_background=True):
labels_map = {
'aeroplane': 1,
'bicycle': 2,
'bird': 3,
'boat': 4,
'bottle': 5,
'bus': 6,
'car': 7,
'cat': 8,
'chair': 9,
'cow': 10,
'diningtable': 11,
'dog': 12,
'horse': 13,
'motorbike': 14,
'person': 15,
'pottedplant': 16,
'sheep': 17,
'sofa': 18,
'train': 19,
'tvmonitor': 20
'aeroplane': 1,
'bicycle': 2,
'bird': 3,
'boat': 4,
'bottle': 5,
'bus': 6,
'car': 7,
'cat': 8,
'chair': 9,
'cow': 10,
'diningtable': 11,
'dog': 12,
'horse': 13,
'motorbike': 14,
'person': 15,
'pottedplant': 16,
'sheep': 17,
'sofa': 18,
'train': 19,
'tvmonitor': 20
}
if not with_background:
labels_map = {k: v - 1 for k, v in labels_map.items()}
......
......@@ -18,10 +18,7 @@ import logging
logger = logging.getLogger(__name__)
def load(anno_path,
sample_num=-1,
cname2cid=None,
with_background=True):
def load(anno_path, sample_num=-1, cname2cid=None, with_background=True):
"""
Load WiderFace records with 'anno_path'
......@@ -120,9 +117,7 @@ def _load_file_list(input_txt):
def widerface_label(with_background=True):
labels_map = {
'face': 1
}
labels_map = {'face': 1}
if not with_background:
labels_map = {k: v - 1 for k, v in labels_map.items()}
return labels_map
......@@ -28,6 +28,7 @@ def _generate_iter_maker(num=10):
return _reader
class TestIteratorSource(unittest.TestCase):
"""Test cases for dataset.source.roidb_source
"""
......
......@@ -132,6 +132,7 @@ class TestReader(unittest.TestCase):
def test_create(self):
""" Test create a reader using my source
"""
def _my_data_reader():
mydata = build_source(self.rcnn_conf['DATA']['TRAIN'])
for i, sample in enumerate(mydata):
......@@ -139,10 +140,12 @@ class TestReader(unittest.TestCase):
my_source = IteratorSource(_my_data_reader)
mode = 'TRAIN'
train_rd = Reader.create(mode,
train_rd = Reader.create(
mode,
self.rcnn_conf['DATA'][mode],
self.rcnn_conf['TRANSFORM'][mode],
max_iter=10, my_source=my_source)
max_iter=10,
my_source=my_source)
out = None
for sample in train_rd():
......
......@@ -52,13 +52,14 @@ def images_labelme(data, num):
image['file_name'] = data['imagePath'].split('/')[-1]
return image
def images_cityscape(data, num, img_file):
image = {}
image['height'] = data['imgHeight']
image['width'] = data['imgWidth']
image['id'] = num + 1
image['file_name'] = img_file
return image
return image
def categories(label, labels_list):
......@@ -88,7 +89,8 @@ def annotations_rectangle(points, label, image_num, object_num, label_to_num):
return annotation
def annotations_polygon(height, width, points, label, image_num, object_num, label_to_num):
def annotations_polygon(height, width, points, label, image_num, object_num,
label_to_num):
annotation = {}
annotation['segmentation'] = [list(np.asarray(points).flatten())]
annotation['iscrowd'] = 0
......@@ -131,7 +133,8 @@ def deal_json(ds_type, img_path, json_path):
object_num = -1
for img_file in os.listdir(img_path):
img_label = img_file.split('.')[0]
if img_file.split('.')[-1] not in ['bmp', 'jpg', 'jpeg', 'png', 'JPEG', 'JPG', 'PNG']:
if img_file.split('.')[
-1] not in ['bmp', 'jpg', 'jpeg', 'png', 'JPEG', 'JPG', 'PNG']:
continue
label_file = osp.join(json_path, img_label + '.json')
print('Generating dataset from:', label_file)
......@@ -141,7 +144,7 @@ def deal_json(ds_type, img_path, json_path):
if ds_type == 'labelme':
images_list.append(images_labelme(data, image_num))
elif ds_type == 'cityscape':
images_list.append(images_cityscape(data, image_num, img_file))
images_list.append(images_cityscape(data, image_num, img_file))
if ds_type == 'labelme':
for shapes in data['shapes']:
object_num = object_num + 1
......@@ -155,13 +158,15 @@ def deal_json(ds_type, img_path, json_path):
if p_type == 'polygon':
annotations_list.append(
annotations_polygon(data['imageHeight'], data[
'imageWidth'], points, label, image_num, object_num, label_to_num))
'imageWidth'], points, label, image_num,
object_num, label_to_num))
if p_type == 'rectangle':
points.append([points[0][0], points[1][1]])
points.append([points[1][0], points[0][1]])
annotations_list.append(
annotations_rectangle(points, label, image_num, object_num, label_to_num))
annotations_rectangle(points, label, image_num,
object_num, label_to_num))
elif ds_type == 'cityscape':
for shapes in data['objects']:
object_num = object_num + 1
......@@ -173,7 +178,8 @@ def deal_json(ds_type, img_path, json_path):
points = shapes['polygon']
annotations_list.append(
annotations_polygon(data['imgHeight'], data[
'imgWidth'], points, label, image_num, object_num, label_to_num))
'imgWidth'], points, label, image_num, object_num,
label_to_num))
data_coco['images'] = images_list
data_coco['categories'] = categories_list
data_coco['annotations'] = annotations_list
......@@ -266,9 +272,8 @@ def main():
if not os.path.exists(args.output_dir + '/annotations'):
os.makedirs(args.output_dir + '/annotations')
if args.train_proportion != 0:
train_data_coco = deal_json(args.dataset_type,
args.output_dir + '/train',
args.json_input_dir)
train_data_coco = deal_json(
args.dataset_type, args.output_dir + '/train', args.json_input_dir)
train_json_path = osp.join(args.output_dir + '/annotations',
'instance_train.json')
json.dump(
......@@ -290,5 +295,6 @@ def main():
json.dump(
test_data_coco, open(test_json_path, 'w'), indent=4, cls=MyEncoder)
if __name__ == '__main__':
main()
......@@ -49,14 +49,18 @@ class ParallelMappedDataset(ProxiedDataset):
super(ParallelMappedDataset, self).__init__(source)
worker_args = {k.lower(): v for k, v in worker_args.items()}
args = {'bufsize': 100, 'worker_num': 8,
'use_process': False, 'memsize': '3G'}
args = {
'bufsize': 100,
'worker_num': 8,
'use_process': False,
'memsize': '3G'
}
args.update(worker_args)
if args['use_process'] and type(args['memsize']) is str:
assert args['memsize'][-1].lower() == 'g', \
"invalid param for memsize[%s], should be ended with 'G' or 'g'" % (args['memsize'])
gb = args['memsize'][:-1]
args['memsize'] = int(gb) * 1024 ** 3
args['memsize'] = int(gb) * 1024**3
self._worker_args = args
self._started = False
......
......@@ -318,8 +318,7 @@ class PageAllocator(object):
while True:
# maybe flags already has some '0' pages,
# so just check 'page_num - len(flags)' pages
flags = self.get_page_status(
pos, page_num, ret_flag=True)
flags = self.get_page_status(pos, page_num, ret_flag=True)
if flags.count('0') == page_num:
break
......@@ -526,7 +525,7 @@ class SharedMemoryMgr(object):
if not self._released and not self._allocator.empty():
logger.debug('not empty when delete this SharedMemoryMgr[%s]' %
(self))
(self))
else:
self._released = True
......
......@@ -23,8 +23,10 @@ from paddle.fluid import unique_name
import paddle.fluid.layer_helper_base as lhb
import paddle.fluid.optimizer as optim
__all__ = ['mixed_precision_global_state', 'mixed_precision_context',
'StaticLossScale', 'DynamicLossScale']
__all__ = [
'mixed_precision_global_state', 'mixed_precision_context',
'StaticLossScale', 'DynamicLossScale'
]
_mixed_precision_global_state = None
......@@ -134,8 +136,8 @@ class DynamicLossScale(LossScale):
with layers.Switch() as switch2:
with switch2.case(scale_valid):
layers.assign(new_scale, self.scale)
layers.assign(layers.zeros_like(self.good_steps),
self.good_steps)
layers.assign(
layers.zeros_like(self.good_steps), self.good_steps)
with switch2.default():
layers.increment(self.good_steps)
with switch.default():
......@@ -151,8 +153,7 @@ class DynamicLossScale(LossScale):
with switch.default():
layers.assign(new_scale, self.scale)
layers.assign(layers.zeros_like(self.good_steps),
self.good_steps)
layers.assign(layers.zeros_like(self.good_steps), self.good_steps)
class mixed_precision_context(object):
......@@ -197,7 +198,7 @@ class mixed_precision_context(object):
if not enabled:
return
monkey_patch()
if isinstance(loss_scale, six.integer_types + (float,)):
if isinstance(loss_scale, six.integer_types + (float, )):
self.loss_scale = StaticLossScale(loss_scale)
elif loss_scale == 'dynamic':
self.loss_scale = DynamicLossScale()
......@@ -243,8 +244,8 @@ def create_parameter(self,
if is_half and mp_state is not None:
dtype = 'float32'
param = self._create_parameter(attr, shape, dtype,
is_bias, default_initializer)
param = self._create_parameter(attr, shape, dtype, is_bias,
default_initializer)
if not is_half or mp_state is None:
return param
......
......@@ -34,9 +34,7 @@ class FaceBoxNet(object):
lite_edition (bool): whether or not is FaceBoxes-lite
"""
def __init__(self,
with_extra_blocks=True,
lite_edition=False):
def __init__(self, with_extra_blocks=True, lite_edition=False):
super(FaceBoxNet, self).__init__()
self.with_extra_blocks = with_extra_blocks
......@@ -212,17 +210,16 @@ class FaceBoxNet(object):
return layers[-3], layers[-2], layers[-1]
def _conv_norm(
self,
input,
filter_size,
num_filters,
stride,
padding,
num_groups=1,
act='relu',
use_cudnn=True,
name=None):
def _conv_norm(self,
input,
filter_size,
num_filters,
stride,
padding,
num_groups=1,
act='relu',
use_cudnn=True,
name=None):
parameter_attr = ParamAttr(
learning_rate=0.1,
initializer=fluid.initializer.MSRA(),
......@@ -240,17 +237,16 @@ class FaceBoxNet(object):
bias_attr=False)
return fluid.layers.batch_norm(input=conv, act=act)
def _conv_norm_crelu(
self,
input,
filter_size,
num_filters,
stride,
padding,
num_groups=1,
act='relu',
use_cudnn=True,
name=None):
def _conv_norm_crelu(self,
input,
filter_size,
num_filters,
stride,
padding,
num_groups=1,
act='relu',
use_cudnn=True,
name=None):
parameter_attr = ParamAttr(
learning_rate=0.1,
initializer=fluid.initializer.MSRA(),
......@@ -358,7 +354,6 @@ class FaceBoxNet(object):
act='relu',
name='inceptionA_' + idx + '_conv4_3')
concat = fluid.layers.concat(
[conv1, conv2, conv3, conv4], axis=1)
concat = fluid.layers.concat([conv1, conv2, conv3, conv4], axis=1)
return concat
......@@ -43,15 +43,13 @@ class VGG(object):
with_extra_blocks=False,
normalizations=[20., -1, -1, -1, -1, -1],
extra_block_filters=[[256, 512, 1, 2, 3], [128, 256, 1, 2, 3],
[128, 256, 0, 1, 3], [128, 256, 0, 1, 3]]):
[128, 256, 0, 1, 3],
[128, 256, 0, 1, 3]]):
assert depth in [16, 19], \
"depth {} not in [16, 19]"
self.depth = depth
self.depth_cfg = {
16: [2, 2, 3, 3, 3],
19: [2, 2, 4, 4, 4]
}
self.depth_cfg = {16: [2, 2, 3, 3, 3], 19: [2, 2, 4, 4, 4]}
self.with_extra_blocks = with_extra_blocks
self.normalizations = normalizations
self.extra_block_filters = extra_block_filters
......@@ -77,7 +75,8 @@ class VGG(object):
conv = input
layers = []
for k, v in enumerate(vgg_base):
conv = self._conv_block(conv, v, nums[k], name="conv{}_".format(k + 1))
conv = self._conv_block(
conv, v, nums[k], name="conv{}_".format(k + 1))
layers.append(conv)
if k == 4:
conv = self._pooling_block(conv, 3, 1, pool_padding=1)
......@@ -95,8 +94,14 @@ class VGG(object):
layers = []
for k, v in enumerate(cfg):
assert len(v) == 5, "extra_block_filters size not fix"
conv = self._extra_block(conv, v[0], v[1],
v[2], v[3], v[4], name="conv{}_".format(6 + k))
conv = self._extra_block(
conv,
v[0],
v[1],
v[2],
v[3],
v[4],
name="conv{}_".format(6 + k))
layers.append(conv)
return layers
......@@ -144,15 +149,15 @@ class VGG(object):
return conv_2
def _conv_layer(self,
input,
num_filters,
filter_size,
stride,
padding,
dilation=1,
act='relu',
use_cudnn=True,
name=None):
input,
num_filters,
filter_size,
stride,
padding,
dilation=1,
act='relu',
use_cudnn=True,
name=None):
conv = fluid.layers.conv2d(
input=input,
num_filters=num_filters,
......@@ -195,6 +200,8 @@ class VGG(object):
dtype=input.dtype,
default_initializer=Constant(init_scale))
out = fluid.layers.elementwise_mul(
x=l2_norm, y=scale, axis=-1 if channel_shared else 1,
x=l2_norm,
y=scale,
axis=-1 if channel_shared else 1,
name="conv4_3_norm_scale")
return out
......@@ -39,7 +39,7 @@ feed_var_def = [
# yapf: enable
def create_feed(feed, use_pyreader=True, sub_prog_feed=False):
def create_feed(feed, iterable=False, sub_prog_feed=False):
image_shape = feed.image_shape
feed_var_map = {var['name']: var for var in feed_var_def}
feed_var_map['image'] = {
......@@ -119,11 +119,9 @@ def create_feed(feed, use_pyreader=True, sub_prog_feed=False):
dtype=feed_var_map[key]['dtype'],
lod_level=feed_var_map[key]['lod_level'])) for key in feed.fields])
pyreader = None
if use_pyreader:
pyreader = fluid.io.PyReader(
feed_list=list(feed_vars.values()),
capacity=64,
use_double_buffer=True,
iterable=False)
return pyreader, feed_vars
loader = fluid.io.DataLoader.from_generator(
feed_list=list(feed_vars.values()),
capacity=64,
use_double_buffer=True,
iterable=iterable) if not sub_prog_feed else None
return loader, feed_vars
......@@ -15,7 +15,6 @@
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
from __future__ import unicode_literals
import sys
......@@ -24,7 +23,7 @@ import paddle.fluid as fluid
import logging
logger = logging.getLogger(__name__)
__all__ = ['check_gpu']
__all__ = ['check_gpu', 'check_version']
def check_gpu(use_gpu):
......@@ -45,3 +44,18 @@ def check_gpu(use_gpu):
except Exception as e:
pass
def check_version():
"""
Log error and exit when the installed version of paddlepaddle is
not satisfied.
"""
err = "PaddlePaddle version 1.6 or higher is required, " \
"or a suitable develop version is satisfied as well. \n" \
"Please make sure the version is good with your code." \
try:
fluid.require_version('1.6.0')
except Exception as e:
logger.error(err)
sys.exit(1)
......@@ -71,11 +71,9 @@ DATASETS = {
'https://dataset.bj.bcebos.com/wider_face/wider_face_split.zip',
'a4a898d6193db4b9ef3260a68bad0dc7', ),
], ["WIDER_train", "WIDER_val", "wider_face_split"]),
'fruit': ([
(
'https://dataset.bj.bcebos.com/PaddleDetection_demo/fruit-detection.tar',
'374554a7633b1b68d6a5fbb7c061b8ba', ),
], ["fruit-detection"]),
'fruit': ([(
'https://dataset.bj.bcebos.com/PaddleDetection_demo/fruit-detection.tar',
'ee4a1bf2e321b75b0850cc6e063f79d7', ), ], ["fruit-detection"]),
}
DOWNLOAD_RETRY_LIMIT = 3
......@@ -133,8 +131,8 @@ def get_dataset_path(path, annotation, image_dir):
# not match any dataset in DATASETS
raise ValueError("Dataset {} is not valid and cannot parse dataset type "
"'{}' for automaticly downloading, which only supports "
"'voc' and 'coco' currently".format(path,
osp.split(path)[-1]))
"'voc' , 'coco', 'wider_face' and 'fruit' currently".
format(path, osp.split(path)[-1]))
def create_voc_list(data_dir, devkit_subdir='VOCdevkit'):
......
......@@ -96,7 +96,7 @@ def clean_res(result, keep_name_list):
def eval_run(exe,
compile_program,
pyreader,
loader,
keys,
values,
cls,
......@@ -121,7 +121,7 @@ def eval_run(exe,
has_bbox = 'bbox' in keys
try:
pyreader.start()
loader.start()
while True:
outs = exe.run(compile_program,
fetch_list=values,
......@@ -158,7 +158,7 @@ def eval_run(exe,
iter_id += 1
images_num += len(res['bbox'][1][0]) if has_bbox else 1
except (StopIteration, fluid.core.EOFException):
pyreader.reset()
loader.reset()
logger.info('Test finish iter {}'.format(iter_id))
end_time = time.time()
......
......@@ -38,8 +38,7 @@ def visualize_results(image,
if mask_results:
image = draw_mask(image, im_id, mask_results, threshold)
if bbox_results:
image = draw_bbox(image, im_id, catid2name, bbox_results,
threshold)
image = draw_bbox(image, im_id, catid2name, bbox_results, threshold)
return image
......@@ -102,9 +101,8 @@ def draw_bbox(image, im_id, catid2name, bboxes, threshold):
# draw label
text = "{} {:.2f}".format(catid2name[catid], score)
tw, th = draw.textsize(text)
draw.rectangle([(xmin + 1, ymin - th),
(xmin + tw + 1, ymin)],
fill=color)
draw.rectangle(
[(xmin + 1, ymin - th), (xmin + tw + 1, ymin)], fill=color)
draw.text((xmin + 1, ymin - th), text, fill=(255, 255, 255))
return image
......@@ -28,9 +28,7 @@ from .coco_eval import bbox2out
import logging
logger = logging.getLogger(__name__)
__all__ = [
'bbox_eval', 'bbox2out', 'get_category_info'
]
__all__ = ['bbox_eval', 'bbox2out', 'get_category_info']
def bbox_eval(results,
......@@ -57,11 +55,12 @@ def bbox_eval(results,
assert 'bbox' in results[0]
logger.info("Start evaluate...")
detection_map = DetectionMAP(class_num=class_num,
overlap_thresh=overlap_thresh,
map_type=map_type,
is_bbox_normalized=is_bbox_normalized,
evaluate_difficult=evaluate_difficult)
detection_map = DetectionMAP(
class_num=class_num,
overlap_thresh=overlap_thresh,
map_type=map_type,
is_bbox_normalized=is_bbox_normalized,
evaluate_difficult=evaluate_difficult)
for t in results:
bboxes = t['bbox'][0]
......@@ -84,9 +83,9 @@ def bbox_eval(results,
difficult = None if difficults is None \
else difficults[i]
bbox_num = bbox_lengths[i]
bbox = bboxes[bbox_idx: bbox_idx + bbox_num]
bbox = bboxes[bbox_idx:bbox_idx + bbox_num]
gt_box, gt_label, difficult = prune_zero_padding(
gt_box, gt_label, difficult)
gt_box, gt_label, difficult)
detection_map.update(bbox, gt_box, gt_label, difficult)
bbox_idx += bbox_num
else:
......@@ -97,9 +96,9 @@ def bbox_eval(results,
for i in range(len(bbox_lengths)):
bbox_num = bbox_lengths[i]
gt_box_num = gt_box_lengths[i]
bbox = bboxes[bbox_idx: bbox_idx + bbox_num]
gt_box = gt_boxes[gt_box_idx: gt_box_idx + gt_box_num]
gt_label = gt_labels[gt_box_idx: gt_box_idx + gt_box_num]
bbox = bboxes[bbox_idx:bbox_idx + bbox_num]
gt_box = gt_boxes[gt_box_idx:gt_box_idx + gt_box_num]
gt_label = gt_labels[gt_box_idx:gt_box_idx + gt_box_num]
difficult = None if difficults is None else \
difficults[gt_box_idx: gt_box_idx + gt_box_num]
detection_map.update(bbox, gt_box, gt_label, difficult)
......@@ -109,8 +108,8 @@ def bbox_eval(results,
logger.info("Accumulating evaluatation results...")
detection_map.accumulate()
map_stat = 100. * detection_map.get_map()
logger.info("mAP({:.2f}, {}) = {:.2f}".format(overlap_thresh,
map_type, map_stat))
logger.info("mAP({:.2f}, {}) = {:.2f}".format(overlap_thresh, map_type,
map_stat))
return map_stat
......@@ -121,8 +120,8 @@ def prune_zero_padding(gt_box, gt_label, difficult=None):
gt_box[i, 2] == 0 and gt_box[i, 3] == 0:
break
valid_cnt += 1
return (gt_box[:valid_cnt], gt_label[:valid_cnt],
difficult[:valid_cnt] if difficult is not None else None)
return (gt_box[:valid_cnt], gt_label[:valid_cnt], difficult[:valid_cnt]
if difficult is not None else None)
def get_category_info(anno_file=None,
......
......@@ -77,10 +77,11 @@ def _walk_voc_dir(devkit_dir, year, output_dir):
if name_prefix in added:
continue
added.add(name_prefix)
ann_path = osp.join(osp.relpath(annotation_dir, output_dir),
name_prefix + '.xml')
img_path = osp.join(osp.relpath(img_dir, output_dir),
name_prefix + '.jpg')
ann_path = osp.join(
osp.relpath(annotation_dir, output_dir),
name_prefix + '.xml')
img_path = osp.join(
osp.relpath(img_dir, output_dir), name_prefix + '.jpg')
img_ann_list.append((img_path, ann_path))
return trainval_list, test_list
......@@ -143,7 +143,7 @@ def main():
# build program
model = create(main_arch)
_, train_feed_vars = create_feed(train_feed, False)
_, train_feed_vars = create_feed(train_feed, True)
train_fetches = model.train(train_feed_vars)
loss = train_fetches['loss']
lr = lr_builder()
......@@ -173,7 +173,7 @@ def main():
with fluid.program_guard(eval_prog, startup_prog):
with fluid.unique_name.guard():
model = create(main_arch)
_, test_feed_vars = create_feed(eval_feed, False)
_, test_feed_vars = create_feed(eval_feed, True)
fetches = model.eval(test_feed_vars)
eval_prog = eval_prog.clone(True)
......
......@@ -143,7 +143,7 @@ def main():
with fluid.program_guard(train_prog, startup_prog):
with fluid.unique_name.guard():
model = create(main_arch)
_, feed_vars = create_feed(train_feed, False)
_, feed_vars = create_feed(train_feed, True)
train_fetches = model.train(feed_vars)
loss = train_fetches['loss']
lr = lr_builder()
......@@ -165,7 +165,7 @@ def main():
with fluid.program_guard(eval_prog, startup_prog):
with fluid.unique_name.guard():
model = create(main_arch)
_, test_feed_vars = create_feed(eval_feed, False)
_, test_feed_vars = create_feed(eval_feed, True)
fetches = model.eval(test_feed_vars)
eval_prog = eval_prog.clone(True)
......
......@@ -151,7 +151,7 @@ def main():
with fluid.program_guard(train_prog, startup_prog):
with fluid.unique_name.guard():
model = create(main_arch)
_, feed_vars = create_feed(train_feed, False)
_, feed_vars = create_feed(train_feed, True)
train_fetches = model.train(feed_vars)
loss = train_fetches['loss']
lr = lr_builder()
......@@ -173,7 +173,7 @@ def main():
with fluid.program_guard(eval_prog, startup_prog):
with fluid.unique_name.guard():
model = create(main_arch)
_, test_feed_vars = create_feed(eval_feed, False)
_, test_feed_vars = create_feed(eval_feed, True)
fetches = model.eval(test_feed_vars)
eval_prog = eval_prog.clone(True)
......
......@@ -35,7 +35,7 @@ import paddle.fluid as fluid
from ppdet.utils.eval_utils import parse_fetches, eval_run, eval_results, json_eval_results
import ppdet.utils.checkpoint as checkpoint
from ppdet.utils.check import check_gpu
from ppdet.utils.check import check_gpu, check_version
from ppdet.modeling.model_input import create_feed
from ppdet.data.data_feed import create_reader
from ppdet.core.workspace import load_config, merge_config, create
......@@ -60,6 +60,8 @@ def main():
merge_config(FLAGS.opt)
# check if set use_gpu=True in paddlepaddle cpu version
check_gpu(cfg.use_gpu)
# check if paddlepaddle version is satisfied
check_version()
if 'eval_feed' not in cfg:
eval_feed = create(main_arch + 'EvalFeed')
......@@ -78,14 +80,14 @@ def main():
eval_prog = fluid.Program()
with fluid.program_guard(eval_prog, startup_prog):
with fluid.unique_name.guard():
pyreader, feed_vars = create_feed(eval_feed)
loader, feed_vars = create_feed(eval_feed)
if multi_scale_test is None:
fetches = model.eval(feed_vars)
else:
fetches = model.eval(feed_vars, multi_scale_test)
eval_prog = eval_prog.clone(True)
reader = create_reader(eval_feed, args_path=FLAGS.dataset_dir)
pyreader.decorate_sample_list_generator(reader, place)
loader.set_sample_list_generator(reader, place)
# eval already exists json file
if FLAGS.json_eval:
......@@ -129,8 +131,7 @@ def main():
sub_eval_prog = fluid.Program()
with fluid.program_guard(sub_eval_prog, startup_prog):
with fluid.unique_name.guard():
_, feed_vars = create_feed(
eval_feed, use_pyreader=False, sub_prog_feed=True)
_, feed_vars = create_feed(eval_feed, False, sub_prog_feed=True)
sub_fetches = model.eval(
feed_vars, multi_scale_test, mask_branch=True)
extra_keys = []
......@@ -145,7 +146,7 @@ def main():
if 'weights' in cfg:
checkpoint.load_params(exe, sub_eval_prog, cfg.weights)
results = eval_run(exe, compile_program, pyreader, keys, values, cls, cfg,
results = eval_run(exe, compile_program, loader, keys, values, cls, cfg,
sub_eval_prog, sub_keys, sub_values)
# evaluation
......
......@@ -81,10 +81,10 @@ def face_eval_run(exe,
shrink, max_shrink = get_shrink(image.size[1], image.size[0])
det0 = detect_face(exe, compile_program, fetches, image, shrink)
det1 = flip_test(exe, compile_program, fetches, image, shrink)
[det2, det3] = multi_scale_test(exe, compile_program, fetches, image,
max_shrink)
det4 = multi_scale_test_pyramid(exe, compile_program, fetches, image,
max_shrink)
[det2, det3] = multi_scale_test(exe, compile_program, fetches,
image, max_shrink)
det4 = multi_scale_test_pyramid(exe, compile_program, fetches,
image, max_shrink)
det = np.row_stack((det0, det1, det2, det3, det4))
dets = bbox_vote(det)
else:
......@@ -293,6 +293,7 @@ if __name__ == '__main__':
"--multi_scale",
action='store_true',
default=False,
help="If True it will select `multi_scale` evaluation. Default is `False`, it will select `single-scale` evaluation.")
help="If True it will select `multi_scale` evaluation. Default is `False`, it will select `single-scale` evaluation."
)
FLAGS = parser.parse_args()
main()
......@@ -43,7 +43,7 @@ from ppdet.data.data_feed import create_reader
from ppdet.utils.eval_utils import parse_fetches
from ppdet.utils.cli import ArgsParser
from ppdet.utils.check import check_gpu
from ppdet.utils.check import check_gpu, check_version
from ppdet.utils.visualizer import visualize_results
import ppdet.utils.checkpoint as checkpoint
......@@ -107,6 +107,8 @@ def main():
# check if set use_gpu=True in paddlepaddle cpu version
check_gpu(cfg.use_gpu)
# check if paddlepaddle version is satisfied
check_version()
if 'test_feed' not in cfg:
test_feed = create(main_arch + 'TestFeed')
......@@ -125,12 +127,12 @@ def main():
infer_prog = fluid.Program()
with fluid.program_guard(infer_prog, startup_prog):
with fluid.unique_name.guard():
_, feed_vars = create_feed(test_feed, use_pyreader=False)
loader, feed_vars = create_feed(test_feed, iterable=True)
test_fetches = model.test(feed_vars)
infer_prog = infer_prog.clone(True)
reader = create_reader(test_feed)
feeder = fluid.DataFeeder(place=place, feed_list=feed_vars.values())
loader.set_sample_list_generator(reader, place)
exe.run(startup_prog)
if cfg.weights:
......@@ -174,9 +176,9 @@ def main():
tb_image_frame = 0 # each frame can display ten pictures at most.
imid2path = reader.imid2path
for iter_id, data in enumerate(reader()):
for iter_id, data in enumerate(loader()):
outs = exe.run(infer_prog,
feed=feeder.feed(data),
feed=data,
fetch_list=values,
return_numpy=False)
res = {
......
......@@ -46,7 +46,7 @@ from ppdet.utils import dist_utils
from ppdet.utils.eval_utils import parse_fetches, eval_run, eval_results
from ppdet.utils.stats import TrainingStats
from ppdet.utils.cli import ArgsParser
from ppdet.utils.check import check_gpu
from ppdet.utils.check import check_gpu, check_version
import ppdet.utils.checkpoint as checkpoint
from ppdet.modeling.model_input import create_feed
......@@ -79,6 +79,8 @@ def main():
# check if set use_gpu=True in paddlepaddle cpu version
check_gpu(cfg.use_gpu)
# check if paddlepaddle version is satisfied
check_version()
if not FLAGS.dist or trainer_id == 0:
print_total_cfg(cfg)
......@@ -114,7 +116,7 @@ def main():
with fluid.program_guard(train_prog, startup_prog):
with fluid.unique_name.guard():
model = create(main_arch)
train_pyreader, feed_vars = create_feed(train_feed)
train_loader, feed_vars = create_feed(train_feed)
if FLAGS.fp16:
assert (getattr(model.backbone, 'norm_type', None)
......@@ -143,12 +145,12 @@ def main():
with fluid.program_guard(eval_prog, startup_prog):
with fluid.unique_name.guard():
model = create(main_arch)
eval_pyreader, feed_vars = create_feed(eval_feed)
eval_loader, feed_vars = create_feed(eval_feed)
fetches = model.eval(feed_vars)
eval_prog = eval_prog.clone(True)
eval_reader = create_reader(eval_feed, args_path=FLAGS.dataset_dir)
eval_pyreader.decorate_sample_list_generator(eval_reader, place)
eval_loader.set_sample_list_generator(eval_reader, place)
# parse eval fetches
extra_keys = []
......@@ -206,7 +208,7 @@ def main():
train_reader = create_reader(train_feed, (cfg.max_iters - start_iter) *
devices_num, FLAGS.dataset_dir)
train_pyreader.decorate_sample_list_generator(train_reader, place)
train_loader.set_sample_list_generator(train_reader, place)
# whether output bbox is normalized in model output layer
is_bbox_normalized = False
......@@ -218,7 +220,7 @@ def main():
map_type = cfg.map_type if 'map_type' in cfg else '11point'
train_stats = TrainingStats(cfg.log_smooth_window, train_keys)
train_pyreader.start()
train_loader.start()
start_time = time.time()
end_time = time.time()
......@@ -265,7 +267,7 @@ def main():
if FLAGS.eval:
# evaluation
results = eval_run(exe, compiled_eval_prog, eval_pyreader,
results = eval_run(exe, compiled_eval_prog, eval_loader,
eval_keys, eval_values, eval_cls)
resolution = None
if 'mask' in results[0]:
......@@ -287,7 +289,7 @@ def main():
logger.info("Best test box ap: {}, in iter: {}".format(
best_box_ap_list[0], best_box_ap_list[1]))
train_pyreader.reset()
train_loader.reset()
if __name__ == '__main__':
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册