未验证 提交 f605beed 编写于 作者: W wangguanzhong 提交者: GitHub

[Cherry pick] fix bugs in release 1.5 (#2859)

* add mask fpn r50 2x & minor fix for MODEL ZOO doc (#2747)

* cherry-pick bug fix for release/1.5

* fix mask eval for multi-batch (#2772)

* cherry-pick bug fix for release/1.5

* fix faster rcnn use im_shape (#2779)

* cherry-pick bug fix for release/1.5

* refine bbox_normalize in infer.py (#2781)

* refine bbox_normalize in infer.py

* add is_bbox_normalize

* rename _forward to build

* check callable

* cherry-pick bug fix for release/1.5

* Prevent module instance from modifying global config (#2790)

some of the global config values are reference type, e.g., objects, lists or
dicts, if the created module instances modify those values (see FPN and
spatial_scale), global config will reflect these changes, and instances of the
same class created later will inherit the changed values

* Fix command line parsing for non module options (#2839)

* fix save_inference_model cannot find feed vars (#2842)

* fix save_inference_model cannot find feed vars

* remove comment

* fix format

* add comment for pruned var

* add log for prune

* Fix train+eval in PaddleDetection(#2847)

* resolve conflict merge error

* reset data_feed
上级 2db34817
...@@ -10,6 +10,7 @@ save_dir: output ...@@ -10,6 +10,7 @@ save_dir: output
pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/ResNet50_cos_pretrained.tar pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/ResNet50_cos_pretrained.tar
weights: output/cascade_rcnn_r50_fpn_1x/model_final weights: output/cascade_rcnn_r50_fpn_1x/model_final
metric: COCO metric: COCO
num_classes: 81
CascadeRCNN: CascadeRCNN:
backbone: ResNet backbone: ResNet
...@@ -74,7 +75,6 @@ CascadeBBoxAssigner: ...@@ -74,7 +75,6 @@ CascadeBBoxAssigner:
bg_thresh_hi: [0.5, 0.6, 0.7] bg_thresh_hi: [0.5, 0.6, 0.7]
fg_thresh: [0.5, 0.6, 0.7] fg_thresh: [0.5, 0.6, 0.7]
fg_fraction: 0.25 fg_fraction: 0.25
num_classes: 81
CascadeBBoxHead: CascadeBBoxHead:
head: FC6FC7Head head: FC6FC7Head
...@@ -82,7 +82,6 @@ CascadeBBoxHead: ...@@ -82,7 +82,6 @@ CascadeBBoxHead:
keep_top_k: 100 keep_top_k: 100
nms_threshold: 0.5 nms_threshold: 0.5
score_threshold: 0.05 score_threshold: 0.05
num_classes: 81
FC6FC7Head: FC6FC7Head:
num_chan: 1024 num_chan: 1024
......
...@@ -10,6 +10,7 @@ snapshot_iter: 10000 ...@@ -10,6 +10,7 @@ snapshot_iter: 10000
pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/ResNet101_pretrained.tar pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/ResNet101_pretrained.tar
metric: COCO metric: COCO
weights: output/faster_rcnn_r101_1x/model_final weights: output/faster_rcnn_r101_1x/model_final
num_classes: 81
FasterRCNN: FasterRCNN:
backbone: ResNet backbone: ResNet
...@@ -64,7 +65,6 @@ BBoxAssigner: ...@@ -64,7 +65,6 @@ BBoxAssigner:
bg_thresh_lo: 0.0 bg_thresh_lo: 0.0
fg_fraction: 0.25 fg_fraction: 0.25
fg_thresh: 0.5 fg_thresh: 0.5
num_classes: 81
BBoxHead: BBoxHead:
head: ResNetC5 head: ResNetC5
...@@ -72,7 +72,6 @@ BBoxHead: ...@@ -72,7 +72,6 @@ BBoxHead:
keep_top_k: 100 keep_top_k: 100
nms_threshold: 0.5 nms_threshold: 0.5
score_threshold: 0.05 score_threshold: 0.05
num_classes: 81
LearningRate: LearningRate:
base_lr: 0.01 base_lr: 0.01
......
...@@ -10,6 +10,7 @@ save_dir: output ...@@ -10,6 +10,7 @@ save_dir: output
pretrain_weights: http://paddle-imagenet-models-name.bj.bcebos.com/ResNet101_pretrained.tar pretrain_weights: http://paddle-imagenet-models-name.bj.bcebos.com/ResNet101_pretrained.tar
weights: output/faster_rcnn_r101_fpn_1x/model_final weights: output/faster_rcnn_r101_fpn_1x/model_final
metric: COCO metric: COCO
num_classes: 81
FasterRCNN: FasterRCNN:
backbone: ResNet backbone: ResNet
...@@ -73,7 +74,6 @@ BBoxAssigner: ...@@ -73,7 +74,6 @@ BBoxAssigner:
bg_thresh_lo: 0.0 bg_thresh_lo: 0.0
fg_fraction: 0.25 fg_fraction: 0.25
fg_thresh: 0.5 fg_thresh: 0.5
num_classes: 81
BBoxHead: BBoxHead:
head: TwoFCHead head: TwoFCHead
...@@ -81,7 +81,6 @@ BBoxHead: ...@@ -81,7 +81,6 @@ BBoxHead:
keep_top_k: 100 keep_top_k: 100
nms_threshold: 0.5 nms_threshold: 0.5
score_threshold: 0.05 score_threshold: 0.05
num_classes: 81
TwoFCHead: TwoFCHead:
num_chan: 1024 num_chan: 1024
......
...@@ -10,6 +10,7 @@ save_dir: output ...@@ -10,6 +10,7 @@ save_dir: output
pretrain_weights: http://paddle-imagenet-models-name.bj.bcebos.com/ResNet101_pretrained.tar pretrain_weights: http://paddle-imagenet-models-name.bj.bcebos.com/ResNet101_pretrained.tar
weights: output/faster_rcnn_r101_fpn_2x/model_final weights: output/faster_rcnn_r101_fpn_2x/model_final
metric: COCO metric: COCO
num_classes: 81
FasterRCNN: FasterRCNN:
backbone: ResNet backbone: ResNet
...@@ -73,7 +74,6 @@ BBoxAssigner: ...@@ -73,7 +74,6 @@ BBoxAssigner:
bg_thresh_lo: 0.0 bg_thresh_lo: 0.0
fg_fraction: 0.25 fg_fraction: 0.25
fg_thresh: 0.5 fg_thresh: 0.5
num_classes: 81
BBoxHead: BBoxHead:
head: TwoFCHead head: TwoFCHead
...@@ -81,7 +81,6 @@ BBoxHead: ...@@ -81,7 +81,6 @@ BBoxHead:
keep_top_k: 100 keep_top_k: 100
nms_threshold: 0.5 nms_threshold: 0.5
score_threshold: 0.05 score_threshold: 0.05
num_classes: 81
TwoFCHead: TwoFCHead:
num_chan: 1024 num_chan: 1024
......
...@@ -10,6 +10,7 @@ save_dir: output ...@@ -10,6 +10,7 @@ save_dir: output
pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/ResNet101_vd_pretrained.tar pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/ResNet101_vd_pretrained.tar
weights: output/faster_rcnn_r101_vd_fpn_1x/model_final weights: output/faster_rcnn_r101_vd_fpn_1x/model_final
metric: COCO metric: COCO
num_classes: 81
FasterRCNN: FasterRCNN:
backbone: ResNet backbone: ResNet
...@@ -74,7 +75,6 @@ BBoxAssigner: ...@@ -74,7 +75,6 @@ BBoxAssigner:
bg_thresh_lo: 0.0 bg_thresh_lo: 0.0
fg_fraction: 0.25 fg_fraction: 0.25
fg_thresh: 0.5 fg_thresh: 0.5
num_classes: 81
BBoxHead: BBoxHead:
head: TwoFCHead head: TwoFCHead
...@@ -82,7 +82,6 @@ BBoxHead: ...@@ -82,7 +82,6 @@ BBoxHead:
keep_top_k: 100 keep_top_k: 100
nms_threshold: 0.5 nms_threshold: 0.5
score_threshold: 0.05 score_threshold: 0.05
num_classes: 81
TwoFCHead: TwoFCHead:
num_chan: 1024 num_chan: 1024
......
...@@ -10,6 +10,7 @@ save_dir: output ...@@ -10,6 +10,7 @@ save_dir: output
pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/ResNet101_vd_pretrained.tar pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/ResNet101_vd_pretrained.tar
weights: output/faster_rcnn_r101_vd_fpn_2x/model_final weights: output/faster_rcnn_r101_vd_fpn_2x/model_final
metric: COCO metric: COCO
num_classes: 81
FasterRCNN: FasterRCNN:
backbone: ResNet backbone: ResNet
...@@ -74,7 +75,6 @@ BBoxAssigner: ...@@ -74,7 +75,6 @@ BBoxAssigner:
bg_thresh_lo: 0.0 bg_thresh_lo: 0.0
fg_fraction: 0.25 fg_fraction: 0.25
fg_thresh: 0.5 fg_thresh: 0.5
num_classes: 81
BBoxHead: BBoxHead:
head: TwoFCHead head: TwoFCHead
...@@ -82,7 +82,6 @@ BBoxHead: ...@@ -82,7 +82,6 @@ BBoxHead:
keep_top_k: 100 keep_top_k: 100
nms_threshold: 0.5 nms_threshold: 0.5
score_threshold: 0.05 score_threshold: 0.05
num_classes: 81
TwoFCHead: TwoFCHead:
num_chan: 1024 num_chan: 1024
......
...@@ -10,6 +10,7 @@ snapshot_iter: 10000 ...@@ -10,6 +10,7 @@ snapshot_iter: 10000
pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/ResNet50_cos_pretrained.tar pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/ResNet50_cos_pretrained.tar
metric: COCO metric: COCO
weights: output/faster_rcnn_r50_1x/model_final weights: output/faster_rcnn_r50_1x/model_final
num_classes: 81
FasterRCNN: FasterRCNN:
backbone: ResNet backbone: ResNet
...@@ -64,7 +65,6 @@ BBoxAssigner: ...@@ -64,7 +65,6 @@ BBoxAssigner:
bg_thresh_lo: 0.0 bg_thresh_lo: 0.0
fg_fraction: 0.25 fg_fraction: 0.25
fg_thresh: 0.5 fg_thresh: 0.5
num_classes: 81
BBoxHead: BBoxHead:
head: ResNetC5 head: ResNetC5
...@@ -72,7 +72,6 @@ BBoxHead: ...@@ -72,7 +72,6 @@ BBoxHead:
keep_top_k: 100 keep_top_k: 100
nms_threshold: 0.5 nms_threshold: 0.5
score_threshold: 0.05 score_threshold: 0.05
num_classes: 81
LearningRate: LearningRate:
base_lr: 0.01 base_lr: 0.01
......
...@@ -10,6 +10,7 @@ snapshot_iter: 10000 ...@@ -10,6 +10,7 @@ snapshot_iter: 10000
pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/ResNet50_cos_pretrained.tar pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/ResNet50_cos_pretrained.tar
metric: COCO metric: COCO
weights: output/faster_rcnn_r50_2x/model_final weights: output/faster_rcnn_r50_2x/model_final
num_classes: 81
FasterRCNN: FasterRCNN:
backbone: ResNet backbone: ResNet
...@@ -64,7 +65,6 @@ BBoxAssigner: ...@@ -64,7 +65,6 @@ BBoxAssigner:
bg_thresh_lo: 0.0 bg_thresh_lo: 0.0
fg_fraction: 0.25 fg_fraction: 0.25
fg_thresh: 0.5 fg_thresh: 0.5
num_classes: 81
BBoxHead: BBoxHead:
head: ResNetC5 head: ResNetC5
...@@ -72,7 +72,6 @@ BBoxHead: ...@@ -72,7 +72,6 @@ BBoxHead:
keep_top_k: 100 keep_top_k: 100
nms_threshold: 0.5 nms_threshold: 0.5
score_threshold: 0.05 score_threshold: 0.05
num_classes: 81
LearningRate: LearningRate:
base_lr: 0.01 base_lr: 0.01
......
...@@ -9,7 +9,8 @@ log_smooth_window: 20 ...@@ -9,7 +9,8 @@ log_smooth_window: 20
save_dir: output save_dir: output
pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/ResNet50_cos_pretrained.tar pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/ResNet50_cos_pretrained.tar
metric: COCO metric: COCO
weights: output/fpn/faster_rcnn_r50_fpn_1x/model_final weights: output/faster_rcnn_r50_fpn_1x/model_final
num_classes: 81
FasterRCNN: FasterRCNN:
backbone: ResNet backbone: ResNet
...@@ -74,7 +75,6 @@ BBoxAssigner: ...@@ -74,7 +75,6 @@ BBoxAssigner:
bg_thresh_hi: 0.5 bg_thresh_hi: 0.5
fg_fraction: 0.25 fg_fraction: 0.25
fg_thresh: 0.5 fg_thresh: 0.5
num_classes: 81
BBoxHead: BBoxHead:
head: TwoFCHead head: TwoFCHead
...@@ -82,7 +82,6 @@ BBoxHead: ...@@ -82,7 +82,6 @@ BBoxHead:
keep_top_k: 100 keep_top_k: 100
nms_threshold: 0.5 nms_threshold: 0.5
score_threshold: 0.05 score_threshold: 0.05
num_classes: 81
TwoFCHead: TwoFCHead:
num_chan: 1024 num_chan: 1024
......
...@@ -10,6 +10,7 @@ save_dir: output ...@@ -10,6 +10,7 @@ save_dir: output
pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/ResNet50_cos_pretrained.tar pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/ResNet50_cos_pretrained.tar
metric: COCO metric: COCO
weights: output/faster_rcnn_r50_fpn_2x/model_final weights: output/faster_rcnn_r50_fpn_2x/model_final
num_classes: 81
FasterRCNN: FasterRCNN:
backbone: ResNet backbone: ResNet
...@@ -74,7 +75,6 @@ BBoxAssigner: ...@@ -74,7 +75,6 @@ BBoxAssigner:
bg_thresh_hi: 0.5 bg_thresh_hi: 0.5
fg_fraction: 0.25 fg_fraction: 0.25
fg_thresh: 0.5 fg_thresh: 0.5
num_classes: 81
BBoxHead: BBoxHead:
head: TwoFCHead head: TwoFCHead
...@@ -82,7 +82,6 @@ BBoxHead: ...@@ -82,7 +82,6 @@ BBoxHead:
keep_top_k: 100 keep_top_k: 100
nms_threshold: 0.5 nms_threshold: 0.5
score_threshold: 0.05 score_threshold: 0.05
num_classes: 81
TwoFCHead: TwoFCHead:
num_chan: 1024 num_chan: 1024
......
...@@ -10,6 +10,7 @@ snapshot_iter: 10000 ...@@ -10,6 +10,7 @@ snapshot_iter: 10000
pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/ResNet50_vd_pretrained.tar pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/ResNet50_vd_pretrained.tar
metric: COCO metric: COCO
weights: output/faster_rcnn_r50_vd_1x/model_final weights: output/faster_rcnn_r50_vd_1x/model_final
num_classes: 81
FasterRCNN: FasterRCNN:
backbone: ResNet backbone: ResNet
...@@ -66,7 +67,6 @@ BBoxAssigner: ...@@ -66,7 +67,6 @@ BBoxAssigner:
bg_thresh_lo: 0.0 bg_thresh_lo: 0.0
fg_fraction: 0.25 fg_fraction: 0.25
fg_thresh: 0.5 fg_thresh: 0.5
num_classes: 81
BBoxHead: BBoxHead:
head: ResNetC5 head: ResNetC5
...@@ -74,7 +74,6 @@ BBoxHead: ...@@ -74,7 +74,6 @@ BBoxHead:
keep_top_k: 100 keep_top_k: 100
nms_threshold: 0.5 nms_threshold: 0.5
score_threshold: 0.05 score_threshold: 0.05
num_classes: 81
LearningRate: LearningRate:
base_lr: 0.01 base_lr: 0.01
......
...@@ -10,6 +10,7 @@ save_dir: output ...@@ -10,6 +10,7 @@ save_dir: output
pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/ResNet50_vd_pretrained.tar pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/ResNet50_vd_pretrained.tar
weights: output/faster_rcnn_r50_vd_fpn_2x/model_final weights: output/faster_rcnn_r50_vd_fpn_2x/model_final
metric: COCO metric: COCO
num_classes: 81
FasterRCNN: FasterRCNN:
backbone: ResNet backbone: ResNet
...@@ -74,7 +75,6 @@ BBoxAssigner: ...@@ -74,7 +75,6 @@ BBoxAssigner:
bg_thresh_lo: 0.0 bg_thresh_lo: 0.0
fg_fraction: 0.25 fg_fraction: 0.25
fg_thresh: 0.5 fg_thresh: 0.5
num_classes: 81
BBoxHead: BBoxHead:
head: TwoFCHead head: TwoFCHead
...@@ -82,7 +82,6 @@ BBoxHead: ...@@ -82,7 +82,6 @@ BBoxHead:
keep_top_k: 100 keep_top_k: 100
nms_threshold: 0.5 nms_threshold: 0.5
score_threshold: 0.05 score_threshold: 0.05
num_classes: 81
TwoFCHead: TwoFCHead:
num_chan: 1024 num_chan: 1024
......
...@@ -10,6 +10,7 @@ save_dir: output ...@@ -10,6 +10,7 @@ save_dir: output
pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/SE154_vd_pretrained.tar pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/SE154_vd_pretrained.tar
weights: output/faster_rcnn_se154_vd_fpn_s1x/model_final weights: output/faster_rcnn_se154_vd_fpn_s1x/model_final
metric: COCO metric: COCO
num_classes: 81
FasterRCNN: FasterRCNN:
backbone: SENet backbone: SENet
...@@ -76,7 +77,6 @@ BBoxAssigner: ...@@ -76,7 +77,6 @@ BBoxAssigner:
bg_thresh_lo: 0.0 bg_thresh_lo: 0.0
fg_fraction: 0.25 fg_fraction: 0.25
fg_thresh: 0.5 fg_thresh: 0.5
num_classes: 81
BBoxHead: BBoxHead:
head: TwoFCHead head: TwoFCHead
...@@ -84,7 +84,6 @@ BBoxHead: ...@@ -84,7 +84,6 @@ BBoxHead:
keep_top_k: 100 keep_top_k: 100
nms_threshold: 0.5 nms_threshold: 0.5
score_threshold: 0.05 score_threshold: 0.05
num_classes: 81
TwoFCHead: TwoFCHead:
num_chan: 1024 num_chan: 1024
......
architecture: FasterRCNN
train_feed: FasterRCNNTrainFeed
eval_feed: FasterRCNNEvalFeed
test_feed: FasterRCNNTestFeed
max_iters: 180000
snapshot_iter: 10000
use_gpu: true
log_smooth_window: 20
save_dir: output
pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/ResNeXt101_vd_64x4d_pretrained.tar
weights: output/faster_rcnn_x101_vd_64x4d_fpn_1x/model_final
metric: COCO
num_classes: 81
FasterRCNN:
backbone: ResNeXt
fpn: FPN
rpn_head: FPNRPNHead
roi_extractor: FPNRoIAlign
bbox_head: BBoxHead
bbox_assigner: BBoxAssigner
ResNeXt:
depth: 101
feature_maps: [2, 3, 4, 5]
freeze_at: 2
group_width: 4
groups: 64
norm_type: affine_channel
variant: d
FPN:
max_level: 6
min_level: 2
num_chan: 256
spatial_scale: [0.03125, 0.0625, 0.125, 0.25]
FPNRPNHead:
anchor_generator:
anchor_sizes: [32, 64, 128, 256, 512]
aspect_ratios: [0.5, 1.0, 2.0]
stride: [16.0, 16.0]
variance: [1.0, 1.0, 1.0, 1.0]
anchor_start_size: 32
max_level: 6
min_level: 2
num_chan: 256
rpn_target_assign:
rpn_batch_size_per_im: 256
rpn_fg_fraction: 0.5
rpn_negative_overlap: 0.3
rpn_positive_overlap: 0.7
rpn_straddle_thresh: 0.0
train_proposal:
min_size: 0.0
nms_thresh: 0.7
post_nms_top_n: 2000
pre_nms_top_n: 2000
test_proposal:
min_size: 0.0
nms_thresh: 0.7
post_nms_top_n: 1000
pre_nms_top_n: 1000
FPNRoIAlign:
canconical_level: 4
canonical_size: 224
max_level: 5
min_level: 2
box_resolution: 7
sampling_ratio: 2
BBoxAssigner:
batch_size_per_im: 512
bbox_reg_weights: [0.1, 0.1, 0.2, 0.2]
bg_thresh_hi: 0.5
bg_thresh_lo: 0.0
fg_fraction: 0.25
fg_thresh: 0.5
BBoxHead:
head: TwoFCHead
nms:
keep_top_k: 100
nms_threshold: 0.5
score_threshold: 0.05
TwoFCHead:
num_chan: 1024
LearningRate:
base_lr: 0.01
schedulers:
- !PiecewiseDecay
gamma: 0.1
milestones: [120000, 160000]
values: null
- !LinearWarmup
start_factor: 0.1
steps: 1000
OptimizerBuilder:
optimizer:
momentum: 0.9
type: Momentum
regularizer:
factor: 0.0001
type: L2
FasterRCNNTrainFeed:
# batch size per device
batch_size: 1
dataset:
dataset_dir: dataset/coco
image_dir: train2017
annotation: annotations/instances_train2017.json
batch_transforms:
- !PadBatch
pad_to_stride: 32
num_workers: 2
shuffle: true
FasterRCNNEvalFeed:
batch_size: 1
dataset:
dataset_dir: dataset/coco
annotation: annotations/instances_val2017.json
image_dir: val2017
batch_transforms:
- !PadBatch
pad_to_stride: 32
num_workers: 2
FasterRCNNTestFeed:
batch_size: 1
dataset:
annotation: annotations/instances_val2017.json
batch_transforms:
- !PadBatch
pad_to_stride: 32
num_workers: 2
shuffle: false
architecture: FasterRCNN
train_feed: FasterRCNNTrainFeed
eval_feed: FasterRCNNEvalFeed
test_feed: FasterRCNNTestFeed
max_iters: 360000
snapshot_iter: 10000
use_gpu: true
log_smooth_window: 20
save_dir: output
pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/ResNeXt101_vd_64x4d_pretrained.tar
weights: output/faster_rcnn_x101_vd_64x4d_fpn_1x/model_final
metric: COCO
FasterRCNN:
backbone: ResNeXt
fpn: FPN
rpn_head: FPNRPNHead
roi_extractor: FPNRoIAlign
bbox_head: BBoxHead
bbox_assigner: BBoxAssigner
ResNeXt:
depth: 101
feature_maps: [2, 3, 4, 5]
freeze_at: 2
group_width: 4
groups: 64
norm_type: affine_channel
variant: d
FPN:
max_level: 6
min_level: 2
num_chan: 256
spatial_scale: [0.03125, 0.0625, 0.125, 0.25]
FPNRPNHead:
anchor_generator:
anchor_sizes: [32, 64, 128, 256, 512]
aspect_ratios: [0.5, 1.0, 2.0]
stride: [16.0, 16.0]
variance: [1.0, 1.0, 1.0, 1.0]
anchor_start_size: 32
max_level: 6
min_level: 2
num_chan: 256
rpn_target_assign:
rpn_batch_size_per_im: 256
rpn_fg_fraction: 0.5
rpn_negative_overlap: 0.3
rpn_positive_overlap: 0.7
rpn_straddle_thresh: 0.0
train_proposal:
min_size: 0.0
nms_thresh: 0.7
post_nms_top_n: 2000
pre_nms_top_n: 2000
test_proposal:
min_size: 0.0
nms_thresh: 0.7
post_nms_top_n: 1000
pre_nms_top_n: 1000
FPNRoIAlign:
canconical_level: 4
canonical_size: 224
max_level: 5
min_level: 2
box_resolution: 7
sampling_ratio: 2
BBoxAssigner:
batch_size_per_im: 512
bbox_reg_weights: [0.1, 0.1, 0.2, 0.2]
bg_thresh_hi: 0.5
bg_thresh_lo: 0.0
fg_fraction: 0.25
fg_thresh: 0.5
num_classes: 81
BBoxHead:
head: TwoFCHead
nms:
keep_top_k: 100
nms_threshold: 0.5
score_threshold: 0.05
num_classes: 81
TwoFCHead:
num_chan: 1024
LearningRate:
base_lr: 0.01
schedulers:
- !PiecewiseDecay
gamma: 0.1
milestones: [240000, 320000]
- !LinearWarmup
start_factor: 0.1
steps: 1000
OptimizerBuilder:
optimizer:
momentum: 0.9
type: Momentum
regularizer:
factor: 0.0001
type: L2
FasterRCNNTrainFeed:
# batch size per device
batch_size: 1
dataset:
dataset_dir: dataset/coco
image_dir: train2017
annotation: annotations/instances_train2017.json
batch_transforms:
- !PadBatch
pad_to_stride: 32
num_workers: 2
shuffle: true
FasterRCNNEvalFeed:
batch_size: 1
dataset:
dataset_dir: dataset/coco
annotation: annotations/instances_val2017.json
image_dir: val2017
batch_transforms:
- !PadBatch
pad_to_stride: 32
num_workers: 2
FasterRCNNTestFeed:
batch_size: 1
dataset:
annotation: annotations/instances_val2017.json
batch_transforms:
- !PadBatch
pad_to_stride: 32
num_workers: 2
shuffle: false
...@@ -10,6 +10,7 @@ save_dir: output ...@@ -10,6 +10,7 @@ save_dir: output
pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/ResNet101_pretrained.tar pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/ResNet101_pretrained.tar
metric: COCO metric: COCO
weights: output/mask_rcnn_r101_fpn_1x/model_final/ weights: output/mask_rcnn_r101_fpn_1x/model_final/
num_classes: 81
MaskRCNN: MaskRCNN:
backbone: ResNet backbone: ResNet
...@@ -68,7 +69,6 @@ FPNRoIAlign: ...@@ -68,7 +69,6 @@ FPNRoIAlign:
MaskHead: MaskHead:
dilation: 1 dilation: 1
num_chan_reduced: 256 num_chan_reduced: 256
num_classes: 81
num_convs: 4 num_convs: 4
resolution: 28 resolution: 28
...@@ -79,7 +79,6 @@ BBoxAssigner: ...@@ -79,7 +79,6 @@ BBoxAssigner:
bg_thresh_lo: 0.0 bg_thresh_lo: 0.0
fg_fraction: 0.25 fg_fraction: 0.25
fg_thresh: 0.5 fg_thresh: 0.5
num_classes: 81
MaskAssigner: MaskAssigner:
resolution: 28 resolution: 28
...@@ -90,7 +89,6 @@ BBoxHead: ...@@ -90,7 +89,6 @@ BBoxHead:
keep_top_k: 100 keep_top_k: 100
nms_threshold: 0.5 nms_threshold: 0.5
score_threshold: 0.05 score_threshold: 0.05
num_classes: 81
TwoFCHead: TwoFCHead:
num_chan: 1024 num_chan: 1024
......
...@@ -10,6 +10,7 @@ save_dir: output ...@@ -10,6 +10,7 @@ save_dir: output
pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/ResNet50_cos_pretrained.tar pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/ResNet50_cos_pretrained.tar
metric: COCO metric: COCO
weights: output/mask_rcnn_r50_1x/model_final weights: output/mask_rcnn_r50_1x/model_final
num_classes: 81
MaskRCNN: MaskRCNN:
backbone: ResNet backbone: ResNet
...@@ -66,12 +67,10 @@ BBoxHead: ...@@ -66,12 +67,10 @@ BBoxHead:
nms_threshold: 0.5 nms_threshold: 0.5
normalized: false normalized: false
score_threshold: 0.05 score_threshold: 0.05
num_classes: 81
MaskHead: MaskHead:
dilation: 1 dilation: 1
num_chan_reduced: 256 num_chan_reduced: 256
num_classes: 81
resolution: 14 resolution: 14
BBoxAssigner: BBoxAssigner:
...@@ -81,10 +80,8 @@ BBoxAssigner: ...@@ -81,10 +80,8 @@ BBoxAssigner:
bg_thresh_lo: 0.0 bg_thresh_lo: 0.0
fg_fraction: 0.25 fg_fraction: 0.25
fg_thresh: 0.5 fg_thresh: 0.5
num_classes: 81
MaskAssigner: MaskAssigner:
num_classes: 81
resolution: 14 resolution: 14
LearningRate: LearningRate:
......
...@@ -10,6 +10,7 @@ save_dir: output ...@@ -10,6 +10,7 @@ save_dir: output
pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/ResNet50_cos_pretrained.tar pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/ResNet50_cos_pretrained.tar
metric: COCO metric: COCO
weights: output/mask_rcnn_r50_2x/model_final/ weights: output/mask_rcnn_r50_2x/model_final/
num_classes: 81
MaskRCNN: MaskRCNN:
backbone: ResNet backbone: ResNet
...@@ -67,12 +68,10 @@ BBoxHead: ...@@ -67,12 +68,10 @@ BBoxHead:
nms_threshold: 0.5 nms_threshold: 0.5
normalized: false normalized: false
score_threshold: 0.05 score_threshold: 0.05
num_classes: 81
MaskHead: MaskHead:
dilation: 1 dilation: 1
num_chan_reduced: 256 num_chan_reduced: 256
num_classes: 81
resolution: 14 resolution: 14
BBoxAssigner: BBoxAssigner:
...@@ -82,10 +81,8 @@ BBoxAssigner: ...@@ -82,10 +81,8 @@ BBoxAssigner:
bg_thresh_lo: 0.0 bg_thresh_lo: 0.0
fg_fraction: 0.25 fg_fraction: 0.25
fg_thresh: 0.5 fg_thresh: 0.5
num_classes: 81
MaskAssigner: MaskAssigner:
num_classes: 81
resolution: 14 resolution: 14
LearningRate: LearningRate:
......
...@@ -10,6 +10,7 @@ save_dir: output ...@@ -10,6 +10,7 @@ save_dir: output
pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/ResNet50_cos_pretrained.tar pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/ResNet50_cos_pretrained.tar
metric: COCO metric: COCO
weights: output/mask_rcnn_r50_fpn_1x/model_final/ weights: output/mask_rcnn_r50_fpn_1x/model_final/
num_classes: 81
MaskRCNN: MaskRCNN:
backbone: ResNet backbone: ResNet
...@@ -68,7 +69,6 @@ FPNRoIAlign: ...@@ -68,7 +69,6 @@ FPNRoIAlign:
MaskHead: MaskHead:
dilation: 1 dilation: 1
num_chan_reduced: 256 num_chan_reduced: 256
num_classes: 81
num_convs: 4 num_convs: 4
resolution: 28 resolution: 28
...@@ -79,7 +79,6 @@ BBoxAssigner: ...@@ -79,7 +79,6 @@ BBoxAssigner:
bg_thresh_lo: 0.0 bg_thresh_lo: 0.0
fg_fraction: 0.25 fg_fraction: 0.25
fg_thresh: 0.5 fg_thresh: 0.5
num_classes: 81
MaskAssigner: MaskAssigner:
resolution: 28 resolution: 28
...@@ -90,7 +89,6 @@ BBoxHead: ...@@ -90,7 +89,6 @@ BBoxHead:
keep_top_k: 100 keep_top_k: 100
nms_threshold: 0.5 nms_threshold: 0.5
score_threshold: 0.05 score_threshold: 0.05
num_classes: 81
TwoFCHead: TwoFCHead:
num_chan: 1024 num_chan: 1024
......
architecture: MaskRCNN
train_feed: MaskRCNNTrainFeed
eval_feed: MaskRCNNEvalFeed
test_feed: MaskRCNNTestFeed
max_iters: 360000
snapshot_iter: 10000
use_gpu: true
log_smooth_window: 20
save_dir: output
pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/ResNet50_cos_pretrained.tar
weights: output/mask_rcnn_r50_fpn_2x/model_final/
metric: COCO
num_classes: 81
MaskRCNN:
backbone: ResNet
fpn: FPN
rpn_head: FPNRPNHead
roi_extractor: FPNRoIAlign
bbox_head: BBoxHead
bbox_assigner: BBoxAssigner
ResNet:
depth: 50
feature_maps: [2, 3, 4, 5]
freeze_at: 2
norm_type: affine_channel
FPN:
max_level: 6
min_level: 2
num_chan: 256
spatial_scale: [0.03125, 0.0625, 0.125, 0.25]
FPNRPNHead:
anchor_generator:
aspect_ratios: [0.5, 1.0, 2.0]
variance: [1.0, 1.0, 1.0, 1.0]
anchor_start_size: 32
max_level: 6
min_level: 2
num_chan: 256
rpn_target_assign:
rpn_batch_size_per_im: 256
rpn_fg_fraction: 0.5
rpn_negative_overlap: 0.3
rpn_positive_overlap: 0.7
rpn_straddle_thresh: 0.0
train_proposal:
min_size: 0.0
nms_thresh: 0.7
pre_nms_top_n: 2000
post_nms_top_n: 2000
test_proposal:
min_size: 0.0
nms_thresh: 0.7
pre_nms_top_n: 1000
post_nms_top_n: 1000
FPNRoIAlign:
canconical_level: 4
canonical_size: 224
max_level: 5
min_level: 2
sampling_ratio: 2
box_resolution: 7
mask_resolution: 14
MaskHead:
dilation: 1
num_chan_reduced: 256
num_convs: 4
resolution: 28
BBoxAssigner:
batch_size_per_im: 512
bbox_reg_weights: [0.1, 0.1, 0.2, 0.2]
bg_thresh_hi: 0.5
bg_thresh_lo: 0.0
fg_fraction: 0.25
fg_thresh: 0.5
MaskAssigner:
resolution: 28
BBoxHead:
head: TwoFCHead
nms:
keep_top_k: 100
nms_threshold: 0.5
score_threshold: 0.05
TwoFCHead:
num_chan: 1024
LearningRate:
base_lr: 0.01
schedulers:
- !PiecewiseDecay
gamma: 0.1
milestones: [240000, 320000]
- !LinearWarmup
start_factor: 0.3333333333333333
steps: 500
OptimizerBuilder:
optimizer:
momentum: 0.9
type: Momentum
regularizer:
factor: 0.0001
type: L2
MaskRCNNTrainFeed:
batch_size: 1
dataset:
dataset_dir: dataset/coco
annotation: annotations/instances_train2017.json
image_dir: train2017
batch_transforms:
- !PadBatch
pad_to_stride: 32
num_workers: 2
MaskRCNNEvalFeed:
batch_size: 1
dataset:
dataset_dir: dataset/coco
annotation: annotations/instances_val2017.json
image_dir: val2017
batch_transforms:
- !PadBatch
pad_to_stride: 32
num_workers: 2
MaskRCNNTestFeed:
batch_size: 1
dataset:
annotation: annotations/instances_val2017.json
batch_transforms:
- !PadBatch
pad_to_stride: 32
num_workers: 2
...@@ -10,6 +10,7 @@ save_dir: output ...@@ -10,6 +10,7 @@ save_dir: output
pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/ResNet50_vd_pretrained.tar pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/ResNet50_vd_pretrained.tar
metric: COCO metric: COCO
weights: output/mask_rcnn_r50_vd_fpn_2x/model_final/ weights: output/mask_rcnn_r50_vd_fpn_2x/model_final/
num_classes: 81
MaskRCNN: MaskRCNN:
backbone: ResNet backbone: ResNet
...@@ -69,7 +70,6 @@ FPNRoIAlign: ...@@ -69,7 +70,6 @@ FPNRoIAlign:
MaskHead: MaskHead:
dilation: 1 dilation: 1
num_chan_reduced: 256 num_chan_reduced: 256
num_classes: 81
num_convs: 4 num_convs: 4
resolution: 28 resolution: 28
...@@ -80,7 +80,6 @@ BBoxAssigner: ...@@ -80,7 +80,6 @@ BBoxAssigner:
bg_thresh_lo: 0.0 bg_thresh_lo: 0.0
fg_fraction: 0.25 fg_fraction: 0.25
fg_thresh: 0.5 fg_thresh: 0.5
num_classes: 81
MaskAssigner: MaskAssigner:
resolution: 28 resolution: 28
...@@ -91,7 +90,6 @@ BBoxHead: ...@@ -91,7 +90,6 @@ BBoxHead:
keep_top_k: 100 keep_top_k: 100
nms_threshold: 0.5 nms_threshold: 0.5
score_threshold: 0.05 score_threshold: 0.05
num_classes: 81
TwoFCHead: TwoFCHead:
num_chan: 1024 num_chan: 1024
......
...@@ -10,6 +10,7 @@ save_dir: output ...@@ -10,6 +10,7 @@ save_dir: output
pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/SE154_vd_pretrained.tar pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/SE154_vd_pretrained.tar
weights: output/mask_rcnn_se154_vd_fpn_s1x/model_final/ weights: output/mask_rcnn_se154_vd_fpn_s1x/model_final/
metric: COCO metric: COCO
num_classes: 81
MaskRCNN: MaskRCNN:
backbone: SENet backbone: SENet
...@@ -71,7 +72,6 @@ FPNRoIAlign: ...@@ -71,7 +72,6 @@ FPNRoIAlign:
MaskHead: MaskHead:
dilation: 1 dilation: 1
num_chan_reduced: 256 num_chan_reduced: 256
num_classes: 81
num_convs: 4 num_convs: 4
resolution: 28 resolution: 28
...@@ -82,7 +82,6 @@ BBoxAssigner: ...@@ -82,7 +82,6 @@ BBoxAssigner:
bg_thresh_lo: 0.0 bg_thresh_lo: 0.0
fg_fraction: 0.25 fg_fraction: 0.25
fg_thresh: 0.5 fg_thresh: 0.5
num_classes: 81
MaskAssigner: MaskAssigner:
resolution: 28 resolution: 28
...@@ -93,7 +92,6 @@ BBoxHead: ...@@ -93,7 +92,6 @@ BBoxHead:
keep_top_k: 100 keep_top_k: 100
nms_threshold: 0.5 nms_threshold: 0.5
score_threshold: 0.05 score_threshold: 0.05
num_classes: 81
TwoFCHead: TwoFCHead:
num_chan: 1024 num_chan: 1024
...@@ -120,7 +118,7 @@ MaskRCNNTrainFeed: ...@@ -120,7 +118,7 @@ MaskRCNNTrainFeed:
# batch size per device # batch size per device
batch_size: 1 batch_size: 1
dataset: dataset:
dataset_dir: dataset/coco dataset_dir: dataset/coco
image_dir: train2017 image_dir: train2017
annotation: annotations/instances_train2017.json annotation: annotations/instances_train2017.json
batch_transforms: batch_transforms:
......
...@@ -188,3 +188,14 @@ A small utility (`tools/configure.py`) is included to simplify the configuration ...@@ -188,3 +188,14 @@ A small utility (`tools/configure.py`) is included to simplify the configuration
```shell ```shell
python tools/configure.py --minimal generate FasterRCNN BBoxHead python tools/configure.py --minimal generate FasterRCNN BBoxHead
``` ```
# FAQ
**Q:** There are some configuration options that are used by multiple modules (e.g., `num_classes`), how do I avoid duplication in config files?
**A:** We provided a `__shared__` annotation for exactly this purpose, simply annotate like this `__shared__ = ['num_classes']`. It works as follows:
1. if `num_classes` is configured for a module in config file, it takes precedence.
2. if `num_classes` is not configured for a module but is present in the config file as a global key, its value will be used.
3. otherwise, the default value (`81`) will be used.
...@@ -180,3 +180,14 @@ pip install typeguard http://github.com/willthefrog/docstring_parser/tarball/mas ...@@ -180,3 +180,14 @@ pip install typeguard http://github.com/willthefrog/docstring_parser/tarball/mas
```shell ```shell
python tools/configure.py --minimal generate FasterRCNN BBoxHead python tools/configure.py --minimal generate FasterRCNN BBoxHead
``` ```
# FAQ
**Q:** 某些配置项会在多个模块中用到(如 `num_classes`),如何避免在配置文件中多次重复设置?
**A:** 框架提供了 `__shared__` 标记来实现配置的共享,用户可以标记参数,如 `__shared__ = ['num_classes']` ,配置数值作用规则如下:
1. 如果模块配置中提供了 `num_classes` ,会优先使用其数值。
2. 如果模块配置中未提供 `num_classes` ,但配置文件中存在全局键值,那么会使用全局键值。
3. 两者均为配置的情况下,将使用默认值(`81`)。
...@@ -38,16 +38,19 @@ The backbone models pretrained on ImageNet are available. All backbone models ar ...@@ -38,16 +38,19 @@ The backbone models pretrained on ImageNet are available. All backbone models ar
| ResNet50-vd | Faster | 1 | 1x | 36.4 | - | [model](https://paddlemodels.bj.bcebos.com/object_detection/faster_rcnn_r50_vd_1x.tar) | | ResNet50-vd | Faster | 1 | 1x | 36.4 | - | [model](https://paddlemodels.bj.bcebos.com/object_detection/faster_rcnn_r50_vd_1x.tar) |
| ResNet50-FPN | Faster | 2 | 1x | 37.2 | - | [model](https://paddlemodels.bj.bcebos.com/object_detection/faster_rcnn_r50_fpn_1x.tar) | | ResNet50-FPN | Faster | 2 | 1x | 37.2 | - | [model](https://paddlemodels.bj.bcebos.com/object_detection/faster_rcnn_r50_fpn_1x.tar) |
| ResNet50-FPN | Faster | 2 | 2x | 37.7 | - | [model](https://paddlemodels.bj.bcebos.com/object_detection/faster_rcnn_r50_fpn_2x.tar) | | ResNet50-FPN | Faster | 2 | 2x | 37.7 | - | [model](https://paddlemodels.bj.bcebos.com/object_detection/faster_rcnn_r50_fpn_2x.tar) |
| ResNet50-FPN | Mask | 2 | 1x | 37.9 | 34.2 | [model](https://paddlemodels.bj.bcebos.com/object_detection/mask_rcnn_r50_fpn_1x.tar) | | ResNet50-FPN | Mask | 1 | 1x | 37.9 | 34.2 | [model](https://paddlemodels.bj.bcebos.com/object_detection/mask_rcnn_r50_fpn_1x.tar) |
| ResNet50-FPN | Mask | 1 | 2x | 38.7 | 34.7 | [model](https://paddlemodels.bj.bcebos.com/object_detection/mask_rcnn_r50_fpn_2x.tar) |
| ResNet50-FPN | Cascade Faster | 2 | 1x | 40.9 | - | [model](https://paddlemodels.bj.bcebos.com/object_detection/cascade_rcnn_r50_fpn_1x.tar) | | ResNet50-FPN | Cascade Faster | 2 | 1x | 40.9 | - | [model](https://paddlemodels.bj.bcebos.com/object_detection/cascade_rcnn_r50_fpn_1x.tar) |
| ResNet50-vd-FPN | Faster | 2 | 2x | 38.9 | - | [model](https://paddlemodels.bj.bcebos.com/object_detection/faster_rcnn_r50_vd_fpn_2x.tar) | | ResNet50-vd-FPN | Faster | 2 | 2x | 38.9 | - | [model](https://paddlemodels.bj.bcebos.com/object_detection/faster_rcnn_r50_vd_fpn_2x.tar) |
| ResNet50-vd-FPN | Mask | 2 | 2x | 39.8 | 35.4 | [model](https://paddlemodels.bj.bcebos.com/object_detection/mask_rcnn_r50_vd_fpn_2x.tar) | | ResNet50-vd-FPN | Mask | 1 | 2x | 39.8 | 35.4 | [model](https://paddlemodels.bj.bcebos.com/object_detection/mask_rcnn_r50_vd_fpn_2x.tar) |
| ResNet101 | Faster | 1 | 1x | 38.3 | - | [model](https://paddlemodels.bj.bcebos.com/object_detection/faster_rcnn_r101_1x.tar) | | ResNet101 | Faster | 1 | 1x | 38.3 | - | [model](https://paddlemodels.bj.bcebos.com/object_detection/faster_rcnn_r101_1x.tar) |
| ResNet101-FPN | Faster | 1 | 1x | 38.7 | - | [model](https://paddlemodels.bj.bcebos.com/object_detection/faster_rcnn_r101_fpn_1x.tar) | | ResNet101-FPN | Faster | 1 | 1x | 38.7 | - | [model](https://paddlemodels.bj.bcebos.com/object_detection/faster_rcnn_r101_fpn_1x.tar) |
| ResNet101-FPN | Faster | 1 | 2x | 39.1 | - | [model](https://paddlemodels.bj.bcebos.com/object_detection/faster_rcnn_r101_fpn_2x.tar) | | ResNet101-FPN | Faster | 1 | 2x | 39.1 | - | [model](https://paddlemodels.bj.bcebos.com/object_detection/faster_rcnn_r101_fpn_2x.tar) |
| ResNet101-FPN | Mask | 1 | 1x | 39.5 | 35.2 | [model](https://paddlemodels.bj.bcebos.com/object_detection/mask_rcnn_r101_fpn_1x.tar) | | ResNet101-FPN | Mask | 1 | 1x | 39.5 | 35.2 | [model](https://paddlemodels.bj.bcebos.com/object_detection/mask_rcnn_r101_fpn_1x.tar) |
| ResNet101-vd-FPN | Faster | 1 | 1x | 40.0 | - | [model](https://paddlemodels.bj.bcebos.com/object_detection/faster_rcnn_r101_fpn_1x.tar) | | ResNet101-vd-FPN | Faster | 1 | 1x | 40.5 | - | [model](https://paddlemodels.bj.bcebos.com/object_detection/faster_rcnn_r101_vd_fpn_1x.tar) |
| ResNet101-vd-FPN | Faster | 1 | 2x | 40.6 | - | [model](https://paddlemodels.bj.bcebos.com/object_detection/faster_rcnn_r101_fpn_2x.tar) | | ResNet101-vd-FPN | Faster | 1 | 2x | 40.8 | - | [model](https://paddlemodels.bj.bcebos.com/object_detection/faster_rcnn_r101_vd_fpn_2x.tar) |
| ResNeXt101-vd-FPN | Faster | 1 | 1x | 42.2 | - | [model](https://paddlemodels.bj.bcebos.com/object_detection/faster_rcnn_x101_vd_64x4d_fpn_1x.tar) |
| ResNeXt101-vd-FPN | Faster | 1 | 2x | 41.7 | - | [model](https://paddlemodels.bj.bcebos.com/object_detection/faster_rcnn_x101_vd_64x4d_fpn_2x.tar) |
| SENet154-vd-FPN | Faster | 1 | 1.44x | 42.9 | - | [model](https://paddlemodels.bj.bcebos.com/object_detection/faster_rcnn_se154_vd_fpn_s1x.tar) | | SENet154-vd-FPN | Faster | 1 | 1.44x | 42.9 | - | [model](https://paddlemodels.bj.bcebos.com/object_detection/faster_rcnn_se154_vd_fpn_s1x.tar) |
| SENet154-vd-FPN | Mask | 1 | 1.44x | 44.0 | 38.7 | [model](https://paddlemodels.bj.bcebos.com/object_detection/mask_rcnn_se154_vd_fpn_s1x.tar) | | SENet154-vd-FPN | Mask | 1 | 1.44x | 44.0 | 38.7 | [model](https://paddlemodels.bj.bcebos.com/object_detection/mask_rcnn_se154_vd_fpn_s1x.tar) |
......
...@@ -43,13 +43,14 @@ except Exception: ...@@ -43,13 +43,14 @@ except Exception:
if not check_type.__warning_sent__: if not check_type.__warning_sent__:
from ppdet.utils.cli import ColorTTY from ppdet.utils.cli import ColorTTY
color_tty = ColorTTY() color_tty = ColorTTY()
message = "typeguard is not installed, type checking is not available" message = "typeguard is not installed," \
+ "type checking is not available"
print(color_tty.yellow(message)) print(color_tty.yellow(message))
check_type.__warning_sent__ = True check_type.__warning_sent__ = True
check_type.__warning_sent__ = False check_type.__warning_sent__ = False
__all__ = ['SchemaValue', 'SchemaDict', 'extract_schema'] __all__ = ['SchemaValue', 'SchemaDict', 'SharedConfig', 'extract_schema']
class SchemaValue(object): class SchemaValue(object):
...@@ -160,6 +161,27 @@ class SchemaDict(dict): ...@@ -160,6 +161,27 @@ class SchemaDict(dict):
self.name, ", ".join(mismatch_keys))) self.name, ", ".join(mismatch_keys)))
class SharedConfig(object):
"""
Representation class for `__shared__` annotations, which work as follows:
- if `key` is set for the module in config file, its value will take
precedence
- if `key` is not set for the module but present in the config file, its
value will be used
- otherwise, use the provided `default_value` as fallback
Args:
key: config[key] will be injected
default_value: fallback value
"""
def __init__(self, key, default_value=None):
super(SharedConfig, self).__init__()
self.key = key
self.default_value = default_value
def extract_schema(cls): def extract_schema(cls):
""" """
Extract schema from a given class Extract schema from a given class
...@@ -216,6 +238,7 @@ def extract_schema(cls): ...@@ -216,6 +238,7 @@ def extract_schema(cls):
schema.strict = not has_kwargs schema.strict = not has_kwargs
schema.pymodule = importlib.import_module(cls.__module__) schema.pymodule = importlib.import_module(cls.__module__)
schema.inject = getattr(cls, '__inject__', []) schema.inject = getattr(cls, '__inject__', [])
schema.shared = getattr(cls, '__shared__', [])
for idx, name in enumerate(names): for idx, name in enumerate(names):
comment = name in comments and comments[name] or name comment = name in comments and comments[name] or name
if name in schema.inject: if name in schema.inject:
...@@ -223,8 +246,13 @@ def extract_schema(cls): ...@@ -223,8 +246,13 @@ def extract_schema(cls):
else: else:
type_ = name in annotations and annotations[name] or None type_ = name in annotations and annotations[name] or None
value_schema = SchemaValue(name, comment, type_) value_schema = SchemaValue(name, comment, type_)
if idx >= num_required: if name in schema.shared:
value_schema.set_default(defaults[idx - num_required]) assert idx >= num_required, "shared config must have default value"
default = defaults[idx - num_required]
value_schema.set_default(SharedConfig(name, default))
elif idx >= num_required:
default = defaults[idx - num_required]
value_schema.set_default(default)
schema.set_schema(name, value_schema) schema.set_schema(name, value_schema)
return schema return schema
...@@ -16,6 +16,7 @@ import importlib ...@@ -16,6 +16,7 @@ import importlib
import inspect import inspect
import yaml import yaml
from .schema import SharedConfig
__all__ = ['serializable', 'Callable'] __all__ = ['serializable', 'Callable']
...@@ -59,7 +60,8 @@ def _make_python_representer(cls): ...@@ -59,7 +60,8 @@ def _make_python_representer(cls):
def serializable(cls): def serializable(cls):
""" """
Add loader and dumper for given class, which must be "trivially serializable" Add loader and dumper for given class, which must be
"trivially serializable"
Args: Args:
cls: class to be serialized cls: class to be serialized
...@@ -72,6 +74,10 @@ def serializable(cls): ...@@ -72,6 +74,10 @@ def serializable(cls):
return cls return cls
yaml.add_representer(SharedConfig,
lambda d, o: d.represent_data(o.default_value))
@serializable @serializable
class Callable(object): class Callable(object):
""" """
......
...@@ -21,8 +21,9 @@ import os ...@@ -21,8 +21,9 @@ import os
import sys import sys
import yaml import yaml
import copy
from .config.schema import SchemaDict, extract_schema from .config.schema import SchemaDict, SharedConfig, extract_schema
from .config.yaml_helpers import serializable from .config.yaml_helpers import serializable
__all__ = [ __all__ = [
...@@ -135,7 +136,8 @@ def create(cls_or_name, **kwargs): ...@@ -135,7 +136,8 @@ def create(cls_or_name, **kwargs):
assert type(cls_or_name) in [type, str assert type(cls_or_name) in [type, str
], "should be a class or name of a class" ], "should be a class or name of a class"
name = type(cls_or_name) == str and cls_or_name or cls_or_name.__name__ name = type(cls_or_name) == str and cls_or_name or cls_or_name.__name__
assert name in global_config and isinstance(global_config[name], SchemaDict), \ assert name in global_config and \
isinstance(global_config[name], SchemaDict), \
"the module {} is not registered".format(name) "the module {} is not registered".format(name)
config = global_config[name] config = global_config[name]
config.update(kwargs) config.update(kwargs)
...@@ -144,9 +146,26 @@ def create(cls_or_name, **kwargs): ...@@ -144,9 +146,26 @@ def create(cls_or_name, **kwargs):
kwargs = {} kwargs = {}
kwargs.update(global_config[name]) kwargs.update(global_config[name])
# parse `shared` annoation of registered modules
if getattr(config, 'shared', None):
for k in config.shared:
target_key = config[k]
shared_conf = config.schema[k].default
assert isinstance(shared_conf, SharedConfig)
if target_key is not None and not isinstance(target_key,
SharedConfig):
continue # value is given for the module
elif shared_conf.key in global_config:
# `key` is present in config
kwargs[k] = global_config[shared_conf.key]
else:
kwargs[k] = shared_conf.default_value
# parse `inject` annoation of registered modules
if getattr(config, 'inject', None): if getattr(config, 'inject', None):
for k in config.inject: for k in config.inject:
target_key = global_config[name][k] target_key = config[k]
# optional dependency # optional dependency
if target_key is None: if target_key is None:
continue continue
...@@ -163,4 +182,7 @@ def create(cls_or_name, **kwargs): ...@@ -163,4 +182,7 @@ def create(cls_or_name, **kwargs):
kwargs[k] = target kwargs[k] = target
else: else:
raise ValueError("Unsupported injection type:", target_key) raise ValueError("Unsupported injection type:", target_key)
# prevent modification of global config values of reference types
# (e.g., list, dict) from within the created module instances
kwargs = copy.deepcopy(kwargs)
return cls(**kwargs) return cls(**kwargs)
...@@ -181,7 +181,6 @@ class DataSet(object): ...@@ -181,7 +181,6 @@ class DataSet(object):
Args: Args:
annotation (str): annotation file path annotation (str): annotation file path
image_dir (str): directory where image files are stored image_dir (str): directory where image files are stored
num_classes (int): number of classes
shuffle (bool): shuffle samples shuffle (bool): shuffle samples
""" """
__source__ = 'RoiDbSource' __source__ = 'RoiDbSource'
......
...@@ -25,7 +25,7 @@ from paddle.fluid.regularizer import L2Decay ...@@ -25,7 +25,7 @@ from paddle.fluid.regularizer import L2Decay
from ppdet.modeling.ops import (AnchorGenerator, RetinaTargetAssign, from ppdet.modeling.ops import (AnchorGenerator, RetinaTargetAssign,
RetinaOutputDecoder) RetinaOutputDecoder)
from ppdet.core.workspace import register, serializable from ppdet.core.workspace import register
__all__ = ['RetinaHead'] __all__ = ['RetinaHead']
...@@ -52,6 +52,7 @@ class RetinaHead(object): ...@@ -52,6 +52,7 @@ class RetinaHead(object):
sigma (float): The parameter in smooth l1 loss sigma (float): The parameter in smooth l1 loss
""" """
__inject__ = ['anchor_generator', 'target_assign', 'output_decoder'] __inject__ = ['anchor_generator', 'target_assign', 'output_decoder']
__shared__ = ['num_classes']
def __init__(self, def __init__(self,
anchor_generator=AnchorGenerator().__dict__, anchor_generator=AnchorGenerator().__dict__,
...@@ -333,7 +334,6 @@ class RetinaHead(object): ...@@ -333,7 +334,6 @@ class RetinaHead(object):
cls_pred_reshape_list = output['cls_pred'] cls_pred_reshape_list = output['cls_pred']
bbox_pred_reshape_list = output['bbox_pred'] bbox_pred_reshape_list = output['bbox_pred']
anchor_reshape_list = output['anchor'] anchor_reshape_list = output['anchor']
anchor_var_reshape_list = output['anchor_var']
for i in range(self.max_level - self.min_level + 1): for i in range(self.max_level - self.min_level + 1):
cls_pred_reshape_list[i] = fluid.layers.sigmoid( cls_pred_reshape_list[i] = fluid.layers.sigmoid(
cls_pred_reshape_list[i]) cls_pred_reshape_list[i])
......
...@@ -64,7 +64,7 @@ class FasterRCNN(object): ...@@ -64,7 +64,7 @@ class FasterRCNN(object):
gt_box = feed_vars['gt_box'] gt_box = feed_vars['gt_box']
is_crowd = feed_vars['is_crowd'] is_crowd = feed_vars['is_crowd']
else: else:
im_shape = feed_vars['im_info'] im_shape = feed_vars['im_shape']
body_feats = self.backbone(im) body_feats = self.backbone(im)
body_feat_names = list(body_feats.keys()) body_feat_names = list(body_feats.keys())
......
...@@ -149,7 +149,11 @@ class MaskRCNN(object): ...@@ -149,7 +149,11 @@ class MaskRCNN(object):
cond = fluid.layers.less_than(x=bbox_size, y=size) cond = fluid.layers.less_than(x=bbox_size, y=size)
mask_pred = fluid.layers.create_global_var( mask_pred = fluid.layers.create_global_var(
shape=[1], value=0.0, dtype='float32', persistable=False) shape=[1],
value=0.0,
dtype='float32',
persistable=False,
name='mask_pred')
with fluid.layers.control_flow.Switch() as switch: with fluid.layers.control_flow.Switch() as switch:
with switch.case(cond): with switch.case(cond):
......
...@@ -56,8 +56,8 @@ class SSD(object): ...@@ -56,8 +56,8 @@ class SSD(object):
self.output_decoder = SSDOutputDecoder(**output_decoder) self.output_decoder = SSDOutputDecoder(**output_decoder)
if isinstance(metric, dict): if isinstance(metric, dict):
self.metric = SSDMetric(**metric) self.metric = SSDMetric(**metric)
def _forward(self, feed_vars, mode='train'): def build(self, feed_vars, mode='train'):
im = feed_vars['image'] im = feed_vars['image']
if mode == 'train' or mode == 'eval': if mode == 'train' or mode == 'eval':
gt_box = feed_vars['gt_box'] gt_box = feed_vars['gt_box']
...@@ -88,10 +88,16 @@ class SSD(object): ...@@ -88,10 +88,16 @@ class SSD(object):
return {'bbox': pred} return {'bbox': pred}
def train(self, feed_vars): def train(self, feed_vars):
return self._forward(feed_vars, 'train') return self.build(feed_vars, 'train')
def eval(self, feed_vars): def eval(self, feed_vars):
return self._forward(feed_vars, 'eval') return self.build(feed_vars, 'eval')
def test(self, feed_vars): def test(self, feed_vars):
return self._forward(feed_vars, 'test') return self.build(feed_vars, 'test')
def is_bbox_normalized(self):
# SSD use output_decoder in output layers, bbox is normalized
# to range [0, 1], is_bbox_normalized is used in infer.py
return True
...@@ -119,6 +119,7 @@ class ResNet(object): ...@@ -119,6 +119,7 @@ class ResNet(object):
regularizer=L2Decay(norm_decay)) regularizer=L2Decay(norm_decay))
if self.norm_type in ['bn', 'sync_bn']: if self.norm_type in ['bn', 'sync_bn']:
global_stats = True if self.freeze_norm else False
out = fluid.layers.batch_norm( out = fluid.layers.batch_norm(
input=conv, input=conv,
act=act, act=act,
...@@ -126,7 +127,8 @@ class ResNet(object): ...@@ -126,7 +127,8 @@ class ResNet(object):
param_attr=pattr, param_attr=pattr,
bias_attr=battr, bias_attr=battr,
moving_mean_name=bn_name + '_mean', moving_mean_name=bn_name + '_mean',
moving_variance_name=bn_name + '_variance', ) moving_variance_name=bn_name + '_variance',
use_global_stats=global_stats)
scale = fluid.framework._get_var(pattr.name) scale = fluid.framework._get_var(pattr.name)
bias = fluid.framework._get_var(battr.name) bias = fluid.framework._get_var(battr.name)
elif self.norm_type == 'affine_channel': elif self.norm_type == 'affine_channel':
...@@ -330,7 +332,6 @@ class ResNetC5(ResNet): ...@@ -330,7 +332,6 @@ class ResNetC5(ResNet):
norm_decay=0., norm_decay=0.,
variant='b', variant='b',
feature_maps=[5]): feature_maps=[5]):
super(ResNetC5, self).__init__( super(ResNetC5, self).__init__(depth, freeze_at, norm_type, freeze_norm,
depth, freeze_at, norm_type, freeze_norm, norm_decay, norm_decay, variant, feature_maps)
variant, feature_maps)
self.severed_head = True self.severed_head = True
...@@ -88,6 +88,7 @@ class GenerateProposals(object): ...@@ -88,6 +88,7 @@ class GenerateProposals(object):
class MaskAssigner(object): class MaskAssigner(object):
__op__ = fluid.layers.generate_mask_labels __op__ = fluid.layers.generate_mask_labels
__append_doc__ = True __append_doc__ = True
__shared__ = ['num_classes']
def __init__(self, num_classes=81, resolution=14): def __init__(self, num_classes=81, resolution=14):
super(MaskAssigner, self).__init__() super(MaskAssigner, self).__init__()
...@@ -123,6 +124,7 @@ class MultiClassNMS(object): ...@@ -123,6 +124,7 @@ class MultiClassNMS(object):
class BBoxAssigner(object): class BBoxAssigner(object):
__op__ = fluid.layers.generate_proposal_labels __op__ = fluid.layers.generate_proposal_labels
__append_doc__ = True __append_doc__ = True
__shared__ = ['num_classes']
def __init__(self, def __init__(self,
batch_size_per_im=512, batch_size_per_im=512,
......
...@@ -92,12 +92,13 @@ class BBoxHead(object): ...@@ -92,12 +92,13 @@ class BBoxHead(object):
RCNN bbox head RCNN bbox head
Args: Args:
head (object): the head module instance, e.g., `ResNetC5` or `TwoFCHead` head (object): the head module instance, e.g., `ResNetC5`, `TwoFCHead`
box_coder (object): `BoxCoder` instance box_coder (object): `BoxCoder` instance
nms (object): `MultiClassNMS` instance nms (object): `MultiClassNMS` instance
num_classes: number of output classes num_classes: number of output classes
""" """
__inject__ = ['head', 'box_coder', 'nms'] __inject__ = ['head', 'box_coder', 'nms']
__shared__ = ['num_classes']
def __init__(self, def __init__(self,
head, head,
......
...@@ -37,6 +37,7 @@ class CascadeBBoxHead(object): ...@@ -37,6 +37,7 @@ class CascadeBBoxHead(object):
num_classes: number of output classes num_classes: number of output classes
""" """
__inject__ = ['head', 'nms'] __inject__ = ['head', 'nms']
__shared__ = ['num_classes']
def __init__(self, head, nms=MultiClassNMS().__dict__, num_classes=81): def __init__(self, head, nms=MultiClassNMS().__dict__, num_classes=81):
super(CascadeBBoxHead, self).__init__() super(CascadeBBoxHead, self).__init__()
...@@ -196,7 +197,8 @@ class CascadeBBoxHead(object): ...@@ -196,7 +197,8 @@ class CascadeBBoxHead(object):
# only use fg box delta to decode box # only use fg box delta to decode box
bbox_pred_new = fluid.layers.slice( bbox_pred_new = fluid.layers.slice(
bbox_pred_new, axes=[1], starts=[1], ends=[2]) bbox_pred_new, axes=[1], starts=[1], ends=[2])
bbox_pred_new = fluid.layers.expand(bbox_pred_new, [1, self.num_classes, 1]) bbox_pred_new = fluid.layers.expand(bbox_pred_new,
[1, self.num_classes, 1])
decoded_box = fluid.layers.box_coder( decoded_box = fluid.layers.box_coder(
prior_box=proposals_boxes, prior_box=proposals_boxes,
prior_box_var=bbox_reg_w, prior_box_var=bbox_reg_w,
......
...@@ -38,6 +38,8 @@ class MaskHead(object): ...@@ -38,6 +38,8 @@ class MaskHead(object):
num_classes (int): number of output classes num_classes (int): number of output classes
""" """
__shared__ = ['num_classes']
def __init__(self, def __init__(self,
num_convs=0, num_convs=0,
num_chan_reduced=256, num_chan_reduced=256,
......
...@@ -26,6 +26,8 @@ __all__ = ['BBoxAssigner', 'MaskAssigner', 'CascadeBBoxAssigner'] ...@@ -26,6 +26,8 @@ __all__ = ['BBoxAssigner', 'MaskAssigner', 'CascadeBBoxAssigner']
@register @register
class CascadeBBoxAssigner(object): class CascadeBBoxAssigner(object):
__shared__ = ['num_classes']
def __init__(self, def __init__(self,
batch_size_per_im=512, batch_size_per_im=512,
fg_fraction=.25, fg_fraction=.25,
......
...@@ -65,7 +65,7 @@ class ArgsParser(ArgumentParser): ...@@ -65,7 +65,7 @@ class ArgsParser(ArgumentParser):
s = s.strip() s = s.strip()
k, v = s.split('=') k, v = s.split('=')
if '.' not in k: if '.' not in k:
config[k] = v config[k] = yaml.load(v, Loader=yaml.Loader)
else: else:
keys = k.split('.') keys = k.split('.')
config[keys[0]] = {} config[keys[0]] = {}
......
...@@ -144,13 +144,13 @@ def mask2out(results, clsid2catid, resolution, thresh_binarize=0.5): ...@@ -144,13 +144,13 @@ def mask2out(results, clsid2catid, resolution, thresh_binarize=0.5):
continue continue
masks = t['mask'][0] masks = t['mask'][0]
im_shape = t['im_shape'][0][0]
s = 0 s = 0
# for each sample # for each sample
for i in range(len(lengths)): for i in range(len(lengths)):
num = lengths[i] num = lengths[i]
im_id = int(im_ids[i][0]) im_id = int(im_ids[i][0])
im_shape = t['im_shape'][0][i]
bbox = bboxes[s:s + num][:, 2:] bbox = bboxes[s:s + num][:, 2:]
clsid_scores = bboxes[s:s + num][:, 0:2] clsid_scores = bboxes[s:s + num][:, 0:2]
......
...@@ -82,6 +82,47 @@ def get_test_images(infer_dir, infer_img): ...@@ -82,6 +82,47 @@ def get_test_images(infer_dir, infer_img):
return images return images
def prune_feed_vars(feeded_var_names, target_vars, prog):
"""
Filter out feed variables which are not in program,
pruned feed variables are only used in post processing
on model output, which are not used in program, such
as im_id to identify image order, im_shape to clip bbox
in image.
"""
exist_var_names = []
prog = prog.clone()
prog = prog._prune(targets=target_vars)
global_block = prog.global_block()
for name in feeded_var_names:
try:
v = global_block.var(name)
exist_var_names.append(v.name)
except Exception:
logger.info('save_inference_model pruned unused feed '
'variables {}'.format(name))
pass
return exist_var_names
def save_infer_model(FLAGS, exe, feed_vars, test_fetches, infer_prog):
cfg_name = os.path.basename(FLAGS.config).split('.')[0]
save_dir = os.path.join(FLAGS.output_dir, cfg_name)
feeded_var_names = [var.name for var in feed_vars.values()]
target_vars = test_fetches.values()
feeded_var_names = prune_feed_vars(feeded_var_names, target_vars, infer_prog)
logger.info("Save inference model to {}, input: {}, output: "
"{}...".format(save_dir, feeded_var_names,
[var.name for var in target_vars]))
fluid.io.save_inference_model(
save_dir,
feeded_var_names=feeded_var_names,
target_vars=target_vars,
executor=exe,
main_program=infer_prog,
params_filename="__params__")
def main(): def main():
cfg = load_config(FLAGS.config) cfg = load_config(FLAGS.config)
...@@ -143,6 +184,12 @@ def main(): ...@@ -143,6 +184,12 @@ def main():
clsid2catid, catid2name = get_category_info(anno_file, with_background, clsid2catid, catid2name = get_category_info(anno_file, with_background,
use_default_label) use_default_label)
# whether output bbox is normalized in model output layer
is_bbox_normalized = False
if hasattr(model, 'is_bbox_normalized') and \
callable(model.is_bbox_normalized):
is_bbox_normalized = model.is_bbox_normalized()
imid2path = reader.imid2path imid2path = reader.imid2path
for iter_id, data in enumerate(reader()): for iter_id, data in enumerate(reader()):
outs = exe.run(infer_prog, outs = exe.run(infer_prog,
...@@ -157,7 +204,6 @@ def main(): ...@@ -157,7 +204,6 @@ def main():
bbox_results = None bbox_results = None
mask_results = None mask_results = None
is_bbox_normalized = True if cfg.metric == 'VOC' else False
if 'bbox' in res: if 'bbox' in res:
bbox_results = bbox2out([res], clsid2catid, is_bbox_normalized) bbox_results = bbox2out([res], clsid2catid, is_bbox_normalized)
if 'mask' in res: if 'mask' in res:
......
...@@ -86,7 +86,6 @@ def main(): ...@@ -86,7 +86,6 @@ def main():
place = fluid.CUDAPlace(0) if cfg.use_gpu else fluid.CPUPlace() place = fluid.CUDAPlace(0) if cfg.use_gpu else fluid.CPUPlace()
exe = fluid.Executor(place) exe = fluid.Executor(place)
model = create(main_arch)
lr_builder = create('LearningRate') lr_builder = create('LearningRate')
optim_builder = create('OptimizerBuilder') optim_builder = create('OptimizerBuilder')
...@@ -95,6 +94,7 @@ def main(): ...@@ -95,6 +94,7 @@ def main():
train_prog = fluid.Program() train_prog = fluid.Program()
with fluid.program_guard(train_prog, startup_prog): with fluid.program_guard(train_prog, startup_prog):
with fluid.unique_name.guard(): with fluid.unique_name.guard():
model = create(main_arch)
train_pyreader, feed_vars = create_feed(train_feed) train_pyreader, feed_vars = create_feed(train_feed)
train_fetches = model.train(feed_vars) train_fetches = model.train(feed_vars)
loss = train_fetches['loss'] loss = train_fetches['loss']
...@@ -113,6 +113,7 @@ def main(): ...@@ -113,6 +113,7 @@ def main():
eval_prog = fluid.Program() eval_prog = fluid.Program()
with fluid.program_guard(eval_prog, startup_prog): with fluid.program_guard(eval_prog, startup_prog):
with fluid.unique_name.guard(): with fluid.unique_name.guard():
model = create(main_arch)
eval_pyreader, feed_vars = create_feed(eval_feed) eval_pyreader, feed_vars = create_feed(eval_feed)
fetches = model.eval(feed_vars) fetches = model.eval(feed_vars)
eval_prog = eval_prog.clone(True) eval_prog = eval_prog.clone(True)
...@@ -120,8 +121,9 @@ def main(): ...@@ -120,8 +121,9 @@ def main():
eval_reader = create_reader(eval_feed) eval_reader = create_reader(eval_feed)
eval_pyreader.decorate_sample_list_generator(eval_reader, place) eval_pyreader.decorate_sample_list_generator(eval_reader, place)
# parse train fetches # parse eval fetches
extra_keys = ['im_info', 'im_id'] if cfg.metric == 'COCO' else [] extra_keys = ['im_info', 'im_id',
'im_shape'] if cfg.metric == 'COCO' else []
eval_keys, eval_values, eval_cls = parse_fetches(fetches, eval_prog, eval_keys, eval_values, eval_cls = parse_fetches(fetches, eval_prog,
extra_keys) extra_keys)
...@@ -132,7 +134,7 @@ def main(): ...@@ -132,7 +134,7 @@ def main():
sync_bn = getattr(model.backbone, 'norm_type', None) == 'sync_bn' sync_bn = getattr(model.backbone, 'norm_type', None) == 'sync_bn'
# only enable sync_bn in multi GPU devices # only enable sync_bn in multi GPU devices
build_strategy.sync_batch_norm = sync_bn and devices_num > 1 \ build_strategy.sync_batch_norm = sync_bn and devices_num > 1 \
and cfg.use_gpu and cfg.use_gpu
train_compile_program = fluid.compiler.CompiledProgram( train_compile_program = fluid.compiler.CompiledProgram(
train_prog).with_data_parallel( train_prog).with_data_parallel(
loss_name=loss.name, build_strategy=build_strategy) loss_name=loss.name, build_strategy=build_strategy)
...@@ -141,12 +143,12 @@ def main(): ...@@ -141,12 +143,12 @@ def main():
exe.run(startup_prog) exe.run(startup_prog)
freeze_bn = getattr(model.backbone, 'freeze_norm', False) fuse_bn = getattr(model.backbone, 'norm_type', None) == 'affine_channel'
start_iter = 0 start_iter = 0
if FLAGS.resume_checkpoint: if FLAGS.resume_checkpoint:
checkpoint.load_checkpoint(exe, train_prog, FLAGS.resume_checkpoint) checkpoint.load_checkpoint(exe, train_prog, FLAGS.resume_checkpoint)
start_iter = checkpoint.global_step() start_iter = checkpoint.global_step()
elif cfg.pretrain_weights and freeze_bn: elif cfg.pretrain_weights and fuse_bn:
checkpoint.load_and_fusebn(exe, train_prog, cfg.pretrain_weights) checkpoint.load_and_fusebn(exe, train_prog, cfg.pretrain_weights)
elif cfg.pretrain_weights: elif cfg.pretrain_weights:
checkpoint.load_pretrain(exe, train_prog, cfg.pretrain_weights) checkpoint.load_pretrain(exe, train_prog, cfg.pretrain_weights)
......
...@@ -91,6 +91,7 @@ tar -xf vgg_ilsvrc_16_fc_reduced.tar.gz && rm -f vgg_ilsvrc_16_fc_reduced.tar.gz ...@@ -91,6 +91,7 @@ tar -xf vgg_ilsvrc_16_fc_reduced.tar.gz && rm -f vgg_ilsvrc_16_fc_reduced.tar.gz
python -u train.py --batch_size=16 --pretrained_model=vgg_ilsvrc_16_fc_reduced python -u train.py --batch_size=16 --pretrained_model=vgg_ilsvrc_16_fc_reduced
``` ```
- 可以通过设置 `export CUDA_VISIBLE_DEVICES=0,1,2,3` 指定想要使用的GPU数量,`batch_size`默认设置为12或16。 - 可以通过设置 `export CUDA_VISIBLE_DEVICES=0,1,2,3` 指定想要使用的GPU数量,`batch_size`默认设置为12或16。
- **注意**: 在**Windows**机器上训练,需要设置 `--use_multiprocess=False`,因为在Windows上使用Python多进程加速训练时有错误。
- 更多的可选参数见: - 更多的可选参数见:
```bash ```bash
python train.py --help python train.py --help
......
...@@ -280,14 +280,25 @@ def train_generator(settings, file_list, batch_size, shuffle=True): ...@@ -280,14 +280,25 @@ def train_generator(settings, file_list, batch_size, shuffle=True):
return reader return reader
def train(settings, file_list, batch_size, shuffle=True, num_workers=8): def train(settings,
file_list,
batch_size,
shuffle=True,
use_multiprocess=True,
num_workers=8):
file_lists = load_file_list(file_list) file_lists = load_file_list(file_list)
n = int(math.ceil(len(file_lists) // num_workers)) if use_multiprocess:
split_lists = [file_lists[i:i + n] for i in range(0, len(file_lists), n)] n = int(math.ceil(len(file_lists) // num_workers))
readers = [] split_lists = [
for iterm in split_lists: file_lists[i:i + n] for i in range(0, len(file_lists), n)
readers.append(train_generator(settings, iterm, batch_size, shuffle)) ]
return paddle.reader.multiprocess_reader(readers, False) readers = []
for iterm in split_lists:
readers.append(
train_generator(settings, iterm, batch_size, shuffle))
return paddle.reader.multiprocess_reader(readers, False)
else:
return train_generator(settings, file_lists, batch_size, shuffle)
def test(settings, file_list): def test(settings, file_list):
......
...@@ -9,6 +9,20 @@ import time ...@@ -9,6 +9,20 @@ import time
import argparse import argparse
import functools import functools
def set_paddle_flags(**kwargs):
for key, value in kwargs.items():
if os.environ.get(key, None) is None:
os.environ[key] = str(value)
# NOTE(paddle-dev): All of these flags should be
# set before `import paddle`. Otherwise, it would
# not take any effect.
set_paddle_flags(
FLAGS_eager_delete_tensor_gb=0, # enable GC to save memory
)
import paddle import paddle
import paddle.fluid as fluid import paddle.fluid as fluid
from pyramidbox import PyramidBox from pyramidbox import PyramidBox
...@@ -32,6 +46,7 @@ add_arg('mean_BGR', str, '104., 117., 123.', "Mean value for B,G,R cha ...@@ -32,6 +46,7 @@ add_arg('mean_BGR', str, '104., 117., 123.', "Mean value for B,G,R cha
add_arg('with_mem_opt', bool, True, "Whether to use memory optimization or not.") add_arg('with_mem_opt', bool, True, "Whether to use memory optimization or not.")
add_arg('pretrained_model', str, './vgg_ilsvrc_16_fc_reduced/', "The init model path.") add_arg('pretrained_model', str, './vgg_ilsvrc_16_fc_reduced/', "The init model path.")
add_arg('data_dir', str, 'data', "The base dir of dataset") add_arg('data_dir', str, 'data', "The base dir of dataset")
add_arg('use_multiprocess', bool, True, "Whether use multi-process for data preprocessing.")
parser.add_argument('--enable_ce', action='store_true', help='If set, run the task with continuous evaluation logs.') parser.add_argument('--enable_ce', action='store_true', help='If set, run the task with continuous evaluation logs.')
parser.add_argument('--batch_num', type=int, help="batch num for ce") parser.add_argument('--batch_num', type=int, help="batch num for ce")
parser.add_argument('--num_devices', type=int, default=1, help='Number of GPU devices') parser.add_argument('--num_devices', type=int, default=1, help='Number of GPU devices')
...@@ -163,7 +178,8 @@ def train(args, config, train_params, train_file_list): ...@@ -163,7 +178,8 @@ def train(args, config, train_params, train_file_list):
train_file_list, train_file_list,
batch_size_per_device, batch_size_per_device,
shuffle = is_shuffle, shuffle = is_shuffle,
num_workers = num_workers) use_multiprocess=args.use_multiprocess,
num_workers=num_workers)
train_py_reader.decorate_paddle_reader(train_reader) train_py_reader.decorate_paddle_reader(train_reader)
if args.parallel: if args.parallel:
......
...@@ -23,7 +23,7 @@ SSD is readily pluggable into a wide variant standard convolutional network, suc ...@@ -23,7 +23,7 @@ SSD is readily pluggable into a wide variant standard convolutional network, suc
Please download [PASCAL VOC dataset](http://host.robots.ox.ac.uk/pascal/VOC/) at first, skip this step if you already have one. Please download [PASCAL VOC dataset](http://host.robots.ox.ac.uk/pascal/VOC/) at first, skip this step if you already have one.
```bash ```
cd data/pascalvoc cd data/pascalvoc
./download.sh ./download.sh
``` ```
...@@ -36,7 +36,7 @@ The command `download.sh` also will create training and testing file lists. ...@@ -36,7 +36,7 @@ The command `download.sh` also will create training and testing file lists.
We provide two pre-trained models. The one is MobileNet-v1 SSD trained on COCO dataset, but removed the convolutional predictors for COCO dataset. This model can be used to initialize the models when training other datasets, like PASCAL VOC. The other pre-trained model is MobileNet-v1 trained on ImageNet 2012 dataset but removed the last weights and bias in the Fully-Connected layer. Download MobileNet-v1 SSD: We provide two pre-trained models. The one is MobileNet-v1 SSD trained on COCO dataset, but removed the convolutional predictors for COCO dataset. This model can be used to initialize the models when training other datasets, like PASCAL VOC. The other pre-trained model is MobileNet-v1 trained on ImageNet 2012 dataset but removed the last weights and bias in the Fully-Connected layer. Download MobileNet-v1 SSD:
```bash ```
./pretrained/download_coco.sh ./pretrained/download_coco.sh
``` ```
...@@ -46,13 +46,14 @@ Declaration: the MobileNet-v1 SSD model is converted by [TensorFlow model](https ...@@ -46,13 +46,14 @@ Declaration: the MobileNet-v1 SSD model is converted by [TensorFlow model](https
#### Train on PASCAL VOC #### Train on PASCAL VOC
`train.py` is the main caller of the training module. Examples of usage are shown below. `train.py` is the main caller of the training module. Examples of usage are shown below.
```bash ```
python -u train.py --batch_size=64 --dataset='pascalvoc' --pretrained_model='pretrained/ssd_mobilenet_v1_coco/' python -u train.py --batch_size=64 --dataset=pascalvoc --pretrained_model=pretrained/ssd_mobilenet_v1_coco/
``` ```
- Set ```export CUDA_VISIBLE_DEVICES=0,1``` to specifiy the number of GPU you want to use. - Set ```export CUDA_VISIBLE_DEVICES=0,1``` to specifiy the number of GPU you want to use.
- **Note**: set `--use_multiprocess=False` when training on **Windows**, since some problems need to be solved when using Python multiprocess to accelerate data processing.
- For more help on arguments: - For more help on arguments:
```bash ```
python train.py --help python train.py --help
``` ```
...@@ -69,14 +70,14 @@ We used RMSProp optimizer with mini-batch size 64 to train the MobileNet-SSD. Th ...@@ -69,14 +70,14 @@ We used RMSProp optimizer with mini-batch size 64 to train the MobileNet-SSD. Th
You can evaluate your trained model in different metrics like 11point, integral on both PASCAL VOC and COCO dataset. Note we set the default test list to the dataset's test/val list, you can use your own test list by setting ```--test_list``` args. You can evaluate your trained model in different metrics like 11point, integral on both PASCAL VOC and COCO dataset. Note we set the default test list to the dataset's test/val list, you can use your own test list by setting ```--test_list``` args.
`eval.py` is the main caller of the evaluating module. Examples of usage are shown below. `eval.py` is the main caller of the evaluating module. Examples of usage are shown below.
```bash ```
python eval.py --dataset='pascalvoc' --model_dir='train_pascal_model/best_model' --data_dir='data/pascalvoc' --test_list='test.txt' --ap_version='11point' --nms_threshold=0.45 python eval.py --dataset=pascalvoc --model_dir=model/best_model --data_dir=data/pascalvoc --test_list=test.txt
``` ```
### Infer and Visualize ### Infer and Visualize
`infer.py` is the main caller of the inferring module. Examples of usage are shown below. `infer.py` is the main caller of the inferring module. Examples of usage are shown below.
```bash ```
python infer.py --dataset='pascalvoc' --nms_threshold=0.45 --model_dir='train_pascal_model/best_model' --image_path='./data/pascalvoc/VOCdevkit/VOC2007/JPEGImages/009963.jpg' python infer.py --dataset=pascalvoc --nms_threshold=0.45 --model_dir=model/best_model --image_path=./data/pascalvoc/VOCdevkit/VOC2007/JPEGImages/009963.jpg
``` ```
Below are the examples of running the inference and visualizing the model result. Below are the examples of running the inference and visualizing the model result.
<p align="center"> <p align="center">
......
...@@ -24,7 +24,7 @@ SSD 可以方便地插入到任何一种标准卷积网络中,比如 VGG、Res ...@@ -24,7 +24,7 @@ SSD 可以方便地插入到任何一种标准卷积网络中,比如 VGG、Res
请先使用下面的命令下载 [PASCAL VOC 数据集](http://host.robots.ox.ac.uk/pascal/VOC/) 请先使用下面的命令下载 [PASCAL VOC 数据集](http://host.robots.ox.ac.uk/pascal/VOC/)
```bash ```
cd data/pascalvoc cd data/pascalvoc
./download.sh ./download.sh
``` ```
...@@ -38,7 +38,7 @@ cd data/pascalvoc ...@@ -38,7 +38,7 @@ cd data/pascalvoc
我们提供了两个预训练模型。第一个模型是在 COCO 数据集上预训练的 MobileNet-v1 SSD,我们将它的预测头移除了以便在 COCO 以外的数据集上进行训练。第二个模型是在 ImageNet 2012 数据集上预训练的 MobileNet-v1,我们也将最后的全连接层移除以便进行目标检测训练。下载 MobileNet-v1 SSD: 我们提供了两个预训练模型。第一个模型是在 COCO 数据集上预训练的 MobileNet-v1 SSD,我们将它的预测头移除了以便在 COCO 以外的数据集上进行训练。第二个模型是在 ImageNet 2012 数据集上预训练的 MobileNet-v1,我们也将最后的全连接层移除以便进行目标检测训练。下载 MobileNet-v1 SSD:
```bash ```
./pretrained/download_coco.sh ./pretrained/download_coco.sh
``` ```
...@@ -48,13 +48,14 @@ cd data/pascalvoc ...@@ -48,13 +48,14 @@ cd data/pascalvoc
#### 训练 #### 训练
`train.py` 是训练模块的主要执行程序,调用示例如下: `train.py` 是训练模块的主要执行程序,调用示例如下:
```bash ```
python -u train.py --batch_size=64 --dataset='pascalvoc' --pretrained_model='pretrained/ssd_mobilenet_v1_coco/' python -u train.py --batch_size=64 --dataset=pascalvoc --pretrained_model=pretrained/ssd_mobilenet_v1_coco/
``` ```
- 可以通过设置 ```export CUDA_VISIBLE_DEVICES=0,1``` 指定想要使用的GPU数量。 - 可以通过设置 ```export CUDA_VISIBLE_DEVICES=0,1``` 指定想要使用的GPU数量。
- **注意**: 在**Windows**机器上训练,需要设置 `--use_multiprocess=False`,因为在Windows上使用Python多进程加速训练时有错误。
- 更多的可选参数见: - 更多的可选参数见:
```bash ```
python train.py --help python train.py --help
``` ```
...@@ -71,15 +72,16 @@ cd data/pascalvoc ...@@ -71,15 +72,16 @@ cd data/pascalvoc
你可以使用11point、integral等指标在PASCAL VOC 数据集上评估训练好的模型。不失一般性,我们采用相应数据集的测试列表作为样例代码的默认列表,你也可以通过设置```--test_list```来指定自己的测试样本列表。 你可以使用11point、integral等指标在PASCAL VOC 数据集上评估训练好的模型。不失一般性,我们采用相应数据集的测试列表作为样例代码的默认列表,你也可以通过设置```--test_list```来指定自己的测试样本列表。
`eval.py`是评估模块的主要执行程序,调用示例如下: `eval.py`是评估模块的主要执行程序,调用示例如下:
```bash
python eval.py --dataset='pascalvoc' --model_dir='train_pascal_model/best_model' --data_dir='data/pascalvoc' --test_list='test.txt' --ap_version='11point' --nms_threshold=0.45 ```
python eval.py --dataset=pascalvoc --model_dir=model/best_model --data_dir=data/pascalvoc --test_list=test.txt
``` ```
### 模型预测以及可视化 ### 模型预测以及可视化
`infer.py`是预测及可视化模块的主要执行程序,调用示例如下: `infer.py`是预测及可视化模块的主要执行程序,调用示例如下:
```bash ```
python infer.py --dataset='pascalvoc' --nms_threshold=0.45 --model_dir='train_pascal_model/best_model' --image_path='./data/pascalvoc/VOCdevkit/VOC2007/JPEGImages/009963.jpg' python infer.py --dataset=pascalvoc --nms_threshold=0.45 --model_dir=model/best_model --image_path=./data/pascalvoc/VOCdevkit/VOC2007/JPEGImages/009963.jpg
``` ```
下图可视化了模型的预测结果: 下图可视化了模型的预测结果:
<p align="center"> <p align="center">
......
...@@ -283,6 +283,7 @@ def train(settings, ...@@ -283,6 +283,7 @@ def train(settings,
file_list, file_list,
batch_size, batch_size,
shuffle=True, shuffle=True,
use_multiprocess=True,
num_workers=8, num_workers=8,
enable_ce=False): enable_ce=False):
file_path = os.path.join(settings.data_dir, file_list) file_path = os.path.join(settings.data_dir, file_list)
...@@ -294,14 +295,15 @@ def train(settings, ...@@ -294,14 +295,15 @@ def train(settings,
image_ids = coco_api.getImgIds() image_ids = coco_api.getImgIds()
images = coco_api.loadImgs(image_ids) images = coco_api.loadImgs(image_ids)
np.random.shuffle(images) np.random.shuffle(images)
n = int(math.ceil(len(images) // num_workers))
image_lists = [images[i:i + n] for i in range(0, len(images), n)]
if '2014' in file_list: if '2014' in file_list:
sub_dir = "train2014" sub_dir = "train2014"
elif '2017' in file_list: elif '2017' in file_list:
sub_dir = "train2017" sub_dir = "train2017"
data_dir = os.path.join(settings.data_dir, sub_dir) data_dir = os.path.join(settings.data_dir, sub_dir)
n = int(math.ceil(len(images) // num_workers)) if use_multiprocess \
else len(images)
image_lists = [images[i:i + n] for i in range(0, len(images), n)]
for l in image_lists: for l in image_lists:
readers.append( readers.append(
coco(settings, coco_api, l, 'train', batch_size, shuffle, coco(settings, coco_api, l, 'train', batch_size, shuffle,
...@@ -309,11 +311,16 @@ def train(settings, ...@@ -309,11 +311,16 @@ def train(settings,
else: else:
images = [line.strip() for line in open(file_path)] images = [line.strip() for line in open(file_path)]
np.random.shuffle(images) np.random.shuffle(images)
n = int(math.ceil(len(images) // num_workers)) n = int(math.ceil(len(images) // num_workers)) if use_multiprocess \
else len(images)
image_lists = [images[i:i + n] for i in range(0, len(images), n)] image_lists = [images[i:i + n] for i in range(0, len(images), n)]
for l in image_lists: for l in image_lists:
readers.append(pascalvoc(settings, l, 'train', batch_size, shuffle)) readers.append(pascalvoc(settings, l, 'train', batch_size, shuffle))
return paddle.reader.multiprocess_reader(readers, False) print("use_multiprocess ", use_multiprocess)
if use_multiprocess:
return paddle.reader.multiprocess_reader(readers, False)
else:
return readers[0]
def test(settings, file_list, batch_size): def test(settings, file_list, batch_size):
......
...@@ -7,6 +7,20 @@ import shutil ...@@ -7,6 +7,20 @@ import shutil
import math import math
import multiprocessing import multiprocessing
def set_paddle_flags(**kwargs):
for key, value in kwargs.items():
if os.environ.get(key, None) is None:
os.environ[key] = str(value)
# NOTE(paddle-dev): All of these flags should be
# set before `import paddle`. Otherwise, it would
# not take any effect.
set_paddle_flags(
FLAGS_eager_delete_tensor_gb=0, # enable GC to save memory
)
import paddle import paddle
import paddle.fluid as fluid import paddle.fluid as fluid
import reader import reader
...@@ -28,6 +42,7 @@ add_arg('ap_version', str, '11point', "mAP version can be inte ...@@ -28,6 +42,7 @@ add_arg('ap_version', str, '11point', "mAP version can be inte
add_arg('image_shape', str, '3,300,300', "Input image shape.") add_arg('image_shape', str, '3,300,300', "Input image shape.")
add_arg('mean_BGR', str, '127.5,127.5,127.5', "Mean value for B,G,R channel which will be subtracted.") add_arg('mean_BGR', str, '127.5,127.5,127.5', "Mean value for B,G,R channel which will be subtracted.")
add_arg('data_dir', str, 'data/pascalvoc', "Data directory.") add_arg('data_dir', str, 'data/pascalvoc', "Data directory.")
add_arg('use_multiprocess', bool, True, "Whether use multi-process for data preprocessing.")
add_arg('enable_ce', bool, False, "Whether use CE to evaluate the model.") add_arg('enable_ce', bool, False, "Whether use CE to evaluate the model.")
#yapf: enable #yapf: enable
...@@ -185,14 +200,8 @@ def train(args, ...@@ -185,14 +200,8 @@ def train(args,
build_strategy.memory_optimize = True build_strategy.memory_optimize = True
train_exe = fluid.ParallelExecutor(main_program=train_prog, train_exe = fluid.ParallelExecutor(main_program=train_prog,
use_cuda=use_gpu, loss_name=loss.name, build_strategy=build_strategy) use_cuda=use_gpu, loss_name=loss.name, build_strategy=build_strategy)
train_reader = reader.train(data_args,
train_file_list,
batch_size_per_device,
shuffle=is_shuffle,
num_workers=num_workers,
enable_ce=enable_ce)
test_reader = reader.test(data_args, val_file_list, batch_size) test_reader = reader.test(data_args, val_file_list, batch_size)
train_py_reader.decorate_paddle_reader(train_reader)
test_py_reader.decorate_paddle_reader(test_reader) test_py_reader.decorate_paddle_reader(test_reader)
def save_model(postfix, main_prog): def save_model(postfix, main_prog):
...@@ -232,6 +241,7 @@ def train(args, ...@@ -232,6 +241,7 @@ def train(args,
train_file_list, train_file_list,
batch_size_per_device, batch_size_per_device,
shuffle=is_shuffle, shuffle=is_shuffle,
use_multiprocess=args.use_multiprocess,
num_workers=num_workers, num_workers=num_workers,
enable_ce=enable_ce) enable_ce=enable_ce)
train_py_reader.decorate_paddle_reader(train_reader) train_py_reader.decorate_paddle_reader(train_reader)
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册