未验证 提交 f605beed 编写于 作者: W wangguanzhong 提交者: GitHub

[Cherry pick] fix bugs in release 1.5 (#2859)

* add mask fpn r50 2x & minor fix for MODEL ZOO doc (#2747)

* cherry-pick bug fix for release/1.5

* fix mask eval for multi-batch (#2772)

* cherry-pick bug fix for release/1.5

* fix faster rcnn use im_shape (#2779)

* cherry-pick bug fix for release/1.5

* refine bbox_normalize in infer.py (#2781)

* refine bbox_normalize in infer.py

* add is_bbox_normalize

* rename _forward to build

* check callable

* cherry-pick bug fix for release/1.5

* Prevent module instance from modifying global config (#2790)

some of the global config values are reference type, e.g., objects, lists or
dicts, if the created module instances modify those values (see FPN and
spatial_scale), global config will reflect these changes, and instances of the
same class created later will inherit the changed values

* Fix command line parsing for non module options (#2839)

* fix save_inference_model cannot find feed vars (#2842)

* fix save_inference_model cannot find feed vars

* remove comment

* fix format

* add comment for pruned var

* add log for prune

* Fix train+eval in PaddleDetection(#2847)

* resolve conflict merge error

* reset data_feed
上级 2db34817
......@@ -10,6 +10,7 @@ save_dir: output
pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/ResNet50_cos_pretrained.tar
weights: output/cascade_rcnn_r50_fpn_1x/model_final
metric: COCO
num_classes: 81
CascadeRCNN:
backbone: ResNet
......@@ -74,7 +75,6 @@ CascadeBBoxAssigner:
bg_thresh_hi: [0.5, 0.6, 0.7]
fg_thresh: [0.5, 0.6, 0.7]
fg_fraction: 0.25
num_classes: 81
CascadeBBoxHead:
head: FC6FC7Head
......@@ -82,7 +82,6 @@ CascadeBBoxHead:
keep_top_k: 100
nms_threshold: 0.5
score_threshold: 0.05
num_classes: 81
FC6FC7Head:
num_chan: 1024
......
......@@ -10,6 +10,7 @@ snapshot_iter: 10000
pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/ResNet101_pretrained.tar
metric: COCO
weights: output/faster_rcnn_r101_1x/model_final
num_classes: 81
FasterRCNN:
backbone: ResNet
......@@ -64,7 +65,6 @@ BBoxAssigner:
bg_thresh_lo: 0.0
fg_fraction: 0.25
fg_thresh: 0.5
num_classes: 81
BBoxHead:
head: ResNetC5
......@@ -72,7 +72,6 @@ BBoxHead:
keep_top_k: 100
nms_threshold: 0.5
score_threshold: 0.05
num_classes: 81
LearningRate:
base_lr: 0.01
......
......@@ -10,6 +10,7 @@ save_dir: output
pretrain_weights: http://paddle-imagenet-models-name.bj.bcebos.com/ResNet101_pretrained.tar
weights: output/faster_rcnn_r101_fpn_1x/model_final
metric: COCO
num_classes: 81
FasterRCNN:
backbone: ResNet
......@@ -73,7 +74,6 @@ BBoxAssigner:
bg_thresh_lo: 0.0
fg_fraction: 0.25
fg_thresh: 0.5
num_classes: 81
BBoxHead:
head: TwoFCHead
......@@ -81,7 +81,6 @@ BBoxHead:
keep_top_k: 100
nms_threshold: 0.5
score_threshold: 0.05
num_classes: 81
TwoFCHead:
num_chan: 1024
......
......@@ -10,6 +10,7 @@ save_dir: output
pretrain_weights: http://paddle-imagenet-models-name.bj.bcebos.com/ResNet101_pretrained.tar
weights: output/faster_rcnn_r101_fpn_2x/model_final
metric: COCO
num_classes: 81
FasterRCNN:
backbone: ResNet
......@@ -73,7 +74,6 @@ BBoxAssigner:
bg_thresh_lo: 0.0
fg_fraction: 0.25
fg_thresh: 0.5
num_classes: 81
BBoxHead:
head: TwoFCHead
......@@ -81,7 +81,6 @@ BBoxHead:
keep_top_k: 100
nms_threshold: 0.5
score_threshold: 0.05
num_classes: 81
TwoFCHead:
num_chan: 1024
......
......@@ -10,6 +10,7 @@ save_dir: output
pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/ResNet101_vd_pretrained.tar
weights: output/faster_rcnn_r101_vd_fpn_1x/model_final
metric: COCO
num_classes: 81
FasterRCNN:
backbone: ResNet
......@@ -74,7 +75,6 @@ BBoxAssigner:
bg_thresh_lo: 0.0
fg_fraction: 0.25
fg_thresh: 0.5
num_classes: 81
BBoxHead:
head: TwoFCHead
......@@ -82,7 +82,6 @@ BBoxHead:
keep_top_k: 100
nms_threshold: 0.5
score_threshold: 0.05
num_classes: 81
TwoFCHead:
num_chan: 1024
......
......@@ -10,6 +10,7 @@ save_dir: output
pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/ResNet101_vd_pretrained.tar
weights: output/faster_rcnn_r101_vd_fpn_2x/model_final
metric: COCO
num_classes: 81
FasterRCNN:
backbone: ResNet
......@@ -74,7 +75,6 @@ BBoxAssigner:
bg_thresh_lo: 0.0
fg_fraction: 0.25
fg_thresh: 0.5
num_classes: 81
BBoxHead:
head: TwoFCHead
......@@ -82,7 +82,6 @@ BBoxHead:
keep_top_k: 100
nms_threshold: 0.5
score_threshold: 0.05
num_classes: 81
TwoFCHead:
num_chan: 1024
......
......@@ -10,6 +10,7 @@ snapshot_iter: 10000
pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/ResNet50_cos_pretrained.tar
metric: COCO
weights: output/faster_rcnn_r50_1x/model_final
num_classes: 81
FasterRCNN:
backbone: ResNet
......@@ -64,7 +65,6 @@ BBoxAssigner:
bg_thresh_lo: 0.0
fg_fraction: 0.25
fg_thresh: 0.5
num_classes: 81
BBoxHead:
head: ResNetC5
......@@ -72,7 +72,6 @@ BBoxHead:
keep_top_k: 100
nms_threshold: 0.5
score_threshold: 0.05
num_classes: 81
LearningRate:
base_lr: 0.01
......
......@@ -10,6 +10,7 @@ snapshot_iter: 10000
pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/ResNet50_cos_pretrained.tar
metric: COCO
weights: output/faster_rcnn_r50_2x/model_final
num_classes: 81
FasterRCNN:
backbone: ResNet
......@@ -64,7 +65,6 @@ BBoxAssigner:
bg_thresh_lo: 0.0
fg_fraction: 0.25
fg_thresh: 0.5
num_classes: 81
BBoxHead:
head: ResNetC5
......@@ -72,7 +72,6 @@ BBoxHead:
keep_top_k: 100
nms_threshold: 0.5
score_threshold: 0.05
num_classes: 81
LearningRate:
base_lr: 0.01
......
......@@ -9,7 +9,8 @@ log_smooth_window: 20
save_dir: output
pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/ResNet50_cos_pretrained.tar
metric: COCO
weights: output/fpn/faster_rcnn_r50_fpn_1x/model_final
weights: output/faster_rcnn_r50_fpn_1x/model_final
num_classes: 81
FasterRCNN:
backbone: ResNet
......@@ -74,7 +75,6 @@ BBoxAssigner:
bg_thresh_hi: 0.5
fg_fraction: 0.25
fg_thresh: 0.5
num_classes: 81
BBoxHead:
head: TwoFCHead
......@@ -82,7 +82,6 @@ BBoxHead:
keep_top_k: 100
nms_threshold: 0.5
score_threshold: 0.05
num_classes: 81
TwoFCHead:
num_chan: 1024
......
......@@ -10,6 +10,7 @@ save_dir: output
pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/ResNet50_cos_pretrained.tar
metric: COCO
weights: output/faster_rcnn_r50_fpn_2x/model_final
num_classes: 81
FasterRCNN:
backbone: ResNet
......@@ -74,7 +75,6 @@ BBoxAssigner:
bg_thresh_hi: 0.5
fg_fraction: 0.25
fg_thresh: 0.5
num_classes: 81
BBoxHead:
head: TwoFCHead
......@@ -82,7 +82,6 @@ BBoxHead:
keep_top_k: 100
nms_threshold: 0.5
score_threshold: 0.05
num_classes: 81
TwoFCHead:
num_chan: 1024
......
......@@ -10,6 +10,7 @@ snapshot_iter: 10000
pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/ResNet50_vd_pretrained.tar
metric: COCO
weights: output/faster_rcnn_r50_vd_1x/model_final
num_classes: 81
FasterRCNN:
backbone: ResNet
......@@ -66,7 +67,6 @@ BBoxAssigner:
bg_thresh_lo: 0.0
fg_fraction: 0.25
fg_thresh: 0.5
num_classes: 81
BBoxHead:
head: ResNetC5
......@@ -74,7 +74,6 @@ BBoxHead:
keep_top_k: 100
nms_threshold: 0.5
score_threshold: 0.05
num_classes: 81
LearningRate:
base_lr: 0.01
......
......@@ -10,6 +10,7 @@ save_dir: output
pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/ResNet50_vd_pretrained.tar
weights: output/faster_rcnn_r50_vd_fpn_2x/model_final
metric: COCO
num_classes: 81
FasterRCNN:
backbone: ResNet
......@@ -74,7 +75,6 @@ BBoxAssigner:
bg_thresh_lo: 0.0
fg_fraction: 0.25
fg_thresh: 0.5
num_classes: 81
BBoxHead:
head: TwoFCHead
......@@ -82,7 +82,6 @@ BBoxHead:
keep_top_k: 100
nms_threshold: 0.5
score_threshold: 0.05
num_classes: 81
TwoFCHead:
num_chan: 1024
......
......@@ -10,6 +10,7 @@ save_dir: output
pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/SE154_vd_pretrained.tar
weights: output/faster_rcnn_se154_vd_fpn_s1x/model_final
metric: COCO
num_classes: 81
FasterRCNN:
backbone: SENet
......@@ -76,7 +77,6 @@ BBoxAssigner:
bg_thresh_lo: 0.0
fg_fraction: 0.25
fg_thresh: 0.5
num_classes: 81
BBoxHead:
head: TwoFCHead
......@@ -84,7 +84,6 @@ BBoxHead:
keep_top_k: 100
nms_threshold: 0.5
score_threshold: 0.05
num_classes: 81
TwoFCHead:
num_chan: 1024
......
architecture: FasterRCNN
train_feed: FasterRCNNTrainFeed
eval_feed: FasterRCNNEvalFeed
test_feed: FasterRCNNTestFeed
max_iters: 180000
snapshot_iter: 10000
use_gpu: true
log_smooth_window: 20
save_dir: output
pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/ResNeXt101_vd_64x4d_pretrained.tar
weights: output/faster_rcnn_x101_vd_64x4d_fpn_1x/model_final
metric: COCO
num_classes: 81
FasterRCNN:
backbone: ResNeXt
fpn: FPN
rpn_head: FPNRPNHead
roi_extractor: FPNRoIAlign
bbox_head: BBoxHead
bbox_assigner: BBoxAssigner
ResNeXt:
depth: 101
feature_maps: [2, 3, 4, 5]
freeze_at: 2
group_width: 4
groups: 64
norm_type: affine_channel
variant: d
FPN:
max_level: 6
min_level: 2
num_chan: 256
spatial_scale: [0.03125, 0.0625, 0.125, 0.25]
FPNRPNHead:
anchor_generator:
anchor_sizes: [32, 64, 128, 256, 512]
aspect_ratios: [0.5, 1.0, 2.0]
stride: [16.0, 16.0]
variance: [1.0, 1.0, 1.0, 1.0]
anchor_start_size: 32
max_level: 6
min_level: 2
num_chan: 256
rpn_target_assign:
rpn_batch_size_per_im: 256
rpn_fg_fraction: 0.5
rpn_negative_overlap: 0.3
rpn_positive_overlap: 0.7
rpn_straddle_thresh: 0.0
train_proposal:
min_size: 0.0
nms_thresh: 0.7
post_nms_top_n: 2000
pre_nms_top_n: 2000
test_proposal:
min_size: 0.0
nms_thresh: 0.7
post_nms_top_n: 1000
pre_nms_top_n: 1000
FPNRoIAlign:
canconical_level: 4
canonical_size: 224
max_level: 5
min_level: 2
box_resolution: 7
sampling_ratio: 2
BBoxAssigner:
batch_size_per_im: 512
bbox_reg_weights: [0.1, 0.1, 0.2, 0.2]
bg_thresh_hi: 0.5
bg_thresh_lo: 0.0
fg_fraction: 0.25
fg_thresh: 0.5
BBoxHead:
head: TwoFCHead
nms:
keep_top_k: 100
nms_threshold: 0.5
score_threshold: 0.05
TwoFCHead:
num_chan: 1024
LearningRate:
base_lr: 0.01
schedulers:
- !PiecewiseDecay
gamma: 0.1
milestones: [120000, 160000]
values: null
- !LinearWarmup
start_factor: 0.1
steps: 1000
OptimizerBuilder:
optimizer:
momentum: 0.9
type: Momentum
regularizer:
factor: 0.0001
type: L2
FasterRCNNTrainFeed:
# batch size per device
batch_size: 1
dataset:
dataset_dir: dataset/coco
image_dir: train2017
annotation: annotations/instances_train2017.json
batch_transforms:
- !PadBatch
pad_to_stride: 32
num_workers: 2
shuffle: true
FasterRCNNEvalFeed:
batch_size: 1
dataset:
dataset_dir: dataset/coco
annotation: annotations/instances_val2017.json
image_dir: val2017
batch_transforms:
- !PadBatch
pad_to_stride: 32
num_workers: 2
FasterRCNNTestFeed:
batch_size: 1
dataset:
annotation: annotations/instances_val2017.json
batch_transforms:
- !PadBatch
pad_to_stride: 32
num_workers: 2
shuffle: false
architecture: FasterRCNN
train_feed: FasterRCNNTrainFeed
eval_feed: FasterRCNNEvalFeed
test_feed: FasterRCNNTestFeed
max_iters: 360000
snapshot_iter: 10000
use_gpu: true
log_smooth_window: 20
save_dir: output
pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/ResNeXt101_vd_64x4d_pretrained.tar
weights: output/faster_rcnn_x101_vd_64x4d_fpn_1x/model_final
metric: COCO
FasterRCNN:
backbone: ResNeXt
fpn: FPN
rpn_head: FPNRPNHead
roi_extractor: FPNRoIAlign
bbox_head: BBoxHead
bbox_assigner: BBoxAssigner
ResNeXt:
depth: 101
feature_maps: [2, 3, 4, 5]
freeze_at: 2
group_width: 4
groups: 64
norm_type: affine_channel
variant: d
FPN:
max_level: 6
min_level: 2
num_chan: 256
spatial_scale: [0.03125, 0.0625, 0.125, 0.25]
FPNRPNHead:
anchor_generator:
anchor_sizes: [32, 64, 128, 256, 512]
aspect_ratios: [0.5, 1.0, 2.0]
stride: [16.0, 16.0]
variance: [1.0, 1.0, 1.0, 1.0]
anchor_start_size: 32
max_level: 6
min_level: 2
num_chan: 256
rpn_target_assign:
rpn_batch_size_per_im: 256
rpn_fg_fraction: 0.5
rpn_negative_overlap: 0.3
rpn_positive_overlap: 0.7
rpn_straddle_thresh: 0.0
train_proposal:
min_size: 0.0
nms_thresh: 0.7
post_nms_top_n: 2000
pre_nms_top_n: 2000
test_proposal:
min_size: 0.0
nms_thresh: 0.7
post_nms_top_n: 1000
pre_nms_top_n: 1000
FPNRoIAlign:
canconical_level: 4
canonical_size: 224
max_level: 5
min_level: 2
box_resolution: 7
sampling_ratio: 2
BBoxAssigner:
batch_size_per_im: 512
bbox_reg_weights: [0.1, 0.1, 0.2, 0.2]
bg_thresh_hi: 0.5
bg_thresh_lo: 0.0
fg_fraction: 0.25
fg_thresh: 0.5
num_classes: 81
BBoxHead:
head: TwoFCHead
nms:
keep_top_k: 100
nms_threshold: 0.5
score_threshold: 0.05
num_classes: 81
TwoFCHead:
num_chan: 1024
LearningRate:
base_lr: 0.01
schedulers:
- !PiecewiseDecay
gamma: 0.1
milestones: [240000, 320000]
- !LinearWarmup
start_factor: 0.1
steps: 1000
OptimizerBuilder:
optimizer:
momentum: 0.9
type: Momentum
regularizer:
factor: 0.0001
type: L2
FasterRCNNTrainFeed:
# batch size per device
batch_size: 1
dataset:
dataset_dir: dataset/coco
image_dir: train2017
annotation: annotations/instances_train2017.json
batch_transforms:
- !PadBatch
pad_to_stride: 32
num_workers: 2
shuffle: true
FasterRCNNEvalFeed:
batch_size: 1
dataset:
dataset_dir: dataset/coco
annotation: annotations/instances_val2017.json
image_dir: val2017
batch_transforms:
- !PadBatch
pad_to_stride: 32
num_workers: 2
FasterRCNNTestFeed:
batch_size: 1
dataset:
annotation: annotations/instances_val2017.json
batch_transforms:
- !PadBatch
pad_to_stride: 32
num_workers: 2
shuffle: false
......@@ -10,6 +10,7 @@ save_dir: output
pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/ResNet101_pretrained.tar
metric: COCO
weights: output/mask_rcnn_r101_fpn_1x/model_final/
num_classes: 81
MaskRCNN:
backbone: ResNet
......@@ -68,7 +69,6 @@ FPNRoIAlign:
MaskHead:
dilation: 1
num_chan_reduced: 256
num_classes: 81
num_convs: 4
resolution: 28
......@@ -79,7 +79,6 @@ BBoxAssigner:
bg_thresh_lo: 0.0
fg_fraction: 0.25
fg_thresh: 0.5
num_classes: 81
MaskAssigner:
resolution: 28
......@@ -90,7 +89,6 @@ BBoxHead:
keep_top_k: 100
nms_threshold: 0.5
score_threshold: 0.05
num_classes: 81
TwoFCHead:
num_chan: 1024
......
......@@ -10,6 +10,7 @@ save_dir: output
pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/ResNet50_cos_pretrained.tar
metric: COCO
weights: output/mask_rcnn_r50_1x/model_final
num_classes: 81
MaskRCNN:
backbone: ResNet
......@@ -66,12 +67,10 @@ BBoxHead:
nms_threshold: 0.5
normalized: false
score_threshold: 0.05
num_classes: 81
MaskHead:
dilation: 1
num_chan_reduced: 256
num_classes: 81
resolution: 14
BBoxAssigner:
......@@ -81,10 +80,8 @@ BBoxAssigner:
bg_thresh_lo: 0.0
fg_fraction: 0.25
fg_thresh: 0.5
num_classes: 81
MaskAssigner:
num_classes: 81
resolution: 14
LearningRate:
......
......@@ -10,6 +10,7 @@ save_dir: output
pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/ResNet50_cos_pretrained.tar
metric: COCO
weights: output/mask_rcnn_r50_2x/model_final/
num_classes: 81
MaskRCNN:
backbone: ResNet
......@@ -67,12 +68,10 @@ BBoxHead:
nms_threshold: 0.5
normalized: false
score_threshold: 0.05
num_classes: 81
MaskHead:
dilation: 1
num_chan_reduced: 256
num_classes: 81
resolution: 14
BBoxAssigner:
......@@ -82,10 +81,8 @@ BBoxAssigner:
bg_thresh_lo: 0.0
fg_fraction: 0.25
fg_thresh: 0.5
num_classes: 81
MaskAssigner:
num_classes: 81
resolution: 14
LearningRate:
......
......@@ -10,6 +10,7 @@ save_dir: output
pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/ResNet50_cos_pretrained.tar
metric: COCO
weights: output/mask_rcnn_r50_fpn_1x/model_final/
num_classes: 81
MaskRCNN:
backbone: ResNet
......@@ -68,7 +69,6 @@ FPNRoIAlign:
MaskHead:
dilation: 1
num_chan_reduced: 256
num_classes: 81
num_convs: 4
resolution: 28
......@@ -79,7 +79,6 @@ BBoxAssigner:
bg_thresh_lo: 0.0
fg_fraction: 0.25
fg_thresh: 0.5
num_classes: 81
MaskAssigner:
resolution: 28
......@@ -90,7 +89,6 @@ BBoxHead:
keep_top_k: 100
nms_threshold: 0.5
score_threshold: 0.05
num_classes: 81
TwoFCHead:
num_chan: 1024
......
architecture: MaskRCNN
train_feed: MaskRCNNTrainFeed
eval_feed: MaskRCNNEvalFeed
test_feed: MaskRCNNTestFeed
max_iters: 360000
snapshot_iter: 10000
use_gpu: true
log_smooth_window: 20
save_dir: output
pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/ResNet50_cos_pretrained.tar
weights: output/mask_rcnn_r50_fpn_2x/model_final/
metric: COCO
num_classes: 81
MaskRCNN:
backbone: ResNet
fpn: FPN
rpn_head: FPNRPNHead
roi_extractor: FPNRoIAlign
bbox_head: BBoxHead
bbox_assigner: BBoxAssigner
ResNet:
depth: 50
feature_maps: [2, 3, 4, 5]
freeze_at: 2
norm_type: affine_channel
FPN:
max_level: 6
min_level: 2
num_chan: 256
spatial_scale: [0.03125, 0.0625, 0.125, 0.25]
FPNRPNHead:
anchor_generator:
aspect_ratios: [0.5, 1.0, 2.0]
variance: [1.0, 1.0, 1.0, 1.0]
anchor_start_size: 32
max_level: 6
min_level: 2
num_chan: 256
rpn_target_assign:
rpn_batch_size_per_im: 256
rpn_fg_fraction: 0.5
rpn_negative_overlap: 0.3
rpn_positive_overlap: 0.7
rpn_straddle_thresh: 0.0
train_proposal:
min_size: 0.0
nms_thresh: 0.7
pre_nms_top_n: 2000
post_nms_top_n: 2000
test_proposal:
min_size: 0.0
nms_thresh: 0.7
pre_nms_top_n: 1000
post_nms_top_n: 1000
FPNRoIAlign:
canconical_level: 4
canonical_size: 224
max_level: 5
min_level: 2
sampling_ratio: 2
box_resolution: 7
mask_resolution: 14
MaskHead:
dilation: 1
num_chan_reduced: 256
num_convs: 4
resolution: 28
BBoxAssigner:
batch_size_per_im: 512
bbox_reg_weights: [0.1, 0.1, 0.2, 0.2]
bg_thresh_hi: 0.5
bg_thresh_lo: 0.0
fg_fraction: 0.25
fg_thresh: 0.5
MaskAssigner:
resolution: 28
BBoxHead:
head: TwoFCHead
nms:
keep_top_k: 100
nms_threshold: 0.5
score_threshold: 0.05
TwoFCHead:
num_chan: 1024
LearningRate:
base_lr: 0.01
schedulers:
- !PiecewiseDecay
gamma: 0.1
milestones: [240000, 320000]
- !LinearWarmup
start_factor: 0.3333333333333333
steps: 500
OptimizerBuilder:
optimizer:
momentum: 0.9
type: Momentum
regularizer:
factor: 0.0001
type: L2
MaskRCNNTrainFeed:
batch_size: 1
dataset:
dataset_dir: dataset/coco
annotation: annotations/instances_train2017.json
image_dir: train2017
batch_transforms:
- !PadBatch
pad_to_stride: 32
num_workers: 2
MaskRCNNEvalFeed:
batch_size: 1
dataset:
dataset_dir: dataset/coco
annotation: annotations/instances_val2017.json
image_dir: val2017
batch_transforms:
- !PadBatch
pad_to_stride: 32
num_workers: 2
MaskRCNNTestFeed:
batch_size: 1
dataset:
annotation: annotations/instances_val2017.json
batch_transforms:
- !PadBatch
pad_to_stride: 32
num_workers: 2
......@@ -10,6 +10,7 @@ save_dir: output
pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/ResNet50_vd_pretrained.tar
metric: COCO
weights: output/mask_rcnn_r50_vd_fpn_2x/model_final/
num_classes: 81
MaskRCNN:
backbone: ResNet
......@@ -69,7 +70,6 @@ FPNRoIAlign:
MaskHead:
dilation: 1
num_chan_reduced: 256
num_classes: 81
num_convs: 4
resolution: 28
......@@ -80,7 +80,6 @@ BBoxAssigner:
bg_thresh_lo: 0.0
fg_fraction: 0.25
fg_thresh: 0.5
num_classes: 81
MaskAssigner:
resolution: 28
......@@ -91,7 +90,6 @@ BBoxHead:
keep_top_k: 100
nms_threshold: 0.5
score_threshold: 0.05
num_classes: 81
TwoFCHead:
num_chan: 1024
......
......@@ -10,6 +10,7 @@ save_dir: output
pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/SE154_vd_pretrained.tar
weights: output/mask_rcnn_se154_vd_fpn_s1x/model_final/
metric: COCO
num_classes: 81
MaskRCNN:
backbone: SENet
......@@ -71,7 +72,6 @@ FPNRoIAlign:
MaskHead:
dilation: 1
num_chan_reduced: 256
num_classes: 81
num_convs: 4
resolution: 28
......@@ -82,7 +82,6 @@ BBoxAssigner:
bg_thresh_lo: 0.0
fg_fraction: 0.25
fg_thresh: 0.5
num_classes: 81
MaskAssigner:
resolution: 28
......@@ -93,7 +92,6 @@ BBoxHead:
keep_top_k: 100
nms_threshold: 0.5
score_threshold: 0.05
num_classes: 81
TwoFCHead:
num_chan: 1024
......@@ -120,7 +118,7 @@ MaskRCNNTrainFeed:
# batch size per device
batch_size: 1
dataset:
dataset_dir: dataset/coco
dataset_dir: dataset/coco
image_dir: train2017
annotation: annotations/instances_train2017.json
batch_transforms:
......
......@@ -188,3 +188,14 @@ A small utility (`tools/configure.py`) is included to simplify the configuration
```shell
python tools/configure.py --minimal generate FasterRCNN BBoxHead
```
# FAQ
**Q:** There are some configuration options that are used by multiple modules (e.g., `num_classes`), how do I avoid duplication in config files?
**A:** We provided a `__shared__` annotation for exactly this purpose, simply annotate like this `__shared__ = ['num_classes']`. It works as follows:
1. if `num_classes` is configured for a module in config file, it takes precedence.
2. if `num_classes` is not configured for a module but is present in the config file as a global key, its value will be used.
3. otherwise, the default value (`81`) will be used.
......@@ -180,3 +180,14 @@ pip install typeguard http://github.com/willthefrog/docstring_parser/tarball/mas
```shell
python tools/configure.py --minimal generate FasterRCNN BBoxHead
```
# FAQ
**Q:** 某些配置项会在多个模块中用到(如 `num_classes`),如何避免在配置文件中多次重复设置?
**A:** 框架提供了 `__shared__` 标记来实现配置的共享,用户可以标记参数,如 `__shared__ = ['num_classes']` ,配置数值作用规则如下:
1. 如果模块配置中提供了 `num_classes` ,会优先使用其数值。
2. 如果模块配置中未提供 `num_classes` ,但配置文件中存在全局键值,那么会使用全局键值。
3. 两者均为配置的情况下,将使用默认值(`81`)。
......@@ -38,16 +38,19 @@ The backbone models pretrained on ImageNet are available. All backbone models ar
| ResNet50-vd | Faster | 1 | 1x | 36.4 | - | [model](https://paddlemodels.bj.bcebos.com/object_detection/faster_rcnn_r50_vd_1x.tar) |
| ResNet50-FPN | Faster | 2 | 1x | 37.2 | - | [model](https://paddlemodels.bj.bcebos.com/object_detection/faster_rcnn_r50_fpn_1x.tar) |
| ResNet50-FPN | Faster | 2 | 2x | 37.7 | - | [model](https://paddlemodels.bj.bcebos.com/object_detection/faster_rcnn_r50_fpn_2x.tar) |
| ResNet50-FPN | Mask | 2 | 1x | 37.9 | 34.2 | [model](https://paddlemodels.bj.bcebos.com/object_detection/mask_rcnn_r50_fpn_1x.tar) |
| ResNet50-FPN | Mask | 1 | 1x | 37.9 | 34.2 | [model](https://paddlemodels.bj.bcebos.com/object_detection/mask_rcnn_r50_fpn_1x.tar) |
| ResNet50-FPN | Mask | 1 | 2x | 38.7 | 34.7 | [model](https://paddlemodels.bj.bcebos.com/object_detection/mask_rcnn_r50_fpn_2x.tar) |
| ResNet50-FPN | Cascade Faster | 2 | 1x | 40.9 | - | [model](https://paddlemodels.bj.bcebos.com/object_detection/cascade_rcnn_r50_fpn_1x.tar) |
| ResNet50-vd-FPN | Faster | 2 | 2x | 38.9 | - | [model](https://paddlemodels.bj.bcebos.com/object_detection/faster_rcnn_r50_vd_fpn_2x.tar) |
| ResNet50-vd-FPN | Mask | 2 | 2x | 39.8 | 35.4 | [model](https://paddlemodels.bj.bcebos.com/object_detection/mask_rcnn_r50_vd_fpn_2x.tar) |
| ResNet50-vd-FPN | Mask | 1 | 2x | 39.8 | 35.4 | [model](https://paddlemodels.bj.bcebos.com/object_detection/mask_rcnn_r50_vd_fpn_2x.tar) |
| ResNet101 | Faster | 1 | 1x | 38.3 | - | [model](https://paddlemodels.bj.bcebos.com/object_detection/faster_rcnn_r101_1x.tar) |
| ResNet101-FPN | Faster | 1 | 1x | 38.7 | - | [model](https://paddlemodels.bj.bcebos.com/object_detection/faster_rcnn_r101_fpn_1x.tar) |
| ResNet101-FPN | Faster | 1 | 2x | 39.1 | - | [model](https://paddlemodels.bj.bcebos.com/object_detection/faster_rcnn_r101_fpn_2x.tar) |
| ResNet101-FPN | Mask | 1 | 1x | 39.5 | 35.2 | [model](https://paddlemodels.bj.bcebos.com/object_detection/mask_rcnn_r101_fpn_1x.tar) |
| ResNet101-vd-FPN | Faster | 1 | 1x | 40.0 | - | [model](https://paddlemodels.bj.bcebos.com/object_detection/faster_rcnn_r101_fpn_1x.tar) |
| ResNet101-vd-FPN | Faster | 1 | 2x | 40.6 | - | [model](https://paddlemodels.bj.bcebos.com/object_detection/faster_rcnn_r101_fpn_2x.tar) |
| ResNet101-vd-FPN | Faster | 1 | 1x | 40.5 | - | [model](https://paddlemodels.bj.bcebos.com/object_detection/faster_rcnn_r101_vd_fpn_1x.tar) |
| ResNet101-vd-FPN | Faster | 1 | 2x | 40.8 | - | [model](https://paddlemodels.bj.bcebos.com/object_detection/faster_rcnn_r101_vd_fpn_2x.tar) |
| ResNeXt101-vd-FPN | Faster | 1 | 1x | 42.2 | - | [model](https://paddlemodels.bj.bcebos.com/object_detection/faster_rcnn_x101_vd_64x4d_fpn_1x.tar) |
| ResNeXt101-vd-FPN | Faster | 1 | 2x | 41.7 | - | [model](https://paddlemodels.bj.bcebos.com/object_detection/faster_rcnn_x101_vd_64x4d_fpn_2x.tar) |
| SENet154-vd-FPN | Faster | 1 | 1.44x | 42.9 | - | [model](https://paddlemodels.bj.bcebos.com/object_detection/faster_rcnn_se154_vd_fpn_s1x.tar) |
| SENet154-vd-FPN | Mask | 1 | 1.44x | 44.0 | 38.7 | [model](https://paddlemodels.bj.bcebos.com/object_detection/mask_rcnn_se154_vd_fpn_s1x.tar) |
......
......@@ -43,13 +43,14 @@ except Exception:
if not check_type.__warning_sent__:
from ppdet.utils.cli import ColorTTY
color_tty = ColorTTY()
message = "typeguard is not installed, type checking is not available"
message = "typeguard is not installed," \
+ "type checking is not available"
print(color_tty.yellow(message))
check_type.__warning_sent__ = True
check_type.__warning_sent__ = False
__all__ = ['SchemaValue', 'SchemaDict', 'extract_schema']
__all__ = ['SchemaValue', 'SchemaDict', 'SharedConfig', 'extract_schema']
class SchemaValue(object):
......@@ -160,6 +161,27 @@ class SchemaDict(dict):
self.name, ", ".join(mismatch_keys)))
class SharedConfig(object):
"""
Representation class for `__shared__` annotations, which work as follows:
- if `key` is set for the module in config file, its value will take
precedence
- if `key` is not set for the module but present in the config file, its
value will be used
- otherwise, use the provided `default_value` as fallback
Args:
key: config[key] will be injected
default_value: fallback value
"""
def __init__(self, key, default_value=None):
super(SharedConfig, self).__init__()
self.key = key
self.default_value = default_value
def extract_schema(cls):
"""
Extract schema from a given class
......@@ -216,6 +238,7 @@ def extract_schema(cls):
schema.strict = not has_kwargs
schema.pymodule = importlib.import_module(cls.__module__)
schema.inject = getattr(cls, '__inject__', [])
schema.shared = getattr(cls, '__shared__', [])
for idx, name in enumerate(names):
comment = name in comments and comments[name] or name
if name in schema.inject:
......@@ -223,8 +246,13 @@ def extract_schema(cls):
else:
type_ = name in annotations and annotations[name] or None
value_schema = SchemaValue(name, comment, type_)
if idx >= num_required:
value_schema.set_default(defaults[idx - num_required])
if name in schema.shared:
assert idx >= num_required, "shared config must have default value"
default = defaults[idx - num_required]
value_schema.set_default(SharedConfig(name, default))
elif idx >= num_required:
default = defaults[idx - num_required]
value_schema.set_default(default)
schema.set_schema(name, value_schema)
return schema
......@@ -16,6 +16,7 @@ import importlib
import inspect
import yaml
from .schema import SharedConfig
__all__ = ['serializable', 'Callable']
......@@ -59,7 +60,8 @@ def _make_python_representer(cls):
def serializable(cls):
"""
Add loader and dumper for given class, which must be "trivially serializable"
Add loader and dumper for given class, which must be
"trivially serializable"
Args:
cls: class to be serialized
......@@ -72,6 +74,10 @@ def serializable(cls):
return cls
yaml.add_representer(SharedConfig,
lambda d, o: d.represent_data(o.default_value))
@serializable
class Callable(object):
"""
......
......@@ -21,8 +21,9 @@ import os
import sys
import yaml
import copy
from .config.schema import SchemaDict, extract_schema
from .config.schema import SchemaDict, SharedConfig, extract_schema
from .config.yaml_helpers import serializable
__all__ = [
......@@ -135,7 +136,8 @@ def create(cls_or_name, **kwargs):
assert type(cls_or_name) in [type, str
], "should be a class or name of a class"
name = type(cls_or_name) == str and cls_or_name or cls_or_name.__name__
assert name in global_config and isinstance(global_config[name], SchemaDict), \
assert name in global_config and \
isinstance(global_config[name], SchemaDict), \
"the module {} is not registered".format(name)
config = global_config[name]
config.update(kwargs)
......@@ -144,9 +146,26 @@ def create(cls_or_name, **kwargs):
kwargs = {}
kwargs.update(global_config[name])
# parse `shared` annoation of registered modules
if getattr(config, 'shared', None):
for k in config.shared:
target_key = config[k]
shared_conf = config.schema[k].default
assert isinstance(shared_conf, SharedConfig)
if target_key is not None and not isinstance(target_key,
SharedConfig):
continue # value is given for the module
elif shared_conf.key in global_config:
# `key` is present in config
kwargs[k] = global_config[shared_conf.key]
else:
kwargs[k] = shared_conf.default_value
# parse `inject` annoation of registered modules
if getattr(config, 'inject', None):
for k in config.inject:
target_key = global_config[name][k]
target_key = config[k]
# optional dependency
if target_key is None:
continue
......@@ -163,4 +182,7 @@ def create(cls_or_name, **kwargs):
kwargs[k] = target
else:
raise ValueError("Unsupported injection type:", target_key)
# prevent modification of global config values of reference types
# (e.g., list, dict) from within the created module instances
kwargs = copy.deepcopy(kwargs)
return cls(**kwargs)
......@@ -181,7 +181,6 @@ class DataSet(object):
Args:
annotation (str): annotation file path
image_dir (str): directory where image files are stored
num_classes (int): number of classes
shuffle (bool): shuffle samples
"""
__source__ = 'RoiDbSource'
......
......@@ -25,7 +25,7 @@ from paddle.fluid.regularizer import L2Decay
from ppdet.modeling.ops import (AnchorGenerator, RetinaTargetAssign,
RetinaOutputDecoder)
from ppdet.core.workspace import register, serializable
from ppdet.core.workspace import register
__all__ = ['RetinaHead']
......@@ -52,6 +52,7 @@ class RetinaHead(object):
sigma (float): The parameter in smooth l1 loss
"""
__inject__ = ['anchor_generator', 'target_assign', 'output_decoder']
__shared__ = ['num_classes']
def __init__(self,
anchor_generator=AnchorGenerator().__dict__,
......@@ -333,7 +334,6 @@ class RetinaHead(object):
cls_pred_reshape_list = output['cls_pred']
bbox_pred_reshape_list = output['bbox_pred']
anchor_reshape_list = output['anchor']
anchor_var_reshape_list = output['anchor_var']
for i in range(self.max_level - self.min_level + 1):
cls_pred_reshape_list[i] = fluid.layers.sigmoid(
cls_pred_reshape_list[i])
......
......@@ -64,7 +64,7 @@ class FasterRCNN(object):
gt_box = feed_vars['gt_box']
is_crowd = feed_vars['is_crowd']
else:
im_shape = feed_vars['im_info']
im_shape = feed_vars['im_shape']
body_feats = self.backbone(im)
body_feat_names = list(body_feats.keys())
......
......@@ -149,7 +149,11 @@ class MaskRCNN(object):
cond = fluid.layers.less_than(x=bbox_size, y=size)
mask_pred = fluid.layers.create_global_var(
shape=[1], value=0.0, dtype='float32', persistable=False)
shape=[1],
value=0.0,
dtype='float32',
persistable=False,
name='mask_pred')
with fluid.layers.control_flow.Switch() as switch:
with switch.case(cond):
......
......@@ -56,8 +56,8 @@ class SSD(object):
self.output_decoder = SSDOutputDecoder(**output_decoder)
if isinstance(metric, dict):
self.metric = SSDMetric(**metric)
def _forward(self, feed_vars, mode='train'):
def build(self, feed_vars, mode='train'):
im = feed_vars['image']
if mode == 'train' or mode == 'eval':
gt_box = feed_vars['gt_box']
......@@ -88,10 +88,16 @@ class SSD(object):
return {'bbox': pred}
def train(self, feed_vars):
return self._forward(feed_vars, 'train')
return self.build(feed_vars, 'train')
def eval(self, feed_vars):
return self._forward(feed_vars, 'eval')
return self.build(feed_vars, 'eval')
def test(self, feed_vars):
return self._forward(feed_vars, 'test')
return self.build(feed_vars, 'test')
def is_bbox_normalized(self):
# SSD use output_decoder in output layers, bbox is normalized
# to range [0, 1], is_bbox_normalized is used in infer.py
return True
......@@ -119,6 +119,7 @@ class ResNet(object):
regularizer=L2Decay(norm_decay))
if self.norm_type in ['bn', 'sync_bn']:
global_stats = True if self.freeze_norm else False
out = fluid.layers.batch_norm(
input=conv,
act=act,
......@@ -126,7 +127,8 @@ class ResNet(object):
param_attr=pattr,
bias_attr=battr,
moving_mean_name=bn_name + '_mean',
moving_variance_name=bn_name + '_variance', )
moving_variance_name=bn_name + '_variance',
use_global_stats=global_stats)
scale = fluid.framework._get_var(pattr.name)
bias = fluid.framework._get_var(battr.name)
elif self.norm_type == 'affine_channel':
......@@ -330,7 +332,6 @@ class ResNetC5(ResNet):
norm_decay=0.,
variant='b',
feature_maps=[5]):
super(ResNetC5, self).__init__(
depth, freeze_at, norm_type, freeze_norm, norm_decay,
variant, feature_maps)
super(ResNetC5, self).__init__(depth, freeze_at, norm_type, freeze_norm,
norm_decay, variant, feature_maps)
self.severed_head = True
......@@ -88,6 +88,7 @@ class GenerateProposals(object):
class MaskAssigner(object):
__op__ = fluid.layers.generate_mask_labels
__append_doc__ = True
__shared__ = ['num_classes']
def __init__(self, num_classes=81, resolution=14):
super(MaskAssigner, self).__init__()
......@@ -123,6 +124,7 @@ class MultiClassNMS(object):
class BBoxAssigner(object):
__op__ = fluid.layers.generate_proposal_labels
__append_doc__ = True
__shared__ = ['num_classes']
def __init__(self,
batch_size_per_im=512,
......
......@@ -92,12 +92,13 @@ class BBoxHead(object):
RCNN bbox head
Args:
head (object): the head module instance, e.g., `ResNetC5` or `TwoFCHead`
head (object): the head module instance, e.g., `ResNetC5`, `TwoFCHead`
box_coder (object): `BoxCoder` instance
nms (object): `MultiClassNMS` instance
num_classes: number of output classes
"""
__inject__ = ['head', 'box_coder', 'nms']
__shared__ = ['num_classes']
def __init__(self,
head,
......
......@@ -37,6 +37,7 @@ class CascadeBBoxHead(object):
num_classes: number of output classes
"""
__inject__ = ['head', 'nms']
__shared__ = ['num_classes']
def __init__(self, head, nms=MultiClassNMS().__dict__, num_classes=81):
super(CascadeBBoxHead, self).__init__()
......@@ -196,7 +197,8 @@ class CascadeBBoxHead(object):
# only use fg box delta to decode box
bbox_pred_new = fluid.layers.slice(
bbox_pred_new, axes=[1], starts=[1], ends=[2])
bbox_pred_new = fluid.layers.expand(bbox_pred_new, [1, self.num_classes, 1])
bbox_pred_new = fluid.layers.expand(bbox_pred_new,
[1, self.num_classes, 1])
decoded_box = fluid.layers.box_coder(
prior_box=proposals_boxes,
prior_box_var=bbox_reg_w,
......
......@@ -38,6 +38,8 @@ class MaskHead(object):
num_classes (int): number of output classes
"""
__shared__ = ['num_classes']
def __init__(self,
num_convs=0,
num_chan_reduced=256,
......
......@@ -26,6 +26,8 @@ __all__ = ['BBoxAssigner', 'MaskAssigner', 'CascadeBBoxAssigner']
@register
class CascadeBBoxAssigner(object):
__shared__ = ['num_classes']
def __init__(self,
batch_size_per_im=512,
fg_fraction=.25,
......
......@@ -65,7 +65,7 @@ class ArgsParser(ArgumentParser):
s = s.strip()
k, v = s.split('=')
if '.' not in k:
config[k] = v
config[k] = yaml.load(v, Loader=yaml.Loader)
else:
keys = k.split('.')
config[keys[0]] = {}
......
......@@ -144,13 +144,13 @@ def mask2out(results, clsid2catid, resolution, thresh_binarize=0.5):
continue
masks = t['mask'][0]
im_shape = t['im_shape'][0][0]
s = 0
# for each sample
for i in range(len(lengths)):
num = lengths[i]
im_id = int(im_ids[i][0])
im_shape = t['im_shape'][0][i]
bbox = bboxes[s:s + num][:, 2:]
clsid_scores = bboxes[s:s + num][:, 0:2]
......
......@@ -82,6 +82,47 @@ def get_test_images(infer_dir, infer_img):
return images
def prune_feed_vars(feeded_var_names, target_vars, prog):
"""
Filter out feed variables which are not in program,
pruned feed variables are only used in post processing
on model output, which are not used in program, such
as im_id to identify image order, im_shape to clip bbox
in image.
"""
exist_var_names = []
prog = prog.clone()
prog = prog._prune(targets=target_vars)
global_block = prog.global_block()
for name in feeded_var_names:
try:
v = global_block.var(name)
exist_var_names.append(v.name)
except Exception:
logger.info('save_inference_model pruned unused feed '
'variables {}'.format(name))
pass
return exist_var_names
def save_infer_model(FLAGS, exe, feed_vars, test_fetches, infer_prog):
cfg_name = os.path.basename(FLAGS.config).split('.')[0]
save_dir = os.path.join(FLAGS.output_dir, cfg_name)
feeded_var_names = [var.name for var in feed_vars.values()]
target_vars = test_fetches.values()
feeded_var_names = prune_feed_vars(feeded_var_names, target_vars, infer_prog)
logger.info("Save inference model to {}, input: {}, output: "
"{}...".format(save_dir, feeded_var_names,
[var.name for var in target_vars]))
fluid.io.save_inference_model(
save_dir,
feeded_var_names=feeded_var_names,
target_vars=target_vars,
executor=exe,
main_program=infer_prog,
params_filename="__params__")
def main():
cfg = load_config(FLAGS.config)
......@@ -143,6 +184,12 @@ def main():
clsid2catid, catid2name = get_category_info(anno_file, with_background,
use_default_label)
# whether output bbox is normalized in model output layer
is_bbox_normalized = False
if hasattr(model, 'is_bbox_normalized') and \
callable(model.is_bbox_normalized):
is_bbox_normalized = model.is_bbox_normalized()
imid2path = reader.imid2path
for iter_id, data in enumerate(reader()):
outs = exe.run(infer_prog,
......@@ -157,7 +204,6 @@ def main():
bbox_results = None
mask_results = None
is_bbox_normalized = True if cfg.metric == 'VOC' else False
if 'bbox' in res:
bbox_results = bbox2out([res], clsid2catid, is_bbox_normalized)
if 'mask' in res:
......
......@@ -86,7 +86,6 @@ def main():
place = fluid.CUDAPlace(0) if cfg.use_gpu else fluid.CPUPlace()
exe = fluid.Executor(place)
model = create(main_arch)
lr_builder = create('LearningRate')
optim_builder = create('OptimizerBuilder')
......@@ -95,6 +94,7 @@ def main():
train_prog = fluid.Program()
with fluid.program_guard(train_prog, startup_prog):
with fluid.unique_name.guard():
model = create(main_arch)
train_pyreader, feed_vars = create_feed(train_feed)
train_fetches = model.train(feed_vars)
loss = train_fetches['loss']
......@@ -113,6 +113,7 @@ def main():
eval_prog = fluid.Program()
with fluid.program_guard(eval_prog, startup_prog):
with fluid.unique_name.guard():
model = create(main_arch)
eval_pyreader, feed_vars = create_feed(eval_feed)
fetches = model.eval(feed_vars)
eval_prog = eval_prog.clone(True)
......@@ -120,8 +121,9 @@ def main():
eval_reader = create_reader(eval_feed)
eval_pyreader.decorate_sample_list_generator(eval_reader, place)
# parse train fetches
extra_keys = ['im_info', 'im_id'] if cfg.metric == 'COCO' else []
# parse eval fetches
extra_keys = ['im_info', 'im_id',
'im_shape'] if cfg.metric == 'COCO' else []
eval_keys, eval_values, eval_cls = parse_fetches(fetches, eval_prog,
extra_keys)
......@@ -132,7 +134,7 @@ def main():
sync_bn = getattr(model.backbone, 'norm_type', None) == 'sync_bn'
# only enable sync_bn in multi GPU devices
build_strategy.sync_batch_norm = sync_bn and devices_num > 1 \
and cfg.use_gpu
and cfg.use_gpu
train_compile_program = fluid.compiler.CompiledProgram(
train_prog).with_data_parallel(
loss_name=loss.name, build_strategy=build_strategy)
......@@ -141,12 +143,12 @@ def main():
exe.run(startup_prog)
freeze_bn = getattr(model.backbone, 'freeze_norm', False)
fuse_bn = getattr(model.backbone, 'norm_type', None) == 'affine_channel'
start_iter = 0
if FLAGS.resume_checkpoint:
checkpoint.load_checkpoint(exe, train_prog, FLAGS.resume_checkpoint)
start_iter = checkpoint.global_step()
elif cfg.pretrain_weights and freeze_bn:
elif cfg.pretrain_weights and fuse_bn:
checkpoint.load_and_fusebn(exe, train_prog, cfg.pretrain_weights)
elif cfg.pretrain_weights:
checkpoint.load_pretrain(exe, train_prog, cfg.pretrain_weights)
......
......@@ -91,6 +91,7 @@ tar -xf vgg_ilsvrc_16_fc_reduced.tar.gz && rm -f vgg_ilsvrc_16_fc_reduced.tar.gz
python -u train.py --batch_size=16 --pretrained_model=vgg_ilsvrc_16_fc_reduced
```
- 可以通过设置 `export CUDA_VISIBLE_DEVICES=0,1,2,3` 指定想要使用的GPU数量,`batch_size`默认设置为12或16。
- **注意**: 在**Windows**机器上训练,需要设置 `--use_multiprocess=False`,因为在Windows上使用Python多进程加速训练时有错误。
- 更多的可选参数见:
```bash
python train.py --help
......
......@@ -280,14 +280,25 @@ def train_generator(settings, file_list, batch_size, shuffle=True):
return reader
def train(settings, file_list, batch_size, shuffle=True, num_workers=8):
def train(settings,
file_list,
batch_size,
shuffle=True,
use_multiprocess=True,
num_workers=8):
file_lists = load_file_list(file_list)
n = int(math.ceil(len(file_lists) // num_workers))
split_lists = [file_lists[i:i + n] for i in range(0, len(file_lists), n)]
readers = []
for iterm in split_lists:
readers.append(train_generator(settings, iterm, batch_size, shuffle))
return paddle.reader.multiprocess_reader(readers, False)
if use_multiprocess:
n = int(math.ceil(len(file_lists) // num_workers))
split_lists = [
file_lists[i:i + n] for i in range(0, len(file_lists), n)
]
readers = []
for iterm in split_lists:
readers.append(
train_generator(settings, iterm, batch_size, shuffle))
return paddle.reader.multiprocess_reader(readers, False)
else:
return train_generator(settings, file_lists, batch_size, shuffle)
def test(settings, file_list):
......
......@@ -9,6 +9,20 @@ import time
import argparse
import functools
def set_paddle_flags(**kwargs):
for key, value in kwargs.items():
if os.environ.get(key, None) is None:
os.environ[key] = str(value)
# NOTE(paddle-dev): All of these flags should be
# set before `import paddle`. Otherwise, it would
# not take any effect.
set_paddle_flags(
FLAGS_eager_delete_tensor_gb=0, # enable GC to save memory
)
import paddle
import paddle.fluid as fluid
from pyramidbox import PyramidBox
......@@ -32,6 +46,7 @@ add_arg('mean_BGR', str, '104., 117., 123.', "Mean value for B,G,R cha
add_arg('with_mem_opt', bool, True, "Whether to use memory optimization or not.")
add_arg('pretrained_model', str, './vgg_ilsvrc_16_fc_reduced/', "The init model path.")
add_arg('data_dir', str, 'data', "The base dir of dataset")
add_arg('use_multiprocess', bool, True, "Whether use multi-process for data preprocessing.")
parser.add_argument('--enable_ce', action='store_true', help='If set, run the task with continuous evaluation logs.')
parser.add_argument('--batch_num', type=int, help="batch num for ce")
parser.add_argument('--num_devices', type=int, default=1, help='Number of GPU devices')
......@@ -163,7 +178,8 @@ def train(args, config, train_params, train_file_list):
train_file_list,
batch_size_per_device,
shuffle = is_shuffle,
num_workers = num_workers)
use_multiprocess=args.use_multiprocess,
num_workers=num_workers)
train_py_reader.decorate_paddle_reader(train_reader)
if args.parallel:
......
......@@ -23,7 +23,7 @@ SSD is readily pluggable into a wide variant standard convolutional network, suc
Please download [PASCAL VOC dataset](http://host.robots.ox.ac.uk/pascal/VOC/) at first, skip this step if you already have one.
```bash
```
cd data/pascalvoc
./download.sh
```
......@@ -36,7 +36,7 @@ The command `download.sh` also will create training and testing file lists.
We provide two pre-trained models. The one is MobileNet-v1 SSD trained on COCO dataset, but removed the convolutional predictors for COCO dataset. This model can be used to initialize the models when training other datasets, like PASCAL VOC. The other pre-trained model is MobileNet-v1 trained on ImageNet 2012 dataset but removed the last weights and bias in the Fully-Connected layer. Download MobileNet-v1 SSD:
```bash
```
./pretrained/download_coco.sh
```
......@@ -46,13 +46,14 @@ Declaration: the MobileNet-v1 SSD model is converted by [TensorFlow model](https
#### Train on PASCAL VOC
`train.py` is the main caller of the training module. Examples of usage are shown below.
```bash
python -u train.py --batch_size=64 --dataset='pascalvoc' --pretrained_model='pretrained/ssd_mobilenet_v1_coco/'
```
python -u train.py --batch_size=64 --dataset=pascalvoc --pretrained_model=pretrained/ssd_mobilenet_v1_coco/
```
- Set ```export CUDA_VISIBLE_DEVICES=0,1``` to specifiy the number of GPU you want to use.
- **Note**: set `--use_multiprocess=False` when training on **Windows**, since some problems need to be solved when using Python multiprocess to accelerate data processing.
- For more help on arguments:
```bash
```
python train.py --help
```
......@@ -69,14 +70,14 @@ We used RMSProp optimizer with mini-batch size 64 to train the MobileNet-SSD. Th
You can evaluate your trained model in different metrics like 11point, integral on both PASCAL VOC and COCO dataset. Note we set the default test list to the dataset's test/val list, you can use your own test list by setting ```--test_list``` args.
`eval.py` is the main caller of the evaluating module. Examples of usage are shown below.
```bash
python eval.py --dataset='pascalvoc' --model_dir='train_pascal_model/best_model' --data_dir='data/pascalvoc' --test_list='test.txt' --ap_version='11point' --nms_threshold=0.45
```
python eval.py --dataset=pascalvoc --model_dir=model/best_model --data_dir=data/pascalvoc --test_list=test.txt
```
### Infer and Visualize
`infer.py` is the main caller of the inferring module. Examples of usage are shown below.
```bash
python infer.py --dataset='pascalvoc' --nms_threshold=0.45 --model_dir='train_pascal_model/best_model' --image_path='./data/pascalvoc/VOCdevkit/VOC2007/JPEGImages/009963.jpg'
```
python infer.py --dataset=pascalvoc --nms_threshold=0.45 --model_dir=model/best_model --image_path=./data/pascalvoc/VOCdevkit/VOC2007/JPEGImages/009963.jpg
```
Below are the examples of running the inference and visualizing the model result.
<p align="center">
......
......@@ -24,7 +24,7 @@ SSD 可以方便地插入到任何一种标准卷积网络中,比如 VGG、Res
请先使用下面的命令下载 [PASCAL VOC 数据集](http://host.robots.ox.ac.uk/pascal/VOC/)
```bash
```
cd data/pascalvoc
./download.sh
```
......@@ -38,7 +38,7 @@ cd data/pascalvoc
我们提供了两个预训练模型。第一个模型是在 COCO 数据集上预训练的 MobileNet-v1 SSD,我们将它的预测头移除了以便在 COCO 以外的数据集上进行训练。第二个模型是在 ImageNet 2012 数据集上预训练的 MobileNet-v1,我们也将最后的全连接层移除以便进行目标检测训练。下载 MobileNet-v1 SSD:
```bash
```
./pretrained/download_coco.sh
```
......@@ -48,13 +48,14 @@ cd data/pascalvoc
#### 训练
`train.py` 是训练模块的主要执行程序,调用示例如下:
```bash
python -u train.py --batch_size=64 --dataset='pascalvoc' --pretrained_model='pretrained/ssd_mobilenet_v1_coco/'
```
python -u train.py --batch_size=64 --dataset=pascalvoc --pretrained_model=pretrained/ssd_mobilenet_v1_coco/
```
- 可以通过设置 ```export CUDA_VISIBLE_DEVICES=0,1``` 指定想要使用的GPU数量。
- **注意**: 在**Windows**机器上训练,需要设置 `--use_multiprocess=False`,因为在Windows上使用Python多进程加速训练时有错误。
- 更多的可选参数见:
```bash
```
python train.py --help
```
......@@ -71,15 +72,16 @@ cd data/pascalvoc
你可以使用11point、integral等指标在PASCAL VOC 数据集上评估训练好的模型。不失一般性,我们采用相应数据集的测试列表作为样例代码的默认列表,你也可以通过设置```--test_list```来指定自己的测试样本列表。
`eval.py`是评估模块的主要执行程序,调用示例如下:
```bash
python eval.py --dataset='pascalvoc' --model_dir='train_pascal_model/best_model' --data_dir='data/pascalvoc' --test_list='test.txt' --ap_version='11point' --nms_threshold=0.45
```
python eval.py --dataset=pascalvoc --model_dir=model/best_model --data_dir=data/pascalvoc --test_list=test.txt
```
### 模型预测以及可视化
`infer.py`是预测及可视化模块的主要执行程序,调用示例如下:
```bash
python infer.py --dataset='pascalvoc' --nms_threshold=0.45 --model_dir='train_pascal_model/best_model' --image_path='./data/pascalvoc/VOCdevkit/VOC2007/JPEGImages/009963.jpg'
```
python infer.py --dataset=pascalvoc --nms_threshold=0.45 --model_dir=model/best_model --image_path=./data/pascalvoc/VOCdevkit/VOC2007/JPEGImages/009963.jpg
```
下图可视化了模型的预测结果:
<p align="center">
......
......@@ -283,6 +283,7 @@ def train(settings,
file_list,
batch_size,
shuffle=True,
use_multiprocess=True,
num_workers=8,
enable_ce=False):
file_path = os.path.join(settings.data_dir, file_list)
......@@ -294,14 +295,15 @@ def train(settings,
image_ids = coco_api.getImgIds()
images = coco_api.loadImgs(image_ids)
np.random.shuffle(images)
n = int(math.ceil(len(images) // num_workers))
image_lists = [images[i:i + n] for i in range(0, len(images), n)]
if '2014' in file_list:
sub_dir = "train2014"
elif '2017' in file_list:
sub_dir = "train2017"
data_dir = os.path.join(settings.data_dir, sub_dir)
n = int(math.ceil(len(images) // num_workers)) if use_multiprocess \
else len(images)
image_lists = [images[i:i + n] for i in range(0, len(images), n)]
for l in image_lists:
readers.append(
coco(settings, coco_api, l, 'train', batch_size, shuffle,
......@@ -309,11 +311,16 @@ def train(settings,
else:
images = [line.strip() for line in open(file_path)]
np.random.shuffle(images)
n = int(math.ceil(len(images) // num_workers))
n = int(math.ceil(len(images) // num_workers)) if use_multiprocess \
else len(images)
image_lists = [images[i:i + n] for i in range(0, len(images), n)]
for l in image_lists:
readers.append(pascalvoc(settings, l, 'train', batch_size, shuffle))
return paddle.reader.multiprocess_reader(readers, False)
print("use_multiprocess ", use_multiprocess)
if use_multiprocess:
return paddle.reader.multiprocess_reader(readers, False)
else:
return readers[0]
def test(settings, file_list, batch_size):
......
......@@ -7,6 +7,20 @@ import shutil
import math
import multiprocessing
def set_paddle_flags(**kwargs):
for key, value in kwargs.items():
if os.environ.get(key, None) is None:
os.environ[key] = str(value)
# NOTE(paddle-dev): All of these flags should be
# set before `import paddle`. Otherwise, it would
# not take any effect.
set_paddle_flags(
FLAGS_eager_delete_tensor_gb=0, # enable GC to save memory
)
import paddle
import paddle.fluid as fluid
import reader
......@@ -28,6 +42,7 @@ add_arg('ap_version', str, '11point', "mAP version can be inte
add_arg('image_shape', str, '3,300,300', "Input image shape.")
add_arg('mean_BGR', str, '127.5,127.5,127.5', "Mean value for B,G,R channel which will be subtracted.")
add_arg('data_dir', str, 'data/pascalvoc', "Data directory.")
add_arg('use_multiprocess', bool, True, "Whether use multi-process for data preprocessing.")
add_arg('enable_ce', bool, False, "Whether use CE to evaluate the model.")
#yapf: enable
......@@ -185,14 +200,8 @@ def train(args,
build_strategy.memory_optimize = True
train_exe = fluid.ParallelExecutor(main_program=train_prog,
use_cuda=use_gpu, loss_name=loss.name, build_strategy=build_strategy)
train_reader = reader.train(data_args,
train_file_list,
batch_size_per_device,
shuffle=is_shuffle,
num_workers=num_workers,
enable_ce=enable_ce)
test_reader = reader.test(data_args, val_file_list, batch_size)
train_py_reader.decorate_paddle_reader(train_reader)
test_py_reader.decorate_paddle_reader(test_reader)
def save_model(postfix, main_prog):
......@@ -232,6 +241,7 @@ def train(args,
train_file_list,
batch_size_per_device,
shuffle=is_shuffle,
use_multiprocess=args.use_multiprocess,
num_workers=num_workers,
enable_ce=enable_ce)
train_py_reader.decorate_paddle_reader(train_reader)
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册