diff --git a/README.md b/README.md index 3d428e7778cc0d3265bb6bc88653670067278053..be7b4102fd011b9b9e97f91ad8afb0eee30abb32 100644 --- a/README.md +++ b/README.md @@ -134,6 +134,7 @@ - [Objects365 2019 Challenge夺冠模型](docs/featured_model/champion_model/CACascadeRCNN.md) - [Open Images 2019-Object Detction比赛最佳单模型](docs/featured_model/champion_model/OIDV5_BASELINE_MODEL.md) - [服务器端实用目标检测模型](configs/rcnn_enhance/README.md): V100上速度20FPS时,COCO mAP高达47.8%。 +- [大规模实用目标检测模型](docs/featured_model/LARGE_SCALE_DET_MODEL.md): 提供了包含676个类别的大规模服务器端实用目标检测模型,适用于绝大部分使用场景,可以直接用来预测,也可以用于微调其他任务。 ## 许可证书 diff --git a/README_en.md b/README_en.md index ed2a97348d7df4bd2dc1709e27792d11e11f9bce..f5a1cd71cf997b5fb8d441f750e938d23e57d1e6 100644 --- a/README_en.md +++ b/README_en.md @@ -149,6 +149,7 @@ The following is the relationship between COCO mAP and FPS on Tesla V100 of repr - [Objects365 2019 Challenge champion model](docs/featured_model/champion_model/CACascadeRCNN.md) - [Best single model of Open Images 2019-Object Detction](docs/featured_model/champion_model/OIDV5_BASELINE_MODEL.md) - [Practical Server-side detection method](configs/rcnn_enhance/README_en.md): Inference speed on single V100 GPU can reach 20FPS when COCO mAP is 47.8%. +- [Large-scale practical object detection models](docs/featured_model/LARGE_SCALE_DET_MODEL_en.md): Large-scale practical server-side detection pretrained models with 676 categories are provided for most application scenarios, which can be used not only for direct inference but also finetuning on other datasets. ## License diff --git a/configs/rcnn_enhance/README.md b/configs/rcnn_enhance/README.md index 08428fbcef4ea14eabb701ff888e1bff2c9bf2b7..73d03ec9150e8db926d2f93900431f75bef3a247 100644 --- a/configs/rcnn_enhance/README.md +++ b/configs/rcnn_enhance/README.md @@ -34,3 +34,4 @@ | :---------------------- | :-------------: | :-------: | :-----: | :------------: | :----: | :-----: | :-------------: | :-----: | | ResNet50-vd-FPN-Dcnv2 | Faster | 2 | 3x | 61.425 | 41.6 | - | [model](https://paddlemodels.bj.bcebos.com/object_detection/faster_rcnn_dcn_r50_vd_fpn_3x_server_side.tar) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/master/configs/rcnn_server_side_det/faster_rcnn_dcn_r50_vd_fpn_3x_server_side.yml) | | ResNet50-vd-FPN-Dcnv2 | Cascade Faster | 2 | 3x | 20.001 | 47.8 | - | [model](https://paddlemodels.bj.bcebos.com/object_detection/cascade_rcnn_dcn_r50_vd_fpn_3x_server_side.tar) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/master/configs/rcnn_server_side_det/cascade_rcnn_dcn_r50_vd_fpn_3x_server_side.yml) | +| ResNet101-vd-FPN-Dcnv2 | Cascade Faster | 2 | 3x | 19.523 | 49.4 | - | [model](https://paddlemodels.bj.bcebos.com/object_detection/cascade_rcnn_dcn_r101_vd_fpn_3x_server_side.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/master/configs/rcnn_server_side_det/cascade_rcnn_dcn_r101_vd_fpn_3x_server_side.yml) | diff --git a/configs/rcnn_enhance/README_en.md b/configs/rcnn_enhance/README_en.md index 4add4dcef58283666b60874f34fb9d060863f794..bc33893784b327bd1ac826f6fe1bbb179a3ee5a4 100644 --- a/configs/rcnn_enhance/README_en.md +++ b/configs/rcnn_enhance/README_en.md @@ -30,11 +30,12 @@ And the following figure shows `mAP-Speed` curves for some common detectors. > For fair comparison, inference time for PSS-DET models on V100 GPU is transformed to Titan V GPU by multiplying by 1.2 times. - - ## Model Zoo +#### COCO dataset + | Backbone | Type | Image/gpu | Lr schd | Inf time (fps) | Box AP | Mask AP | Download | Configs | | :---------------------- | :-------------: | :-------: | :-----: | :------------: | :----: | :-----: | :----------------------------------------------------------: | :-----: | | ResNet50-vd-FPN-Dcnv2 | Faster | 2 | 3x | 61.425 | 41.6 | - | [model](https://paddlemodels.bj.bcebos.com/object_detection/faster_rcnn_dcn_r50_vd_fpn_3x_server_side.tar) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/master/configs/rcnn_server_side_det/faster_rcnn_dcn_r50_vd_fpn_3x_server_side.yml) | | ResNet50-vd-FPN-Dcnv2 | Cascade Faster | 2 | 3x | 20.001 | 47.8 | - | [model](https://paddlemodels.bj.bcebos.com/object_detection/cascade_rcnn_dcn_r50_vd_fpn_3x_server_side.tar) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/master/configs/rcnn_server_side_det/cascade_rcnn_dcn_r50_vd_fpn_3x_server_side.yml) | +| ResNet101-vd-FPN-Dcnv2 | Cascade Faster | 2 | 3x | 19.523 | 49.4 | - | [model](https://paddlemodels.bj.bcebos.com/object_detection/cascade_rcnn_dcn_r101_vd_fpn_3x_server_side.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/master/configs/rcnn_server_side_det/cascade_rcnn_dcn_r101_vd_fpn_3x_server_side.yml) | diff --git a/configs/rcnn_enhance/cascade_rcnn_dcn_r101_vd_fpn_3x_server_side.yml b/configs/rcnn_enhance/cascade_rcnn_dcn_r101_vd_fpn_3x_server_side.yml new file mode 100644 index 0000000000000000000000000000000000000000..15e1a1a8991a2cef14dda76e9364cd558f27539b --- /dev/null +++ b/configs/rcnn_enhance/cascade_rcnn_dcn_r101_vd_fpn_3x_server_side.yml @@ -0,0 +1,216 @@ +architecture: CascadeRCNN +max_iters: 270000 +snapshot_iter: 30000 +use_gpu: true +log_smooth_window: 20 +log_iter: 20 +save_dir: output +pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/ResNet101_vd_ssld_pretrained.tar +weights: output/cascade_rcnn_dcn_r101_vd_fpn_3x_server_side/model_final +metric: COCO +num_classes: 81 + +CascadeRCNN: + backbone: ResNet + fpn: FPN + rpn_head: FPNRPNHead + roi_extractor: FPNRoIAlign + bbox_head: CascadeBBoxHead + bbox_assigner: CascadeBBoxAssigner + +ResNet: + norm_type: bn + depth: 101 + feature_maps: [2, 3, 4, 5] + freeze_at: 2 + variant: d + dcn_v2_stages: [3, 4, 5] + lr_mult_list: [0.05, 0.05, 0.1, 0.15] + +FPN: + max_level: 6 + min_level: 2 + num_chan: 64 + spatial_scale: [0.03125, 0.0625, 0.125, 0.25] + +FPNRPNHead: + anchor_generator: + anchor_sizes: [32, 64, 128, 256, 512] + aspect_ratios: [0.5, 1.0, 2.0] + stride: [16.0, 16.0] + variance: [1.0, 1.0, 1.0, 1.0] + anchor_start_size: 32 + min_level: 2 + max_level: 6 + num_chan: 64 + rpn_target_assign: + rpn_batch_size_per_im: 256 + rpn_fg_fraction: 0.5 + rpn_positive_overlap: 0.7 + rpn_negative_overlap: 0.3 + rpn_straddle_thresh: 0.0 + train_proposal: + min_size: 0.0 + nms_thresh: 0.7 + pre_nms_top_n: 2000 + post_nms_top_n: 2000 + test_proposal: + min_size: 0.0 + nms_thresh: 0.7 + pre_nms_top_n: 500 + post_nms_top_n: 300 + +FPNRoIAlign: + canconical_level: 4 + canonical_size: 224 + min_level: 2 + max_level: 5 + box_resolution: 7 + sampling_ratio: 2 + +CascadeBBoxAssigner: + batch_size_per_im: 512 + bbox_reg_weights: [10, 20, 30] + bg_thresh_lo: [0.0, 0.0, 0.0] + bg_thresh_hi: [0.5, 0.6, 0.7] + fg_thresh: [0.5, 0.6, 0.7] + fg_fraction: 0.25 + +CascadeBBoxHead: + head: CascadeTwoFCHead + bbox_loss: BalancedL1Loss + nms: + keep_top_k: 100 + nms_threshold: 0.5 + score_threshold: 0.05 + +BalancedL1Loss: + alpha: 0.5 + gamma: 1.5 + beta: 1.0 + loss_weight: 1.0 + +CascadeTwoFCHead: + mlp_dim: 1024 + +LearningRate: + base_lr: 0.02 + schedulers: + - !PiecewiseDecay + gamma: 0.1 + milestones: [180000, 240000] + - !LinearWarmup + start_factor: 0.1 + steps: 1000 + +OptimizerBuilder: + optimizer: + momentum: 0.9 + type: Momentum + regularizer: + factor: 0.0001 + type: L2 + +TrainReader: + inputs_def: + fields: ['image', 'im_info', 'im_id', 'gt_bbox', 'gt_class', 'is_crowd'] + dataset: + !COCODataSet + image_dir: train2017 + anno_path: annotations/instances_train2017.json + dataset_dir: dataset/coco + sample_transforms: + - !DecodeImage + to_rgb: true + - !RandomFlipImage + prob: 0.5 + - !AutoAugmentImage + autoaug_type: v1 + - !NormalizeImage + is_channel_first: false + is_scale: true + mean: [0.485,0.456,0.406] + std: [0.229, 0.224,0.225] + - !ResizeImage + target_size: [640, 672, 704, 736, 768, 800, 832, 864, 896, 928, 960, 992, 1024] + max_size: 1500 + interp: 1 + use_cv2: true + - !Permute + to_bgr: false + channel_first: true + batch_transforms: + - !PadBatch + pad_to_stride: 32 + use_padded_im_info: false + batch_size: 2 + shuffle: true + worker_num: 2 + use_process: false + +EvalReader: + inputs_def: + fields: ['image', 'im_info', 'im_id', 'im_shape'] + # for voc + #fields: ['image', 'im_info', 'im_id', 'gt_bbox', 'gt_class', 'is_difficult'] + dataset: + !COCODataSet + image_dir: val2017 + anno_path: annotations/instances_val2017.json + dataset_dir: dataset/coco + sample_transforms: + - !DecodeImage + to_rgb: true + with_mixup: false + - !NormalizeImage + is_channel_first: false + is_scale: true + mean: [0.485,0.456,0.406] + std: [0.229, 0.224,0.225] + - !ResizeImage + interp: 1 + max_size: 13330 + target_size: 800 + use_cv2: true + - !Permute + channel_first: true + to_bgr: false + batch_transforms: + - !PadBatch + pad_to_stride: 32 + use_padded_im_info: true + batch_size: 1 + shuffle: false + drop_empty: false + worker_num: 2 + +TestReader: + inputs_def: + # set image_shape if needed + fields: ['image', 'im_info', 'im_id', 'im_shape'] + dataset: + !ImageFolder + anno_path: annotations/instances_val2017.json + sample_transforms: + - !DecodeImage + to_rgb: true + with_mixup: false + - !NormalizeImage + is_channel_first: false + is_scale: true + mean: [0.485,0.456,0.406] + std: [0.229, 0.224,0.225] + - !ResizeImage + interp: 1 + max_size: 1333 + target_size: 800 + use_cv2: true + - !Permute + channel_first: true + to_bgr: false + batch_transforms: + - !PadBatch + pad_to_stride: 32 + use_padded_im_info: true + batch_size: 1 + shuffle: false diff --git a/configs/rcnn_enhance/generic/cascade_rcnn_cbr101_vd_fpn_generic_server_side.yml b/configs/rcnn_enhance/generic/cascade_rcnn_cbr101_vd_fpn_generic_server_side.yml new file mode 100644 index 0000000000000000000000000000000000000000..adc2cfa2ac9fce0e938b039f852b981308e67f05 --- /dev/null +++ b/configs/rcnn_enhance/generic/cascade_rcnn_cbr101_vd_fpn_generic_server_side.yml @@ -0,0 +1,220 @@ +architecture: CascadeRCNN +max_iters: 1500000 +snapshot_iter: 100000 +use_gpu: true +log_smooth_window: 20 +log_iter: 20 +save_dir: output +pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/CBResNet101_vd_ssld_pretrained.tar +weights: output/cascade_rcnn_cbr101_vd_fpn_generic_server_side/model_final +metric: VOC +num_classes: 677 + +CascadeRCNN: + backbone: CBResNet + fpn: FPN + rpn_head: FPNRPNHead + roi_extractor: FPNRoIAlign + bbox_head: CascadeBBoxHead + bbox_assigner: CascadeBBoxAssigner + +CBResNet: + norm_type: bn + norm_decay: 0. + depth: 101 + feature_maps: [2, 3, 4, 5] + freeze_at: 2 + variant: d + repeat_num: 2 + lr_mult_list: [0.05, 0.05, 0.1, 0.15] + +FPN: + max_level: 6 + min_level: 2 + num_chan: 256 + spatial_scale: [0.03125, 0.0625, 0.125, 0.25] + +FPNRPNHead: + anchor_generator: + anchor_sizes: [32, 64, 128, 256, 512] + aspect_ratios: [0.5, 1.0, 2.0] + stride: [16.0, 16.0] + variance: [1.0, 1.0, 1.0, 1.0] + anchor_start_size: 32 + min_level: 2 + max_level: 6 + num_chan: 256 + rpn_target_assign: + rpn_batch_size_per_im: 256 + rpn_fg_fraction: 0.5 + rpn_positive_overlap: 0.7 + rpn_negative_overlap: 0.3 + rpn_straddle_thresh: 0.0 + train_proposal: + min_size: 0.0 + nms_thresh: 0.7 + pre_nms_top_n: 2000 + post_nms_top_n: 2000 + test_proposal: + min_size: 0.0 + nms_thresh: 0.7 + pre_nms_top_n: 500 + post_nms_top_n: 300 + +FPNRoIAlign: + canconical_level: 4 + canonical_size: 224 + min_level: 2 + max_level: 5 + box_resolution: 14 + sampling_ratio: 2 + +CascadeBBoxAssigner: + batch_size_per_im: 512 + bbox_reg_weights: [10, 20, 30] + bg_thresh_lo: [0.0, 0.0, 0.0] + bg_thresh_hi: [0.5, 0.6, 0.7] + fg_thresh: [0.5, 0.6, 0.7] + fg_fraction: 0.25 + +CascadeBBoxHead: + head: CascadeTwoFCHead + bbox_loss: BalancedL1Loss + nms: + keep_top_k: 100 + nms_threshold: 0.5 + score_threshold: 0.05 + +BalancedL1Loss: + alpha: 0.5 + gamma: 1.5 + beta: 1.0 + loss_weight: 1.0 + +CascadeTwoFCHead: + mlp_dim: 1024 + +LearningRate: + base_lr: 0.005 + schedulers: + - !PiecewiseDecay + gamma: 0.1 + milestones: [1000000, 1400000] + - !LinearWarmup + start_factor: 0.1 + steps: 1000 + +OptimizerBuilder: + optimizer: + momentum: 0.9 + type: Momentum + regularizer: + factor: 0.0001 + type: L2 + +TrainReader: + inputs_def: + fields: ['image', 'im_info', 'im_id', 'gt_bbox', 'gt_class', 'is_crowd'] + dataset: + !COCODataSet + image_dir: train2017 + anno_path: annotations/instances_train2017.json + dataset_dir: dataset/coco + sample_transforms: + - !DecodeImage + to_rgb: true + - !RandomFlipImage + prob: 0.5 + - !AutoAugmentImage + autoaug_type: v1 + - !NormalizeImage + is_channel_first: false + is_scale: true + mean: [0.485,0.456,0.406] + std: [0.229, 0.224,0.225] + - !ResizeImage + target_size: [640, 672, 704, 736, 768, 800, 832, 864, 896, 928, 960, 992, 1024] + max_size: 1500 + interp: 1 + use_cv2: true + - !Permute + to_bgr: false + channel_first: true + batch_transforms: + - !PadBatch + pad_to_stride: 32 + use_padded_im_info: false + batch_size: 1 + shuffle: true + worker_num: 2 + use_process: false + +EvalReader: + inputs_def: + fields: ['image', 'im_info', 'im_id', 'im_shape'] + # for voc + #fields: ['image', 'im_info', 'im_id', 'gt_bbox', 'gt_class', 'is_difficult'] + dataset: + !COCODataSet + image_dir: val2017 + anno_path: annotations/instances_val2017.json + dataset_dir: dataset/coco + sample_transforms: + - !DecodeImage + to_rgb: true + with_mixup: false + - !NormalizeImage + is_channel_first: false + is_scale: true + mean: [0.485,0.456,0.406] + std: [0.229, 0.224,0.225] + - !ResizeImage + interp: 1 + max_size: 1300 + target_size: 800 + use_cv2: true + - !Permute + channel_first: true + to_bgr: false + batch_transforms: + - !PadBatch + pad_to_stride: 32 + use_padded_im_info: true + batch_size: 1 + shuffle: false + drop_empty: false + worker_num: 2 + + +TestReader: + inputs_def: + # set image_shape if needed + fields: ['image', 'im_info', 'im_id', 'im_shape'] + dataset: + !ImageFolder + use_default_label: false + with_background: true + anno_path: ./dataset/voc/generic_det_label_list.txt + sample_transforms: + - !DecodeImage + to_rgb: true + with_mixup: false + - !NormalizeImage + is_channel_first: false + is_scale: true + mean: [0.485,0.456,0.406] + std: [0.229, 0.224,0.225] + - !ResizeImage + interp: 1 + max_size: 1333 + target_size: 800 + use_cv2: true + - !Permute + channel_first: true + to_bgr: false + batch_transforms: + - !PadBatch + pad_to_stride: 32 + use_padded_im_info: true + batch_size: 1 + shuffle: false diff --git a/configs/rcnn_enhance/generic/cascade_rcnn_dcn_r101_vd_fpn_generic_server_side.yml b/configs/rcnn_enhance/generic/cascade_rcnn_dcn_r101_vd_fpn_generic_server_side.yml new file mode 100644 index 0000000000000000000000000000000000000000..c47b9b7f401bef7e77fd6209c1f716306293164f --- /dev/null +++ b/configs/rcnn_enhance/generic/cascade_rcnn_dcn_r101_vd_fpn_generic_server_side.yml @@ -0,0 +1,218 @@ +architecture: CascadeRCNN +max_iters: 1500000 +snapshot_iter: 100000 +use_gpu: true +log_smooth_window: 20 +log_iter: 20 +save_dir: output +pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/ResNet101_vd_ssld_pretrained.tar +weights: output/cascade_rcnn_dcn_r101_vd_fpn_generic_server_side/model_final +metric: VOC +num_classes: 677 + +CascadeRCNN: + backbone: ResNet + fpn: FPN + rpn_head: FPNRPNHead + roi_extractor: FPNRoIAlign + bbox_head: CascadeBBoxHead + bbox_assigner: CascadeBBoxAssigner + +ResNet: + norm_type: bn + depth: 101 + feature_maps: [2, 3, 4, 5] + freeze_at: 2 + variant: d + dcn_v2_stages: [3, 4, 5] + lr_mult_list: [0.05, 0.05, 0.1, 0.15] + +FPN: + max_level: 6 + min_level: 2 + num_chan: 64 + spatial_scale: [0.03125, 0.0625, 0.125, 0.25] + +FPNRPNHead: + anchor_generator: + anchor_sizes: [32, 64, 128, 256, 512] + aspect_ratios: [0.5, 1.0, 2.0] + stride: [16.0, 16.0] + variance: [1.0, 1.0, 1.0, 1.0] + anchor_start_size: 32 + min_level: 2 + max_level: 6 + num_chan: 64 + rpn_target_assign: + rpn_batch_size_per_im: 256 + rpn_fg_fraction: 0.5 + rpn_positive_overlap: 0.7 + rpn_negative_overlap: 0.3 + rpn_straddle_thresh: 0.0 + train_proposal: + min_size: 0.0 + nms_thresh: 0.7 + pre_nms_top_n: 2000 + post_nms_top_n: 2000 + test_proposal: + min_size: 0.0 + nms_thresh: 0.7 + pre_nms_top_n: 500 + post_nms_top_n: 300 + +FPNRoIAlign: + canconical_level: 4 + canonical_size: 224 + min_level: 2 + max_level: 5 + box_resolution: 7 + sampling_ratio: 2 + +CascadeBBoxAssigner: + batch_size_per_im: 512 + bbox_reg_weights: [10, 20, 30] + bg_thresh_lo: [0.0, 0.0, 0.0] + bg_thresh_hi: [0.5, 0.6, 0.7] + fg_thresh: [0.5, 0.6, 0.7] + fg_fraction: 0.25 + +CascadeBBoxHead: + head: CascadeTwoFCHead + bbox_loss: BalancedL1Loss + nms: + keep_top_k: 100 + nms_threshold: 0.5 + score_threshold: 0.05 + +BalancedL1Loss: + alpha: 0.5 + gamma: 1.5 + beta: 1.0 + loss_weight: 1.0 + +CascadeTwoFCHead: + mlp_dim: 1024 + +LearningRate: + base_lr: 0.01 + schedulers: + - !PiecewiseDecay + gamma: 0.1 + milestones: [1000000, 1400000] + - !LinearWarmup + start_factor: 0.1 + steps: 1000 + +OptimizerBuilder: + optimizer: + momentum: 0.9 + type: Momentum + regularizer: + factor: 0.0001 + type: L2 + +TrainReader: + inputs_def: + fields: ['image', 'im_info', 'im_id', 'gt_bbox', 'gt_class', 'is_crowd'] + dataset: + !COCODataSet + image_dir: train2017 + anno_path: annotations/instances_train2017.json + dataset_dir: dataset/coco + sample_transforms: + - !DecodeImage + to_rgb: true + - !RandomFlipImage + prob: 0.5 + - !AutoAugmentImage + autoaug_type: v1 + - !NormalizeImage + is_channel_first: false + is_scale: true + mean: [0.485,0.456,0.406] + std: [0.229, 0.224,0.225] + - !ResizeImage + target_size: [640, 672, 704, 736, 768, 800, 832, 864, 896, 928, 960, 992, 1024] + max_size: 1500 + interp: 1 + use_cv2: true + - !Permute + to_bgr: false + channel_first: true + batch_transforms: + - !PadBatch + pad_to_stride: 32 + use_padded_im_info: false + batch_size: 1 + shuffle: true + worker_num: 2 + use_process: false + +EvalReader: + inputs_def: + fields: ['image', 'im_info', 'im_id', 'im_shape'] + # for voc + #fields: ['image', 'im_info', 'im_id', 'gt_bbox', 'gt_class', 'is_difficult'] + dataset: + !COCODataSet + image_dir: val2017 + anno_path: annotations/instances_val2017.json + dataset_dir: dataset/coco + sample_transforms: + - !DecodeImage + to_rgb: true + with_mixup: false + - !NormalizeImage + is_channel_first: false + is_scale: true + mean: [0.485,0.456,0.406] + std: [0.229, 0.224,0.225] + - !ResizeImage + interp: 1 + max_size: 1300 + target_size: 800 + use_cv2: true + - !Permute + channel_first: true + to_bgr: false + batch_transforms: + - !PadBatch + pad_to_stride: 32 + use_padded_im_info: true + batch_size: 1 + shuffle: false + drop_empty: false + worker_num: 2 + +TestReader: + inputs_def: + # set image_shape if needed + fields: ['image', 'im_info', 'im_id', 'im_shape'] + dataset: + !ImageFolder + use_default_label: false + with_background: true + anno_path: ./dataset/voc/generic_det_label_list.txt + sample_transforms: + - !DecodeImage + to_rgb: true + with_mixup: false + - !NormalizeImage + is_channel_first: false + is_scale: true + mean: [0.485,0.456,0.406] + std: [0.229, 0.224,0.225] + - !ResizeImage + interp: 1 + max_size: 1333 + target_size: 800 + use_cv2: true + - !Permute + channel_first: true + to_bgr: false + batch_transforms: + - !PadBatch + pad_to_stride: 32 + use_padded_im_info: true + batch_size: 1 + shuffle: false diff --git a/configs/rcnn_enhance/generic/cascade_rcnn_dcn_r50_vd_fpn_generic_server_side.yml b/configs/rcnn_enhance/generic/cascade_rcnn_dcn_r50_vd_fpn_generic_server_side.yml new file mode 100644 index 0000000000000000000000000000000000000000..f7ef433189314c68f32151db41bd021cbf5c9c84 --- /dev/null +++ b/configs/rcnn_enhance/generic/cascade_rcnn_dcn_r50_vd_fpn_generic_server_side.yml @@ -0,0 +1,218 @@ +architecture: CascadeRCNN +max_iters: 750000 +snapshot_iter: 50000 +use_gpu: true +log_smooth_window: 20 +log_iter: 20 +save_dir: output +pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/ResNet50_vd_ssld_v2_pretrained.tar +weights: output/cascade_rcnn_dcn_r50_vd_fpn_generic_server_side/model_final +metric: VOC +num_classes: 677 + +CascadeRCNN: + backbone: ResNet + fpn: FPN + rpn_head: FPNRPNHead + roi_extractor: FPNRoIAlign + bbox_head: CascadeBBoxHead + bbox_assigner: CascadeBBoxAssigner + +ResNet: + norm_type: bn + depth: 50 + feature_maps: [2, 3, 4, 5] + freeze_at: 2 + variant: d + dcn_v2_stages: [3, 4, 5] + lr_mult_list: [0.05, 0.05, 0.1, 0.15] + +FPN: + max_level: 6 + min_level: 2 + num_chan: 64 + spatial_scale: [0.03125, 0.0625, 0.125, 0.25] + +FPNRPNHead: + anchor_generator: + anchor_sizes: [32, 64, 128, 256, 512] + aspect_ratios: [0.5, 1.0, 2.0] + stride: [16.0, 16.0] + variance: [1.0, 1.0, 1.0, 1.0] + anchor_start_size: 32 + min_level: 2 + max_level: 6 + num_chan: 64 + rpn_target_assign: + rpn_batch_size_per_im: 256 + rpn_fg_fraction: 0.5 + rpn_positive_overlap: 0.7 + rpn_negative_overlap: 0.3 + rpn_straddle_thresh: 0.0 + train_proposal: + min_size: 0.0 + nms_thresh: 0.7 + pre_nms_top_n: 2000 + post_nms_top_n: 2000 + test_proposal: + min_size: 0.0 + nms_thresh: 0.7 + pre_nms_top_n: 500 + post_nms_top_n: 300 + +FPNRoIAlign: + canconical_level: 4 + canonical_size: 224 + min_level: 2 + max_level: 5 + box_resolution: 7 + sampling_ratio: 2 + +CascadeBBoxAssigner: + batch_size_per_im: 512 + bbox_reg_weights: [10, 20, 30] + bg_thresh_lo: [0.0, 0.0, 0.0] + bg_thresh_hi: [0.5, 0.6, 0.7] + fg_thresh: [0.5, 0.6, 0.7] + fg_fraction: 0.25 + +CascadeBBoxHead: + head: CascadeTwoFCHead + bbox_loss: BalancedL1Loss + nms: + keep_top_k: 100 + nms_threshold: 0.5 + score_threshold: 0.05 + +BalancedL1Loss: + alpha: 0.5 + gamma: 1.5 + beta: 1.0 + loss_weight: 1.0 + +CascadeTwoFCHead: + mlp_dim: 1024 + +LearningRate: + base_lr: 0.02 + schedulers: + - !PiecewiseDecay + gamma: 0.1 + milestones: [500000, 700000] + - !LinearWarmup + start_factor: 0.1 + steps: 1000 + +OptimizerBuilder: + optimizer: + momentum: 0.9 + type: Momentum + regularizer: + factor: 0.0001 + type: L + +TrainReader: + inputs_def: + fields: ['image', 'im_info', 'im_id', 'gt_bbox', 'gt_class', 'is_crowd'] + dataset: + !COCODataSet + image_dir: train2017 + anno_path: annotations/instances_train2017.json + dataset_dir: dataset/coco + sample_transforms: + - !DecodeImage + to_rgb: true + - !RandomFlipImage + prob: 0.5 + - !AutoAugmentImage + autoaug_type: v1 + - !NormalizeImage + is_channel_first: false + is_scale: true + mean: [0.485,0.456,0.406] + std: [0.229, 0.224,0.225] + - !ResizeImage + target_size: [640, 672, 704, 736, 768, 800, 832, 864, 896, 928, 960, 992, 1024] + max_size: 1500 + interp: 1 + use_cv2: true + - !Permute + to_bgr: false + channel_first: true + batch_transforms: + - !PadBatch + pad_to_stride: 32 + use_padded_im_info: false + batch_size: 2 + shuffle: true + worker_num: 2 + use_process: false + +EvalReader: + inputs_def: + fields: ['image', 'im_info', 'im_id', 'im_shape'] + # for voc + #fields: ['image', 'im_info', 'im_id', 'gt_bbox', 'gt_class', 'is_difficult'] + dataset: + !COCODataSet + image_dir: val2017 + anno_path: annotations/instances_val2017.json + dataset_dir: dataset/coco + sample_transforms: + - !DecodeImage + to_rgb: true + with_mixup: false + - !NormalizeImage + is_channel_first: false + is_scale: true + mean: [0.485,0.456,0.406] + std: [0.229, 0.224,0.225] + - !ResizeImage + interp: 1 + max_size: 1500 + target_size: 1000 + use_cv2: true + - !Permute + channel_first: true + to_bgr: false + batch_transforms: + - !PadBatch + pad_to_stride: 32 + use_padded_im_info: true + batch_size: 1 + shuffle: false + drop_empty: false + worker_num: 2 + +TestReader: + inputs_def: + # set image_shape if needed + fields: ['image', 'im_info', 'im_id', 'im_shape'] + dataset: + !ImageFolder + use_default_label: false + with_background: true + anno_path: ./dataset/voc/generic_det_label_list.txt + sample_transforms: + - !DecodeImage + to_rgb: true + with_mixup: false + - !NormalizeImage + is_channel_first: false + is_scale: true + mean: [0.485,0.456,0.406] + std: [0.229, 0.224,0.225] + - !ResizeImage + interp: 1 + max_size: 1500 + target_size: 1000 + use_cv2: true + - !Permute + channel_first: true + to_bgr: false + batch_transforms: + - !PadBatch + pad_to_stride: 32 + use_padded_im_info: true + batch_size: 1 + shuffle: false diff --git a/dataset/voc/generic_det_label_list.txt b/dataset/voc/generic_det_label_list.txt new file mode 100644 index 0000000000000000000000000000000000000000..410f9ae593ba501be091bc267491f6158c339a44 --- /dev/null +++ b/dataset/voc/generic_det_label_list.txt @@ -0,0 +1,676 @@ +Infant bed +Rose +Flag +Flashlight +Sea turtle +Camera +Animal +Glove +Crocodile +Cattle +House +Guacamole +Penguin +Vehicle registration plate +Bench +Ladybug +Human nose +Watermelon +Flute +Butterfly +Washing machine +Raccoon +Segway +Taco +Jellyfish +Cake +Pen +Cannon +Bread +Tree +Shellfish +Bed +Hamster +Hat +Toaster +Sombrero +Tiara +Bowl +Dragonfly +Moths and butterflies +Antelope +Vegetable +Torch +Building +Power plugs and sockets +Blender +Billiard table +Cutting board +Bronze sculpture +Turtle +Broccoli +Tiger +Mirror +Bear +Zucchini +Dress +Volleyball +Guitar +Reptile +Golf cart +Tart +Fedora +Carnivore +Car +Lighthouse +Coffeemaker +Food processor +Truck +Bookcase +Surfboard +Footwear +Bench +Necklace +Flower +Radish +Marine mammal +Frying pan +Tap +Peach +Knife +Handbag +Laptop +Tent +Ambulance +Christmas tree +Eagle +Limousine +Kitchen & dining room table +Polar bear +Tower +Football +Willow +Human head +Stop sign +Banana +Mixer +Binoculars +Dessert +Bee +Chair +Wood-burning stove +Flowerpot +Beaker +Oyster +Woodpecker +Harp +Bathtub +Wall clock +Sports uniform +Rhinoceros +Beehive +Cupboard +Chicken +Man +Blue jay +Cucumber +Balloon +Kite +Fireplace +Lantern +Missile +Book +Spoon +Grapefruit +Squirrel +Orange +Coat +Punching bag +Zebra +Billboard +Bicycle +Door handle +Mechanical fan +Ring binder +Table +Parrot +Sock +Vase +Weapon +Shotgun +Glasses +Seahorse +Belt +Watercraft +Window +Giraffe +Lion +Tire +Vehicle +Canoe +Tie +Shelf +Picture frame +Printer +Human leg +Boat +Slow cooker +Croissant +Candle +Pancake +Pillow +Coin +Stretcher +Sandal +Woman +Stairs +Harpsichord +Stool +Bus +Suitcase +Human mouth +Juice +Skull +Door +Violin +Chopsticks +Digital clock +Sunflower +Leopard +Bell pepper +Harbor seal +Snake +Sewing machine +Goose +Helicopter +Seat belt +Coffee cup +Microwave oven +Hot dog +Countertop +Serving tray +Dog bed +Beer +Sunglasses +Golf ball +Waffle +Palm tree +Trumpet +Ruler +Helmet +Ladder +Office building +Tablet computer +Toilet paper +Pomegranate +Skirt +Gas stove +Cookie +Cart +Raven +Egg +Burrito +Goat +Kitchen knife +Skateboard +Salt and pepper shakers +Lynx +Boot +Platter +Ski +Swimwear +Swimming pool +Drinking straw +Wrench +Drum +Ant +Human ear +Headphones +Fountain +Bird +Jeans +Television +Crab +Microphone +Home appliance +Snowplow +Beetle +Artichoke +Jet ski +Stationary bicycle +Human hair +Brown bear +Starfish +Fork +Lobster +Corded phone +Drink +Saucer +Carrot +Insect +Clock +Castle +Tennis racket +Ceiling fan +Asparagus +Jaguar +Musical instrument +Train +Cat +Rifle +Dumbbell +Mobile phone +Taxi +Shower +Pitcher +Lemon +Invertebrate +Turkey +High heels +Bust +Elephant +Scarf +Barrel +Trombone +Pumpkin +Box +Tomato +Frog +Bidet +Human face +Houseplant +Van +Shark +Ice cream +Swim cap +Falcon +Ostrich +Handgun +Whiteboard +Lizard +Pasta +Snowmobile +Light bulb +Window blind +Muffin +Pretzel +Computer monitor +Horn +Furniture +Sandwich +Fox +Convenience store +Fish +Fruit +Earrings +Curtain +Grape +Sofa bed +Horse +Luggage and bags +Desk +Crutch +Bicycle helmet +Tick +Airplane +Canary +Spatula +Watch +Lily +Kitchen appliance +Filing cabinet +Aircraft +Cake stand +Candy +Sink +Mouse +Wine +Wheelchair +Goldfish +Refrigerator +French fries +Drawer +Treadmill +Picnic basket +Dice +Cabbage +Football helmet +Pig +Person +Shorts +Gondola +Honeycomb +Doughnut +Chest of drawers +Land vehicle +Bat +Monkey +Dagger +Tableware +Human foot +Mug +Alarm clock +Pressure cooker +Human hand +Tortoise +Baseball glove +Sword +Pear +Miniskirt +Traffic sign +Girl +Roller skates +Dinosaur +Porch +Human beard +Submarine sandwich +Screwdriver +Strawberry +Wine glass +Seafood +Racket +Wheel +Sea lion +Toy +Tea +Tennis ball +Waste container +Mule +Cricket ball +Pineapple +Coconut +Doll +Coffee table +Snowman +Lavender +Shrimp +Maple +Cowboy hat +Goggles +Rugby ball +Caterpillar +Poster +Rocket +Organ +Saxophone +Traffic light +Cocktail +Plastic bag +Squash +Mushroom +Hamburger +Light switch +Parachute +Teddy bear +Winter melon +Deer +Musical keyboard +Plumbing fixture +Scoreboard +Baseball bat +Envelope +Adhesive tape +Briefcase +Paddle +Bow and arrow +Telephone +Sheep +Jacket +Boy +Pizza +Otter +Office supplies +Couch +Cello +Bull +Camel +Ball +Duck +Whale +Shirt +Tank +Motorcycle +Accordion +Owl +Porcupine +Sun hat +Nail +Scissors +Swan +Lamp +Crown +Piano +Sculpture +Cheetah +Oboe +Tin can +Mango +Tripod +Oven +Mouse +Barge +Coffee +Snowboard +Common fig +Salad +Marine invertebrates +Umbrella +Kangaroo +Human arm +Measuring cup +Snail +Loveseat +Suit +Teapot +Bottle +Alpaca +Kettle +Trousers +Popcorn +Centipede +Spider +Sparrow +Plate +Bagel +Personal care +Apple +Brassiere +Bathroom cabinet +studio couch +Computer keyboard +Table tennis racket +Sushi +Cabinetry +Street light +Towel +Nightstand +Rabbit +Dolphin +Dog +Jug +Wok +Fire hydrant +Human eye +Skyscraper +Backpack +Potato +Paper towel +Lifejacket +Bicycle wheel +Toilet +tuba +carpet +trolley +tv +fan +llama +stapler +tricycle +head_phone +air_conditioner +cookies +towel/napkin +boots +sausage +suv +bar_soap +baseball +luggage +poker_card +shovel +marker +earphone +projector +pencil_case +french_horn +tangerine +router/modem +folder +donut +durian +sailboat +nuts +coffee_machine +meat_balls +basket +extension_cord +green_beans +avocado +soccer +egg_tart +clutch +slide +fishing_rod +hanger +bread/bun +surveillance_camera +globe +blackboard/whiteboard +life_saver +pigeon +red_cabbage +cymbal +faucet +steak +swing +mangosteen +cheese +urinal +lettuce +hurdle +ring +basketball +potted_plant +rickshaw +target +race_car +bow_tie +iron +toiletries +donkey +saw +hammer +billiards +cutting/chopping_board +power_outlet +hair_drier +baozi +medal +liquid_soap +wild_bird +leather_shoes +dining_table +game_board +barbell +radio +street_lights +tape +hockey +spring_rolls +rice +golf_club +lighter +chips +microscope +cell_phone +fire_truck +noodles +cabinet/shelf +electronic_stove_and_gas_stove +key +comb +trash_bin/can +toothbrush +dates +electric_drill +cow +eggplant +broom +vent +tong +green_onion +scallop +facial_cleanser +toothpaste +hamimelon +eraser +shampoo/shower_gel +CD +skating_and_skiing_shoes +american_football +slippers +pitaya +pot/pan +calculator +tissue +table_tennis_paddle +board_eraser +speaker +papaya +cigar +notepaper +garlic +rice_cooker +canned +parking_meter +flashlight +paint_brush +cup +cue +crosswalk_sign +kiwi_fruit +radiator +mop +chainsaw +sandals +storage_box +onion +bracelet +fire_extinguisher +scale +okra +microwave +sneakers +pepper +corn +pomelo +computer_box +pliers +trophy +plum +brush +machinery_vehicle +yak +crane +converter +facial_mask +carriage +pickup_truck +traffic_cone +pie +pen/pencil +sports_car +frisbee +cleaning_products +remote +stroller diff --git a/dataset/voc/generic_det_label_list_zh.txt b/dataset/voc/generic_det_label_list_zh.txt new file mode 100644 index 0000000000000000000000000000000000000000..0012d759df820f99d6fd814215a78453274b26fa --- /dev/null +++ b/dataset/voc/generic_det_label_list_zh.txt @@ -0,0 +1,676 @@ +婴儿床 +玫瑰 +旗 +手电筒 +海龟 +照相机 +动物 +手套 +鳄鱼 +牛 +房子 +鳄梨酱 +企鹅 +车辆牌照 +凳子 +瓢虫 +人鼻 +西瓜 +长笛 +蝴蝶 +洗衣机 +浣熊 +赛格威 +墨西哥玉米薄饼卷 +海蜇 +蛋糕 +笔 +加农炮 +面包 +树 +贝类 +床 +仓鼠 +帽子 +烤面包机 +帽帽 +冠状头饰 +碗 +蜻蜓 +飞蛾和蝴蝶 +羚羊 +蔬菜 +火炬 +建筑物 +电源插头和插座 +搅拌机 +台球桌 +切割板 +青铜雕塑 +乌龟 +西兰花 +老虎 +镜子 +熊 +西葫芦 +礼服 +排球 +吉他 +爬行动物 +高尔夫球车 +蛋挞 +费多拉 +食肉动物 +小型车 +灯塔 +咖啡壶 +食品加工厂 +卡车 +书柜 +冲浪板 +鞋类 +凳子 +项链 +花 +萝卜 +海洋哺乳动物 +煎锅 +水龙头 +桃 +刀 +手提包 +笔记本电脑 +帐篷 +救护车 +圣诞树 +鹰 +豪华轿车 +厨房和餐桌 +北极熊 +塔楼 +足球 +柳树 +人头 +停车标志 +香蕉 +搅拌机 +双筒望远镜 +甜点 +蜜蜂 +椅子 +烧柴炉 +花盆 +烧杯 +牡蛎 +啄木鸟 +竖琴 +浴缸 +挂钟 +运动服 +犀牛 +蜂箱 +橱柜 +鸡 +人 +冠蓝鸦 +黄瓜 +气球 +风筝 +壁炉 +灯笼 +导弹 +书 +勺子 +葡萄柚 +松鼠 +橙色 +外套 +打孔袋 +斑马 +广告牌 +自行车 +门把手 +机械风扇 +环形粘结剂 +桌子 +鹦鹉 +袜子 +花瓶 +武器 +猎枪 +玻璃杯 +海马 +腰带 +船舶 +窗口 +长颈鹿 +狮子 +轮胎 +车辆 +独木舟 +领带 +架子 +相框 +打印机 +人腿 +小船 +慢炖锅 +牛角包 +蜡烛 +煎饼 +枕头 +硬币 +担架 +凉鞋 +女人 +楼梯 +拨弦键琴 +凳子 +公共汽车 +手提箱 +人口学 +果汁 +颅骨 +门 +小提琴 +筷子 +数字时钟 +向日葵 +豹 +甜椒 +海港海豹 +蛇 +缝纫机 +鹅 +直升机 +座椅安全带 +咖啡杯 +微波炉 +热狗 +台面 +服务托盘 +狗床 +啤酒 +太阳镜 +高尔夫球 +华夫饼干 +棕榈树 +小号 +尺子 +头盔 +梯子 +办公楼 +平板电脑 +厕纸 +石榴 +裙子 +煤气炉 +曲奇饼干 +大车 +掠夺 +鸡蛋 +墨西哥煎饼 +山羊 +菜刀 +滑板 +盐和胡椒瓶 +猞猁 +靴子 +大浅盘 +滑雪板 +泳装 +游泳池 +吸管 +扳手 +鼓 +蚂蚁 +人耳 +耳机 +喷泉 +鸟 +牛仔裤 +电视机 +蟹 +话筒 +家用电器 +除雪机 +甲虫 +朝鲜蓟 +喷气式滑雪板 +固定自行车 +人发 +棕熊 +海星 +叉子 +龙虾 +有线电话 +饮料 +碟 +胡萝卜 +昆虫 +时钟 +城堡 +网球拍 +吊扇 +芦笋 +美洲虎 +乐器 +火车 +猫 +来复枪 +哑铃 +手机 +出租车 +淋浴 +投掷者 +柠檬 +无脊椎动物 +火鸡 +高跟鞋 +打破 +大象 +围巾 +枪管 +长号 +南瓜 +盒子 +番茄 +蛙 +坐浴盆 +人脸 +室内植物 +厢式货车 +鲨鱼 +冰淇淋 +游泳帽 +隼 +鸵鸟 +手枪 +白板 +蜥蜴 +面食 +雪车 +灯泡 +窗盲 +松饼 +椒盐脆饼 +计算机显示器 +喇叭 +家具 +三明治 +福克斯 +便利店 +鱼 +水果 +耳环 +帷幕 +葡萄 +沙发床 +马 +行李和行李 +书桌 +拐杖 +自行车头盔 +滴答声 +飞机 +金丝雀 +铲 +手表 +莉莉 +厨房用具 +文件柜 +飞机 +蛋糕架 +糖果 +水槽 +鼠标 +葡萄酒 +轮椅 +金鱼 +冰箱 +炸薯条 +抽屉 +单调的工作 +野餐篮子 +骰子 +甘蓝 +足球头盔 +猪 +人 +短裤 +贡多拉 +蜂巢 +炸圈饼 +抽屉柜 +陆地车辆 +蝙蝠 +猴子 +匕首 +餐具 +人足 +马克杯 +闹钟 +高压锅 +人手 +乌龟 +棒球手套 +剑 +梨 +迷你裙 +交通标志 +女孩 +旱冰鞋 +恐龙 +门廊 +胡须 +潜艇三明治 +螺丝起子 +草莓 +酒杯 +海鲜 +球拍 +车轮 +海狮 +玩具 +茶叶 +网球 +废物容器 +骡子 +板球 +菠萝 +椰子 +娃娃 +咖啡桌 +雪人 +薰衣草 +小虾 +枫树 +牛仔帽 +护目镜 +橄榄球 +毛虫 +海报 +火箭 +器官 +萨克斯 +交通灯 +鸡尾酒 +塑料袋 +壁球 +蘑菇 +汉堡包 +电灯开关 +降落伞 +泰迪熊 +冬瓜 +鹿 +音乐键盘 +卫生器具 +记分牌 +棒球棒 +包络线 +胶带 +公文包 +桨 +弓箭 +电话 +羊 +夹克 +男孩 +披萨 +水獭 +办公用品 +沙发 +大提琴 +公牛 +骆驼 +球 +鸭子 +鲸鱼 +衬衫 +坦克 +摩托车 +手风琴 +猫头鹰 +豪猪 +太阳帽 +钉子 +剪刀 +天鹅 +灯 +皇冠 +钢琴 +雕塑 +猎豹 +双簧管 +罐头罐 +芒果 +三脚架 +烤箱 +鼠标 +驳船 +咖啡 +滑雪板 +普通无花果 +沙拉 +无脊椎动物 +雨伞 +袋鼠 +人手臂 +量杯 +蜗牛 +相思 +西服 +茶壶 +瓶 +羊驼 +水壶 +裤子 +爆米花 +蜈蚣 +蜘蛛 +麻雀 +盘子 +百吉饼 +个人护理 +苹果 +胸罩 +浴室柜 +演播室沙发 +电脑键盘 +乒乓球拍 +寿司 +橱柜 +路灯 +毛巾 +床头柜 +兔 +海豚 +狗 +大罐 +炒锅 +消火栓 +人眼 +摩天大楼 +背包 +马铃薯 +纸巾 +小精灵 +自行车车轮 +卫生间 +大号 +地毯 +手推车 +电视 +风扇 +美洲驼 +订书机 +三轮车 +耳机 +空调器 +饼干 +毛巾/餐巾 +靴子 +香肠 +运动型多用途汽车 +肥皂 +棒球 +行李 +扑克牌 +铲子 +标记笔 +耳机 +投影机 +铅笔盒 +法国圆号 +橘子 +路由器 +文件夹 +甜甜圈 +榴莲 +帆船 +坚果 +咖啡机 +肉丸 +篮子 +插线板 +青豆 +鳄梨 +英式足球 +蛋挞 +离合器 +滑梯 +鱼竿 +衣架 +面包 +监控摄像头 +地球仪 +黑板/白板 +救生员 +鸽子 +红卷心菜 +铜钹 +水龙头 +牛排 +秋千 +山竹 +奶酪 +小便池 +生菜 +跨栏 +戒指 +篮球 +盆栽植物 +人力车 +目标 +赛车 +蝴蝶结 +熨斗 +化妆品 +驴 +锯 +铁锤 +台球 +切割/砧板 +电源插座 +吹风机 +包子 +奖章/奖牌 +液体肥皂 +野鸟 +皮鞋 +餐桌 +游戏板 +杠铃 +收音机 +路灯 +磁带 +曲棍球 +春卷 +大米 +高尔夫俱乐部 +打火机 +炸薯条 +显微镜 +手机 +消防车 +面条 +橱柜/架子 +电磁炉和煤气炉 +钥匙 +梳子 +垃圾箱/罐 +牙刷 +枣子 +电钻 +奶牛 +茄子 +扫帚 +抽油烟机 +钳子 +大葱 +扇贝 +洁面乳 +牙膏 +哈密瓜 +橡皮擦 +洗发水/沐浴露 +光盘 +溜冰鞋和滑雪鞋 +美式足球 +拖鞋 +火龙果 +锅/平底锅 +计算器 +纸巾 +乒乓球拍 +板擦 +扬声器 +木瓜 +雪茄 +信纸 +大蒜 +电饭锅 +罐装的 +停车计时器 +手电筒 +画笔 +杯子 +球杆 +人行横道标志 +奇异果/猕猴桃 +散热器 +拖把 +电锯 +凉鞋拖鞋 +储物箱 +洋葱 +手镯 +灭火器 +秤 +秋葵 +微波炉 +运动鞋 +胡椒 +玉米 +柚子 +主机 +钳子 +奖杯 +李子/梅子 +刷子/画笔 +机械车辆 +牦牛 +起重机 +转换器 +面膜 +马车 +皮卡车 +交通锥 +馅饼 +钢笔/铅笔 +跑车 +飞盘 +清洁用品/洗涤剂/洗衣液 +遥控器 +婴儿车/手推车 diff --git a/docs/featured_model/LARGE_SCALE_DET_MODEL.md b/docs/featured_model/LARGE_SCALE_DET_MODEL.md new file mode 100644 index 0000000000000000000000000000000000000000..8d432d54180a8d88eaa47055447091f0a829cd01 --- /dev/null +++ b/docs/featured_model/LARGE_SCALE_DET_MODEL.md @@ -0,0 +1,25 @@ +## 大规模实用目标检测模型 + +### 简介 + +* 与图像分类任务不同,目标检测任务中,不仅需要标注图像中物体所属类别,还要标注其边框位置,因此标注成本相对更高。目前已开源的目标检测数据集中,应用比较广泛的有Open Images V5、Objects365和COCO数据集,这三个数据集的基本信息如下。 + + +| Dataset | Classes | Images | Bounding boxes | +|--------------------|---------|-----------|----------------| +| COCO | 80 | 123,287 | 886,284 | +| Objects365 | 365 | 600,000 | 10,000,000 | +| Open Images V5 | 500 | 1,743,042 | 14,610,229 | + + +上述数据集中包含的类别均不多(相比于ImageNet1k分类数据集的1000个类别)。为了提供更加实用的服务器端目标检测模型,方便用户在不需要任何微调的情况下就可以直接使用,PaddleDetection结合[服务器端实用目标检测方案](./SERVER_SIDE.md),融合Open Images V5和Objects365训练集数据(二者包含许多重复类别),生成了包含676个类别的新数据集,类别映射关系可以在这里查看: [676个类别的标签文件](../../dataset/voc/generic_det_label_list_zh.txt)。并训练了服务器端实用目标检测模型,适用于绝大部分应用场景,方便用户直接部署使用,用户也可以根据提供的预训练模型,在自己的数据集上进行模型微调,加快收敛并获得更高的精度指标。 + + +### 模型库 + + +| 骨架网络 | 网络类型 | 下载 | 配置文件 | +| :---------------| :---------------| :---------------| :--------------- +| ResNet50-vd-FPN-Dcnv2 | Cascade Faster | [model](https://paddlemodels.bj.bcebos.com/object_detection/cascade_rcnn_dcn_r50_vd_fpn_generic_server_side.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/master/configs/rcnn_server_side_det/generic/cascade_rcnn_dcn_r50_vd_fpn_generic_server_side.yml) | +| ResNet101-vd-FPN-Dcnv2 | Cascade Faster | [model](https://paddlemodels.bj.bcebos.com/object_detection/cascade_rcnn_dcn_r101_vd_fpn_generic_server_side.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/master/configs/rcnn_server_side_det/generic/cascade_rcnn_dcn_r101_vd_fpn_generic_server_side.yml) | +| CBResNet101-vd-FPN | Cascade Faster | [model](https://paddlemodels.bj.bcebos.com/object_detection/cascade_rcnn_cbr101_vd_fpn_generic_server_side.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/master/configs/rcnn_server_side_det/generic/cascade_rcnn_cbr101_vd_fpn_generic_server_side.yml) | diff --git a/docs/featured_model/LARGE_SCALE_DET_MODEL_en.md b/docs/featured_model/LARGE_SCALE_DET_MODEL_en.md new file mode 100644 index 0000000000000000000000000000000000000000..efaf8bfee36c747e4044ddb2530e1032c0c88aa0 --- /dev/null +++ b/docs/featured_model/LARGE_SCALE_DET_MODEL_en.md @@ -0,0 +1,23 @@ +## Large-scale practical object detection models (676 categories) + +### Introduction + +* Unlike the image classification task, in the object detection task, it is necessary to mark not only the category of the object in the image, but also the position of the object, which takes higher cost for labeling. Open Images V5, Objects365 and COCO datasets are commonly used datasets for objecet detection tasks. The basic information of these three datasets is as follows. + +| Dataset | Classes | Images | Bounding boxes | +|--------------------|---------|-----------|----------------| +| COCO | 80 | 123,287 | 886,284 | +| Objects365 | 365 | 600,000 | 10,000,000 | +| Open Images V5 | 500 | 1,743,042 | 14,610,229 | + + +There are relatively not enough categories in the above dataset (compared to 1000 categories in the ImageNet1k classification dataset). In order to provide more practical server-side object detection models, which are convenient for users to use directly without finetuning anymore, PaddleDetection combines [Practical Server-side detection method base on RCNN](./SERVER_SIDE_en.md), merges Open image V5 and Objects365 dataset to generate a new training set containing 676 categories. The label list can be here: [label list containing 676 categories](../../dataset/voc/generic_det_label_list.txt). Some practical server-side models are trained on the dataset, which are suitable for most application scenarios. It is convenient for users to directly infer or deploy. Users can also finetune on their own datasets based on the provided pretrained models to accelerate convergence and achieve higher performance. + + +### Model zoo + +| Backbone | Type | Download | Configs | +| :---------------| :---------------| :---------------| :--------------- +| ResNet50-vd-FPN-Dcnv2 | Cascade Faster | [model](https://paddlemodels.bj.bcebos.com/object_detection/cascade_rcnn_dcn_r50_vd_fpn_generic_server_side.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/master/configs/rcnn_server_side_det/generic/cascade_rcnn_dcn_r50_vd_fpn_generic_server_side.yml) | +| ResNet101-vd-FPN-Dcnv2 | Cascade Faster | [model](https://paddlemodels.bj.bcebos.com/object_detection/cascade_rcnn_dcn_r101_vd_fpn_generic_server_side.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/master/configs/rcnn_server_side_det/generic/cascade_rcnn_dcn_r101_vd_fpn_generic_server_side.yml) | +| CBResNet101-vd-FPN | Cascade Faster | [model](https://paddlemodels.bj.bcebos.com/object_detection/cascade_rcnn_cbr101_vd_fpn_generic_server_side.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/master/configs/rcnn_server_side_det/generic/cascade_rcnn_cbr101_vd_fpn_generic_server_side.yml) | diff --git a/docs/featured_model/SERVER_SIDE_en.md b/docs/featured_model/SERVER_SIDE_en.md new file mode 120000 index 0000000000000000000000000000000000000000..5074fda136a9d5c342d6aa6625204c15dfaddfe7 --- /dev/null +++ b/docs/featured_model/SERVER_SIDE_en.md @@ -0,0 +1 @@ +../../configs/rcnn_enhance/README_en.md \ No newline at end of file