add picodet-npu model (#6622)

* add picodet-npu model * fix modelingf

add picodet-npu model (#6622)
* add picodet-npu model * fix modelingf
0e1f2a68 · Guanghua Yu · GitHub · b4727677 · 0e1f2a68 · 0e1f2a68
4 changed file
--- a/configs/picodet/README.md
+++ b/configs/picodet/README.md
@@ -6,6 +6,8 @@

 ## 最新动态

+- 发布PicoDet-NPU模型，支持模型全量化部署。**（2022.08.10）**
+
 - 发布全新系列PP-PicoDet模型：**（2022.03.20）**
  - (1)引入TAL及ETA Head，优化PAN等结构，精度提升2个点以上；
  - (2)优化CPU端预测速度，同时训练速度提升一倍；
@@ -45,6 +47,12 @@ PP-PicoDet模型有如下特点：
 | PicoDet-L |  416*416   |          39.4           |        55.7        |        5.80        |       7.10       |              20.7ms              |            42.23ms            | [model](https://paddledet.bj.bcebos.com/models/picodet_l_416_coco_lcnet.pdparams) &#124; [log](https://paddledet.bj.bcebos.com/logs/train_picodet_l_416_coco_lcnet.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/picodet/picodet_l_416_coco_lcnet.yml) | [w/ 后处理](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_l_416_coco_lcnet.tar) &#124; [w/o 后处理](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_l_416_coco_lcnet_non_postprocess.tar) |
 | PicoDet-L |  640*640   |          42.6           |        59.2        |        5.80        |       16.81        |              62.5ms              |            108.1ms          | [model](https://paddledet.bj.bcebos.com/models/picodet_l_640_coco_lcnet.pdparams) &#124; [log](https://paddledet.bj.bcebos.com/logs/train_picodet_l_640_coco_lcnet.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/picodet/picodet_l_640_coco_lcnet.yml) | [w/ 后处理](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_l_640_coco_lcnet.tar) &#124; [w/o 后处理](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_l_640_coco_lcnet_non_postprocess.tar) |

+- 特色模型
+
+| 模型     | 输入尺寸 | mAP<sup>val<br>0.5:0.95 | mAP<sup>val<br>0.5 | 参数量<br><sup>(M) | FLOPS<br><sup>(G) | 预测时延<sup><small>[CPU](#latency)</small><sup><br><sup>(ms) | 预测时延<sup><small>[Lite](#latency)</small><sup><br><sup>(ms) |  权重下载  | 配置文件 |
+| :-------- | :--------: | :---------------------: | :----------------: | :----------------: | :---------------: | :-----------------------------: | :-----------------------------: | :----------------------------------------: | :--------------------------------------- |
+| PicoDet-S-NPU |  416*416   |          30.1           |        44.2       |        -        |       -        |              -             |            -             | [model](https://paddledet.bj.bcebos.com/models/picodet_s_416_coco_npu.pdparams) &#124; [log](https://paddledet.bj.bcebos.com/logs/train_picodet_s_416_coco_npu.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/picodet/picodet_s_416_coco_npu.yml) |
+

 <details open>
 <summary><b>注意事项:</b></summary>

--- a/configs/picodet/picodet_s_416_coco_npu.yml
+++ b/configs/picodet/picodet_s_416_coco_npu.yml
+_BASE_: [
+  '../datasets/coco_detection.yml',
+  '../runtime.yml',
+  '_base_/picodet_v2.yml',
+  '_base_/optimizer_300e.yml',
+]
+
+pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPLCNet_x0_75_pretrained.pdparams
+weights: output/picodet_s_416_coco/best_model
+find_unused_parameters: True
+keep_best_weight: True
+use_ema: True
+epoch: 300
+snapshot_epoch: 10
+
+PicoDet:
+  backbone: LCNet
+  neck: CSPPAN
+  head: PicoHeadV2
+
+LCNet:
+  scale: 0.75
+  feature_maps: [3, 4, 5]
+  act: relu6
+
+CSPPAN:
+  out_channels: 96
+  use_depthwise: True
+  num_csp_blocks: 1
+  num_features: 4
+  act: relu6
+
+PicoHeadV2:
+  conv_feat:
+    name: PicoFeat
+    feat_in: 96
+    feat_out: 96
+    num_convs: 4
+    num_fpn_stride: 4
+    norm_type: bn
+    share_cls_reg: True
+    use_se: True
+    act: relu6
+  feat_in_chan: 96
+  act: relu6
+
+LearningRate:
+  base_lr: 0.2
+  schedulers:
+  - !CosineDecay
+    max_epochs: 300
+    min_lr_ratio: 0.08
+    last_plateau_epochs: 30
+  - !ExpWarmup
+    epochs: 2
+
+worker_num: 6
+eval_height: &eval_height 416
+eval_width: &eval_width 416
+eval_size: &eval_size [*eval_height, *eval_width]
+
+TrainReader:
+  sample_transforms:
+  - Decode: {}
+  - Mosaic:
+      prob: 0.6
+      input_dim: [640, 640]
+      degrees: [-10, 10]
+      scale: [0.1, 2.0]
+      shear: [-2, 2]
+      translate: [-0.1, 0.1]
+      enable_mixup: True
+  - AugmentHSV: {is_bgr: False, hgain: 5, sgain: 30, vgain: 30}
+  - RandomFlip: {prob: 0.5}
+  batch_transforms:
+  - BatchRandomResize: {target_size: [320, 352, 384, 416, 448, 480, 512], random_size: True, random_interp: True, keep_ratio: False}
+  - NormalizeImage: {mean: [0, 0, 0], std: [1, 1, 1], is_scale: True}
+  - Permute: {}
+  - PadGT: {}
+  batch_size: 40
+  shuffle: true
+  drop_last: true
+  mosaic_epoch: 180
+
+
+EvalReader:
+  sample_transforms:
+  - Decode: {}
+  - Resize: {interp: 2, target_size: *eval_size, keep_ratio: False}
+  - NormalizeImage: {mean: [0, 0, 0], std: [1, 1, 1], is_scale: True}
+  - Permute: {}
+  batch_transforms:
+  - PadBatch: {pad_to_stride: 32}
+  batch_size: 8
+  shuffle: false
+
+
+TestReader:
+  inputs_def:
+    image_shape: [1, 3, *eval_height, *eval_width]
+  sample_transforms:
+  - Decode: {}
+  - Resize: {interp: 2, target_size: *eval_size, keep_ratio: False}
+  - NormalizeImage: {mean: [0, 0, 0], std: [1, 1, 1], is_scale: True}
+  - Permute: {}
+  batch_size: 1
--- a/ppdet/modeling/backbones/lcnet.py
+++ b/ppdet/modeling/backbones/lcnet.py
@@ -68,7 +68,8 @@ class ConvBNLayer(nn.Layer):
                 filter_size,
                 num_filters,
                 stride,
-                 num_groups=1):
+                 num_groups=1,
+                 act='hard_swish'):
        super().__init__()

        self.conv = Conv2D(
@@ -85,12 +86,15 @@ class ConvBNLayer(nn.Layer):
            num_filters,
            weight_attr=ParamAttr(regularizer=L2Decay(0.0)),
            bias_attr=ParamAttr(regularizer=L2Decay(0.0)))
-        self.hardswish = nn.Hardswish()
+        if act == 'hard_swish':
+            self.act = nn.Hardswish()
+        elif act == 'relu6':
+            self.act = nn.ReLU6()

    def forward(self, x):
        x = self.conv(x)
        x = self.bn(x)
-        x = self.hardswish(x)
+        x = self.act(x)
        return x


@@ -100,7 +104,8 @@ class DepthwiseSeparable(nn.Layer):
                 num_filters,
                 stride,
                 dw_size=3,
-                 use_se=False):
+                 use_se=False,
+                 act='hard_swish'):
        super().__init__()
        self.use_se = use_se
        self.dw_conv = ConvBNLayer(
@@ -108,14 +113,16 @@ class DepthwiseSeparable(nn.Layer):
            num_filters=num_channels,
            filter_size=dw_size,
            stride=stride,
-            num_groups=num_channels)
+            num_groups=num_channels,
+            act=act)
        if use_se:
            self.se = SEModule(num_channels)
        self.pw_conv = ConvBNLayer(
            num_channels=num_channels,
            filter_size=1,
            num_filters=num_filters,
-            stride=1)
+            stride=1,
+            act=act)

    def forward(self, x):
        x = self.dw_conv(x)
@@ -158,7 +165,7 @@ class SEModule(nn.Layer):
 @register
 @serializable
 class LCNet(nn.Layer):
-    def __init__(self, scale=1.0, feature_maps=[3, 4, 5]):
+    def __init__(self, scale=1.0, feature_maps=[3, 4, 5], act='hard_swish'):
        super().__init__()
        self.scale = scale
        self.feature_maps = feature_maps
@@ -169,7 +176,8 @@ class LCNet(nn.Layer):
            num_channels=3,
            filter_size=3,
            num_filters=make_divisible(16 * scale),
-            stride=2)
+            stride=2,
+            act=act)

        self.blocks2 = nn.Sequential(* [
            DepthwiseSeparable(
@@ -177,7 +185,8 @@ class LCNet(nn.Layer):
                num_filters=make_divisible(out_c * scale),
                dw_size=k,
                stride=s,
-                use_se=se)
+                use_se=se,
+                act=act)
            for i, (k, in_c, out_c, s, se) in enumerate(NET_CONFIG["blocks2"])
        ])

@@ -187,7 +196,8 @@ class LCNet(nn.Layer):
                num_filters=make_divisible(out_c * scale),
                dw_size=k,
                stride=s,
-                use_se=se)
+                use_se=se,
+                act=act)
            for i, (k, in_c, out_c, s, se) in enumerate(NET_CONFIG["blocks3"])
        ])

@@ -200,7 +210,8 @@ class LCNet(nn.Layer):
                num_filters=make_divisible(out_c * scale),
                dw_size=k,
                stride=s,
-                use_se=se)
+                use_se=se,
+                act=act)
            for i, (k, in_c, out_c, s, se) in enumerate(NET_CONFIG["blocks4"])
        ])

@@ -213,7 +224,8 @@ class LCNet(nn.Layer):
                num_filters=make_divisible(out_c * scale),
                dw_size=k,
                stride=s,
-                use_se=se)
+                use_se=se,
+                act=act)
            for i, (k, in_c, out_c, s, se) in enumerate(NET_CONFIG["blocks5"])
        ])

@@ -226,7 +238,8 @@ class LCNet(nn.Layer):
                num_filters=make_divisible(out_c * scale),
                dw_size=k,
                stride=s,
-                use_se=se)
+                use_se=se,
+                act=act)
            for i, (k, in_c, out_c, s, se) in enumerate(NET_CONFIG["blocks6"])
        ])


--- a/ppdet/modeling/heads/pico_head.py
+++ b/ppdet/modeling/heads/pico_head.py
@@ -155,6 +155,8 @@ class PicoFeat(nn.Layer):
            x = F.leaky_relu(x)
        elif self.act == "hard_swish":
            x = F.hardswish(x)
+        elif self.act == "relu6":
+            x = F.relu6(x)
        return x

    def forward(self, fpn_feat, stage_idx):