未验证 提交 219a9928 编写于 作者: D dyning 提交者: GitHub

Merge pull request #20 from littletomatodonkey/fix_tb

Fix tables in doc
......@@ -12,6 +12,7 @@ ResNet101_vd
ResNet152_vd
ResNet200_vd
ResNet50_vd_ssld
ResNet101_vd_ssld
MobileNetV3_large_x0_35
MobileNetV3_large_x0_5
MobileNetV3_large_x0_75
......
docs/images/models/main_fps_top1.png

301.6 KB | W: | H:

docs/images/models/main_fps_top1.png

299.7 KB | W: | H:

docs/images/models/main_fps_top1.png
docs/images/models/main_fps_top1.png
docs/images/models/main_fps_top1.png
docs/images/models/main_fps_top1.png
  • 2-up
  • Swipe
  • Onion skin
文件模式从 100644 更改为 100755
文件模式从 100644 更改为 100755
......@@ -2,9 +2,16 @@
## 概述
正在持续更新中......
![](../../images/models/DPN.png)
所有模型在预测时,图像的crop_size设置为224,resize_short_size设置为256。
该系列模型的FLOPS、参数量以及fp32预测耗时如下图所示。
![](../../images/models/DPN.png.flops.png)
![](../../images/models/DPN.png.params.png)
![](../../images/models/DPN.png.fp32.png)
所有模型在预测时,图像的crop_size设置为224,resize_short_size设置为256。
## 精度、FLOPS和参数量
......@@ -22,33 +29,19 @@
| DPN131 | 0.807 | 0.951 | 0.801 | 0.949 | 30.510 | 75.360 |
## FP16预测速度
| Models | batch_size=1<br>(ms) | batch_size=4<br>(ms) | batch_size=8<br>(ms) | batch_size=32<br>(ms) |
|:--:|:--:|:--:|:--:|:--:|
| DenseNet121 | 3.653 | 4.560 | 5.574 | 11.517 |
| DenseNet161 | 7.826 | 8.936 | 10.970 | 22.554 |
| DenseNet169 | 5.625 | 6.698 | 7.876 | 14.983 |
| DenseNet201 | 7.243 | 8.537 | 10.111 | 18.928 |
| DenseNet264 | 10.882 | 12.539 | 14.645 | 26.455 |
| DPN68 | 10.310 | 11.060 | 14.299 | 29.618 |
| DPN92 | 16.335 | 17.373 | 23.197 | 45.210 |
| DPN98 | 18.975 | 23.073 | 28.902 | 66.280 |
| DPN107 | 24.932 | 28.607 | 37.513 | 89.112 |
| DPN131 | 25.425 | 29.874 | 37.355 | 88.583 |
## FP32预测速度
| Models | batch_size=1<br>(ms) | batch_size=4<br>(ms) | batch_size=8<br>(ms) | batch_size=32<br>(ms) |
|:--:|:--:|:--:|:--:|:--:|
| DenseNet121 | 3.732 | 6.614 | 8.517 | 21.755 |
| DenseNet161 | 8.282 | 14.438 | 19.336 | 51.953 |
| DenseNet169 | 5.705 | 10.074 | 12.432 | 28.756 |
| DenseNet201 | 7.315 | 13.830 | 16.941 | 38.654 |
| DenseNet264 | 10.986 | 21.460 | 25.724 | 56.501 |
| DPN68 | 10.357 | 11.025 | 14.903 | 34.380 |
| DPN92 | 16.067 | 21.315 | 26.176 | 62.126 |
| DPN98 | 18.455 | 26.710 | 36.009 | 104.084 |
| DPN107 | 24.164 | 37.691 | 51.307 | 148.041 |
| DPN131 | 24.761 | 35.806 | 48.401 | 133.233 |
| Models | Crop Size | Resize Short Size | Batch Size=1<br>(ms) |
|-------------|-----------|-------------------|--------------------------|
| DenseNet121 | 224 | 256 | 4.371 |
| DenseNet161 | 224 | 256 | 8.863 |
| DenseNet169 | 224 | 256 | 6.391 |
| DenseNet201 | 224 | 256 | 8.173 |
| DenseNet264 | 224 | 256 | 11.942 |
| DPN68 | 224 | 256 | 11.805 |
| DPN92 | 224 | 256 | 17.840 |
| DPN98 | 224 | 256 | 21.057 |
| DPN107 | 224 | 256 | 28.685 |
| DPN131 | 224 | 256 | 28.083 |
......@@ -2,26 +2,14 @@
## 概述
正在持续更新中......
![](../../images/models/EfficientNet.png)
在预测时,图像的crop_size和resize_short_size如下表所示。
该系列模型的FLOPS、参数量以及fp32预测耗时如下图所示。
| Models | crop_size | resize_short_size |
|:--:|:--:|:--:|
| ResNeXt101_32x8d_wsl | 224 | 224 |
| ResNeXt101_32x16d_wsl | 224 | 224 |
| ResNeXt101_32x32d_wsl | 224 | 224 |
| ResNeXt101_32x48d_wsl | 224 | 224 |
| Fix_ResNeXt101_32x48d_wsl | 320 | 320 |
| EfficientNetB0 | 224 | 256 |
| EfficientNetB1 | 240 | 272 |
| EfficientNetB2 | 260 | 292 |
| EfficientNetB3 | 300 | 332 |
| EfficientNetB4 | 380 | 412 |
| EfficientNetB5 | 456 | 488 |
| EfficientNetB6 | 528 | 560 |
| EfficientNetB7 | 600 | 632 |
| EfficientNetB0_small | 224 | 256 |
![](../../images/models/EfficientNet.png.flops.png)
![](../../images/models/EfficientNet.png.params.png)
![](../../images/models/EfficientNet.png.fp32.png)
## 精度、FLOPS和参数量
......@@ -44,41 +32,21 @@
| EfficientNetB0_<br>small | 0.758 | 0.926 | | | 0.720 | 4.650 |
## FP16预测速度
| Models | batch_size=1<br>(ms) | batch_size=4<br>(ms) | batch_size=8<br>(ms) | batch_size=32<br>(ms) |
|:--:|:--:|:--:|:--:|:--:|
| ResNeXt101_<br>32x8d_wsl | 16.063 | 16.342 | 24.914 | 45.035 |
| ResNeXt101_<br>32x16d_wsl | 16.471 | 25.235 | 30.762 | 67.869 |
| ResNeXt101_<br>32x32d_wsl | 29.425 | 37.149 | 50.834 | |
| ResNeXt101_<br>32x48d_wsl | 40.311 | 58.414 | | |
| Fix_ResNeXt101_<br>32x48d_wsl | 43.960 | 86.514 | | |
| EfficientNetB0 | 1.759 | 2.748 | 3.761 | 10.178 |
| EfficientNetB1 | 2.592 | 4.122 | 5.829 | 16.262 |
| EfficientNetB2 | 2.866 | 4.715 | 7.064 | 20.954 |
| EfficientNetB3 | 3.869 | 6.815 | 10.672 | 34.097 |
| EfficientNetB4 | 5.626 | 11.937 | 19.753 | 67.436 |
| EfficientNetB5 | 8.907 | 21.685 | 37.248 | 134.185 |
| EfficientNetB6 | 13.591 | 34.093 | 60.976 | |
| EfficientNetB7 | 20.963 | 56.397 | 103.971 | |
| EfficientNetB0_<br>small | 1.039 | 1.665 | 2.493 | 7.748 |
## FP32预测速度
| Models | batch_size=1<br>(ms) | batch_size=4<br>(ms) | batch_size=8<br>(ms) | batch_size=32<br>(ms) |
|:--:|:--:|:--:|:--:|:--:|
| ResNeXt101_<br>32x8d_wsl | 16.325 | 25.633 | 37.196 | 108.535 |
| ResNeXt101_<br>32x16d_wsl | 25.224 | 40.929 | 62.898 | |
| ResNeXt101_<br>32x32d_wsl | 41.047 | 79.575 | | |
| ResNeXt101_<br>32x48d_wsl | 60.610 | | | |
| Fix_ResNeXt101_<br>32x48d_wsl | 80.280 | | | |
| EfficientNetB0 | 1.902 | 3.296 | 4.361 | 11.319 |
| EfficientNetB1 | 2.908 | 5.093 | 6.900 | 18.015 |
| EfficientNetB2 | 3.324 | 5.832 | 8.357 | 23.371 |
| EfficientNetB3 | 4.557 | 8.526 | 12.485 | 38.124 |
| EfficientNetB4 | 6.767 | 14.742 | 23.218 | 77.590 |
| EfficientNetB5 | 11.097 | 26.642 | 43.590 | |
| EfficientNetB6 | 17.582 | 42.408 | 74.336 | |
| EfficientNetB7 | 26.529 | 70.337 | 126.839 | |
| EfficientNetB0_<br>small | 1.171 | 2.026 | 2.906 | 8.506 |
| Models | Crop Size | Resize Short Size | Batch Size=1<br>(ms) |
|-------------------------------|-----------|-------------------|--------------------------|
| ResNeXt101_<br>32x8d_wsl | 224 | 256 | 19.127 |
| ResNeXt101_<br>32x16d_wsl | 224 | 256 | 23.629 |
| ResNeXt101_<br>32x32d_wsl | 224 | 256 | 40.214 |
| ResNeXt101_<br>32x48d_wsl | 224 | 256 | 59.714 |
| Fix_ResNeXt101_<br>32x48d_wsl | 320 | 320 | 82.431 |
| EfficientNetB0 | 224 | 256 | 2.449 |
| EfficientNetB1 | 240 | 272 | 3.547 |
| EfficientNetB2 | 260 | 292 | 3.908 |
| EfficientNetB3 | 300 | 332 | 5.145 |
| EfficientNetB4 | 380 | 412 | 7.609 |
| EfficientNetB5 | 456 | 488 | 12.078 |
| EfficientNetB6 | 528 | 560 | 18.381 |
| EfficientNetB7 | 600 | 632 | 27.817 |
| EfficientNetB0_<br>small | 224 | 256 | 1.692 |
......@@ -2,8 +2,14 @@
## 概述
正在持续更新中......
![](../../images/models/HRNet.png)
所有模型在预测时,图像的crop_size设置为224,resize_short_size设置为256。
该系列模型的FLOPS、参数量以及fp32预测耗时如下图所示。
![](../../images/models/HRNet.png.flops.png)
![](../../images/models/HRNet.png.params.png)
![](../../images/models/HRNet.png.fp32.png)
## 精度、FLOPS和参数量
......@@ -19,27 +25,14 @@
| HRNet_W64_C | 0.793 | 0.946 | 0.795 | 0.946 | 57.830 | 128.060 |
## FP16预测速度
| Models | batch_size=1<br>(ms) | batch_size=4<br>(ms) | batch_size=8<br>(ms) | batch_size=32<br>(ms) |
|:--:|:--:|:--:|:--:|:--:|
| HRNet_W18_C | 6.188 | 7.207 | 9.149 | 18.221 |
| HRNet_W30_C | 7.941 | 8.851 | 10.540 | 21.129 |
| HRNet_W32_C | 7.904 | 8.890 | 10.752 | 21.159 |
| HRNet_W40_C | 9.233 | 11.600 | 13.927 | 29.868 |
| HRNet_W44_C | 9.917 | 12.119 | 15.555 | 31.948 |
| HRNet_W48_C | 10.198 | 12.399 | 15.572 | 32.210 |
| HRNet_W64_C | 12.264 | 14.552 | 18.251 | 41.106 |
## FP32预测速度
| Models | batch_size=1<br>(ms) | batch_size=4<br>(ms) | batch_size=8<br>(ms) | batch_size=32<br>(ms) |
|:--:|:--:|:--:|:--:|:--:|
| HRNet_W18_C | 6.828 | 8.552 | 11.154 | 30.665 |
| HRNet_W30_C | 8.901 | 11.067 | 14.421 | 43.459 |
| HRNet_W32_C | 8.983 | 11.334 | 14.688 | 44.564 |
| HRNet_W40_C | 10.300 | 14.720 | 20.257 | 64.346 |
| HRNet_W44_C | 11.183 | 15.830 | 25.292 | 73.136 |
| HRNet_W48_C | 11.619 | 16.791 | 26.569 | 77.536 |
| HRNet_W64_C | 14.434 | 20.988 | 35.114 | 117.434 |
| Models | Crop Size | Resize Short Size | Batch Size=1<br>(ms) |
|-------------|-----------|-------------------|--------------------------|
| HRNet_W18_C | 224 | 256 | 7.368 |
| HRNet_W30_C | 224 | 256 | 9.402 |
| HRNet_W32_C | 224 | 256 | 9.467 |
| HRNet_W40_C | 224 | 256 | 10.739 |
| HRNet_W44_C | 224 | 256 | 11.497 |
| HRNet_W48_C | 224 | 256 | 12.165 |
| HRNet_W64_C | 224 | 256 | 15.003 |
......@@ -2,8 +2,14 @@
## 概述
正在持续更新中......
![](../../images/models/Inception.png)
GoogLeNet在预测时,图像的crop_size设置为224,resize_short_size设置为256,其余模型在预测时,图像的crop_size设置为299,resize_short_size设置为320。
该系列模型的FLOPS、参数量以及fp32预测耗时如下图所示。
![](../../images/models/Inception.png.flops.png)
![](../../images/models/Inception.png.params.png)
![](../../images/models/Inception.png.fp32.png)
## 精度、FLOPS和参数量
......@@ -19,27 +25,15 @@ GoogLeNet在预测时,图像的crop_size设置为224,resize_short_size设置
| InceptionV4 | 0.808 | 0.953 | 0.800 | 0.950 | 24.570 | 42.680 |
## FP16预测速度
| Models | batch_size=1<br>(ms) | batch_size=4<br>(ms) | batch_size=8<br>(ms) | batch_size=32<br>(ms) |
|:--:|:--:|:--:|:--:|:--:|
| GoogLeNet | 1.428 | 1.833 | 2.138 | 4.143 |
| Xception41 | 1.545 | 2.772 | 4.961 | 18.447 |
| Xception41<br>_deeplab | 1.630 | 2.647 | 4.462 | 16.354 |
| Xception65 | 5.398 | 4.215 | 8.611 | 28.702 |
| Xception65<br>_deeplab | 5.317 | 3.688 | 6.168 | 23.108 |
| Xception71 | 2.732 | 5.033 | 8.948 | 33.857 |
| InceptionV4 | 6.172 | 7.558 | 9.527 | 24.021 |
## FP32预测速度
| Models | batch_size=1<br>(ms) | batch_size=4<br>(ms) | batch_size=8<br>(ms) | batch_size=32<br>(ms) |
|:--:|:--:|:--:|:--:|:--:|
| GoogLeNet | 1.436 | 2.904 | 3.800 | 9.049 |
| Xception41 | 3.402 | 7.889 | 14.953 | 56.142 |
| Xception41<br>_deeplab | 3.778 | 8.396 | 15.449 | 58.735 |
| Xception65 | 6.802 | 13.935 | 34.301 | 87.256 |
| Xception65<br>_deeplab | 8.583 | 12.132 | 22.917 | 87.983 |
| Xception71 | 6.156 | 14.359 | 27.360 | 107.282 |
| InceptionV4 | 10.384 | 17.438 | 23.312 | 68.777 |
| Models | Crop Size | Resize Short Size | Batch Size=1<br>(ms) |
|------------------------|-----------|-------------------|--------------------------|
| GoogLeNet | 224 | 256 | 1.807 |
| Xception41 | 299 | 320 | 3.972 |
| Xception41<br>_deeplab | 299 | 320 | 4.408 |
| Xception65 | 299 | 320 | 6.174 |
| Xception65<br>_deeplab | 299 | 320 | 6.464 |
| Xception71 | 299 | 320 | 6.782 |
| InceptionV4 | 299 | 320 | 11.141 |
......@@ -10,11 +10,10 @@ ShuffleNet系列网络是旷视提出的轻量化网络结构,到目前为止
MobileNetV3是Google于2019年提出的一种基于NAS的新的轻量级网络,为了进一步提升效果,将relu和sigmoid激活函数分别替换为hard_swish与hard_sigmoid激活函数,同时引入了一些专门减小网络计算量的改进策略。
![](../../images/models/mobile_arm_top1.png)
![](../../images/models/mobile_arm_storage.png)
![](../../images/models/mobile_trt.png)
![](../../images/models/mobile_trt.png.flops.png)
![](../../images/models/mobile_trt.png.params.png)
目前PaddleClas开源的的移动端系列的预训练模型一共有32个,其指标如图所示。从图片可以看出,越新的轻量级模型往往有更优的表现,MobileNetV3代表了目前最新的轻量级神经网络结构。在MobileNetV3中,作者为了获得更高的精度,在global-avg-pooling后使用了1x1的卷积。该操作大幅提升了参数量但对计算量影响不大,所以如果从存储角度评价模型的优异程度,MobileNetV3优势不是很大,但由于其更小的计算量,使得其有更快的推理速度。此外,我们模型库中的ssld蒸馏模型表现优异,从各个考量角度下,都刷新了当前轻量级模型的精度。由于MobileNetV3模型结构复杂,分支较多,对GPU并不友好,GPU预测速度不如MobileNetV1。
**注意**:所有模型在预测时,图像的crop_size设置为224,resize_short_size设置为256。
## 精度、FLOPS和参数量
......@@ -54,78 +53,42 @@ MobileNetV3是Google于2019年提出的一种基于NAS的新的轻量级网络
| ShuffleNetV2_swish | 0.700 | 0.892 | | | 0.290 | 2.260 |
## FP16预测速度
| Models | batch_size=1<br>(ms) | batch_size=4<br>(ms) | batch_size=8<br>(ms) | batch_size=32<br>(ms) |
|:--:|:--:|:--:|:--:|:--:|
| MobileNetV1_x0_25 | 0.236 | 0.258 | 0.281 | 0.556 |
| MobileNetV1_x0_5 | 0.246 | 0.318 | 0.364 | 0.845 |
| MobileNetV1_x0_75 | 0.303 | 0.380 | 0.512 | 1.164 |
| MobileNetV1 | 0.340 | 0.426 | 0.601 | 1.578 |
| MobileNetV1_ssld | 0.340 | 0.426 | 0.601 | 1.578 |
| MobileNetV2_x0_25 | 0.432 | 0.488 | 0.532 | 0.967 |
| MobileNetV2_x0_5 | 0.475 | 0.564 | 0.654 | 1.296 |
| MobileNetV2_x0_75 | 0.553 | 0.653 | 0.821 | 1.761 |
| MobileNetV2 | 0.610 | 0.738 | 0.931 | 2.115 |
| MobileNetV2_x1_5 | 0.731 | 0.966 | 1.252 | 3.152 |
| MobileNetV2_x2_0 | 0.870 | 1.123 | 1.494 | 3.910 |
| MobileNetV2_ssld | 0.610 | 0.738 | 0.931 | 2.115 |
| MobileNetV3_large_<br>x1_25 | 2.004 | 2.223 | 2.433 | 5.954 |
| MobileNetV3_large_<br>x1_0 | 1.943 | 2.203 | 2.113 | 4.823 |
| MobileNetV3_large_<br>x0_75 | 2.107 | 2.266 | 2.120 | 3.968 |
| MobileNetV3_large_<br>x0_5 | 1.942 | 2.178 | 2.179 | 2.936 |
| MobileNetV3_large_<br>x0_35 | 1.994 | 2.407 | 2.285 | 2.420 |
| MobileNetV3_small_<br>x1_25 | 1.876 | 2.141 | 2.118 | 3.423 |
| MobileNetV3_small_<br>x1_0 | 1.751 | 2.160 | 2.203 | 2.830 |
| MobileNetV3_small_<br>x0_75 | 1.856 | 2.235 | 2.166 | 2.464 |
| MobileNetV3_small_<br>x0_5 | 1.773 | 2.304 | 2.242 | 2.133 |
| MobileNetV3_small_<br>x0_35 | 1.870 | 2.392 | 2.323 | 2.101 |
| MobileNetV3_large_<br>x1_0_ssld | 1.943 | 2.203 | 2.113 | 4.823 | |
| MobileNetV3_small_<br>x1_0_ssld | 1.751 | 2.160 | 2.203 | 2.830 |
| ShuffleNetV2 | 1.134 | 1.068 | 1.199 | 2.558 |
| ShuffleNetV2_x0_25 | 0.911 | 0.953 | 0.948 | 1.327 |
| ShuffleNetV2_x0_33 | 0.853 | 1.072 | 0.958 | 1.398 |
| ShuffleNetV2_x0_5 | 0.858 | 1.059 | 1.084 | 1.620 |
| ShuffleNetV2_x1_5 | 1.040 | 1.153 | 1.394 | 3.452 |
| ShuffleNetV2_x2_0 | 1.061 | 1.316 | 1.694 | 4.485 |
| ShuffleNetV2_swish | 1.688 | 1.958 | 1.707 | 3.711 |
## FP32预测速度
| Models | batch_size=1<br>(ms) | batch_size=4<br>(ms) | batch_size=8<br>(ms) | batch_size=32<br>(ms) |
|:--:|:--:|:--:|:--:|:--:|
| MobileNetV1_x0_25 | 0.233 | 0.372 | 0.424 | 0.930 |
| MobileNetV1_x0_5 | 0.281 | 0.532 | 0.677 | 1.808 |
| MobileNetV1_x0_75 | 0.344 | 0.733 | 0.960 | 2.920 |
| MobileNetV1 | 0.420 | 0.963 | 1.462 | 4.769 |
| MobileNetV1_ssld | 0.420 | 0.963 | 1.462 | 4.769 |
| MobileNetV2_x0_25 | 0.718 | 0.738 | 0.775 | 1.482 |
| MobileNetV2_x0_5 | 0.818 | 0.975 | 1.107 | 2.481 |
| MobileNetV2_x0_75 | 0.830 | 1.104 | 1.514 | 3.629 |
| MobileNetV2 | 0.889 | 1.346 | 1.875 | 4.711 |
| MobileNetV2_x1_5 | 1.221 | 1.982 | 2.951 | 7.645 |
| MobileNetV2_x2_0 | 1.546 | 2.625 | 3.734 | 10.429 |
| MobileNetV2_ssld | 0.889 | 1.346 | 1.875 | 4.711 |
| MobileNetV3_large_<br>x1_25 | 2.113 | 2.377 | 3.114 | 7.332 |
| MobileNetV3_large_<br>x1_0 | 1.991 | 2.380 | 2.517 | 5.826 |
| MobileNetV3_large_<br>x0_75 | 2.105 | 2.454 | 2.336 | 4.611 |
| MobileNetV3_large_<br>x0_5 | 1.978 | 2.603 | 2.291 | 3.306 |
| MobileNetV3_large_<br>x0_35 | 2.017 | 2.469 | 2.316 | 2.558 |
| MobileNetV3_small_<br>x1_25 | 1.915 | 2.411 | 2.295 | 3.742 |
| MobileNetV3_small_<br>x1_0 | 1.915 | 2.889 | 2.862 | 3.022 |
| MobileNetV3_small_<br>x0_75 | 1.941 | 2.358 | 2.232 | 2.602 |
| MobileNetV3_small_<br>x0_5 | 1.872 | 2.364 | 2.238 | 2.061 |
| MobileNetV3_small_<br>x0_35 | 1.889 | 2.407 | 2.328 | 2.127 |
| MobileNetV3_large_<br>x1_0_ssld | 1.991 | 2.380 | 2.517 | 5.826 |
| MobileNetV3_small_<br>x1_0_ssld | 1.915 | 2.889 | 2.862 | 3.022 |
| ShuffleNetV2 | 1.328 | 1.211 | 1.440 | 3.210 |
| ShuffleNetV2_x0_25 | 0.905 | 0.908 | 0.924 | 1.284 |
| ShuffleNetV2_x0_33 | 0.871 | 1.073 | 0.891 | 1.416 |
| ShuffleNetV2_x0_5 | 0.852 | 1.150 | 1.093 | 1.702 |
| ShuffleNetV2_x1_5 | 0.874 | 1.470 | 1.889 | 4.490 |
| ShuffleNetV2_x2_0 | 1.443 | 1.908 | 2.556 | 6.864 |
| ShuffleNetV2_swish | 1.694 | 1.856 | 2.101 | 3.942 |
| Models | Crop Size | Resize Short Size | Batch Size=1<br>(ms) |
|--------------------------------------|-----------|-------------------|--------------------------|
| MobileNetV1_x0_25 | 224 | 256 | 0.492 |
| MobileNetV1_x0_5 | 224 | 256 | 0.599 |
| MobileNetV1_x0_75 | 224 | 256 | 0.695 |
| MobileNetV1 | 224 | 256 | 0.739 |
| MobileNetV1_ssld | 224 | 256 | 0.739 |
| MobileNetV2_x0_25 | 224 | 256 | 1.014 |
| MobileNetV2_x0_5 | 224 | 256 | 1.216 |
| MobileNetV2_x0_75 | 224 | 256 | 1.392 |
| MobileNetV2 | 224 | 256 | 1.153 |
| MobileNetV2_x1_5 | 224 | 256 | 1.516 |
| MobileNetV2_x2_0 | 224 | 256 | 1.819 |
| MobileNetV2_ssld | 224 | 256 | 1.153 |
| MobileNetV3_large_<br>x1_25 | 224 | 256 | 3.070 |
| MobileNetV3_large_<br>x1_0 | 224 | 256 | 3.173 |
| MobileNetV3_large_<br>x0_75 | 224 | 256 | 2.928 |
| MobileNetV3_large_<br>x0_5 | 224 | 256 | 2.979 |
| MobileNetV3_large_<br>x0_35 | 224 | 256 | 2.987 |
| MobileNetV3_small_<br>x1_25 | 224 | 256 | 3.003 |
| MobileNetV3_small_<br>x1_0 | 224 | 256 | 3.168 |
| MobileNetV3_small_<br>x0_75 | 224 | 256 | 2.974 |
| MobileNetV3_small_<br>x0_5 | 224 | 256 | 2.199 |
| MobileNetV3_small_<br>x0_35 | 224 | 256 | 2.240 |
| MobileNetV3_large_<br>x1_0_ssld | 224 | 256 | 3.173 |
| MobileNetV3_small_<br>x1_0_ssld | 224 | 256 | 3.168 |
| ShuffleNetV2 | 224 | 256 | 1.861 |
| ShuffleNetV2_x0_25 | 224 | 256 | 1.410 |
| ShuffleNetV2_x0_33 | 224 | 256 | 1.271 |
| ShuffleNetV2_x0_5 | 224 | 256 | 1.389 |
| ShuffleNetV2_x1_5 | 224 | 256 | 1.239 |
| ShuffleNetV2_x2_0 | 224 | 256 | 2.152 |
| ShuffleNetV2_swish | 224 | 256 | 2.150 |
## CPU预测速度和存储大小
......
......@@ -3,8 +3,6 @@
## 概述
正在持续更新中......
DarkNet53在预测时,图像的crop_size设置为256,resize_short_size设置为256;其余模型在预测时,图像的crop_size设置为224,resize_short_size设置为256。
## 精度、FLOPS和参数量
......@@ -22,31 +20,18 @@ DarkNet53在预测时,图像的crop_size设置为256,resize_short_size设置
| ResNet50_ACNet<br>_deploy | 0.767 | 0.932 | | | 8.190 | 25.550 |
## FP16预测速度
| Models | batch_size=1<br>(ms) | batch_size=4<br>(ms) | batch_size=8<br>(ms) | batch_size=32<br>(ms) |
|:--:|:--:|:--:|:--:|:--:|
| AlexNet | 0.684 | 0.740 | 0.810 | 1.481 |
| SqueezeNet1_0 | 0.545 | 0.841 | 1.146 | 3.501 |
| SqueezeNet1_1 | 0.473 | 0.575 | 0.805 | 1.862 |
| VGG11 | 1.096 | 1.655 | 2.396 | 6.728 |
| VGG13 | 1.216 | 2.059 | 3.056 | 9.468 |
| VGG16 | 1.518 | 2.594 | 4.019 | 12.145 |
| VGG19 | 1.817 | 3.124 | 4.886 | 14.958 |
| DarkNet53 | 2.150 | 2.627 | 3.422 | 10.092 | |
| ResNet50_ACNet<br>_deploy | 2.748 | 3.178 | 3.823 | 8.369 |
## FP32预测速度
| Models | batch_size=1<br>(ms) | batch_size=4<br>(ms) | batch_size=8<br>(ms) | batch_size=32<br>(ms) |
|:--:|:--:|:--:|:--:|:--:|
| AlexNet | 0.682 | 0.875 | 1.196 | 3.196 |
| SqueezeNet1_0 | 0.530 | 1.072 | 1.652 | 5.338 |
| SqueezeNet1_1 | 0.439 | 0.787 | 1.164 | 2.973 |
| VGG11 | 1.575 | 3.638 | 6.427 | 23.227 |
| VGG13 | 1.859 | 4.832 | 8.832 | 32.946 |
| VGG16 | 2.316 | 6.420 | 11.936 | 44.719 |
| VGG19 | 2.775 | 8.013 | 14.925 | 57.272 |
| DarkNet53 | 2.648 | 5.727 | 9.616 | 33.664 | |
| ResNet50_ACNet<br>_deploy | 4.544 | 6.873 | 9.627 | 28.283 |
| Models | Crop Size | Resize Short Size | Batch Size=1<br>(ms) |
|---------------------------|-----------|-------------------|----------------------|
| AlexNet | 224 | 256 | 1.176 |
| SqueezeNet1_0 | 224 | 256 | 0.860 |
| SqueezeNet1_1 | 224 | 256 | 0.763 |
| VGG11 | 224 | 256 | 1.867 |
| VGG13 | 224 | 256 | 2.148 |
| VGG16 | 224 | 256 | 2.616 |
| VGG19 | 224 | 256 | 3.076 |
| DarkNet53 | 256 | 256 | 3.139 |
| ResNet50_ACNet<br>_deploy | 224 | 256 | 5.626 |
......@@ -9,7 +9,16 @@ ResNet系列模型是在2015年提出的,一举在ILSVRC2015比赛中取得冠
本次发布ResNet系列的模型包括ResNet50,ResNet50_vd,ResNet50_vd_ssld,ResNet200_vd等14个预训练模型。在训练层面上,ResNet的模型采用了训练ImageNet的标准训练流程,而其余改进版模型采用了更多的训练策略,如learning rate的下降方式采用了cosine decay,引入了label smoothing的标签正则方式,在数据预处理加入了mixup的操作,迭代总轮数从120个epoch增加到200个epoch。
其中,ResNet50_vd_v2与ResNet50_vd_ssld采用了知识蒸馏,保证模型结构不变的情况下,进一步提升了模型的精度,具体地,ResNet50_vd_v2的teacher模型是ResNet152_vd(top1准确率80.59%),数据选用的是ImageNet-1k的训练集,ResNet50_vd_ssld的teacher模型是ResNeXt101_32x16d_wsl(top1准确率84.2%),数据选用结合了ImageNet-1k的训练集和ImageNet-22k挖掘的400万数据。知识蒸馏的具体方法正在持续更新中。
![](../../images/models/ResNet.png)
该系列模型的FLOPS、参数量以及fp32预测耗时如下图所示。
![](../../images/models/ResNet.png.flops.png)
![](../../images/models/ResNet.png.params.png)
![](../../images/models/ResNet.png.fp32.png)
通过上述曲线可以看出,层数越多,准确率越高,但是相应的参数量、计算量和延时都会增加。ResNet50_vd_ssld通过用更强的teacher和更多的数据,将其在ImageNet-1k上的验证集top-1精度进一步提高,达到了82.39%,刷新了ResNet50系列模型的精度。
**注意**:所有模型在预测时,图像的crop_size设置为224,resize_short_size设置为256。
......@@ -32,43 +41,27 @@ ResNet系列模型是在2015年提出的,一举在ILSVRC2015比赛中取得冠
| ResNet152_vd | 0.806 | 0.953 | | | 23.530 | 60.210 |
| ResNet200_vd | 0.809 | 0.953 | | | 30.530 | 74.740 |
| ResNet50_vd_ssld | 0.824 | 0.961 | | | 8.670 | 25.580 |
| ResNet101_vd_ssld | 0.835 | 0.968 | | | 16.100 | 44.570 |
## FP16预测速度
| Models | batch_size=1<br>(ms) | batch_size=4<br>(ms) | batch_size=8<br>(ms) | batch_size=32<br>(ms) |
|:--:|:--:|:--:|:--:|:--:|
| ResNet18 | 0.966 | 1.076 | 1.263 | 2.656 |
| ResNet18_vd | 1.002 | 1.163 | 1.392 | 3.045 |
| ResNet34 | 1.798 | 1.959 | 2.269 | 4.716 |
| ResNet34_vd | 1.839 | 2.011 | 2.482 | 4.767 |
| ResNet50 | 1.892 | 2.146 | 2.692 | 6.411 |
| ResNet50_vc | 1.903 | 2.094 | 2.677 | 6.096 |
| ResNet50_vd | 1.918 | 2.273 | 2.833 | 6.978 |
| ResNet50_vd_v2 | 1.918 | 2.273 | 2.833 | 6.978 |
| ResNet101 | 3.790 | 4.128 | 4.789 | 10.913 |
| ResNet101_vd | 3.853 | 4.229 | 5.001 | 11.437 |
| ResNet152 | 5.523 | 5.871 | 6.710 | 15.258 |
| ResNet152_vd | 5.503 | 6.003 | 7.001 | 15.716 |
| ResNet200_vd | 7.270 | 7.595 | 8.802 | 19.516 |
| ResNet50_vd_ssld | 1.918 | 2.273 | 2.833 | 6.978 |
## FP32预测速度
| Models | batch_size=1<br>(ms) | batch_size=4<br>(ms) | batch_size=8<br>(ms) | batch_size=32<br>(ms) |
|:--:|:--:|:--:|:--:|:--:|
| ResNet18 | 1.127 | 1.428 | 2.352 | 7.780 |
| ResNet18_vd | 1.142 | 1.532 | 2.584 | 8.441 |
| ResNet34 | 1.936 | 2.409 | 4.197 | 14.442 |
| ResNet34_vd | 1.948 | 2.526 | 4.403 | 15.133 |
| ResNet50 | 2.630 | 4.393 | 6.491 | 20.449 |
| ResNet50_vc | 2.728 | 4.413 | 6.618 | 21.183 |
| ResNet50_vd | 2.649 | 4.522 | 6.771 | 21.552 |
| ResNet50_vd_v2 | 2.649 | 4.522 | 6.771 | 21.552 |
| ResNet101 | 4.747 | 8.015 | 11.555 | 36.739 |
| ResNet101_vd | 4.735 | 8.111 | 11.820 | 37.155 |
| ResNet152 | 6.618 | 11.471 | 16.580 | 51.792 |
| ResNet152_vd | 6.626 | 11.613 | 16.843 | 53.645 |
| ResNet200_vd | 8.540 | 14.770 | 21.554 | 69.053 |
| ResNet50_vd_ssld | 2.649 | 4.522 | 6.771 | 21.552 |
| Models | Crop Size | Resize Short Size | Batch Size=1<br>(ms) |
|------------------|-----------|-------------------|--------------------------|
| ResNet18 | 224 | 256 | 1.499 |
| ResNet18_vd | 224 | 256 | 1.603 |
| ResNet34 | 224 | 256 | 2.272 |
| ResNet34_vd | 224 | 256 | 2.343 |
| ResNet50 | 224 | 256 | 2.939 |
| ResNet50_vc | 224 | 256 | 3.041 |
| ResNet50_vd | 224 | 256 | 3.165 |
| ResNet50_vd_v2 | 224 | 256 | 3.165 |
| ResNet101 | 224 | 256 | 5.314 |
| ResNet101_vd | 224 | 256 | 5.252 |
| ResNet152 | 224 | 256 | 7.205 |
| ResNet152_vd | 224 | 256 | 7.200 |
| ResNet200_vd | 224 | 256 | 8.885 |
| ResNet50_vd_ssld | 224 | 256 | 3.165 |
| ResNet101_vd_ssld | 224 | 256 | 5.252 |
......@@ -6,9 +6,18 @@ ResNeXt是ResNet的典型变种网络之一,ResNeXt发表于2017年的CVPR会
SENet是2017年ImageNet分类比赛的冠军方案,其提出了一个全新的SE结构,该结构可以迁移到任何其他网络中,其通过控制scale的大小,把每个通道间重要的特征增强,不重要的特征减弱,从而让提取的特征指向性更强。
Res2Net是2019年提出的一种全新的对ResNet的改进方案,该方案可以和现有其他优秀模块轻松整合,在不增加计算负载量的情况下,在ImageNet、CIFAR-100等数据集上的测试性能超过了ResNet。Res2Net结构简单,性能优越,进一步探索了CNN在更细粒度级别的多尺度表示能力。Res2Net揭示了一个新的提升模型精度的维度,即scale,其是除了深度、宽度和基数的现有维度之外另外一个必不可少的更有效的因素。该网络在其他视觉任务如目标检测、图像分割等也有相当不错的表现。
![](../../images/models/SeResNeXt.png)
该系列模型的FLOPS、参数量以及fp32预测耗时如下图所示。
![](../../images/models/SeResNeXt.png.flops.png)
![](../../images/models/SeResNeXt.png.params.png)
![](../../images/models/SeResNeXt.png.fp32.png)
目前PaddleClas开源的这三类的预训练模型一共有24个,其指标如图所示,从图中可以看出,在同样Flops和Params下,改进版的模型往往有更高的精度,但是推理速度往往不如ResNet系列。另一方面,Res2Net表现也较为优秀,相比ResNeXt中的group操作、SEResNet中的SE结构操作,Res2Net在相同Flops、Params和推理速度下往往精度更佳。
**注意**:所有模型在预测时,图像的crop_size设置为224,resize_short_size设置为256。
......@@ -42,61 +51,32 @@ Res2Net是2019年提出的一种全新的对ResNet的改进方案,该方案可
| SENet154_vd | 0.814 | 0.955 | | | 45.830 | 114.290 |
## FP16预测速度
| Models | batch_size=1<br>(ms) | batch_size=4<br>(ms) | batch_size=8<br>(ms) | batch_size=32<br>(ms) |
|:--:|:--:|:--:|:--:|:--:|
| Res2Net50_26w_4s | 2.625 | 3.338 | 4.670 | 11.939 |
| Res2Net50_vd_26w_4s | 2.642 | 3.480 | 4.862 | 13.089 |
| Res2Net50_14w_8s | 3.393 | 4.237 | 5.473 | 13.979 |
| Res2Net101_vd_26w_4s | 5.128 | 6.190 | 7.995 | 20.534 |
| Res2Net200_vd_26w_4s | 9.594 | 11.131 | 14.278 | 36.258 |
| ResNeXt50_32x4d | 6.795 | 7.102 | 8.444 | 18.938 |
| ResNeXt50_vd_32x4d | 7.455 | 7.231 | 8.891 | 19.849 |
| ResNeXt50_64x4d | 20.279 | 12.343 | 13.633 | 32.772 |
| ResNeXt50_vd_64x4d | 16.325 | 21.773 | 25.007 | 55.329 |
| ResNeXt101_32x4d | 14.847 | 15.092 | 15.847 | 42.681 |
| ResNeXt101_vd_32x4d | 15.227 | 15.139 | 16.603 | 39.371 |
| ResNeXt101_64x4d | 28.221 | 29.455 | 29.873 | 59.415 |
| ResNeXt101_vd_64x4d | 31.051 | 28.160 | 28.915 | 60.525 |
| ResNeXt152_32x4d | 22.961 | 23.167 | 24.173 | 51.621 |
| ResNeXt152_vd_32x4d | 23.259 | 23.469 | 23.886 | 52.085 |
| ResNeXt152_64x4d | 41.930 | 42.441 | 45.985 | 79.405 |
| ResNeXt152_vd_64x4d | 42.778 | 43.281 | 45.017 | 79.728 |
| SE_ResNet18_vd | 1.256 | 1.463 | 1.917 | 4.316 |
| SE_ResNet34_vd | 2.314 | 2.691 | 3.432 | 7.411 |
| SE_ResNet50_vd | 2.884 | 4.051 | 5.421 | 15.013 |
| SE_ResNeXt50_32x4d | 7.973 | 10.613 | 12.788 | 29.091 |
| SE_ResNeXt50_vd_32x4d | 8.340 | 12.245 | 15.253 | 30.399 |
| SE_ResNeXt101_32x4d | 17.324 | 21.004 | 28.541 | 52.888 |
| SENet154_vd | 47.234 | 48.018 | 52.967 | 109.787 |
## FP32预测速度
| Models | batch_size=1<br>(ms) | batch_size=4<br>(ms) | batch_size=8<br>(ms) | batch_size=32<br>(ms) |
|:--:|:--:|:--:|:--:|:--:|
| Res2Net50_26w_4s | 3.711 | 5.855 | 8.450 | 26.084 |
| Res2Net50_vd_26w_4s | 3.651 | 5.986 | 8.747 | 26.772 |
| Res2Net50_14w_8s | 4.549 | 6.863 | 9.492 | 27.049 |
| Res2Net101_vd_26w_4s | 6.658 | 10.870 | 15.364 | 47.054 |
| Res2Net200_vd_26w_4s | 12.017 | 19.871 | 28.330 | 88.645 |
| ResNeXt50_32x4d | 6.747 | 8.862 | 11.961 | 32.782 |
| ResNeXt50_vd_32x4d | 6.746 | 9.037 | 12.279 | 33.496 |
| ResNeXt50_64x4d | 11.577 | 14.570 | 20.425 | 57.979 |
| ResNeXt50_vd_64x4d | 19.219 | 21.454 | 30.943 | 90.950 |
| ResNeXt101_32x4d | 14.652 | 18.082 | 24.148 | 70.200 |
| ResNeXt101_vd_32x4d | 14.927 | 18.454 | 23.894 | 67.334 |
| ResNeXt101_64x4d | 28.726 | 30.999 | 43.169 | 116.282 |
| ResNeXt101_vd_64x4d | 28.350 | 31.186 | 41.315 | 113.655 |
| ResNeXt152_32x4d | 23.578 | 27.323 | 35.588 | 99.121 |
| ResNeXt152_vd_32x4d | 23.548 | 26.879 | 35.091 | 104.832 |
| ResNeXt152_64x4d | 43.214 | 43.339 | 60.990 | 159.381 |
| ResNeXt152_vd_64x4d | 43.998 | 44.510 | 61.094 | 160.601 |
| SE_ResNet18_vd | 1.353 | 1.867 | 3.021 | 9.331 |
| SE_ResNet34_vd | 2.421 | 3.201 | 5.294 | 16.849 |
| SE_ResNet50_vd | 3.403 | 6.023 | 8.721 | 26.978 |
| SE_ResNeXt50_32x4d | 8.339 | 12.689 | 15.471 | 41.562 |
| SE_ResNeXt50_vd_32x4d | 7.849 | 13.530 | 16.810 | 44.020 |
| SE_ResNeXt101_32x4d | 16.853 | 24.409 | 32.666 | 81.806 |
| SENet154_vd | 46.002 | 53.666 | 70.589 | 180.334 |
| Models | Crop Size | Resize Short Size | Batch Size=1<br>(ms) |
|-----------------------|-----------|-------------------|--------------------------|
| Res2Net50_26w_4s | 224 | 256 | 4.148 |
| Res2Net50_vd_26w_4s | 224 | 256 | 4.172 |
| Res2Net50_14w_8s | 224 | 256 | 5.113 |
| Res2Net101_vd_26w_4s | 224 | 256 | 7.327 |
| Res2Net200_vd_26w_4s | 224 | 256 | 12.806 |
| ResNeXt50_32x4d | 224 | 256 | 10.964 |
| ResNeXt50_vd_32x4d | 224 | 256 | 7.566 |
| ResNeXt50_64x4d | 224 | 256 | 13.905 |
| ResNeXt50_vd_64x4d | 224 | 256 | 14.321 |
| ResNeXt101_32x4d | 224 | 256 | 14.915 |
| ResNeXt101_vd_32x4d | 224 | 256 | 14.885 |
| ResNeXt101_64x4d | 224 | 256 | 28.716 |
| ResNeXt101_vd_64x4d | 224 | 256 | 28.398 |
| ResNeXt152_32x4d | 224 | 256 | 22.996 |
| ResNeXt152_vd_32x4d | 224 | 256 | 22.729 |
| ResNeXt152_64x4d | 224 | 256 | 46.705 |
| ResNeXt152_vd_64x4d | 224 | 256 | 46.395 |
| SE_ResNet18_vd | 224 | 256 | 1.694 |
| SE_ResNet34_vd | 224 | 256 | 2.786 |
| SE_ResNet50_vd | 224 | 256 | 3.749 |
| SE_ResNeXt50_32x4d | 224 | 256 | 8.924 |
| SE_ResNeXt50_vd_32x4d | 224 | 256 | 9.011 |
| SE_ResNeXt101_32x4d | 224 | 256 | 19.204 |
| SENet154_vd | 224 | 256 | 50.406 |
......@@ -2,7 +2,26 @@
## 概述
基于ImageNet1k分类数据集,PaddleClas支持的23种系列分类网络结构以及对应的117个图像分类预训练模型如下所示,训练技巧、每个系列网络结构的简单介绍和性能评估将在相应章节展现。GPU评估环境基于V100和TensorRT,CPU的评估环境基于骁龙855(SD855)。
基于ImageNet1k分类数据集,PaddleClas支持的23种系列分类网络结构以及对应的117个图像分类预训练模型如下所示,训练技巧、每个系列网络结构的简单介绍和性能评估将在相应章节展现。
## 评估环境
* CPU的评估环境基于骁龙855(SD855)。
* GPU评估环境基于V100和TensorRT,评估脚本如下。
```shell
#!/usr/bin/env bash
export PYTHONPATH=$PWD:$PYTHONPATH
python tools/infer/predict.py \
--model_file='pretrained/infer/model' \
--params_file='pretrained/infer/params' \
--enable_benchmark=True \
--model_name=ResNet50_vd \
--use_tensorrt=True \
--use_fp16=False \
--batch_size=1
```
![](../../images/models/main_fps_top1.png)
![](../../images/models/mobile_arm_top1.png)
......@@ -25,6 +44,7 @@
- [ResNet152_vd](https://paddle-imagenet-models-name.bj.bcebos.com/ResNet152_vd_pretrained.tar)
- [ResNet200_vd](https://paddle-imagenet-models-name.bj.bcebos.com/ResNet200_vd_pretrained.tar)
- [ResNet50_vd_ssld](https://paddle-imagenet-models-name.bj.bcebos.com/ResNet50_vd_ssld_pretrained.tar)
- [ResNet101_vd_ssld](https://paddle-imagenet-models-name.bj.bcebos.com/ResNet101_vd_ssld_pretrained.tar)
- 移动端系列
......
......@@ -109,11 +109,6 @@ def main():
operators = create_operators()
predictor = create_predictor(args)
inputs = preprocess(args.image_file, operators)
inputs = np.expand_dims(
inputs, axis=0).repeat(
args.batch_size, axis=0).copy()
input_names = predictor.get_input_names()
input_tensor = predictor.get_input_tensor(input_names[0])
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册