diff --git a/_sources/api_cn/nas_api.rst.txt b/_sources/api_cn/nas_api.rst.txt
index f1f0214d5c6e8d7e1a1390a88adfea4612355772..d5ddaa583cba8b38282c3de66cafc5ffb2607f9e 100644
--- a/_sources/api_cn/nas_api.rst.txt
+++ b/_sources/api_cn/nas_api.rst.txt
@@ -128,7 +128,7 @@ SANAS(Simulated Annealing Neural Architecture Search)是基于模拟退火
- **tokens(list):** - 一组tokens。tokens的长度和范围取决于搜索空间。
**返回:**
- 根据传入的token得到一个模型结构实例。
+ 根据传入的token得到一个模型结构实例列表。
**示例代码:**
@@ -153,8 +153,10 @@ SANAS(Simulated Annealing Neural Architecture Search)是基于模拟退火
**示例代码:**
.. code-block:: python
+
import paddle.fluid as fluid
from paddleslim.nas import SANAS
config = [('MobileNetV2Space')]
sanas = SANAS(configs=config)
print(sanas.current_info())
+
diff --git a/_sources/api_cn/pantheon_api.md.txt b/_sources/api_cn/pantheon_api.md.txt
index 1405faeaf691fe909eb19141188f702c1201776b..1de04a6451396149f09c198c206d47181d74546e 100644
--- a/_sources/api_cn/pantheon_api.md.txt
+++ b/_sources/api_cn/pantheon_api.md.txt
@@ -1,4 +1,4 @@
-# 多进程蒸馏
+# 大规模可扩展知识蒸馏框架 Pantheon
## Teacher
@@ -100,7 +100,8 @@ pantheon.Teacher.start\_knowledge\_service(feed\_list, schema, program, reader\_
- **times (int):** The maximum repeated serving times, default 1. Whenever
the public method **get\_knowledge\_generator()** in **Student**
object called once, the serving times will be added one,
- until reaching the maximum and ending the service.
+ until reaching the maximum and ending the service. Only
+ valid in online mode, and will be ignored in offline mode.
**Return:** None
diff --git a/_sources/api_cn/prune_api.rst.txt b/_sources/api_cn/prune_api.rst.txt
index d38480d6055e52c86e3d20fa9d384040a7389856..38a76acdaed38fe8c9a302e614016e985ed9ff11 100644
--- a/_sources/api_cn/prune_api.rst.txt
+++ b/_sources/api_cn/prune_api.rst.txt
@@ -378,7 +378,7 @@ load_sensitivities
}
}
sensitivities_file = "sensitive_api_demo.data"
- with open(sensitivities_file, 'w') as f:
+ with open(sensitivities_file, 'wb') as f:
pickle.dump(sen, f)
sensitivities = load_sensitivities(sensitivities_file)
print(sensitivities)
diff --git a/_sources/api_en/index_en.rst.txt b/_sources/api_en/index_en.rst.txt
index 9fd0325a9142160f4a53c792ed430f48ca5f145a..2f21132c4c183ed70b687118eaf198d37cf49504 100644
--- a/_sources/api_en/index_en.rst.txt
+++ b/_sources/api_en/index_en.rst.txt
@@ -17,3 +17,4 @@ API Documents
paddleslim.nas.one_shot.rst
paddleslim.pantheon.rst
search_space_en.rst
+ table_latency_en.md
diff --git a/_sources/api_en/table_latency_en.md.txt b/_sources/api_en/table_latency_en.md.txt
new file mode 100644
index 0000000000000000000000000000000000000000..6a6c6ac665dd8fa021780b753e2287e52c16d404
--- /dev/null
+++ b/_sources/api_en/table_latency_en.md.txt
@@ -0,0 +1,144 @@
+# Table about hardware lantency
+
+The table about hardware latency is used to evaluate the inference time in special environment and inference engine. The following text used to introduce the format that PaddleSlim support.
+
+## Introduce
+
+The table about hardware latency saved all possible operations, one operation in the table including type and parameters, such as: type can be `conv2d`, and corresponding parameters can be the size of feature map, number of kernel, and the size of kernel.
+The latency of every operation depends on hardware and inference engine.
+
+## Overview format
+The table about hardware latency saved in the way of file or multi-line string.
+The first line of the table about hardware latency saved the information about version, every line in the following represents a operation and its latency.
+
+## Version
+
+The information about version split by comma in the english format, and the detail is hardware, inference engine and timestamp.
+
+- ** hardware: ** Used to mark the environment of hardware, including type of architecture, version and so on.
+
+- ** inference engine: ** Used to mark inference engine, including the name of inference engine, version, optimize options and so on.
+
+- ** timestamp: ** Used to mark the time of this table created.
+
+## Operation
+
+The information about operation split by comma in the english format, the information about operation and latency split by tabs.
+
+### conv2d
+
+**format**
+
+```text
+op_type,flag_bias,flag_relu,n_in,c_in,h_in,w_in,c_out,groups,kernel,padding,stride,dilation\tlatency
+```
+
+**introduce**
+
+- **op_type(str)** - The type of this op.
+- **flag_bias (int)** - Whether has bias or not(0: donot has bias, 1: has bias).
+- **flag_relu (int)** - Whether has relu or not(0: donot has relu, 1: has relu).
+- **n_in (int)** - The batch size of input.
+- **c_in (int)** - The number of channel about input.
+- **h_in (int)** - The height of input feature map.
+- **w_in (int)** - The width of input feature map.
+- **c_out (int)** - The number of channel about output.
+- **groups (int)** - The group of conv2d.
+- **kernel (int)** - The size of kernel.
+- **padding (int)** - The size of padding.
+- **stride (int)** - The size of stride.
+- **dilation (int)** - The size of dilation.
+- **latency (float)** - The latency of this op.
+
+### activaiton
+
+**format**
+
+```text
+op_type,n_in,c_in,h_in,w_in\tlatency
+```
+
+**introduce**
+
+- **op_type(str)** - The type of this op.
+- **n_in (int)** - The batch size of input.
+- **c_in (int)** - The number of channel about input.
+- **h_in (int)** - The height of input feature map.
+- **w_in (int)** - The width of input feature map.
+- **latency (float)** - The latency of this op.
+
+### batch_norm
+
+**format**
+
+```text
+op_type,active_type,n_in,c_in,h_in,w_in\tlatency
+```
+
+**introduce**
+
+- **op_type(str)** - The type of this op.
+- **active_type (string|None)** - The type of activation function, including relu, prelu, sigmoid, relu6, tanh.
+- **n_in (int)** - The batch size of input.
+- **c_in (int)** - The number of channel about input.
+- **h_in (int)** - The height of input feature map.
+- **w_in (int)** - The width of input feature map.
+- **latency (float)** - The latency of this op.
+
+### eltwise
+
+**format**
+
+```text
+op_type,n_in,c_in,h_in,w_in\tlatency
+```
+
+**introduce**
+
+- **op_type(str)** - The type of this op.
+- **n_in (int)** - The batch size of input.
+- **c_in (int)** - The number of channel about input.
+- **h_in (int)** - The height of input feature map.
+- **w_in (int)** - The width of input feature map.
+- **latency (float)** - The latency of this op.
+
+### pooling
+
+**format**
+
+```text
+op_type,flag_global_pooling,n_in,c_in,h_in,w_in,kernel,padding,stride,ceil_mode,pool_type\tlatency
+```
+
+**introduce**
+
+- **op_type(str)** - The type of this op.
+- **flag_global_pooling (int)** - Whether is global pooling or not(0: is not global, 1: is global pooling).
+- **n_in (int)** - The batch size of input.
+- **c_in (int)** - The number of channel about input.
+- **h_in (int)** - The height of input feature map.
+- **w_in (int)** - The width of input feature map.
+- **kernel (int)** - The size of kernel.
+- **padding (int)** - The size of padding.
+- **stride (int)** - The size of stride.
+- **ceil_mode (int)** - Whether to compute height and width by using ceil function(0: use floor function, 1: use ceil function).
+- **pool_type (int)** - The type of pooling(1: max pooling 2: average pooling including padding 3: average pooling excluding padding).
+- **latency (float)** - The latency of this op.
+
+### softmax
+
+**format**
+
+```text
+op_type,axis,n_in,c_in,h_in,w_in\tlatency
+```
+
+**introduce**
+
+- **op_type(str)** - The type of this op.
+- **axis (int)** - The index to compute softmax, index in the range of [-1, rank-1], `rank` is the rank of input.
+- **n_in (int)** - The batch size of input.
+- **c_in (int)** - The number of channel about input.
+- **h_in (int)** - The height of input feature map.
+- **w_in (int)** - The width of input feature map.
+- **latency (float)** - The latency of this op.
diff --git a/_sources/index_en.rst.txt b/_sources/index_en.rst.txt
index ec250402d8706ab93505f97b7784ea2147a713ef..3268252919e743b3a2ff2aca9c54d7230ad15744 100644
--- a/_sources/index_en.rst.txt
+++ b/_sources/index_en.rst.txt
@@ -3,8 +3,8 @@
You can adapt this file completely to your liking, but it should at least
contain the root `toctree` directive.
-Welcome to use PaddleSlim.
-========
+Index
+==============
.. toctree::
:maxdepth: 1
@@ -16,3 +16,5 @@ Welcome to use PaddleSlim.
tutorials/index_en
api_en/index_en
model_zoo_en.md
+
+.. mdinclude:: intro_en.md
diff --git a/_sources/install_en.md.txt b/_sources/install_en.md.txt
index 69d684a46e345e591cdb2b8e493684b8f549a32a..9d0eedba8ea361e4529eccdc538f0906a78c4000 100644
--- a/_sources/install_en.md.txt
+++ b/_sources/install_en.md.txt
@@ -1,23 +1,22 @@
# Install
-安装PaddleSlim前,请确认已正确安装Paddle1.6版本或更新版本。Paddle安装请参考:[Paddle安装教程](https://www.paddlepaddle.org.cn/install/quick)。
+Please ensure you have installed PaddlePaddle1.7+. [How to install PaddlePaddle](https://www.paddlepaddle.org.cn/install/quick)。
-- 安装develop版本
-
+- Install by pip
```bash
-git clone https://github.com/PaddlePaddle/PaddleSlim.git
-cd PaddleSlim
-python setup.py install
+pip install paddleslim -i https://pypi.org/simple
```
-- 安装官方发布的最新版本
+- Install from source
```bash
-pip install paddleslim -i https://pypi.org/simple
+git clone https://github.com/PaddlePaddle/PaddleSlim.git
+cd PaddleSlim
+python setup.py install
```
-- 安装历史版本
+- History packages
-请点击[pypi.org](https://pypi.org/project/paddleslim/#history)查看可安装历史版本。
+History packages is available in [pypi.org](https://pypi.org/project/paddleslim/#history).
diff --git a/_sources/intro.md.txt b/_sources/intro.md.txt
index 572c169069a9b5c88ee68f2c3c8c1daa5499c810..2b2ec1b3febeb83e0126b0d4293d699d5be69c3c 100644
--- a/_sources/intro.md.txt
+++ b/_sources/intro.md.txt
@@ -1,27 +1,75 @@
+# 介绍
+PaddleSlim是一个模型压缩工具库,包含模型剪裁、定点量化、知识蒸馏、超参搜索和模型结构搜索等一系列模型压缩策略。
-# PaddleSlim简介
+对于业务用户,PaddleSlim提供完整的模型压缩解决方案,可用于图像分类、检测、分割等各种类型的视觉场景。
+同时也在持续探索NLP领域模型的压缩方案。另外,PaddleSlim提供且在不断完善各种压缩策略在经典开源任务的benchmark,
+以便业务用户参考。
+
+对于模型压缩算法研究者或开发者,PaddleSlim提供各种压缩策略的底层辅助接口,方便用户复现、调研和使用最新论文方法。
+PaddleSlim会从底层能力、技术咨询合作和业务场景等角度支持开发者进行模型压缩策略相关的创新工作。
-PaddleSlim是PaddlePaddle框架的一个子模块,主要用于压缩图像领域模型。在PaddleSlim中,不仅实现了目前主流的网络剪枝、量化、蒸馏三种压缩策略,还实现了超参数搜索和小模型网络结构搜索功能。在后续版本中,会添加更多的压缩策略,以及完善对NLP领域模型的支持。
## 功能
- 模型剪裁
- - 支持通道均匀模型剪裁(uniform pruning)
- - 基于敏感度的模型剪裁
- - 基于进化算法的自动模型剪裁三种方式
+ - 卷积通道均匀剪裁
+ - 基于敏感度的卷积通道剪裁
+ - 基于进化算法的自动剪裁
-- 量化训练
+- 定点量化
- 在线量化训练(training aware)
- 离线量化(post training)
- - 支持对权重全局量化和Channel-Wise量化
- 知识蒸馏
- 支持单进程知识蒸馏
- 支持多进程分布式知识蒸馏
- 神经网络结构自动搜索(NAS)
- - 支持One-Shot网络结构自动搜索(Ont-Shot-NAS)
- - 支持基于进化算法的轻量神经网络结构自动搜索(Light-NAS)
+ - 支持基于进化算法的轻量神经网络结构自动搜索
+ - 支持One-Shot网络结构自动搜索
- 支持 FLOPS / 硬件延时约束
- 支持多平台模型延时评估
+ - 支持用户自定义搜索算法和搜索空间
+
+
+## 部分压缩策略效果
+
+### 分类模型
+
+数据: ImageNet2012; 模型: MobileNetV1;
+
+|压缩策略 |精度收益(baseline: 70.91%) |模型大小(baseline: 17.0M)|
+|:---:|:---:|:---:|
+| 知识蒸馏(ResNet50)| **+1.06%** | |
+| 知识蒸馏(ResNet50) + int8量化训练 |**+1.10%**| **-71.76%**|
+| 剪裁(FLOPs-50%) + int8量化训练|**-1.71%**|**-86.47%**|
+
+
+### 图像检测模型
+
+#### 数据:Pascal VOC;模型:MobileNet-V1-YOLOv3
+
+| 压缩方法 | mAP(baseline: 76.2%) | 模型大小(baseline: 94MB) |
+| :---------------------: | :------------: | :------------:|
+| 知识蒸馏(ResNet34-YOLOv3) | **+2.8%** | |
+| 剪裁 FLOPs -52.88% | **+1.4%** | **-67.76%** |
+|知识蒸馏(ResNet34-YOLOv3)+剪裁(FLOPs-69.57%)| **+2.6%**|**-67.00%**|
+
+
+#### 数据:COCO;模型:MobileNet-V1-YOLOv3
+
+| 压缩方法 | mAP(baseline: 29.3%) | 模型大小|
+| :---------------------: | :------------: | :------:|
+| 知识蒸馏(ResNet34-YOLOv3) | **+2.1%** | |
+| 知识蒸馏(ResNet34-YOLOv3)+剪裁(FLOPs-67.56%) | **-0.3%** | **-66.90%**|
+
+### 搜索
+
+数据:ImageNet2012; 模型:MobileNetV2
+
+|硬件环境 | 推理耗时 | Top1准确率(baseline:71.90%) |
+|:---------------:|:---------:|:--------------------:|
+| RK3288 | **-23%** | +0.07% |
+| Android cellphone | **-20%** | +0.16% |
+| iPhone 6s | **-17%** | +0.32% |
diff --git a/_sources/intro_en.md.txt b/_sources/intro_en.md.txt
index b0b64cba44d67b33436c665520f7876d4be74e63..2373710675cb1788144f6e8e9ab27bbfe26ec3da 100644
--- a/_sources/intro_en.md.txt
+++ b/_sources/intro_en.md.txt
@@ -1,28 +1,84 @@
+# Introduction
+PaddleSlim is a toolkit for model compression. It contains a collection of compression strategies, such as pruning, fixed point quantization, knowledge distillation, hyperparameter searching and neural architecture search.
-# Introduction
+PaddleSlim provides solutions of compression on computer vision models, such as image classification, object detection and semantic segmentation. Meanwhile, PaddleSlim Keeps exploring advanced compression strategies for language model. Furthermore, benckmark of compression strategies on some open tasks is available for your reference.
+
+PaddleSlim also provides auxiliary and primitive API for developer and researcher to survey, implement and apply the method in latest papers. PaddleSlim will support developer in ability of framework and technology consulting.
+
+## Features
+
+### Pruning
+
+ - Uniform pruning of convolution
+ - Sensitivity-based prunning
+ - Automated pruning based evolution search strategy
+ - Support pruning of various deep architectures such as VGG, ResNet, and MobileNet.
+ - Support self-defined range of pruning, i.e., layers to be pruned.
+
+### Fixed Point Quantization
+
+ - **Training aware**
+ - Dynamic strategy: During inference, we quantize models with hyperparameters dynamically estimated from small batches of samples.
+ - Static strategy: During inference, we quantize models with the same hyperparameters estimated from training data.
+ - Support layer-wise and channel-wise quantization.
+ - **Post training**
+
+### Knowledge Distillation
+
+ - **Naive knowledge distillation:** transfers dark knowledge by merging the teacher and student model into the same Program
+ - **Paddle large-scale scalable knowledge distillation framework Pantheon:** a universal solution for knowledge distillation, more flexible than the naive knowledge distillation, and easier to scale to the large-scale applications.
+
+ - Decouple the teacher and student models --- they run in different processes in the same or different nodes, and transfer knowledge via TCP/IP ports or local files;
+ - Friendly to assemble multiple teacher models and each of them can work in either online or offline mode independently;
+ - Merge knowledge from different teachers and make batch data for the student model automatically;
+ - Support the large-scale knowledge prediction of teacher models on multiple devices.
+
+### Neural Architecture Search
+
+ - Neural architecture search based on evolution strategy.
+ - Support distributed search.
+ - One-Shot neural architecture search.
+ - Support FLOPs and latency constrained search.
+ - Support the latency estimation on different hardware and platforms.
+
+## Performance
+
+### Image Classification
+
+Dataset: ImageNet2012; Model: MobileNetV1;
+
+|Method |Accuracy(baseline: 70.91%) |Model Size(baseline: 17.0M)|
+|:---:|:---:|:---:|
+| Knowledge Distillation(ResNet50)| **+1.06%** | |
+| Knowledge Distillation(ResNet50) + int8 quantization |**+1.10%**| **-71.76%**|
+| Pruning(FLOPs-50%) + int8 quantization|**-1.71%**|**-86.47%**|
+
+
+### Object Detection
+
+#### Dataset: Pascal VOC; Model: MobileNet-V1-YOLOv3
-As a submodule of PaddlePaddle framework, PaddleSlim is an open-source library for deep model compression and architecture search. PaddleSlim supports current popular deep compression techniques such as pruning, quantization, and knowledge distillation. Further, it also automates the search of hyperparameters and the design of lightweight deep architectures. In the future, we will develop more practically useful compression techniques for industrial-level applications and transfer these techniques to models in NLP.
+| Method | mAP(baseline: 76.2%) | Model Size(baseline: 94MB) |
+| :---------------------: | :------------: | :------------:|
+| Knowledge Distillation(ResNet34-YOLOv3) | **+2.8%** | |
+| Pruning(FLOPs -52.88%) | **+1.4%** | **-67.76%** |
+|Knowledge DistillationResNet34-YOLOv3)+Pruning(FLOPs-69.57%)| **+2.6%**|**-67.00%**|
-## Methods
+#### Dataset: COCO; Model: MobileNet-V1-YOLOv3
-- Pruning
- - Uniform pruning
- - Sensitivity-based pruning
- - Automated model pruning
+| Method | mAP(baseline: 29.3%) | Model Size|
+| :---------------------: | :------------: | :------:|
+| Knowledge Distillation(ResNet34-YOLOv3) | **+2.1%** |-|
+| Knowledge Distillation(ResNet34-YOLOv3)+Pruning(FLOPs-67.56%) | **-0.3%** | **-66.90%**|
-- Quantization
- - Training-aware quantization: Quantize models with hyperparameters dynamically estimated from small batches of samples.
- - Training-aware quantization: Quantize models with the same hyperparameters estimated from training data.
- - Support global quantization of weights and Channel-Wise quantization
+### NAS
-- Knowledge Distillation
- - Single-process knowledge distillation
- - Multi-process distributed knowledge distillation
+Dataset: ImageNet2012; Model: MobileNetV2
-- Network Architecture Search(NAS)
- - Simulated Annealing (SA)-based lightweight network architecture search method.(Light-NAS)
- - One-Shot network structure automatic search. (One-Shot-NAS)
- - PaddleSlim supports FLOPs and latency constrained search.
- - PaddleSlim supports the latency estimation on different hardware and platforms.
+|Device | Infer time cost | Top1 accuracy(baseline:71.90%) |
+|:---------------:|:---------:|:--------------------:|
+| RK3288 | **-23%** | +0.07% |
+| Android cellphone | **-20%** | +0.16% |
+| iPhone 6s | **-17%** | +0.32% |
diff --git a/_sources/model_zoo.md.txt b/_sources/model_zoo.md.txt
index c8126cb70999856ed6713be60911441116fdab3b..27e859ef6fbe91009ebf37bcd318a3e405ffc644 100644
--- a/_sources/model_zoo.md.txt
+++ b/_sources/model_zoo.md.txt
@@ -1,6 +1,6 @@
# 模型库
-## 1. 图象分类
+## 1. 图像分类
数据集:ImageNet1000类
@@ -16,7 +16,7 @@
| MobileNetV2 | quant_aware |72.05%/90.63% (-0.1%/-0.02%)| 4.0 | - | [下载链接](https://paddlemodels.bj.bcebos.com/PaddleSlim/MobileNetV2_quant_aware.tar) |
|ResNet50|-|76.50%/93.00%| 99 | 2.71 | [下载链接](http://paddle-imagenet-models-name.bj.bcebos.com/ResNet50_pretrained.tar) |
|ResNet50|quant_post|76.33%/93.02% (-0.17%/+0.02%)| 25.1| 1.19 | [下载链接](https://paddlemodels.bj.bcebos.com/PaddleSlim/ResNet50_quant_post.tar) |
-|ResNet50|quant_aware| 76.48%/93.11% (-0.02%/+0.11%)| 25.1 | 1.17 | [下载链接](https://paddlemodels.bj.bcebos.com/PaddleSlim/ResNet50_quant_awre.tar) |
+|ResNet50|quant_aware| 76.48%/93.11% (-0.02%/+0.11%)| 25.1 | 1.17 | [下载链接](https://paddlemodels.bj.bcebos.com/PaddleSlim/ResNet50_quant_awre.tar) |
分类模型Lite时延(ms)
@@ -57,20 +57,27 @@
### 1.2 剪裁
-| 模型 | 压缩方法 | Top-1/Top-5 Acc | 模型体积(MB) | GFLOPs | 下载 |
-|:--:|:---:|:--:|:--:|:--:|:--:|
-| MobileNetV1 | Baseline | 70.99%/89.68% | 17 | 1.11 | [下载链接](http://paddle-imagenet-models-name.bj.bcebos.com/MobileNetV1_pretrained.tar) |
-| MobileNetV1 | uniform -50% | 69.4%/88.66% (-1.59%/-1.02%) | 9 | 0.56 | [下载链接](https://paddlemodels.bj.bcebos.com/PaddleSlim/MobileNetV1_uniform-50.tar) |
-| MobileNetV1 | sensitive -30% | 70.4%/89.3% (-0.59%/-0.38%) | 12 | 0.74 | [下载链接](https://paddlemodels.bj.bcebos.com/PaddleSlim/MobileNetV1_sensitive-30.tar) |
-| MobileNetV1 | sensitive -50% | 69.8% / 88.9% (-1.19%/-0.78%) | 9 | 0.56 | [下载链接](https://paddlemodels.bj.bcebos.com/PaddleSlim/MobileNetV1_sensitive-50.tar) |
-| MobileNetV2 | - | 72.15%/90.65% | 15 | 0.59 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/MobileNetV2_pretrained.tar) |
-| MobileNetV2 | uniform -50% | 65.79%/86.11% (-6.35%/-4.47%) | 11 | 0.296 | [下载链接](https://paddlemodels.bj.bcebos.com/PaddleSlim/MobileNetV2_uniform-50.tar) |
-| ResNet34 | - | 72.15%/90.65% | 84 | 7.36 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/ResNet34_pretrained.tar) |
-| ResNet34 | uniform -50% | 70.99%/89.95% (-1.36%/-0.87%) | 41 | 3.67 | [下载链接](https://paddlemodels.bj.bcebos.com/PaddleSlim/ResNet34_uniform-50.tar) |
-| ResNet34 | auto -55.05% | 70.24%/89.63% (-2.04%/-1.06%) | 33 | 3.31 | [下载链接](https://paddlemodels.bj.bcebos.com/PaddleSlim/ResNet34_auto-55.tar) |
+PaddleLite推理耗时说明:
+
+环境:Qualcomm SnapDragon 845 + armv8
+
+速度指标:Thread1/Thread2/Thread4耗时
+PaddleLite版本: v2.3
+| 模型 | 压缩方法 | Top-1/Top-5 Acc | 模型体积(MB) | GFLOPs |PaddleLite推理耗时|TensorRT推理速度(FPS)| 下载 |
+|:--:|:---:|:--:|:--:|:--:|:--:|:--:|:--:|
+| MobileNetV1 | Baseline | 70.99%/89.68% | 17 | 1.11 |66.052\35.8014\19.5762|-| [下载链接](http://paddle-imagenet-models-name.bj.bcebos.com/MobileNetV1_pretrained.tar) |
+| MobileNetV1 | uniform -50% | 69.4%/88.66% (-1.59%/-1.02%) | 9 | 0.56 | 33.5636\18.6834\10.5076|-|[下载链接](https://paddlemodels.bj.bcebos.com/PaddleSlim/MobileNetV1_uniform-50.tar) |
+| MobileNetV1 | sensitive -30% | 70.4%/89.3% (-0.59%/-0.38%) | 12 | 0.74 | 46.5958\25.3098\13.6982|-|[下载链接](https://paddlemodels.bj.bcebos.com/PaddleSlim/MobileNetV1_sensitive-30.tar) |
+| MobileNetV1 | sensitive -50% | 69.8% / 88.9% (-1.19%/-0.78%) | 9 | 0.56 |37.9892\20.7882\11.3144|-| [下载链接](https://paddlemodels.bj.bcebos.com/PaddleSlim/MobileNetV1_sensitive-50.tar) |
+| MobileNetV2 | - | 72.15%/90.65% | 15 | 0.59 |41.7874\23.375\13.3998|-| [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/MobileNetV2_pretrained.tar) |
+| MobileNetV2 | uniform -50% | 65.79%/86.11% (-6.35%/-4.47%) | 11 | 0.296 |23.8842\13.8698\8.5572|-| [下载链接](https://paddlemodels.bj.bcebos.com/PaddleSlim/MobileNetV2_uniform-50.tar) |
+| ResNet34 | - | 72.15%/90.65% | 84 | 7.36 |217.808\139.943\96.7504|342.32| [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/ResNet34_pretrained.tar) |
+| ResNet34 | uniform -50% | 70.99%/89.95% (-1.36%/-0.87%) | 41 | 3.67 |114.787\75.0332\51.8438|452.41| [下载链接](https://paddlemodels.bj.bcebos.com/PaddleSlim/ResNet34_uniform-50.tar) |
+| ResNet34 | auto -55.05% | 70.24%/89.63% (-2.04%/-1.06%) | 33 | 3.31 |105.924\69.3222\48.0246|457.25| [下载链接](https://paddlemodels.bj.bcebos.com/PaddleSlim/ResNet34_auto-55.tar) |
+
### 1.3 蒸馏
@@ -85,10 +92,25 @@
|ResNet101|teacher|77.56%/93.64%| 173 | [下载链接](http://paddle-imagenet-models-name.bj.bcebos.com/ResNet101_pretrained.tar) |
| ResNet50 | ResNet101 distill | 77.29%/93.65% (+0.79%/+0.65%) | 99 | [下载链接](https://paddlemodels.bj.bcebos.com/PaddleSlim/ResNet50_distilled.tar) |
-!!! note "Note"
+注意:带"_vd"后缀代表该预训练模型使用了Mixup,Mixup相关介绍参考[mixup: Beyond Empirical Risk Minimization](https://arxiv.org/abs/1710.09412)
- [1] :带_vd后缀代表该预训练模型使用了Mixup,Mixup相关介绍参考[mixup: Beyond Empirical Risk Minimization](https://arxiv.org/abs/1710.09412)
+### 1.4 搜索
+数据集: ImageNet1000
+
+| 模型 | 压缩方法 | Top-1/Top-5 Acc | 模型体积(MB) | GFLOPs | 下载 |
+|:--:|:---:|:--:|:--:|:--:|:--:|
+| MobileNetV2 | - | 72.15%/90.65% | 15 | 0.59 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/MobileNetV2_pretrained.tar) |
+| MobileNetV2 | SANAS | 71.518%/90.208% (-0.632%/-0.442%) | 14 | 0.295 | [下载链接](https://paddlemodels.cdn.bcebos.com/PaddleSlim/MobileNetV2_sanas.tar) |
+
+数据集: Cifar10
+
+| 模型 |压缩方法 | Acc | 模型参数(MB) | 下载 |
+|:---:|:--:|:--:|:--:|:--:|
+| Darts | - | 97.135% | 3.767 | - |
+| Darts_SA(基于Darts搜索空间) | SANAS | 97.276%(+0.141%) | 3.344(-11.2%) | - |
+
+Note: MobileNetV2_NAS 的token是:[4, 4, 5, 1, 1, 2, 1, 1, 0, 2, 6, 2, 0, 3, 4, 5, 0, 4, 5, 5, 1, 4, 8, 0, 0]. Darts_SA的token是:[5, 5, 0, 5, 5, 10, 7, 7, 5, 7, 7, 11, 10, 12, 10, 0, 5, 3, 10, 8].
## 2. 目标检测
@@ -99,8 +121,8 @@
| 模型 | 压缩方法 | 数据集 | Image/GPU | 输入608 Box AP | 输入416 Box AP | 输入320 Box AP | 模型体积(MB) | TensorRT时延(V100, ms) | 下载 |
| :----------------------------: | :---------: | :----: | :-------: | :------------: | :------------: | :------------: | :------------: | :----------: |:----------: |
| MobileNet-V1-YOLOv3 | - | COCO | 8 | 29.3 | 29.3 | 27.1 | 95 | - | [下载链接](https://paddlemodels.bj.bcebos.com/object_detection/yolov3_mobilenet_v1.tar) |
-| MobileNet-V1-YOLOv3 | quant_post | COCO | 8 | 27.9 (-1.4)| 28.0 (-1.3) | 26.0 (-1.0) | 25 | - | [下载链接](https://paddlemodels.bj.bcebos.com/PaddleSlim/yolov3_mobilenetv1_coco_quant_post.tar) |
-| MobileNet-V1-YOLOv3 | quant_aware | COCO | 8 | 28.1 (-1.2)| 28.2 (-1.1) | 25.8 (-1.2) | 26.3 | - | [下载链接](https://paddlemodels.bj.bcebos.com/PaddleSlim/yolov3_mobilenet_coco_quant_aware.tar) |
+| MobileNet-V1-YOLOv3 | quant_post | COCO | 8 | 27.9 (-1.4)| 28.0 (-1.3) | 26.0 (-1.0) | 25 | - | [下载链接](https://paddlemodels.bj.bcebos.com/PaddleSlim/yolov3_mobilenetv1_coco_quant_post.tar) |
+| MobileNet-V1-YOLOv3 | quant_aware | COCO | 8 | 28.1 (-1.2)| 28.2 (-1.1) | 25.8 (-1.2) | 26.3 | - | [下载链接](https://paddlemodels.bj.bcebos.com/PaddleSlim/yolov3_mobilenet_coco_quant_aware.tar) |
| R34-YOLOv3 | - | COCO | 8 | 36.2 | 34.3 | 31.4 | 162 | - | [下载链接](https://paddlemodels.bj.bcebos.com/object_detection/yolov3_r34.tar) |
| R34-YOLOv3 | quant_post | COCO | 8 | 35.7 (-0.5) | - | - | 42.7 | - | [下载链接](https://paddlemodels.bj.bcebos.com/PaddleSlim/yolov3_r34_coco_quant_post.tar) |
| R34-YOLOv3 | quant_aware | COCO | 8 | 35.2 (-1.0) | 33.3 (-1.0) | 30.3 (-1.1)| 44 | - | [下载链接](https://paddlemodels.bj.bcebos.com/PaddleSlim/yolov3_r34_coco_quant_aware.tar) |
@@ -127,20 +149,29 @@
### 2.2 剪裁
+
数据集:Pasacl VOC & COCO 2017
-| 模型 | 压缩方法 | 数据集 | Image/GPU | 输入608 Box AP | 输入416 Box AP | 输入320 Box AP | 模型体积(MB) | GFLOPs (608*608) | 下载 |
-| :----------------------------: | :---------------: | :--------: | :-------: | :------------: | :------------: | :------------: | :----------: | :--------------: | :----------------------------------------------------------: |
-| MobileNet-V1-YOLOv3 | Baseline | Pascal VOC | 8 | 76.2 | 76.7 | 75.3 | 94 | 40.49 | [下载链接](https://paddlemodels.bj.bcebos.com/object_detection/yolov3_mobilenet_v1_voc.tar) |
-| MobileNet-V1-YOLOv3 | sensitive -52.88% | Pascal VOC | 8 | 77.6 (+1.4) | 77.7 (1.0) | 75.5 (+0.2) | 31 | 19.08 | [下载链接](https://paddlemodels.bj.bcebos.com/PaddleSlim/yolov3_mobilenet_v1_voc_prune.tar) |
-| MobileNet-V1-YOLOv3 | - | COCO | 8 | 29.3 | 29.3 | 27.0 | 95 | 41.35 | [下载链接](https://paddlemodels.bj.bcebos.com/object_detection/yolov3_mobilenet_v1.tar) |
-| MobileNet-V1-YOLOv3 | sensitive -51.77% | COCO | 8 | 26.0 (-3.3) | 25.1 (-4.2) | 22.6 (-4.4) | 32 | 19.94 | [下载链接](https://paddlemodels.bj.bcebos.com/PaddleSlim/yolov3_mobilenet_v1_prune.tar) |
-| R50-dcn-YOLOv3 | - | COCO | 8 | 39.1 | - | - | 177 | 89.60 | [下载链接](https://paddlemodels.bj.bcebos.com/object_detection/yolov3_r50vd_dcn.tar) |
-| R50-dcn-YOLOv3 | sensitive -9.37% | COCO | 8 | 39.3 (+0.2) | - | - | 150 | 81.20 | [下载链接](https://paddlemodels.bj.bcebos.com/PaddleSlim/yolov3_r50vd_dcn_prune.tar) |
-| R50-dcn-YOLOv3 | sensitive -24.68% | COCO | 8 | 37.3 (-1.8) | - | - | 113 | 67.48 | [下载链接](https://paddlemodels.bj.bcebos.com/PaddleSlim/yolov3_r50vd_dcn_prune578.tar) |
-| R50-dcn-YOLOv3 obj365_pretrain | - | COCO | 8 | 41.4 | - | - | 177 | 89.60 | [下载链接](https://paddlemodels.bj.bcebos.com/object_detection/yolov3_r50vd_dcn_obj365_pretrained_coco.tar) |
-| R50-dcn-YOLOv3 obj365_pretrain | sensitive -9.37% | COCO | 8 | 40.5 (-0.9) | - | - | 150 | 81.20 | [下载链接](https://paddlemodels.bj.bcebos.com/PaddleSlim/yolov3_r50vd_dcn_obj365_pretrained_coco_prune.tar) |
-| R50-dcn-YOLOv3 obj365_pretrain | sensitive -24.68% | COCO | 8 | 37.8 (-3.3) | - | - | 113 | 67.48 | [下载链接](https://paddlemodels.bj.bcebos.com/PaddleSlim/yolov3_r50vd_dcn_obj365_pretrained_coco_prune578.tar) |
+PaddleLite推理耗时说明:
+
+环境:Qualcomm SnapDragon 845 + armv8
+
+速度指标:Thread1/Thread2/Thread4耗时
+
+PaddleLite版本: v2.3
+
+| 模型 | 压缩方法 | 数据集 | Image/GPU | 输入608 Box AP | 输入416 Box AP | 输入320 Box AP | 模型体积(MB) | GFLOPs (608*608) | PaddleLite推理耗时(ms)(608*608) | TensorRT推理速度(FPS)(608*608) | 下载 |
+| :----------------------------: | :---------------: | :--------: | :-------: | :------------: | :------------: | :------------: | :----------: | :--------------: | :--------------: | :--------------: | :-----------------------------------: |
+| MobileNet-V1-YOLOv3 | Baseline | Pascal VOC | 8 | 76.2 | 76.7 | 75.3 | 94 | 40.49 | 1238\796.943\520.101|60.04| [下载链接](https://paddlemodels.bj.bcebos.com/object_detection/yolov3_mobilenet_v1_voc.tar) |
+| MobileNet-V1-YOLOv3 | sensitive -52.88% | Pascal VOC | 8 | 77.6 (+1.4) | 77.7 (1.0) | 75.5 (+0.2) | 31 | 19.08 | 602.497\353.759\222.427 |99.36| [下载链接](https://paddlemodels.bj.bcebos.com/PaddleSlim/yolov3_mobilenet_v1_voc_prune.tar) |
+| MobileNet-V1-YOLOv3 | - | COCO | 8 | 29.3 | 29.3 | 27.0 | 95 | 41.35 |-|-| [下载链接](https://paddlemodels.bj.bcebos.com/object_detection/yolov3_mobilenet_v1.tar) |
+| MobileNet-V1-YOLOv3 | sensitive -51.77% | COCO | 8 | 26.0 (-3.3) | 25.1 (-4.2) | 22.6 (-4.4) | 32 | 19.94 |-|73.93| [下载链接](https://paddlemodels.bj.bcebos.com/PaddleSlim/yolov3_mobilenet_v1_prune.tar) |
+| R50-dcn-YOLOv3 | - | COCO | 8 | 39.1 | - | - | 177 | 89.60 |-|27.68| [下载链接](https://paddlemodels.bj.bcebos.com/object_detection/yolov3_r50vd_dcn.tar) |
+| R50-dcn-YOLOv3 | sensitive -9.37% | COCO | 8 | 39.3 (+0.2) | - | - | 150 | 81.20 |-|30.08| [下载链接](https://paddlemodels.bj.bcebos.com/PaddleSlim/yolov3_r50vd_dcn_prune.tar) |
+| R50-dcn-YOLOv3 | sensitive -24.68% | COCO | 8 | 37.3 (-1.8) | - | - | 113 | 67.48 |-|34.32| [下载链接](https://paddlemodels.bj.bcebos.com/PaddleSlim/yolov3_r50vd_dcn_prune578.tar) |
+| R50-dcn-YOLOv3 obj365_pretrain | - | COCO | 8 | 41.4 | - | - | 177 | 89.60 |-|-| [下载链接](https://paddlemodels.bj.bcebos.com/object_detection/yolov3_r50vd_dcn_obj365_pretrained_coco.tar) |
+| R50-dcn-YOLOv3 obj365_pretrain | sensitive -9.37% | COCO | 8 | 40.5 (-0.9) | - | - | 150 | 81.20 |-|-| [下载链接](https://paddlemodels.bj.bcebos.com/PaddleSlim/yolov3_r50vd_dcn_obj365_pretrained_coco_prune.tar) |
+| R50-dcn-YOLOv3 obj365_pretrain | sensitive -24.68% | COCO | 8 | 37.8 (-3.3) | - | - | 113 | 67.48 |-|-| [下载链接](https://paddlemodels.bj.bcebos.com/PaddleSlim/yolov3_r50vd_dcn_obj365_pretrained_coco_prune578.tar) |
### 2.3 蒸馏
@@ -157,6 +188,18 @@
| MobileNet-V1-YOLOv3 | ResNet34-YOLOv3 distill | COCO | 8 | 31.4 (+2.1) | 30.0 (+0.7) | 27.1 (+0.1) | 95 | [下载链接](https://paddlemodels.bj.bcebos.com/PaddleSlim/yolov3_mobilenetv1_coco_distilled.tar) |
+### 2.4 搜索
+
+数据集:WIDER-FACE
+
+| 模型 | 压缩方法 | Image/GPU | 输入尺寸 | Easy/Medium/Hard | 模型体积(KB) | 硬件延时(ms)| 下载 |
+| :------------: | :---------: | :-------: | :------: | :-----------------------------: | :------------: | :------------: | :----------------------------------------------------------: |
+| BlazeFace | - | 8 | 640 | 91.5/89.2/79.7 | 815 | 71.862 | [下载链接](https://paddlemodels.bj.bcebos.com/object_detection/blazeface_original.tar) |
+| BlazeFace-NAS | - | 8 | 640 | 83.7/80.7/65.8 | 244 | 21.117 |[下载链接](https://paddlemodels.bj.bcebos.com/object_detection/blazeface_nas.tar) |
+| BlazeFace-NASV2 | SANAS | 8 | 640 | 87.0/83.7/68.5 | 389 | 22.558 | [下载链接](https://paddlemodels.bj.bcebos.com/object_detection/blazeface_nas2.tar) |
+
+Note: 硬件延时时间是利用提供的硬件延时表得到的,硬件延时表是在855芯片上基于PaddleLite测试的结果。BlazeFace-NASV2的详细配置在[这里](https://github.com/PaddlePaddle/PaddleDetection/blob/master/configs/face_detection/blazeface_nas_v2.yml).
+
## 3. 图像分割
数据集:Cityscapes
@@ -201,8 +244,16 @@
### 3.2 剪裁
-| 模型 | 压缩方法 | mIoU | 模型体积(MB) | GFLOPs | 下载 |
-| :-------: | :---------------: | :-----------: | :------------: | :----: | :----------------------------------------------------------: |
-| fast-scnn | baseline | 69.64 | 11 | 14.41 | [下载链接](https://paddlemodels.bj.bcebos.com/PaddleSlim/fast_scnn_cityscape.tar) |
-| fast-scnn | uniform -17.07% | 69.58 (-0.06) | 8.5 | 11.95 | [下载链接](https://paddlemodels.bj.bcebos.com/PaddleSlim/fast_scnn_cityscape_uniform-17.tar) |
-| fast-scnn | sensitive -47.60% | 66.68 (-2.96) | 5.7 | 7.55 | [下载链接](https://paddlemodels.bj.bcebos.com/PaddleSlim/fast_scnn_cityscape_sensitive-47.tar) |
+PaddleLite推理耗时说明:
+
+环境:Qualcomm SnapDragon 845 + armv8
+
+速度指标:Thread1/Thread2/Thread4耗时
+
+PaddleLite版本: v2.3
+
+| 模型 | 压缩方法 | mIoU | 模型体积(MB) | GFLOPs | PaddleLite推理耗时 | TensorRT推理速度(FPS) | 下载 |
+| :-------: | :---------------: | :-----------: | :------------: | :----: | :------------: | :----: | :--------------------------------------: |
+| fast-scnn | baseline | 69.64 | 11 | 14.41 | 1226.36\682.96\415.664 |39.53| [下载链接](https://paddlemodels.bj.bcebos.com/PaddleSlim/fast_scnn_cityscape.tar) |
+| fast-scnn | uniform -17.07% | 69.58 (-0.06) | 8.5 | 11.95 | 1140.37\656.612\415.888 |42.01| [下载链接](https://paddlemodels.bj.bcebos.com/PaddleSlim/fast_scnn_cityscape_uniform-17.tar) |
+| fast-scnn | sensitive -47.60% | 66.68 (-2.96) | 5.7 | 7.55 | 866.693\494.467\291.748 |51.48| [下载链接](https://paddlemodels.bj.bcebos.com/PaddleSlim/fast_scnn_cityscape_sensitive-47.tar) |
diff --git a/_sources/model_zoo_en.md.txt b/_sources/model_zoo_en.md.txt
index 52d7806b9e291248fe2f16e92f61cecfc5e12018..ba0bc909e1c1c0aa99365bf29e95c6e087cbb2cf 100644
--- a/_sources/model_zoo_en.md.txt
+++ b/_sources/model_zoo_en.md.txt
@@ -1,145 +1,250 @@
# Model Zoo
-## 1. 图象分类
+## 1. Image Classification
-数据集:ImageNet1000类
+Dataset:ImageNet1000
-### 1.1 量化
+### 1.1 Quantization
-| 模型 | 压缩方法 | Top-1/Top-5 Acc | 模型体积(MB) | 下载 |
+| Model | Method | Top-1/Top-5 Acc | Model Size(MB) | TensorRT latency(V100, ms) | Download |
+|:--:|:---:|:--:|:--:|:--:|:--:|
+|MobileNetV1|-|70.99%/89.68%| 17 | -| [model](http://paddle-imagenet-models-name.bj.bcebos.com/MobileNetV1_pretrained.tar) |
+|MobileNetV1|quant_post|70.18%/89.25% (-0.81%/-0.43%)| 4.4 | - | [model](https://paddlemodels.bj.bcebos.com/PaddleSlim/MobileNetV1_quant_post.tar) |
+|MobileNetV1|quant_aware|70.60%/89.57% (-0.39%/-0.11%)| 4.4 | -| [model](https://paddlemodels.bj.bcebos.com/PaddleSlim/MobileNetV1_quant_aware.tar) |
+| MobileNetV2 | - |72.15%/90.65%| 15 | - | [model](https://paddle-imagenet-models-name.bj.bcebos.com/MobileNetV2_pretrained.tar) |
+| MobileNetV2 | quant_post | 71.15%/90.11% (-1%/-0.54%)| 4.0 | - | [model](https://paddlemodels.bj.bcebos.com/PaddleSlim/MobileNetV2_quant_post.tar) |
+| MobileNetV2 | quant_aware |72.05%/90.63% (-0.1%/-0.02%)| 4.0 | - | [model](https://paddlemodels.bj.bcebos.com/PaddleSlim/MobileNetV2_quant_aware.tar) |
+|ResNet50|-|76.50%/93.00%| 99 | 2.71 | [model](http://paddle-imagenet-models-name.bj.bcebos.com/ResNet50_pretrained.tar) |
+|ResNet50|quant_post|76.33%/93.02% (-0.17%/+0.02%)| 25.1| 1.19 | [model](https://paddlemodels.bj.bcebos.com/PaddleSlim/ResNet50_quant_post.tar) |
+|ResNet50|quant_aware| 76.48%/93.11% (-0.02%/+0.11%)| 25.1 | 1.17 | [model](https://paddlemodels.bj.bcebos.com/PaddleSlim/ResNet50_quant_awre.tar) |
+
+PaddleLite latency(ms)
+
+| Device | Model | Method | armv7 Thread 1 | armv7 Thread 2 | armv7 Thread 4 | armv8 Thread 1 | armv8 Thread 2 | armv8 Thread 4 |
+| ------- | ----------- | ------------- | -------------- | -------------- | -------------- | -------------- | -------------- | -------------- |
+| Qualcomm 835 | MobileNetV1 | FP32 baseline | 96.1942 | 53.2058 | 32.4468 | 88.4955 | 47.95 | 27.5189 |
+| Qualcomm 835 | MobileNetV1 | quant_aware | 60.8186 | 32.1931 | 16.4275 | 56.4311 | 29.5446 | 15.1053 |
+| Qualcomm 835 | MobileNetV1 | quant_post | 60.5615 | 32.4016 | 16.6596 | 56.5266 | 29.7178 | 15.1459 |
+| Qualcomm 835 | MobileNetV2 | FP32 baseline | 65.715 | 38.1346 | 25.155 | 61.3593 | 36.2038 | 22.849 |
+| Qualcomm 835 | MobileNetV2 | quant_aware | 48.3655 | 30.2021 | 21.9303 | 46.1487 | 27.3146 | 18.3053 |
+| Qualcomm 835 | MobileNetV2 | quant_post | 48.3495 | 30.3069 | 22.1506 | 45.8715 | 27.4105 | 18.2223 |
+| Qualcomm 835 | ResNet50 | FP32 baseline | 526.811 | 319.6486 | 205.8345 | 506.1138 | 335.1584 | 214.8936 |
+| Qualcomm 835 | ResNet50 | quant_aware | 475.4538 | 256.8672 | 139.699 | 461.7344 | 247.9506 | 145.9847 |
+| Qualcomm 835 | ResNet50 | quant_post | 476.0507 | 256.5963 | 139.7266 | 461.9176 | 248.3795 | 149.353 |
+| Qualcomm 855 | MobileNetV1 | FP32 baseline | 33.5086 | 19.5773 | 11.7534 | 31.3474 | 18.5382 | 10.0811 |
+| Qualcomm 855 | MobileNetV1 | quant_aware | 36.7067 | 21.628 | 11.0372 | 14.0238 | 8.199 | 4.2588 |
+| Qualcomm 855 | MobileNetV1 | quant_post | 37.0498 | 21.7081 | 11.0779 | 14.0947 | 8.1926 | 4.2934 |
+| Qualcomm 855 | MobileNetV2 | FP32 baseline | 25.0396 | 15.2862 | 9.6609 | 22.909 | 14.1797 | 8.8325 |
+| Qualcomm 855 | MobileNetV2 | quant_aware | 28.1583 | 18.3317 | 11.8103 | 16.9158 | 11.1606 | 7.4148 |
+| Qualcomm 855 | MobileNetV2 | quant_post | 28.1631 | 18.3917 | 11.8333 | 16.9399 | 11.1772 | 7.4176 |
+| Qualcomm 855 | ResNet50 | FP32 baseline | 185.3705 | 113.0825 | 87.0741 | 177.7367 | 110.0433 | 74.4114 |
+| Qualcomm 855 | ResNet50 | quant_aware | 327.6883 | 202.4536 | 106.243 | 243.5621 | 150.0542 | 78.4205 |
+| Qualcomm 855 | ResNet50 | quant_post | 328.2683 | 201.9937 | 106.744 | 242.6397 | 150.0338 | 79.8659 |
+| Kirin 970 | MobileNetV1 | FP32 baseline | 101.2455 | 56.4053 | 35.6484 | 94.8985 | 51.7251 | 31.9511 |
+| Kirin 970 | MobileNetV1 | quant_aware | 62.5012 | 32.1863 | 16.6018 | 57.7477 | 29.2116 | 15.0703 |
+| Kirin 970 | MobileNetV1 | quant_post | 62.4412 | 32.2585 | 16.6215 | 57.825 | 29.2573 | 15.1206 |
+| Kirin 970 | MobileNetV2 | FP32 baseline | 70.4176 | 42.0795 | 25.1939 | 68.9597 | 39.2145 | 22.6617 |
+| Kirin 970 | MobileNetV2 | quant_aware | 52.9961 | 31.5323 | 22.1447 | 49.4858 | 28.0856 | 18.7287 |
+| Kirin 970 | MobileNetV2 | quant_post | 53.0961 | 31.7987 | 21.8334 | 49.383 | 28.2358 | 18.3642 |
+| Kirin 970 | ResNet50 | FP32 baseline | 586.8943 | 344.0858 | 228.2293 | 573.3344 | 351.4332 | 225.8006 |
+| Kirin 970 | ResNet50 | quant_aware | 488.361 | 260.1697 | 142.416 | 479.5668 | 249.8485 | 138.1742 |
+| Kirin 970 | ResNet50 | quant_post | 489.6188 | 258.3279 | 142.6063 | 480.0064 | 249.5339 | 138.5284 |
+
+### 1.2 Pruning
+
+PaddleLite:
+
+env: Qualcomm SnapDragon 845 + armv8
+
+criterion: time cost in Thread1/Thread2/Thread4
+
+PaddleLite version: v2.3
+
+
+|Model | Method | Top-1/Top-5 Acc | ModelSize(MB) | GFLOPs |PaddleLite cost(ms)|TensorRT speed(FPS)| download |
+|:--:|:---:|:--:|:--:|:--:|:--:|:--:|:--:|
+| MobileNetV1 | Baseline | 70.99%/89.68% | 17 | 1.11 |66.052\35.8014\19.5762|-| [download](http://paddle-imagenet-models-name.bj.bcebos.com/MobileNetV1_pretrained.tar) |
+| MobileNetV1 | uniform -50% | 69.4%/88.66% (-1.59%/-1.02%) | 9 | 0.56 | 33.5636\18.6834\10.5076|-|[download](https://paddlemodels.bj.bcebos.com/PaddleSlim/MobileNetV1_uniform-50.tar) |
+| MobileNetV1 | sensitive -30% | 70.4%/89.3% (-0.59%/-0.38%) | 12 | 0.74 | 46.5958\25.3098\13.6982|-|[download](https://paddlemodels.bj.bcebos.com/PaddleSlim/MobileNetV1_sensitive-30.tar) |
+| MobileNetV1 | sensitive -50% | 69.8% / 88.9% (-1.19%/-0.78%) | 9 | 0.56 |37.9892\20.7882\11.3144|-| [download](https://paddlemodels.bj.bcebos.com/PaddleSlim/MobileNetV1_sensitive-50.tar) |
+| MobileNetV2 | - | 72.15%/90.65% | 15 | 0.59 |41.7874\23.375\13.3998|-| [download](https://paddle-imagenet-models-name.bj.bcebos.com/MobileNetV2_pretrained.tar) |
+| MobileNetV2 | uniform -50% | 65.79%/86.11% (-6.35%/-4.47%) | 11 | 0.296 |23.8842\13.8698\8.5572|-| [download](https://paddlemodels.bj.bcebos.com/PaddleSlim/MobileNetV2_uniform-50.tar) |
+| ResNet34 | - | 72.15%/90.65% | 84 | 7.36 |217.808\139.943\96.7504|342.32| [download](https://paddle-imagenet-models-name.bj.bcebos.com/ResNet34_pretrained.tar) |
+| ResNet34 | uniform -50% | 70.99%/89.95% (-1.36%/-0.87%) | 41 | 3.67 |114.787\75.0332\51.8438|452.41| [download](https://paddlemodels.bj.bcebos.com/PaddleSlim/ResNet34_uniform-50.tar) |
+| ResNet34 | auto -55.05% | 70.24%/89.63% (-2.04%/-1.06%) | 33 | 3.31 |105.924\69.3222\48.0246|457.25| [download](https://paddlemodels.bj.bcebos.com/PaddleSlim/ResNet34_auto-55.tar) |
+
+### 1.3 Distillation
+
+| Model | Method | Top-1/Top-5 Acc | Model Size(MB) | Download |
|:--:|:---:|:--:|:--:|:--:|
-|MobileNetV1|-|70.99%/89.68%| xx | [下载链接]() |
-|MobileNetV1|quant_post|xx%/xx%| xx | [下载链接]() |
-|MobileNetV1|quant_aware|xx%/xx%| xx | [下载链接]() |
-| MobileNetV2 | - |72.15%/90.65%| xx | [下载链接]() |
-| MobileNetV2 | quant_post |xx%/xx%| xx | [下载链接]() |
-| MobileNetV2 | quant_aware |xx%/xx%| xx | [下载链接]() |
-|ResNet50|-|76.50%/93.00%| xx | [下载链接]() |
-|ResNet50|quant_post|xx%/xx%| xx | [下载链接]() |
-|ResNet50|quant_aware|xx%/xx%| xx | [下载链接]() |
-
+| MobileNetV1 | student | 70.99%/89.68% | 17 | [model](http://paddle-imagenet-models-name.bj.bcebos.com/MobileNetV1_pretrained.tar) |
+|ResNet50_vd|teacher|79.12%/94.44%| 99 | [model](https://paddle-imagenet-models-name.bj.bcebos.com/ResNet50_vd_pretrained.tar) |
+|MobileNetV1|ResNet50_vd[1](#trans1) distill|72.77%/90.68% (+1.78%/+1.00%)| 17 | [model](https://paddlemodels.bj.bcebos.com/PaddleSlim/MobileNetV1_distilled.tar) |
+| MobileNetV2 | student | 72.15%/90.65% | 15 | [model](https://paddle-imagenet-models-name.bj.bcebos.com/MobileNetV2_pretrained.tar) |
+| MobileNetV2 | ResNet50_vd distill | 74.28%/91.53% (+2.13%/+0.88%) | 15 | [model](https://paddlemodels.bj.bcebos.com/PaddleSlim/MobileNetV2_distilled.tar) |
+| ResNet50 | student | 76.50%/93.00% | 99 | [model](http://paddle-imagenet-models-name.bj.bcebos.com/ResNet50_pretrained.tar) |
+|ResNet101|teacher|77.56%/93.64%| 173 | [model](http://paddle-imagenet-models-name.bj.bcebos.com/ResNet101_pretrained.tar) |
+| ResNet50 | ResNet101 distill | 77.29%/93.65% (+0.79%/+0.65%) | 99 | [model](https://paddlemodels.bj.bcebos.com/PaddleSlim/ResNet50_distilled.tar) |
+Note: The `_vd` suffix indicates that the pre-trained model uses Mixup. Please refer to the detailed introduction: [mixup: Beyond Empirical Risk Minimization](https://arxiv.org/abs/1710.09412)
-### 1.2 剪裁
+### 1.4 NAS
-| 模型 | 压缩方法 | Top-1/Top-5 Acc | 模型体积(MB) | GFLOPs | 下载 |
+| Model | Method | Top-1/Top-5 Acc | Volume(MB) | GFLOPs | Download |
|:--:|:---:|:--:|:--:|:--:|:--:|
-| MobileNetV1 | Baseline | 70.99%/89.68% | 17 | 1.11 | [下载链接](http://paddle-imagenet-models-name.bj.bcebos.com/MobileNetV1_pretrained.tar) |
-| MobileNetV1 | uniform -50% | 69.4%/88.66% (-1.59%/-1.02%) | 9 | 0.56 | [下载链接](https://paddlemodels.bj.bcebos.com/PaddleSlim/MobileNetV1_uniform-50.tar) |
-| MobileNetV1 | sensitive -30% | 70.4%/89.3% (-0.59%/-0.38%) | 12 | 0.74 | [下载链接](https://paddlemodels.bj.bcebos.com/PaddleSlim/MobileNetV1_sensitive-30.tar) |
-| MobileNetV1 | sensitive -50% | 69.8% / 88.9% (-1.19%/-0.78%) | 9 | 0.56 | [下载链接](https://paddlemodels.bj.bcebos.com/PaddleSlim/MobileNetV1_sensitive-50.tar) |
-| MobileNetV2 | - | 72.15%/90.65% | 15 | 0.59 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/MobileNetV2_pretrained.tar) |
-| MobileNetV2 | uniform -50% | 65.79%/86.11% (-6.35%/-4.47%) | 11 | 0.296 | [下载链接](https://paddlemodels.bj.bcebos.com/PaddleSlim/MobileNetV2_uniform-50.tar) |
-| ResNet34 | - | 72.15%/90.65% | 84 | 7.36 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/ResNet34_pretrained.tar) |
-| ResNet34 | uniform -50% | 70.99%/89.95% (-1.36%/-0.87%) | 41 | 3.67 | [下载链接](https://paddlemodels.bj.bcebos.com/PaddleSlim/ResNet34_uniform-50.tar) |
-| ResNet34 | auto -55.05% | 70.24%/89.63% (-2.04%/-1.06%) | 33 | 3.31 | [下载链接](https://paddlemodels.bj.bcebos.com/PaddleSlim/ResNet34_auto-55.tar) |
+| MobileNetV2 | - | 72.15%/90.65% | 15 | 0.59 | [model](https://paddle-imagenet-models-name.bj.bcebos.com/MobileNetV2_pretrained.tar) |
+| MobileNetV2_NAS | SANAS | 71.518%/90.208% (-0.632%/-0.442%) | 14 | 0.295 | [model](https://paddlemodels.cdn.bcebos.com/PaddleSlim/MobileNetV2_sanas.tar) |
+Dataset: Cifar10
+| Model | Method | Acc | Params(MB) | Download |
+|:---:|:--:|:--:|:--:|:--:|
+| Darts | - | 97.135% | 3.767 | - |
+| Darts_SA(Based on Darts) | SANAS | 97.276%(+0.141%) | 3.344(-11.2%) | - |
+Note: The token of MobileNetV2_NAS is [4, 4, 5, 1, 1, 2, 1, 1, 0, 2, 6, 2, 0, 3, 4, 5, 0, 4, 5, 5, 1, 4, 8, 0, 0]. The token of Darts_SA is [5, 5, 0, 5, 5, 10, 7, 7, 5, 7, 7, 11, 10, 12, 10, 0, 5, 3, 10, 8].
-### 1.3 蒸馏
+## 2. Object Detection
-| 模型 | 压缩方法 | Top-1/Top-5 Acc | 模型体积(MB) | 下载 |
-|:--:|:---:|:--:|:--:|:--:|
-| MobileNetV1 | student | 70.99%/89.68% | 17 | [下载链接](http://paddle-imagenet-models-name.bj.bcebos.com/MobileNetV1_pretrained.tar) |
-|ResNet50_vd|teacher|79.12%/94.44%| 99 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/ResNet50_vd_pretrained.tar) |
-|MobileNetV1|ResNet50_vd[1](#trans1) distill|72.77%/90.68% (+1.78%/+1.00%)| 17 | [下载链接](https://paddlemodels.bj.bcebos.com/PaddleSlim/MobileNetV1_distilled.tar) |
-| MobileNetV2 | student | 72.15%/90.65% | 15 | [下载链接](https://paddle-imagenet-models-name.bj.bcebos.com/MobileNetV2_pretrained.tar) |
-| MobileNetV2 | ResNet50_vd distill | 74.28%/91.53% (+2.13%/+0.88%) | 15 | [下载链接](https://paddlemodels.bj.bcebos.com/PaddleSlim/MobileNetV2_distilled.tar) |
-| ResNet50 | student | 76.50%/93.00% | 99 | [下载链接](http://paddle-imagenet-models-name.bj.bcebos.com/ResNet50_pretrained.tar) |
-|ResNet101|teacher|77.56%/93.64%| 173 | [下载链接](http://paddle-imagenet-models-name.bj.bcebos.com/ResNet101_pretrained.tar) |
-| ResNet50 | ResNet101 distill | 77.29%/93.65% (+0.79%/+0.65%) | 99 | [下载链接](https://paddlemodels.bj.bcebos.com/PaddleSlim/ResNet50_distilled.tar) |
+### 2.1 Quantization
+
+Dataset: COCO 2017
+
+| Model | Method | Dataset | Image/GPU | Input 608 Box AP | Input 416 Box AP | Input 320 Box AP | Model Size(MB) | TensorRT latency(V100, ms) | Download |
+| :----------------------------: | :---------: | :----: | :-------: | :------------: | :------------: | :------------: | :------------: | :----------: |:----------: |
+| MobileNet-V1-YOLOv3 | - | COCO | 8 | 29.3 | 29.3 | 27.1 | 95 | - | [model](https://paddlemodels.bj.bcebos.com/object_detection/yolov3_mobilenet_v1.tar) |
+| MobileNet-V1-YOLOv3 | quant_post | COCO | 8 | 27.9 (-1.4)| 28.0 (-1.3) | 26.0 (-1.0) | 25 | - | [model](https://paddlemodels.bj.bcebos.com/PaddleSlim/yolov3_mobilenetv1_coco_quant_post.tar) |
+| MobileNet-V1-YOLOv3 | quant_aware | COCO | 8 | 28.1 (-1.2)| 28.2 (-1.1) | 25.8 (-1.2) | 26.3 | - | [model](https://paddlemodels.bj.bcebos.com/PaddleSlim/yolov3_mobilenet_coco_quant_aware.tar) |
+| R34-YOLOv3 | - | COCO | 8 | 36.2 | 34.3 | 31.4 | 162 | - | [model](https://paddlemodels.bj.bcebos.com/object_detection/yolov3_r34.tar) |
+| R34-YOLOv3 | quant_post | COCO | 8 | 35.7 (-0.5) | - | - | 42.7 | - | [model](https://paddlemodels.bj.bcebos.com/PaddleSlim/yolov3_r34_coco_quant_post.tar) |
+| R34-YOLOv3 | quant_aware | COCO | 8 | 35.2 (-1.0) | 33.3 (-1.0) | 30.3 (-1.1)| 44 | - | [model](https://paddlemodels.bj.bcebos.com/PaddleSlim/yolov3_r34_coco_quant_aware.tar) |
+| R50-dcn-YOLOv3 obj365_pretrain | - | COCO | 8 | 41.4 | - | - | 177 | 18.56 |[model](https://paddlemodels.bj.bcebos.com/object_detection/yolov3_r50vd_dcn_obj365_pretrained_coco.tar) |
+| R50-dcn-YOLOv3 obj365_pretrain | quant_aware | COCO | 8 | 40.6 (-0.8) | 37.5 | 34.1 | 66 | 14.64 | [model](https://paddlemodels.bj.bcebos.com/PaddleSlim/yolov3_r50vd_dcn_obj365_pretrained_coco_quant_aware.tar) |
+
+
+
+Dataset:WIDER-FACE
+
+
+
+| Model | Method | Image/GPU | Input Size | Easy/Medium/Hard | Model Size(MB) | Download |
+| :------------: | :---------: | :-------: | :--------: | :-----------------------------: | :--------------: | :----------------------------------------------------------: |
+| BlazeFace | - | 8 | 640 | 91.5/89.2/79.7 | 815 | [model](https://paddlemodels.bj.bcebos.com/object_detection/blazeface_original.tar) |
+| BlazeFace | quant_post | 8 | 640 | 87.8/85.1/74.9 (-3.7/-4.1/-4.8) | 228 | [model](https://paddlemodels.bj.bcebos.com/PaddleSlim/blazeface_origin_quant_post.tar) |
+| BlazeFace | quant_aware | 8 | 640 | 90.5/87.9/77.6 (-1.0/-1.3/-2.1) | 228 | [model](https://paddlemodels.bj.bcebos.com/PaddleSlim/blazeface_origin_quant_aware.tar) |
+| BlazeFace-Lite | - | 8 | 640 | 90.9/88.5/78.1 | 711 | [model](https://paddlemodels.bj.bcebos.com/object_detection/blazeface_lite.tar) |
+| BlazeFace-Lite | quant_post | 8 | 640 | 89.4/86.7/75.7 (-1.5/-1.8/-2.4) | 211 | [model]((https://paddlemodels.bj.bcebos.com/PaddleSlim/blazeface_lite_quant_post.tar)) |
+| BlazeFace-Lite | quant_aware | 8 | 640 | 89.7/87.3/77.0 (-1.2/-1.2/-1.1) | 211 | [model](https://paddlemodels.bj.bcebos.com/PaddleSlim/blazeface_lite_quant_aware.tar) |
+| BlazeFace-NAS | - | 8 | 640 | 83.7/80.7/65.8 | 244 | [model](https://paddlemodels.bj.bcebos.com/object_detection/blazeface_nas.tar) |
+| BlazeFace-NAS | quant_post | 8 | 640 | 81.6/78.3/63.6 (-2.1/-2.4/-2.2) | 71 | [model](https://paddlemodels.bj.bcebos.com/PaddleSlim/blazeface_nas_quant_post.tar) |
+| BlazeFace-NAS | quant_aware | 8 | 640 | 83.1/79.7/64.2 (-0.6/-1.0/-1.6) | 71 | [model](https://paddlemodels.bj.bcebos.com/PaddleSlim/blazeface_nas_quant_aware.tar) |
+
+### 2.2 Pruning
+
+Dataset:Pasacl VOC & COCO 2017
+
+PaddleLite:
+
+env: Qualcomm SnapDragon 845 + armv8
-!!! note "Note"
+criterion: time cost in Thread1/Thread2/Thread4
- [1] :带_vd后缀代表该预训练模型使用了Mixup,Mixup相关介绍参考[mixup: Beyond Empirical Risk Minimization](https://arxiv.org/abs/1710.09412)
+PaddleLite version: v2.3
+| Model | Method | Dataset | Image/GPU | Input 608 Box AP | Input 416 Box AP | Input 320 Box AP | Model Size(MB) | GFLOPs (608*608) | PaddleLite cost(ms)(608*608) | TensorRT speed(FPS)(608*608) | Download |
+| :----------------------------: | :---------------: | :--------: | :-------: | :--------------: | :--------------: | :--------------: | :------------: | :--------------: | :--------------: | :--------------: | :----------------------------: |
+| MobileNet-V1-YOLOv3 | Baseline | Pascal VOC | 8 | 76.2 | 76.7 | 75.3 | 94 | 40.49 | 1238\796.943\520.101 |60.40| [model](https://paddlemodels.bj.bcebos.com/object_detection/yolov3_mobilenet_v1_voc.tar) |
+| MobileNet-V1-YOLOv3 | sensitive -52.88% | Pascal VOC | 8 | 77.6 (+1.4) | 77.7 (1.0) | 75.5 (+0.2) | 31 | 19.08 | 602.497\353.759\222.427 |99.36| [model](https://paddlemodels.bj.bcebos.com/PaddleSlim/yolov3_mobilenet_v1_voc_prune.tar) |
+| MobileNet-V1-YOLOv3 | - | COCO | 8 | 29.3 | 29.3 | 27.0 | 95 | 41.35 |-|-| [model](https://paddlemodels.bj.bcebos.com/object_detection/yolov3_mobilenet_v1.tar) |
+| MobileNet-V1-YOLOv3 | sensitive -51.77% | COCO | 8 | 26.0 (-3.3) | 25.1 (-4.2) | 22.6 (-4.4) | 32 | 19.94 |-|73.93| [model](https://paddlemodels.bj.bcebos.com/PaddleSlim/yolov3_mobilenet_v1_prune.tar) |
+| R50-dcn-YOLOv3 | - | COCO | 8 | 39.1 | - | - | 177 | 89.60 |-|27.68| [model](https://paddlemodels.bj.bcebos.com/object_detection/yolov3_r50vd_dcn.tar) |
+| R50-dcn-YOLOv3 | sensitive -9.37% | COCO | 8 | 39.3 (+0.2) | - | - | 150 | 81.20 |-|30.08| [model](https://paddlemodels.bj.bcebos.com/PaddleSlim/yolov3_r50vd_dcn_prune.tar) |
+| R50-dcn-YOLOv3 | sensitive -24.68% | COCO | 8 | 37.3 (-1.8) | - | - | 113 | 67.48 |-|34.32| [model](https://paddlemodels.bj.bcebos.com/PaddleSlim/yolov3_r50vd_dcn_prune578.tar) |
+| R50-dcn-YOLOv3 obj365_pretrain | - | COCO | 8 | 41.4 | - | - | 177 | 89.60 |-|-| [model](https://paddlemodels.bj.bcebos.com/object_detection/yolov3_r50vd_dcn_obj365_pretrained_coco.tar) |
+| R50-dcn-YOLOv3 obj365_pretrain | sensitive -9.37% | COCO | 8 | 40.5 (-0.9) | - | - | 150 | 81.20 |-|-| [model](https://paddlemodels.bj.bcebos.com/PaddleSlim/yolov3_r50vd_dcn_obj365_pretrained_coco_prune.tar) |
+| R50-dcn-YOLOv3 obj365_pretrain | sensitive -24.68% | COCO | 8 | 37.8 (-3.3) | - | - | 113 | 67.48 |-|-| [model](https://paddlemodels.bj.bcebos.com/PaddleSlim/yolov3_r50vd_dcn_obj365_pretrained_coco_prune578.tar) |
-## 2. 目标检测
+### 2.3 Distillation
-### 2.1 量化
+Dataset:Pasacl VOC & COCO 2017
-数据集: COCO 2017
-| 模型 | 压缩方法 | 数据集 | Image/GPU | 输入608 Box AP | 输入416 Box AP | 输入320 Box AP | 模型体积(MB) | 下载 |
-| :----------------------------: | :---------: | :----: | :-------: | :------------: | :------------: | :------------: | :------------: | :----------: |
-| MobileNet-V1-YOLOv3 | - | COCO | 8 | 29.3 | 29.3 | 27.1 | xx | [下载链接]() |
-| MobileNet-V1-YOLOv3 | quant_post | COCO | 8 | xx | xx | xx | xx | [下载链接]() |
-| MobileNet-V1-YOLOv3 | quant_aware | COCO | 8 | xx | xx | xx | xx | [下载链接]() |
-| R50-dcn-YOLOv3 obj365_pretrain | - | COCO | 8 | 41.4 | xx | xx | xx | [下载链接]() |
-| R50-dcn-YOLOv3 obj365_pretrain | quant_post | COCO | 8 | xx | xx | xx | xx | [下载链接]() |
-| R50-dcn-YOLOv3 obj365_pretrain | quant_aware | COCO | 8 | xx | xx | xx | xx | [下载链接]() |
+| Model | Method | Dataset | Image/GPU | Input 608 Box AP | Input 416 Box AP | Input 320 Box AP | Model Size(MB) | Download |
+| :-----------------: | :---------------------: | :--------: | :-------: | :--------------: | :--------------: | :--------------: | :--------------: | :----------------------------------------------------------: |
+| MobileNet-V1-YOLOv3 | - | Pascal VOC | 8 | 76.2 | 76.7 | 75.3 | 94 | [model](https://paddlemodels.bj.bcebos.com/object_detection/yolov3_mobilenet_v1_voc.tar) |
+| ResNet34-YOLOv3 | - | Pascal VOC | 8 | 82.6 | 81.9 | 80.1 | 162 | [model](https://paddlemodels.bj.bcebos.com/object_detection/yolov3_r34_voc.tar) |
+| MobileNet-V1-YOLOv3 | ResNet34-YOLOv3 distill | Pascal VOC | 8 | 79.0 (+2.8) | 78.2 (+1.5) | 75.5 (+0.2) | 94 | [model](https://paddlemodels.bj.bcebos.com/PaddleSlim/yolov3_mobilenetv1_voc_distilled.tar) |
+| MobileNet-V1-YOLOv3 | - | COCO | 8 | 29.3 | 29.3 | 27.0 | 95 | [model](https://paddlemodels.bj.bcebos.com/object_detection/yolov3_mobilenet_v1.tar) |
+| ResNet34-YOLOv3 | - | COCO | 8 | 36.2 | 34.3 | 31.4 | 163 | [model](https://paddlemodels.bj.bcebos.com/object_detection/yolov3_r34.tar) |
+| MobileNet-V1-YOLOv3 | ResNet34-YOLOv3 distill | COCO | 8 | 31.4 (+2.1) | 30.0 (+0.7) | 27.1 (+0.1) | 95 | [model](https://paddlemodels.bj.bcebos.com/PaddleSlim/yolov3_mobilenetv1_coco_distilled.tar) |
+### 2.4 NAS
-数据集:WIDER-FACE
+Dataset: WIDER-FACE
+| Model | Method | Image/GPU | Input size | Easy/Medium/Hard | volume(KB) | latency(ms)| Download |
+| :------------: | :---------: | :-------: | :------: | :-----------------------------: | :------------: | :------------: | :----------------------------------------------------------: |
+| BlazeFace | - | 8 | 640 | 91.5/89.2/79.7 | 815 | 71.862 | [model](https://paddlemodels.bj.bcebos.com/object_detection/blazeface_original.tar) |
+| BlazeFace-NAS | - | 8 | 640 | 83.7/80.7/65.8 | 244 | 21.117 |[model](https://paddlemodels.bj.bcebos.com/object_detection/blazeface_nas.tar) |
+| BlazeFace-NASV2 | SANAS | 8 | 640 | 87.0/83.7/68.5 | 389 | 22.558 | [model](https://paddlemodels.bj.bcebos.com/object_detection/blazeface_nas2.tar) |
+Note: latency is based on latency_855.txt, the file is test on 855 by PaddleLite。The config of BlazeFace-NASV2 is in [there](https://github.com/PaddlePaddle/PaddleDetection/blob/master/configs/face_detection/blazeface_nas_v2.yml).
-| 模型 | 压缩方法 | Image/GPU | 输入尺寸 | Easy/Medium/Hard | 模型体积(MB) | 下载 |
-| :------------: | :---------: | :-------: | :------: | :---------------: | :------------: | :----------: |
-| BlazeFace | - | 8 | 640 | 0.915/0.892/0.797 | xx | [下载链接]() |
-| BlazeFace | quant_post | 8 | 640 | xx/xx/xx | xx | [下载链接]() |
-| BlazeFace | quant_aware | 8 | 640 | xx/xx/xx | xx | [下载链接]() |
-| BlazeFace-Lite | - | 8 | 640 | 0.909/0.885/0.781 | xx | [下载链接]() |
-| BlazeFace-Lite | quant_post | 8 | 640 | xx/xx/xx | xx | [下载链接]() |
-| BlazeFace-Lite | quant_aware | 8 | 640 | xx/xx/xx | xx | [下载链接]() |
-| BlazeFace-NAS | - | 8 | 640 | 0.837/0.807/0.658 | xx | [下载链接]() |
-| BlazeFace-NAS | quant_post | 8 | 640 | xx/xx/xx | xx | [下载链接]() |
-| BlazeFace-NAS | quant_aware | 8 | 640 | xx/xx/xx | xx | [下载链接]() |
+## 3. Image Segmentation
+Dataset:Cityscapes
-### 2.2 剪裁
+### 3.1 Quantization
-数据集:Pasacl VOC & COCO 2017
+| Model | Method | mIoU | Model Size(MB) | Download |
+| :--------------------: | :---------: | :-----------: | :--------------: | :----------------------------------------------------------: |
+| DeepLabv3+/MobileNetv1 | - | 63.26 | 6.6 | [model](https://paddlemodels.bj.bcebos.com/PaddleSlim/deeplabv3_mobilenetv1.tar ) |
+| DeepLabv3+/MobileNetv1 | quant_post | 58.63 (-4.63) | 1.8 | [model](https://paddlemodels.bj.bcebos.com/PaddleSlim/deeplabv3_mobilenetv1_2049x1025_quant_post.tar) |
+| DeepLabv3+/MobileNetv1 | quant_aware | 62.03 (-1.23) | 1.8 | [model](https://paddlemodels.bj.bcebos.com/PaddleSlim/deeplabv3_mobilenetv1_2049x1025_quant_aware.tar) |
+| DeepLabv3+/MobileNetv2 | - | 69.81 | 7.4 | [model](https://paddleseg.bj.bcebos.com/models/mobilenet_cityscapes.tgz) |
+| DeepLabv3+/MobileNetv2 | quant_post | 67.59 (-2.22) | 2.1 | [model](https://paddlemodels.bj.bcebos.com/PaddleSlim/deeplabv3_mobilenetv2_2049x1025_quant_post.tar) |
+| DeepLabv3+/MobileNetv2 | quant_aware | 68.33 (-1.48) | 2.1 | [model](https://paddlemodels.bj.bcebos.com/PaddleSlim/deeplabv3_mobilenetv2_2049x1025_quant_aware.tar) |
-| 模型 | 压缩方法 | 数据集 | Image/GPU | 输入608 Box AP | 输入416 Box AP | 输入320 Box AP | 模型体积(MB) | GFLOPs (608*608) | 下载 |
-| :----------------------------: | :---------------: | :--------: | :-------: | :------------: | :------------: | :------------: | :----------: | :--------------: | :----------------------------------------------------------: |
-| MobileNet-V1-YOLOv3 | Baseline | Pascal VOC | 8 | 76.2 | 76.7 | 75.3 | 94 | 40.49 | [下载链接](https://paddlemodels.bj.bcebos.com/object_detection/yolov3_mobilenet_v1_voc.tar) |
-| MobileNet-V1-YOLOv3 | sensitive -52.88% | Pascal VOC | 8 | 77.6 (+1.4) | 77.7 (1.0) | 75.5 (+0.2) | 31 | 19.08 | [下载链接](https://paddlemodels.bj.bcebos.com/PaddleSlim/yolov3_mobilenet_v1_voc_prune.tar) |
-| MobileNet-V1-YOLOv3 | - | COCO | 8 | 29.3 | 29.3 | 27.0 | 95 | 41.35 | [下载链接](https://paddlemodels.bj.bcebos.com/object_detection/yolov3_mobilenet_v1.tar) |
-| MobileNet-V1-YOLOv3 | sensitive -51.77% | COCO | 8 | 26.0 (-3.3) | 25.1 (-4.2) | 22.6 (-4.4) | 32 | 19.94 | [下载链接](https://paddlemodels.bj.bcebos.com/PaddleSlim/yolov3_mobilenet_v1_prune.tar) |
-| R50-dcn-YOLOv3 | - | COCO | 8 | 39.1 | - | - | 177 | 89.60 | [下载链接](https://paddlemodels.bj.bcebos.com/object_detection/yolov3_r50vd_dcn.tar) |
-| R50-dcn-YOLOv3 | sensitive -9.37% | COCO | 8 | 39.3 (+0.2) | - | - | 150 | 81.20 | [下载链接](https://paddlemodels.bj.bcebos.com/PaddleSlim/yolov3_r50vd_dcn_prune.tar) |
-| R50-dcn-YOLOv3 | sensitive -24.68% | COCO | 8 | 37.3 (-1.8) | - | - | 113 | 67.48 | [下载链接](https://paddlemodels.bj.bcebos.com/PaddleSlim/yolov3_r50vd_dcn_prune578.tar) |
-| R50-dcn-YOLOv3 obj365_pretrain | - | COCO | 8 | 41.4 | - | - | 177 | 89.60 | [下载链接](https://paddlemodels.bj.bcebos.com/object_detection/yolov3_r50vd_dcn_obj365_pretrained_coco.tar) |
-| R50-dcn-YOLOv3 obj365_pretrain | sensitive -9.37% | COCO | 8 | 40.5 (-0.9) | - | - | 150 | 81.20 | [下载链接](https://paddlemodels.bj.bcebos.com/PaddleSlim/yolov3_r50vd_dcn_obj365_pretrained_coco_prune.tar) |
-| R50-dcn-YOLOv3 obj365_pretrain | sensitive -24.68% | COCO | 8 | 37.8 (-3.3) | - | - | 113 | 67.48 | [下载链接](https://paddlemodels.bj.bcebos.com/PaddleSlim/yolov3_r50vd_dcn_obj365_pretrained_coco_prune578.tar) |
+Image segmentation model PaddleLite latency (ms), input size 769x769
-### 2.3 蒸馏
+| Device | Model | Method | armv7 Thread 1 | armv7 Thread 2 | armv7 Thread 4 | armv8 Thread 1 | armv8 Thread 2 | armv8 Thread 4 |
+| ------------ | ---------------------- | ------------- | -------------- | -------------- | -------------- | -------------- | -------------- | -------------- |
+| Qualcomm 835 | Deeplabv3- MobileNetV1 | FP32 baseline | 1227.9894 | 734.1922 | 527.9592 | 1109.96 | 699.3818 | 479.0818 |
+| Qualcomm 835 | Deeplabv3- MobileNetV1 | quant_aware | 848.6544 | 512.785 | 382.9915 | 752.3573 | 455.0901 | 307.8808 |
+| Qualcomm 835 | Deeplabv3- MobileNetV1 | quant_post | 840.2323 | 510.103 | 371.9315 | 748.9401 | 452.1745 | 309.2084 |
+| Qualcomm 835 | Deeplabv3-MobileNetV2 | FP32 baseline | 1282.8126 | 793.2064 | 653.6538 | 1193.9908 | 737.1827 | 593.4522 |
+| Qualcomm 835 | Deeplabv3-MobileNetV2 | quant_aware | 976.0495 | 659.0541 | 513.4279 | 892.1468 | 582.9847 | 484.7512 |
+| Qualcomm 835 | Deeplabv3-MobileNetV2 | quant_post | 981.44 | 658.4969 | 538.6166 | 885.3273 | 586.1284 | 484.0018 |
+| Qualcomm 855 | Deeplabv3- MobileNetV1 | FP32 baseline | 568.8748 | 339.8578 | 278.6316 | 420.6031 | 281.3197 | 217.5222 |
+| Qualcomm 855 | Deeplabv3- MobileNetV1 | quant_aware | 608.7578 | 347.2087 | 260.653 | 241.2394 | 177.3456 | 143.9178 |
+| Qualcomm 855 | Deeplabv3- MobileNetV1 | quant_post | 609.0142 | 347.3784 | 259.9825 | 239.4103 | 180.1894 | 139.9178 |
+| Qualcomm 855 | Deeplabv3-MobileNetV2 | FP32 baseline | 639.4425 | 390.1851 | 322.7014 | 477.7667 | 339.7411 | 262.2847 |
+| Qualcomm 855 | Deeplabv3-MobileNetV2 | quant_aware | 703.7275 | 497.689 | 417.1296 | 394.3586 | 300.2503 | 239.9204 |
+| Qualcomm 855 | Deeplabv3-MobileNetV2 | quant_post | 705.7589 | 474.4076 | 427.2951 | 394.8352 | 297.4035 | 264.6724 |
+| Kirin 970 | Deeplabv3- MobileNetV1 | FP32 baseline | 1682.1792 | 1437.9774 | 1181.0246 | 1261.6739 | 1068.6537 | 690.8225 |
+| Kirin 970 | Deeplabv3- MobileNetV1 | quant_aware | 1062.3394 | 1248.1014 | 878.3157 | 774.6356 | 710.6277 | 528.5376 |
+| Kirin 970 | Deeplabv3- MobileNetV1 | quant_post | 1109.1917 | 1339.6218 | 866.3587 | 771.5164 | 716.5255 | 500.6497 |
+| Kirin 970 | Deeplabv3-MobileNetV2 | FP32 baseline | 1771.1301 | 1746.0569 | 1222.4805 | 1448.9739 | 1192.4491 | 760.606 |
+| Kirin 970 | Deeplabv3-MobileNetV2 | quant_aware | 1320.2905 | 921.4522 | 676.0732 | 1145.8801 | 821.5685 | 590.1713 |
+| Kirin 970 | Deeplabv3-MobileNetV2 | quant_post | 1320.386 | 918.5328 | 672.2481 | 1020.753 | 820.094 | 591.4114 |
-数据集:Pasacl VOC & COCO 2017
-| 模型 | 压缩方法 | 数据集 | Image/GPU | 输入608 Box AP | 输入416 Box AP | 输入320 Box AP | 模型体积(MB) | 下载 |
-| :-----------------: | :---------------------: | :--------: | :-------: | :------------: | :------------: | :------------: | :------------: | :----------------------------------------------------------: |
-| MobileNet-V1-YOLOv3 | - | Pascal VOC | 8 | 76.2 | 76.7 | 75.3 | 94 | [下载链接](https://paddlemodels.bj.bcebos.com/object_detection/yolov3_mobilenet_v1_voc.tar) |
-| ResNet34-YOLOv3 | - | Pascal VOC | 8 | 82.6 | 81.9 | 80.1 | 162 | [下载链接](https://paddlemodels.bj.bcebos.com/object_detection/yolov3_r34_voc.tar) |
-| MobileNet-V1-YOLOv3 | ResNet34-YOLOv3 distill | Pascal VOC | 8 | 79.0 (+2.8) | 78.2 (+1.5) | 75.5 (+0.2) | 94 | [下载链接](https://paddlemodels.bj.bcebos.com/PaddleSlim/yolov3_mobilenetv1_voc_distilled.tar) |
-| MobileNet-V1-YOLOv3 | - | COCO | 8 | 29.3 | 29.3 | 27.0 | 95 | [下载链接](https://paddlemodels.bj.bcebos.com/object_detection/yolov3_mobilenet_v1.tar) |
-| ResNet34-YOLOv3 | - | COCO | 8 | 36.2 | 34.3 | 31.4 | 163 | [下载链接](https://paddlemodels.bj.bcebos.com/object_detection/yolov3_r34.tar) |
-| MobileNet-V1-YOLOv3 | ResNet34-YOLOv3 distill | COCO | 8 | 31.4 (+2.1) | 30.0 (+0.7) | 27.1 (+0.1) | 95 | [下载链接](https://paddlemodels.bj.bcebos.com/PaddleSlim/yolov3_mobilenetv1_coco_distilled.tar) |
-## 3. 图像分割
+### 3.2 Pruning
-数据集:Cityscapes
+PaddleLite:
-### 3.1 量化
+env: Qualcomm SnapDragon 845 + armv8
-| 模型 | 压缩方法 | mIoU | 模型体积(MB) | 下载 |
-| :--------------------: | :---------: | :---: | :------------: | :----------: |
-| DeepLabv3+/MobileNetv1 | - | 63.26 | xx | [下载链接]() |
-| DeepLabv3+/MobileNetv1 | quant_post | xx | xx | [下载链接]() |
-| DeepLabv3+/MobileNetv1 | quant_aware | xx | xx | [下载链接]() |
-| DeepLabv3+/MobileNetv2 | - | 69.81 | xx | [下载链接]() |
-| DeepLabv3+/MobileNetv2 | quant_post | xx | xx | [下载链接]() |
-| DeepLabv3+/MobileNetv2 | quant_aware | xx | xx | [下载链接]() |
+criterion: time cost in Thread1/Thread2/Thread4
-### 3.2 剪裁
+PaddleLite version: v2.3
-| 模型 | 压缩方法 | mIoU | 模型体积(MB) | GFLOPs | 下载 |
-| :-------: | :---------------: | :-----------: | :------------: | :----: | :----------------------------------------------------------: |
-| fast-scnn | baseline | 69.64 | 11 | 14.41 | [下载链接](https://paddlemodels.bj.bcebos.com/PaddleSlim/fast_scnn_cityscape.tar) |
-| fast-scnn | uniform -17.07% | 69.58 (-0.06) | 8.5 | 11.95 | [下载链接](https://paddlemodels.bj.bcebos.com/PaddleSlim/fast_scnn_cityscape_uniform-17.tar) |
-| fast-scnn | sensitive -47.60% | 66.68 (-2.96) | 5.7 | 7.55 | [下载链接](https://paddlemodels.bj.bcebos.com/PaddleSlim/fast_scnn_cityscape_sensitive-47.tar) |
+| Model | Method | mIoU | Model Size(MB) | GFLOPs | PaddleLite cost(ms) | TensorRT speed(FPS) | Download |
+| :-------: | :---------------: | :-----------: | :--------------: | :----: | :--------------: | :----: | :-------------------: |
+| fast-scnn | baseline | 69.64 | 11 | 14.41 | 1226.36\682.96\415.664 |39.53| [model](https://paddlemodels.bj.bcebos.com/PaddleSlim/fast_scnn_cityscape.tar) |
+| fast-scnn | uniform -17.07% | 69.58 (-0.06) | 8.5 | 11.95 | 1140.37\656.612\415.888 |42.01| [model](https://paddlemodels.bj.bcebos.com/PaddleSlim/fast_scnn_cityscape_uniform-17.tar) |
+| fast-scnn | sensitive -47.60% | 66.68 (-2.96) | 5.7 | 7.55 | 866.693\494.467\291.748 |51.48| [model](https://paddlemodels.bj.bcebos.com/PaddleSlim/fast_scnn_cityscape_sensitive-47.tar) |
diff --git a/_sources/tutorials/darts_nas_turorial.md.txt b/_sources/tutorials/darts_nas_turorial.md.txt
new file mode 100644
index 0000000000000000000000000000000000000000..2f99a6c1ca3fd4d8e63d25d3c83d1f6b19807f57
--- /dev/null
+++ b/_sources/tutorials/darts_nas_turorial.md.txt
@@ -0,0 +1,275 @@
+# SANAS进阶版实验教程-压缩DARTS产出模型
+
+## 收益情况
+利用DARTS搜索出来的最终模型结构(以下简称为DARTS_model)构造相应的搜索空间,根据PaddleSlim提供的SANAS搜索方法进行搜索实验,最终得到的模型结构(以下简称为DARTS_SA)相比DARTS_model的精度提升0.141% ,模型大小下降11.2% 。
+
+## 搜索教程
+本教程展示了如何在DARTS_model基础上利用SANAS进行搜索实验,并得到DARTS_SA的结果。
+
+本教程包含以下步骤:
+1. 构造搜索空间
+2. 导入依赖包并定义全局变量
+3. 初始化SANAS实例
+4. 定义计算模型参数量的函数
+5. 定义网络输入数据的函数
+6. 定义造program的函数
+7. 定义训练函数
+8. 定义预测函数
+9. 启动搜索
+ 9.1 获取下一个模型结构
+ 9.2 构造相应的训练和预测program
+ 9.3 添加搜索限制
+ 9.4 定义环境
+ 9.5 定义输入数据
+ 9.6 启动训练和评估
+ 9.7 回传当前模型的得分reward
+10. 利用demo下的脚本启动搜索
+11. 利用demo下的脚本启动最终实验
+
+### 1. 构造搜索空间
+进行搜索实验之前,首先需要根据DARTS_model的模型特点构造相应的搜索空间,本次实验仅会对DARTS_model的通道数进行搜索,搜索的目的是得到一个精度更高并且模型参数更少的模型。
+定义如下搜索空间:
+- 通道数`filter_num`: 定义了每个卷积操作的通道数变化区间。取值区间为:`[4, 8, 12, 16, 20, 36, 54, 72, 90, 108, 144, 180, 216, 252]`
+
+按照通道数来区分DARTS_model中block的话,则DARTS_model中共有3个block,第一个block仅包含6个normal cell,之后的两个block每个block都包含和一个reduction cell和6个normal cell,共有20个cell。在构造搜索空间的时候我们定义每个cell中的所有卷积操作都使用相同的通道数,共有20位token。
+
+完整的搜索空间可以参考[基于DARTS_model的搜索空间](../../../paddleslim/nas/search_space/darts_space.py)
+
+### 2. 引入依赖包并定义全局变量
+```python
+import numpy as np
+import paddle
+import paddle.fluid as fluid
+from paddleslim.nas import SANAS
+
+BATCH_SIZE=96
+SERVER_ADDRESS = ""
+PORT = 8377
+SEARCH_STEPS = 300
+RETAIN_EPOCH=30
+MAX_PARAMS=3.77
+IMAGE_SHAPE=[3, 32, 32]
+AUXILIARY = True
+AUXILIARY_WEIGHT= 0.4
+TRAINSET_NUM = 50000
+LR = 0.025
+MOMENTUM = 0.9
+WEIGHT_DECAY = 0.0003
+DROP_PATH_PROBILITY = 0.2
+```
+
+### 3. 初始化SANAS实例
+首先需要初始化SANAS示例。
+```python
+config = [('DartsSpace')]
+sa_nas = SANAS(config, server_addr=(SERVER_ADDRESS, PORT), search_steps=SEARCH_STEPS, is_server=True)
+```
+
+### 4. 定义计算模型参数量的函数
+根据输入的program计算当前模型中的参数量。本教程使用模型参数量作为搜索的限制条件。
+```python
+def count_parameters_in_MB(all_params, prefix='model'):
+ parameters_number = 0
+ for param in all_params:
+ if param.name.startswith(
+ prefix) and param.trainable and 'aux' not in param.name:
+ parameters_number += np.prod(param.shape)
+ return parameters_number / 1e6
+```
+
+### 5. 定义网络输入数据的函数
+根据输入图片的尺寸定义网络中的输入,其中包括图片输入、标签输入和在训练过程中需要随机丢弃单元的比例和掩膜。
+```python
+def create_data_loader(IMAGE_SHAPE, is_train):
+ image = fluid.data(
+ name="image", shape=[None] + IMAGE_SHAPE, dtype="float32")
+ label = fluid.data(name="label", shape=[None, 1], dtype="int64")
+ data_loader = fluid.io.DataLoader.from_generator(
+ feed_list=[image, label],
+ capacity=64,
+ use_double_buffer=True,
+ iterable=True)
+ drop_path_prob = ''
+ drop_path_mask = ''
+ if is_train:
+ drop_path_prob = fluid.data(
+ name="drop_path_prob", shape=[BATCH_SIZE, 1], dtype="float32")
+ drop_path_mask = fluid.data(
+ name="drop_path_mask",
+ shape=[BATCH_SIZE, 20, 4, 2],
+ dtype="float32")
+
+ return data_loader, image, label, drop_path_prob, drop_path_mask
+```
+
+### 6. 定义构造program的函数
+根据输入的模型结构、输入图片尺寸和当前program是否是训练模式构造program。
+```python
+def build_program(main_program, startup_program, IMAGE_SHAPE, archs, is_train):
+ with fluid.program_guard(main_program, startup_program):
+ data_loader, data, label, drop_path_prob, drop_path_mask = create_data_loader(
+ IMAGE_SHAPE, is_train)
+ logits, logits_aux = archs(data, drop_path_prob, drop_path_mask,
+ is_train, 10)
+ top1 = fluid.layers.accuracy(input=logits, label=label, k=1)
+ top5 = fluid.layers.accuracy(input=logits, label=label, k=5)
+ loss = fluid.layers.reduce_mean(
+ fluid.layers.softmax_with_cross_entropy(logits, label))
+
+ if is_train:
+ if AUXILIARY:
+ loss_aux = fluid.layers.reduce_mean(
+ fluid.layers.softmax_with_cross_entropy(logits_aux, label))
+ loss = loss + AUXILIARY_WEIGHT * loss_aux
+ step_per_epoch = int(TRAINSET_NUM / BATCH_SIZE)
+ learning_rate = fluid.layers.cosine_decay(LR, step_per_epoch, RETAIN_EPOCH)
+ fluid.clip.set_gradient_clip(
+ clip=fluid.clip.GradientClipByGlobalNorm(clip_norm=5.0))
+ optimizer = fluid.optimizer.MomentumOptimizer(
+ learning_rate,
+ MOMENTUM,
+ regularization=fluid.regularizer.L2DecayRegularizer(
+ WEIGHT_DECAY))
+ optimizer.minimize(loss)
+ outs = [loss, top1, top5, learning_rate]
+ else:
+ outs = [loss, top1, top5]
+ return outs, data_loader
+
+```
+
+### 7. 定义训练函数
+```python
+def train(main_prog, exe, epoch_id, train_loader, fetch_list):
+ loss = []
+ top1 = []
+ top5 = []
+ for step_id, data in enumerate(train_loader()):
+ devices_num = len(data)
+ if DROP_PATH_PROBILITY > 0:
+ feed = []
+ for device_id in range(devices_num):
+ image = data[device_id]['image']
+ label = data[device_id]['label']
+ drop_path_prob = np.array(
+ [[DROP_PATH_PROBILITY * epoch_id / RETAIN_EPOCH]
+ for i in range(BATCH_SIZE)]).astype(np.float32)
+ drop_path_mask = 1 - np.random.binomial(
+ 1, drop_path_prob[0],
+ size=[BATCH_SIZE, 20, 4, 2]).astype(np.float32)
+ feed.append({
+ "image": image,
+ "label": label,
+ "drop_path_prob": drop_path_prob,
+ "drop_path_mask": drop_path_mask
+ })
+ else:
+ feed = data
+ loss_v, top1_v, top5_v, lr = exe.run(
+ main_prog, feed=feed, fetch_list=[v.name for v in fetch_list])
+ loss.append(loss_v)
+ top1.append(top1_v)
+ top5.append(top5_v)
+ if step_id % 10 == 0:
+ print(
+ "Train Epoch {}, Step {}, Lr {:.8f}, loss {:.6f}, acc_1 {:.6f}, acc_5 {:.6f}".
+ format(epoch_id, step_id, lr[0], np.mean(loss), np.mean(top1), np.mean(top5)))
+ return np.mean(top1)
+```
+
+### 8. 定义预测函数
+```python
+def valid(main_prog, exe, epoch_id, valid_loader, fetch_list):
+ loss = []
+ top1 = []
+ top5 = []
+ for step_id, data in enumerate(valid_loader()):
+ loss_v, top1_v, top5_v = exe.run(
+ main_prog, feed=data, fetch_list=[v.name for v in fetch_list])
+ loss.append(loss_v)
+ top1.append(top1_v)
+ top5.append(top5_v)
+ if step_id % 10 == 0:
+ print(
+ "Valid Epoch {}, Step {}, loss {:.6f}, acc_1 {:.6f}, acc_5 {:.6f}".
+ format(epoch_id, step_id, np.mean(loss), np.mean(top1), np.mean(top5)))
+ return np.mean(top1)
+```
+
+### 9. 启动搜索实验
+以下步骤拆解说明了如何获得当前模型结构以及获得当前模型结构之后应该有的步骤。
+
+#### 9.1 获取下一个模型结构
+根据上面的SANAS实例中的函数获取下一个模型结构。
+```python
+archs = sa_nas.next_archs()[0]
+```
+
+#### 9.2 构造训练和预测program
+根据上一步中获得的模型结构分别构造训练program和预测program。
+```python
+train_program = fluid.Program()
+test_program = fluid.Program()
+startup_program = fluid.Program()
+train_fetch_list, train_loader = build_program(train_program, startup_program, IMAGE_SHAPE, archs, is_train=True)
+test_fetch_list, test_loader = build_program(test_program, startup_program, IMAGE_SHAPE, archs, is_train=False)
+test_program = test_program.clone(for_test=True)
+```
+
+#### 9.3 添加搜索限制
+本教程以模型参数量为限制条件。首先计算一下当前program的参数量,如果超出限制条件,则终止本次模型结构的训练,获取下一个模型结构。
+```python
+current_params = count_parameters_in_MB(
+ train_program.global_block().all_parameters(), 'cifar10')
+```
+
+#### 9.4 定义环境
+定义数据和模型的环境并初始化参数。
+```python
+place = fluid.CPUPlace()
+exe = fluid.Executor(place)
+exe.run(startup_program)
+```
+
+#### 9.5 定义输入数据
+由于本示例中对cifar10中的图片进行了一些额外的预处理操作,和[快速开始](../quick_start/nas_tutorial.md)示例中的reader不同,所以需要自定义cifar10的reader,不能直接调用paddle中封装好的`paddle.dataset.cifar10`的reader。自定义cifar10的reader文件位于[demo/nas](../../../demo/nas/darts_cifar10_reader.py)中。
+
+**注意:**本示例为了简化代码直接调用`paddle.dataset.cifar10`定义训练数据和预测数据,实际训练需要使用自定义cifar10的reader。
+```python
+train_reader = paddle.batch(paddle.reader.shuffle(paddle.dataset.cifar.train10(cycle=False), buf_size=1024), batch_size=BATCH_SIZE, drop_last=True)
+test_reader = paddle.batch(paddle.dataset.cifar.test10(cycle=False), batch_size=BATCH_SIZE, drop_last=False)
+train_loader.set_sample_list_generator(train_reader, places=place)
+test_loader.set_sample_list_generator(test_reader, places=place)
+```
+
+#### 9.6 启动训练和评估
+```python
+for epoch_id in range(RETAIN_EPOCH):
+ train_top1 = train(train_program, exe, epoch_id, train_loader, train_fetch_list)
+ print("TRAIN: Epoch {}, train_acc {:.6f}".format(epoch_id, train_top1))
+ valid_top1 = valid(test_program, exe, epoch_id, test_loader, test_fetch_list)
+ print("TEST: Epoch {}, valid_acc {:.6f}".format(epoch_id, valid_top1))
+ valid_top1_list.append(valid_top1)
+```
+
+#### 9.7 回传当前模型的得分reward
+本教程利用最后两个epoch的准确率均值作为最终的得分回传给SANAS。
+```python
+sa_nas.reward(float(valid_top1_list[-1] + valid_top1_list[-2]) / 2)
+```
+
+
+### 10. 利用demo下的脚本启动搜索
+
+搜索文件位于: [darts_sanas_demo](https://github.com/PaddlePaddle/PaddleSlim/tree/develop/demo/nas/darts_nas.py),搜索过程中限制模型参数量为不大于3.77M。
+```python
+cd demo/nas/
+python darts_nas.py
+```
+
+### 11. 利用demo下的脚本启动最终实验
+最终实验文件位于: [darts_sanas_demo](https://github.com/PaddlePaddle/PaddleSlim/tree/develop/demo/nas/darts_nas.py),最终实验需要训练600epoch。以下示例输入token为`[5, 5, 0, 5, 5, 10, 7, 7, 5, 7, 7, 11, 10, 12, 10, 0, 5, 3, 10, 8]`。
+```python
+cd demo/nas/
+python darts_nas.py --token 5 5 0 5 5 10 7 7 5 7 7 11 10 12 10 0 5 3 10 8 --retain_epoch 600
+```
diff --git a/_sources/tutorials/image_classification_sensitivity_analysis_tutorial_en.md.txt b/_sources/tutorials/image_classification_sensitivity_analysis_tutorial_en.md.txt
new file mode 100644
index 0000000000000000000000000000000000000000..2b084074d9c9f79cbe013252457f01f448968f5e
--- /dev/null
+++ b/_sources/tutorials/image_classification_sensitivity_analysis_tutorial_en.md.txt
@@ -0,0 +1,263 @@
+# Pruning of image classification model - sensitivity
+
+In this tutorial, you will learn how to use [sensitivity API of PaddleSlim](https://paddlepaddle.github.io/PaddleSlim/api/prune_api/#sensitivity) by a demo of MobileNetV1 model on MNIST dataset。
+This tutorial following workflow:
+
+1. Import dependency
+2. Build model
+3. Define data reader
+4. Define function for test
+5. Training model
+6. Get names of parameter
+7. Compute sensitivities
+8. Pruning model
+
+
+## 1. Import dependency
+
+PaddleSlim dependents on Paddle1.7. Please ensure that you have installed paddle correctly. Import Paddle and PaddleSlim as below:
+
+```python
+import paddle
+import paddle.fluid as fluid
+import paddleslim as slim
+```
+
+## 2. Build model
+
+This section will build a classsification model based `MobileNetV1` for MNIST task. The shape of the input is `[1, 28, 28]` and the output number is 10.
+
+To make the code simple, we define a function in package `paddleslim.models` to build classification model.
+Excute following code to build a model,
+
+
+```python
+exe, train_program, val_program, inputs, outputs = slim.models.image_classification("MobileNet", [1, 28, 28], 10, use_gpu=True)
+place = fluid.CUDAPlace(0)
+```
+
+>Note:The functions in paddleslim.models is just used in tutorials or demos.
+
+## 3 Define data reader
+
+MNIST dataset is used for making the demo can be executed quickly. It defines some functions for downloading and reading MNIST dataset in package `paddle.dataset.mnist`.
+Show as below:
+
+```python
+import paddle.dataset.mnist as reader
+train_reader = paddle.batch(
+ reader.train(), batch_size=128, drop_last=True)
+test_reader = paddle.batch(
+ reader.test(), batch_size=128, drop_last=True)
+data_feeder = fluid.DataFeeder(inputs, place)
+```
+
+## 4. Define test function
+
+To get the performance of model on test dataset after pruning a convolution layer, we define a test function as below:
+
+```python
+import numpy as np
+def test(program):
+ acc_top1_ns = []
+ acc_top5_ns = []
+ for data in test_reader():
+ acc_top1_n, acc_top5_n, _ = exe.run(
+ program,
+ feed=data_feeder.feed(data),
+ fetch_list=outputs)
+ acc_top1_ns.append(np.mean(acc_top1_n))
+ acc_top5_ns.append(np.mean(acc_top5_n))
+ print("Final eva - acc_top1: {}; acc_top5: {}".format(
+ np.mean(np.array(acc_top1_ns)), np.mean(np.array(acc_top5_ns))))
+ return np.mean(np.array(acc_top1_ns))
+```
+
+## 5. Training model
+
+Sensitivity analysis is dependent on pretrained model. So we should train the model defined in section 2 for some epochs. One epoch training is enough for this simple demo while more epochs may be necessary for other model. Or you can load pretrained model from filesystem.
+
+Training model as below:
+
+
+```python
+for data in train_reader():
+ acc1, acc5, loss = exe.run(train_program, feed=data_feeder.feed(data), fetch_list=outputs)
+print(np.mean(acc1), np.mean(acc5), np.mean(loss))
+```
+
+Get the performance using the test function defined in section 4:
+
+```python
+test(val_program)
+```
+
+## 6. Get names of parameters
+
+```python
+params = []
+for param in train_program.global_block().all_parameters():
+ if "_sep_weights" in param.name:
+ params.append(param.name)
+print(params)
+params = params[:5]
+```
+
+## 7. Compute sensitivities
+
+### 7.1 Compute in single process
+
+Apply sensitivity analysis on pretrained model by calling [sensitivity API](https://paddlepaddle.github.io/PaddleSlim/api/prune_api/#sensitivity).
+
+The sensitivities will be appended into the file given by option `sensitivities_file` during computing.
+The information in this file won`t be computed repeatedly.
+
+Remove the file `sensitivities_0.data` in current directory:
+
+```python
+!rm -rf sensitivities_0.data
+```
+
+Apart from the parameters to be analyzed, it also support for setting the ratios that each convolutoin will be pruned.
+
+If one model losses 90% accuracy on test dataset when its single convolution layer is pruned by 40%, then we can set `pruned_ratios` to `[0.1, 0.2, 0.3, 0.4]`.
+
+The granularity of `pruned_ratios` should be small to get more reasonable sensitivities. But small granularity of `pruned_ratios` will slow down the computing.
+
+```python
+sens_0 = slim.prune.sensitivity(
+ val_program,
+ place,
+ params,
+ test,
+ sensitivities_file="sensitivities_0.data",
+ pruned_ratios=[0.1, 0.2])
+print(sens_0)
+```
+
+### 7.2 Expand sensitivities
+
+We can expand `pruned_ratios` to `[0.1, 0.2, 0.3]` based the sensitivities generated in section 7.1.
+
+```python
+sens_0 = slim.prune.sensitivity(
+ val_program,
+ place,
+ params,
+ test,
+ sensitivities_file="sensitivities_0.data",
+ pruned_ratios=[0.3])
+print(sens_0)
+```
+
+### 7.3 Computing sensitivity in multi-process
+
+The time cost of computing sensitivities is dependent on the count of parameters and the speed of model evaluation on test dataset. We can speed up computing by multi-process.
+
+Split `pruned_ratios` into multi-process, and merge the sensitivities from multi-process.
+
+#### 7.3.1 Computing in each process
+
+We have compute the sensitivities when `pruned_ratios=[0.1, 0.2, 0.3]` and saved the sensitivities into file named `sensitivities_0.data`.
+
+在另一个进程中,The we start a task by setting `pruned_ratios=[0.4]` in another process and save result into file named `sensitivities_1.data`. Show as below:
+
+
+```python
+sens_1 = slim.prune.sensitivity(
+ val_program,
+ place,
+ params,
+ test,
+ sensitivities_file="sensitivities_1.data",
+ pruned_ratios=[0.4])
+print(sens_1)
+```
+
+#### 7.3.2 Load sensitivity file generated in multi-process
+
+```python
+s_0 = slim.prune.load_sensitivities("sensitivities_0.data")
+s_1 = slim.prune.load_sensitivities("sensitivities_1.data")
+print(s_0)
+print(s_1)
+```
+
+#### 7.3.3 Merge sensitivies
+
+
+```python
+s = slim.prune.merge_sensitive([s_0, s_1])
+print(s)
+```
+
+## 8. Pruning model
+
+Pruning model according to the sensitivities generated in section 7.3.3.
+
+### 8.1 Get pruning ratios
+
+Get a group of ratios by calling [get_ratios_by_loss](https://paddlepaddle.github.io/PaddleSlim/api/prune_api/#get_ratios_by_loss) fuction:
+
+
+```python
+loss = 0.01
+ratios = slim.prune.get_ratios_by_loss(s_0, loss)
+print(ratios)
+```
+
+### 8.2 Pruning training network
+
+
+```python
+pruner = slim.prune.Pruner()
+print("FLOPs before pruning: {}".format(slim.analysis.flops(train_program)))
+pruned_program, _, _ = pruner.prune(
+ train_program,
+ fluid.global_scope(),
+ params=ratios.keys(),
+ ratios=ratios.values(),
+ place=place)
+print("FLOPs after pruning: {}".format(slim.analysis.flops(pruned_program)))
+```
+
+### 8.3 Pruning test network
+
+Note:The `only_graph` should be set to True while pruning test network. [Pruner API](https://paddlepaddle.github.io/PaddleSlim/api/prune_api/#pruner)
+
+
+```python
+pruner = slim.prune.Pruner()
+print("FLOPs before pruning: {}".format(slim.analysis.flops(val_program)))
+pruned_val_program, _, _ = pruner.prune(
+ val_program,
+ fluid.global_scope(),
+ params=ratios.keys(),
+ ratios=ratios.values(),
+ place=place,
+ only_graph=True)
+print("FLOPs after pruning: {}".format(slim.analysis.flops(pruned_val_program)))
+```
+
+Get accuracy of pruned model on test dataset:
+
+```python
+test(pruned_val_program)
+```
+
+### 8.4 Training pruned model
+
+Training pruned model:
+
+
+```python
+for data in train_reader():
+ acc1, acc5, loss = exe.run(pruned_program, feed=data_feeder.feed(data), fetch_list=outputs)
+print(np.mean(acc1), np.mean(acc5), np.mean(loss))
+```
+
+Get accuracy of model after training:
+
+```python
+test(pruned_val_program)
+```
diff --git a/_sources/tutorials/index.rst.txt b/_sources/tutorials/index.rst.txt
index bc4a29ea981819c1ae935b5e891b5f600fa0adb1..e6109b73b6c3bd5c370522cce67f62005f255a65 100644
--- a/_sources/tutorials/index.rst.txt
+++ b/_sources/tutorials/index.rst.txt
@@ -3,9 +3,13 @@
========
.. toctree::
- :maxdepth: 2
- :caption: Contents:
+ :maxdepth: 1
image_classification_sensitivity_analysis_tutorial.md
- image_classification_nas_quick_start.ipynb
-
+ darts_nas_turorial.md
+ paddledetection_slim_distillation_tutorial.md
+ paddledetection_slim_nas_tutorial.md
+ paddledetection_slim_pruing_tutorial.md
+ paddledetection_slim_prune_dist_tutorial.md
+ paddledetection_slim_quantization_tutorial.md
+ paddledetection_slim_sensitivy_tutorial.md
diff --git a/_sources/tutorials/index_en.rst.txt b/_sources/tutorials/index_en.rst.txt
index 44241358d263d556ebf9e5329ecbaba73498fe30..10bdbd38761f9f1c72a0cb427636cd3431986cd7 100644
--- a/_sources/tutorials/index_en.rst.txt
+++ b/_sources/tutorials/index_en.rst.txt
@@ -4,5 +4,5 @@ Aadvanced Tutorials
.. toctree::
:maxdepth: 1
- sensitivity_tutorial_en.md
+ image_classification_sensitivity_analysis_tutorial_en.md
diff --git a/_sources/tutorials/paddledetection_slim_distillation_tutorial.md.txt b/_sources/tutorials/paddledetection_slim_distillation_tutorial.md.txt
new file mode 100644
index 0000000000000000000000000000000000000000..dbdd30fff6b4b014e907a3af234735597a20db63
--- /dev/null
+++ b/_sources/tutorials/paddledetection_slim_distillation_tutorial.md.txt
@@ -0,0 +1,32 @@
+# 目标检测模型蒸馏教程
+
+教程内容请参考:https://github.com/PaddlePaddle/PaddleDetection/blob/release/0.2/slim/distillation/README.md
+
+
+## 示例结果
+
+### MobileNetV1-YOLO-V3-VOC
+
+| FLOPS |输入尺寸|每张GPU图片个数|推理时间(fps)|Box AP|下载|
+|:-:|:-:|:-:|:-:|:-:|:-:|
+|baseline|608 |16|104.291|76.2|[下载链接](https://paddlemodels.bj.bcebos.com/object_detection/yolov3_mobilenet_v1_voc.tar)|
+|蒸馏后|608 |16|106.914|79.0|[下载链接](https://paddlemodels.bj.bcebos.com/PaddleSlim/yolov3_mobilenetv1_voc_distilled.tar)|
+|baseline|416 |16|-|76.7|[下载链接](https://paddlemodels.bj.bcebos.com/object_detection/yolov3_mobilenet_v1_voc.tar)|
+|蒸馏后|416 |16|-|78.2|[下载链接](https://paddlemodels.bj.bcebos.com/PaddleSlim/yolov3_mobilenetv1_voc_distilled.tar)|
+|baseline|320 |16|-|75.3|[下载链接](https://paddlemodels.bj.bcebos.com/object_detection/yolov3_mobilenet_v1_voc.tar)|
+|蒸馏后|320 |16|-|75.5|[下载链接](https://paddlemodels.bj.bcebos.com/PaddleSlim/yolov3_mobilenetv1_voc_distilled.tar)|
+
+> 蒸馏后的结果用ResNet34-YOLO-V3做teacher,4GPU总batch_size64训练90000 iter得到
+
+### MobileNetV1-YOLO-V3-COCO
+
+| FLOPS |输入尺寸|每张GPU图片个数|推理时间(fps)|Box AP|下载|
+|:-:|:-:|:-:|:-:|:-:|:-:|
+|baseline|608 |16|78.302|29.3|[下载链接](https://paddlemodels.bj.bcebos.com/object_detection/yolov3_mobilenet_v1_voc.tar)|
+|蒸馏后|608 |16|78.523|31.4|[下载链接](https://paddlemodels.bj.bcebos.com/PaddleSlim/yolov3_mobilenetv1_coco_distilled.tar)|
+|baseline|416 |16|-|29.3|[下载链接](https://paddlemodels.bj.bcebos.com/object_detection/yolov3_mobilenet_v1_voc.tar)|
+|蒸馏后|416 |16|-|30.0|[下载链接](https://paddlemodels.bj.bcebos.com/PaddleSlim/yolov3_mobilenetv1_coco_distilled.tar)|
+|baseline|320 |16|-|27.0|[下载链接](https://paddlemodels.bj.bcebos.com/object_detection/yolov3_mobilenet_v1_voc.tar)|
+|蒸馏后|320 |16|-|27.1|[下载链接](https://paddlemodels.bj.bcebos.com/PaddleSlim/yolov3_mobilenetv1_coco_distilled.tar)|
+
+> 蒸馏后的结果用ResNet34-YOLO-V3做teacher,4GPU总batch_size64训练600000 iter得到
diff --git a/_sources/tutorials/paddledetection_slim_nas_tutorial.md.txt b/_sources/tutorials/paddledetection_slim_nas_tutorial.md.txt
new file mode 100644
index 0000000000000000000000000000000000000000..b4aa6ad133f7cadc744c11912850c66f5868dc19
--- /dev/null
+++ b/_sources/tutorials/paddledetection_slim_nas_tutorial.md.txt
@@ -0,0 +1,8 @@
+# 人脸检测模型小模型结构搜索教程
+
+教程内容请参考:https://github.com/PaddlePaddle/PaddleDetection/blob/release/0.2/slim/nas/README.md
+
+## 概述
+
+我们选取人脸检测的BlazeFace模型作为神经网络搜索示例,该示例使用PaddleSlim 辅助完成神经网络搜索实验。
+基于PaddleSlim进行搜索实验过程中,搜索限制条件可以选择是浮点运算数(FLOPs)限制还是硬件延时(latency)限制,硬件延时限制需要提供延时表。本示例提供一份基于blazeface搜索空间的硬件延时表,名称是latency_855.txt(基于PaddleLite在骁龙855上测试的延时),可以直接用该表进行blazeface的硬件延时搜索实验。
diff --git a/_sources/tutorials/paddledetection_slim_pruing_tutorial.md.txt b/_sources/tutorials/paddledetection_slim_pruing_tutorial.md.txt
new file mode 100644
index 0000000000000000000000000000000000000000..defd371ccef152a00670cab2c93aa8745688bae0
--- /dev/null
+++ b/_sources/tutorials/paddledetection_slim_pruing_tutorial.md.txt
@@ -0,0 +1,3 @@
+# 目标检测模型卷积通道剪裁教程
+
+请参考:https://github.com/PaddlePaddle/PaddleDetection/blob/release/0.2/slim/prune/README.md
diff --git a/_sources/tutorials/paddledetection_slim_prune_dist_tutorial.md.txt b/_sources/tutorials/paddledetection_slim_prune_dist_tutorial.md.txt
new file mode 100644
index 0000000000000000000000000000000000000000..5697e968da17681d1f8b4f0cdb38eb2a8d19a43d
--- /dev/null
+++ b/_sources/tutorials/paddledetection_slim_prune_dist_tutorial.md.txt
@@ -0,0 +1,7 @@
+# 目标检测模型蒸馏剪裁教程
+
+教程内容请参考:https://github.com/PaddlePaddle/PaddleDetection/blob/release/0.2/slim/extensions/distill_pruned_model/README.md
+
+## 概述
+
+该文档介绍如何使用PaddleSlim的蒸馏接口和卷积通道剪裁接口对检测库中的模型进行卷积层的通道剪裁并使用较高精度模型对其蒸馏。
diff --git a/_sources/tutorials/paddledetection_slim_quantization_tutorial.md.txt b/_sources/tutorials/paddledetection_slim_quantization_tutorial.md.txt
new file mode 100644
index 0000000000000000000000000000000000000000..c3ea66ced7cb539ef7ed365cf852a442be091f18
--- /dev/null
+++ b/_sources/tutorials/paddledetection_slim_quantization_tutorial.md.txt
@@ -0,0 +1,28 @@
+# 目标检测模型定点量化教程
+
+教程内容请参考:https://github.com/PaddlePaddle/PaddleDetection/blob/release/0.2/slim/quantization/README.md
+
+
+## 示例结果
+
+### 训练策略
+
+- 量化策略`post`为使用离线量化得到的模型,`aware`为在线量化训练得到的模型。
+
+### YOLOv3 on COCO
+
+| 骨架网络 | 预训练权重 | 量化策略 | 输入尺寸 | Box AP | 下载 |
+| :----------------| :--------: | :------: | :------: |:------: | :-----------------------------------------------------: |
+| MobileNetV1 | ImageNet | post | 608 | 27.9 | [下载链接](https://paddlemodels.bj.bcebos.com/PaddleSlim/yolov3_mobilenetv1_coco_quant_post.tar) |
+| MobileNetV1 | ImageNet | post | 416 | 28.0 | [下载链接](https://paddlemodels.bj.bcebos.com/PaddleSlim/yolov3_mobilenetv1_coco_quant_post.tar) |
+| MobileNetV1 | ImageNet | post | 320 | 26.0 | [下载链接](https://paddlemodels.bj.bcebos.com/PaddleSlim/yolov3_mobilenetv1_coco_quant_post.tar) |
+| MobileNetV1 | ImageNet | aware | 608 | 28.1 | [下载链接](https://paddlemodels.bj.bcebos.com/PaddleSlim/yolov3_mobilenetv1_coco_quant_aware.tar) |
+| MobileNetV1 | ImageNet | aware | 416 | 28.2 | [下载链接](https://paddlemodels.bj.bcebos.com/PaddleSlim/yolov3_mobilenetv1_coco_quant_aware.tar) |
+| MobileNetV1 | ImageNet | aware | 320 | 25.8 | [下载链接](https://paddlemodels.bj.bcebos.com/PaddleSlim/yolov3_mobilenetv1_coco_quant_aware.tar) |
+| ResNet34 | ImageNet | post | 608 | 35.7 | [下载链接](https://paddlemodels.bj.bcebos.com/PaddleSlim/yolov3_r34_coco_quant_post.tar) |
+| ResNet34 | ImageNet | aware | 608 | 35.2 | [下载链接](https://paddlemodels.bj.bcebos.com/PaddleSlim/yolov3_r34_coco_quant_aware.tar) |
+| ResNet34 | ImageNet | aware | 416 | 33.3 | [下载链接](https://paddlemodels.bj.bcebos.com/PaddleSlim/yolov3_r34_coco_quant_aware.tar) |
+| ResNet34 | ImageNet | aware | 320 | 30.3 | [下载链接](https://paddlemodels.bj.bcebos.com/PaddleSlim/yolov3_r34_coco_quant_aware.tar) |
+| R50vd-dcn | object365 | aware | 608 | 40.6 | [下载链接](https://paddlemodels.bj.bcebos.com/PaddleSlim/yolov3_r50vd_dcn_obj365_pretrained_coco_quant_aware.tar) |
+| R50vd-dcn | object365 | aware | 416 | 37.5 | [下载链接](https://paddlemodels.bj.bcebos.com/PaddleSlim/yolov3_r50vd_dcn_obj365_pretrained_coco_quant_aware.tar) |
+| R50vd-dcn | object365 | aware | 320 | 34.1 | [下载链接](https://paddlemodels.bj.bcebos.com/PaddleSlim/yolov3_r50vd_dcn_obj365_pretrained_coco_quant_aware.tar) |
diff --git a/_sources/tutorials/paddledetection_slim_sensitivy_tutorial.md.txt b/_sources/tutorials/paddledetection_slim_sensitivy_tutorial.md.txt
new file mode 100644
index 0000000000000000000000000000000000000000..5e2c7bd54c8f5344f2348bc397a5da569548512c
--- /dev/null
+++ b/_sources/tutorials/paddledetection_slim_sensitivy_tutorial.md.txt
@@ -0,0 +1,3 @@
+# 目标检测模型敏感度分析教程
+
+教程内容请参考:https://github.com/PaddlePaddle/PaddleDetection/blob/release/0.2/slim/sensitive/README.md
diff --git a/algo/algo.html b/algo/algo.html
index 6c5543e708fcd4a073442ce593951c580d7a9d58..359f04b2926c74132aefad3b64cdb29921fae273 100644
--- a/algo/algo.html
+++ b/algo/algo.html
@@ -81,7 +81,7 @@
English Documents
-PaddleSlim简介
+介绍
安装
快速开始
进阶教程
@@ -90,39 +90,39 @@
算法原理
目录
1. Quantization Aware Training量化介绍
-1.1 背景
-1.2 量化原理
-2. 卷积核剪裁原理
-2.1 剪裁卷积核
+2. 卷积核剪裁原理
+2.1 剪裁卷积核
2.2 Uniform剪裁卷积网络
-2.3 基于敏感度剪裁卷积网络
-3. 蒸馏
-4. 轻量级模型结构搜索
@@ -174,16 +174,16 @@
1. Quantization Aware Training量化介绍
-
-
1.1 背景
+
+
1.1 背景
近年来,定点量化使用更少的比特数(如8-bit、3-bit、2-bit等)表示神经网络的权重和激活已被验证是有效的。定点量化的优点包括低内存带宽、低功耗、低计算资源占用以及低模型存储需求等。
@@ -194,64 +194,62 @@
表2:模型量化前后精度对比
目前,学术界主要将量化分为两大类:Post Training Quantization
和Quantization Aware Training
。Post Training Quantization
是指使用KL散度、滑动平均等方法确定量化参数且不需要重新训练的定点量化方法。Quantization Aware Training
是在训练过程中对量化进行建模以确定量化参数,它与Post Training Quantization
模式相比可以提供更高的预测精度。
-
-
1.2 量化原理
-
-
1.2.1 量化方式
+
+
1.2 量化原理
+
+
1.2.1 量化方式
目前,存在着许多方法可以将浮点数量化成定点数。例如:
-$$ r = min(max(x, a), b)$$ $$ s = \frac{b - a}{n - 1} $$ $$ q = \left \lfloor \frac{r - a}{s} \right \rceil $$
-式中,$x$是待量化的浮点值,$[a, b]$是量化范围,$a$是待量化浮点数中的最小值, $b$ 是待量化浮点数中的最大值。$\left \lfloor \right \rceil$ 表示将结果四舍五入到最近的整数。如果量化级别为$k$,则$n$为$2^k$。例如,若$k$为8,则$n$为256。$q$是量化得到的整数。
+$$ r = min(max(x, a), b)$$ $$ s = frac{b - a}{n - 1} $$ $$ q = left lfloor frac{r - a}{s} right rceil $$
+式中,$x$是待量化的浮点值,$[a, b]$是量化范围,$a$是待量化浮点数中的最小值, $b$ 是待量化浮点数中的最大值。$left lfloor right rceil$ 表示将结果四舍五入到最近的整数。如果量化级别为$k$,则$n$为$2^k$。例如,若$k$为8,则$n$为256。$q$是量化得到的整数。
PaddleSlim框架中选择的量化方法为最大绝对值量化(max-abs
),具体描述如下:
-$$ M = max(abs(x)) $$ $$ q = \left \lfloor \frac{x}{M} * (n - 1) \right \rceil $$
-式中,$x$是待被量化的浮点值,$M$是待量化浮点数中的绝对值最大值。$\left \lfloor \right \rceil$表示将结果四舍五入到最近的整数。对于8bit量化,PaddleSlim采用int8_t
,即$n=2^7=128$。$q$是量化得到的整数。
+$$ M = max(abs(x)) $$ $$ q = left lfloor frac{x}{M} * (n - 1) right rceil $$
+式中,$x$是待被量化的浮点值,$M$是待量化浮点数中的绝对值最大值。$left lfloor right rceil$表示将结果四舍五入到最近的整数。对于8bit量化,PaddleSlim采用int8_t
,即$n=2^7=128$。$q$是量化得到的整数。
无论是min-max量化
还是max-abs量化
,他们都可以表示为如下形式:
$q = scale * r + b$
其中min-max
和max-abs
被称为量化参数或者量化比例或者量化范围。
-
-
1.2.2 量化训练
-
-
1.2.2.1 前向传播
+
+
1.2.2 量化训练
+
+
1.2.2.1 前向传播
前向传播过程采用模拟量化的方式,具体描述如下:
图1:基于模拟量化训练的前向过程
-
由图1可知,基于模拟量化训练的前向过程可被描述为以下四个部分:
-
-输入和权重均被量化成8-bit整数。
-在8-bit整数上执行矩阵乘法或者卷积操作。
-反量化矩阵乘法或者卷积操作的输出结果为32-bit浮点型数据。
-在32-bit浮点型数据上执行偏置加法操作。此处,偏置并未被量化。
+由图1可知,基于模拟量化训练的前向过程可被描述为以下四个部分:
+1) 输入和权重均被量化成8-bit整数。
+2) 在8-bit整数上执行矩阵乘法或者卷积操作。
+3) 反量化矩阵乘法或者卷积操作的输出结果为32-bit浮点型数据。
+4) 在32-bit浮点型数据上执行偏置加法操作。此处,偏置并未被量化。
对于通用矩阵乘法(GEMM
),输入$X$和权重$W$的量化操作可被表述为如下过程:
-$$ X_q = \left \lfloor \frac{X}{X_m} * (n - 1) \right \rceil $$ $$ W_q = \left \lfloor \frac{W}{W_m} * (n - 1) \right \rceil $$
+$$ X_q = left lfloor frac{X}{X_m} * (n - 1) right rceil $$ $$ W_q = left lfloor frac{W}{W_m} * (n - 1) right rceil $$
执行通用矩阵乘法:
-$$ Y_q = X_q * W_q $$
+$$ Y_q = X_q * Wq $$
对量化乘积结果$Yq$进行反量化:
$$
-\begin{align}
-Y_{dq} = \frac{Y_q}{(n - 1) * (n - 1)} * X_m * W_m =\frac{X_q * W_q}{(n - 1) * (n - 1)} * X_m * W_m =(\frac{X_q}{n - 1} * X_m) * (\frac{W_q}{n - 1} * W_m) \end{align}
+begin{align}
+Y {dq} = frac{Y_q}{(n - 1) * (n - 1)} * X_m * W_m =frac{X_q * W_q}{(n - 1) * (n - 1)} * X_m * W_m =(frac{X_q}{n - 1} * X_m) * (frac{W_q}{n - 1} * W_m) end{align}
$$
-上述公式表明反量化操作可以被移动到GEMM
之前,即先对$Xq$和$Wq$执行反量化操作再做GEMM
操作。因此,前向传播的工作流亦可表示为如下方式:
-
+上述公式表明反量化操作可以被移动到
GEMM
之前,即先对$Xq$和$Wq$执行反量化操作再做
GEMM
操作。因此,前向传播的工作流亦可表示为如下方式:
图2:基于模拟量化训练前向过程的等价工作流
训练过程中,PaddleSlim使用图2中所示的等价工作流。在设计中,量化Pass在IrGraph中插入量化op和反量化op。因为在连续的量化、反量化操作之后输入仍然为32-bit浮点型数据。因此,PaddleSlim量化训练框架所采用的量化方式被称为模拟量化。
-
-
1.2.2.2 反向传播
+
+
1.2.2.2 反向传播
由图3可知,权重更新所需的梯度值可以由量化后的权重和量化后的激活求得。反向传播过程中的所有输入和输出均为32-bit浮点型数据。注意,梯度更新操作需要在原始权重上进行,即计算出的梯度将被加到原始权重上而非量化后或反量化后的权重上。
图3:基于模拟量化训练的反向传播和权重更新过程
因此,量化Pass也会改变相应反向算子的某些输入。
-
-
1.2.2.3 确定量化比例系数
+
+
1.2.2.3 确定量化比例系数
存在着两种策略可以计算求取量化比例系数,即动态策略和静态策略。动态策略会在每次迭代过程中计算量化比例系数的值。静态策略则对不同的输入采用相同的量化比例系数。
对于权重而言,在训练过程中采用动态策略。换句话说,在每次迭代过程中量化比例系数均会被重新计算得到直至训练过程结束。
对于激活而言,可以选择动态策略也可以选择静态策略。若选择使用静态策略,则量化比例系数会在训练过程中被评估求得,且在推断过程中被使用(不同的输入均保持不变)。静态策略中的量化比例系数可于训练过程中通过如下三种方式进行评估:
-
+
在一个窗口中计算激活最大绝对值的平均值。
在一个窗口中计算激活最大绝对值的最大值。
在一个窗口中计算激活最大绝对值的滑动平均值,计算公式如下:
@@ -260,8 +258,8 @@ $$
式中,$V$ 是当前batch的最大绝对值, $Vt$是滑动平均值。$k$是一个因子,例如其值可取为0.9。
-
-
1.2.4 训练后量化
+
+
1.2.4 训练后量化
训练后量化是基于采样数据,采用KL散度等方法计算量化比例因子的方法。相比量化训练,训练后量化不需要重新训练,可以快速得到量化模型。
训练后量化的目标是求取量化比例因子,主要有两种方法:非饱和量化方法 ( No Saturation) 和饱和量化方法 (Saturation)。非饱和量化方法计算FP32类型Tensor中绝对值的最大值abs_max
,将其映射为127,则量化比例因子等于abs_max/127
。饱和量化方法使用KL散度计算一个合适的阈值T
(0<T<mab_max
),将其映射为127,则量化比例因子等于T/127
。一般而言,对于待量化op的权重Tensor,采用非饱和量化方法,对于待量化op的激活Tensor(包括输入和输出),采用饱和量化方法 。
训练后量化的实现步骤如下:
@@ -275,19 +273,19 @@ $$
-
-
2. 卷积核剪裁原理
+
+
2. 卷积核剪裁原理
该策略参考paper: Pruning Filters for Efficient ConvNets
该策略通过减少卷积层中卷积核的数量,来减小模型大小和降低模型计算复杂度。
-
-
2.1 剪裁卷积核
+
+
2.1 剪裁卷积核
剪裁注意事项1
-剪裁一个conv layer的filter,需要修改后续conv layer的filter. 如图4 所示,剪掉Xi的一个filter,会导致$X_{i+1}$少一个channel, $X_{i+1}$对应的filter在input_channel纬度上也要减1.
+剪裁一个conv layer的filter,需要修改后续conv layer的filter. 如
图4 所示,剪掉Xi的一个filter,会导致$X
{i+1}$少一个channel, $X {i+1}$对应的filter在input_channel纬度上也要减1.
图4
剪裁注意事项2
-
如图5 所示,剪裁完$X_i$之后,根据注意事项1我们从$X_{i+1}$的filter中删除了一行(图中蓝色行),在计算$X_{i+1}$的filters的l1_norm(图中绿色一列)的时候,有两种选择:
+
如图5 所示,剪裁完$Xi$之后,根据注意事项1我们从$X {i+1}$的filter中删除了一行(图中蓝色行),在计算$X_{i+1}$的filters的l1_norm(图中绿色一列)的时候,有两种选择:
算上被删除的一行:independent pruning
减去被删除的一行:greedy pruning
@@ -295,7 +293,7 @@ $$
图5
剪裁注意事项3
在对ResNet等复杂网络剪裁的时候,还要考虑到后当前卷积层的修改对上一层卷积层的影响。
-如图6 所示,在对residual block剪裁时,$X_{i+1}$层如何剪裁取决于project shortcut的剪裁结果,因为我们要保证project shortcut的output和$X_{i+1}$的output能被正确的concat.
+如
图6 所示,在对residual block剪裁时,$X
{i+1}$层如何剪裁取决于project shortcut的剪裁结果,因为我们要保证project shortcut的output和$X {i+1}$的output能被正确的concat.
图6
@@ -305,11 +303,11 @@ $$
每层剪裁一样比例的卷积核。
在剪裁一个卷积核之前,按l1_norm对filter从高到低排序,越靠后的filter越不重要,优先剪掉靠后的filter.
-
-
2.3 基于敏感度剪裁卷积网络
+
+
2.3 基于敏感度剪裁卷积网络
根据每个卷积层敏感度的不同,剪掉不同比例的卷积核。
-
-
两个假设
+
+
两个假设
在一个conv layer的parameter内部,按l1_norm对filter从高到低排序,越靠后的filter越不重要。
两个layer剪裁相同的比例的filters,我们称对模型精度影响更大的layer的敏感度相对高。
@@ -322,24 +320,24 @@ $$
优先剪裁layer内l1_norm相对低的filter
-
-
敏感度的理解
+
+
敏感度的理解
图7
如图7 所示,横坐标是将filter剪裁掉的比例,竖坐标是精度的损失,每条彩色虚线表示的是网络中的一个卷积层。
以不同的剪裁比例单独 剪裁一个卷积层,并观察其在验证数据集上的精度损失,并绘出图7 中的虚线。虚线上升较慢的,对应的卷积层相对不敏感,我们优先剪不敏感的卷积层的filter.
-
-
选择最优的剪裁率组合
+
+
选择最优的剪裁率组合
我们将图7 中的折线拟合为图8 中的曲线,每在竖坐标轴上选取一个精度损失值,就在横坐标轴上对应着一组剪裁率,如图8 中黑色实线所示。
用户给定一个模型整体的剪裁率,我们通过移动图5 中的黑色实线来找到一组满足条件的且合法的剪裁率。
图8
-
-
迭代剪裁
+
+
迭代剪裁
考虑到多个卷积层间的相关性,一个卷积层的修改可能会影响其它卷积层的敏感度,我们采取了多次剪裁的策略,步骤如下:
step1: 统计各卷积层的敏感度信息
@@ -350,63 +348,70 @@ $$
-
-
3. 蒸馏
-
一般情况下,模型参数量越多,结构越复杂,其性能越好,但参数也越允余,运算量和资源消耗也越大;模型蒸馏是将复杂网络中的有用信息将复杂网络中的有用信息提取出来提取出来,迁移到一个更小的网络中去,在我们的工具包中,支持两种蒸馏的方法。
-第一种是传统的蒸馏方法(参考论文:Distilling the Knowledge in a Neural Network )
-使用复杂的网络作为teacher模型去监督训练一个参数量和运算量更少的student模型。teacher模型可以是一个或者多个提前训练好的高性能模型。student模型的训练有两个目标:一个是原始的目标函数,为student模型输出的类别概率和label的交叉熵,记为hard-target;另一个是student模型输出的类别概率和teacher模型输出的类别概率的交叉熵,记为soft target,这两个loss加权后得到最终的训练loss,共同监督studuent模型的训练。
-第二种是基于FSP的蒸馏方法(参考论文:A Gift from Knowledge Distillation:
-Fast Optimization, Network Minimization and Transfer Learning )
-相比传统的蒸馏方法直接用小模型去拟合大模型的输出,该方法用小模型去拟合大模型不同层特征之间的转换关系,其用一个FSP矩阵(特征的内积)来表示不同层特征之间的关系,大模型和小模型不同层之间分别获得多个FSP矩阵,然后使用L2 loss让小模型的对应层FSP矩阵和大模型对应层的FSP矩阵尽量一致,具体如下图所示。这种方法的优势,通俗的解释是,比如将蒸馏类比成teacher(大模型)教student(小模型)解决一个问题,传统的蒸馏是直接告诉小模型问题的答案,让小模型学习,而学习FSP矩阵是让小模型学习解决问题的中间过程和方法,因此其学到的信息更多。
+
+
3. 蒸馏
+
+
+一般情况下,模型参数量越多,结构越复杂,其性能越好,但参数也越允余,运算量和资源消耗也越大;模型蒸馏是将复杂网络中的有用信息将复杂网络中的有用信息提取出来提取出来,迁移到一个更小的网络中去,在我们的工具包中,支持两种蒸馏的方法。
+第一种是传统的蒸馏方法(参考论文:Distilling the Knowledge in a Neural Network )
+
+
使用复杂的网络作为teacher模型去监督训练一个参数量和运算量更少的student模型。teacher模型可以是一个或者多个提前训练好的高性能模型。student模型的训练有两个目标:一个是原始的目标函数,为student模型输出的类别概率和label的交叉熵,记为hard-target;另一个是student模型输出的类别概率和teacher模型输出的类别概率的交叉熵,记为soft target,这两个loss加权后得到最终的训练loss,共同监督studuent模型的训练。
+第二种是基于FSP的蒸馏方法(参考论文:` A Gift from Knowledge Distillation:
+
+
+Fast Optimization, Network Minimization and Transfer Learning <http://openaccess.thecvf.com/content_cvpr_2017/papers/Yim_A_Gift_From_CVPR_2017_paper.pdf >`_)
+相比传统的蒸馏方法直接用小模型去拟合大模型的输出,该方法用小模型去拟合大模型不同层特征之间的转换关系,其用一个FSP矩阵(特征的内积)来表示不同层特征之间的关系,大模型和小模型不同层之间分别获得多个FSP矩阵,然后使用L2 loss让小模型的对应层FSP矩阵和大模型对应层的FSP矩阵尽量一致,具体如下图所示。这种方法的优势,通俗的解释是,比如将蒸馏类比成teacher(大模型)教student(小模型)解决一个问题,传统的蒸馏是直接告诉小模型问题的答案,让小模型学习,而学习FSP矩阵是让小模型学习解决问题的中间过程和方法,因此其学到的信息更多。
+
图9
-
由于小模型和大模型之间通过L2 loss进行监督,必须保证两个FSP矩阵的维度必须相同,而FSP矩阵的维度为M*N,其中M、N分别为输入和输出特征的channel数,因此大模型和小模型的FSP矩阵需要一一对应。
-
-
-
4. 轻量级模型结构搜索
+
+
+
+由于小模型和大模型之间通过L2 loss进行监督,必须保证两个FSP矩阵的维度必须相同,而FSP矩阵的维度为M*N,其中M、N分别为输入和输出特征的channel数,因此大模型和小模型的FSP矩阵需要一一对应。
+
+
4. 轻量级模型结构搜索
深度学习模型在很多任务上都取得了不错的效果,网络结构的好坏对最终模型的效果有非常重要的影响。手工设计网络需要非常丰富的经验和众多尝试,并且众多的超参数和网络结构参数会产生爆炸性的组合,常规的random search几乎不可行,因此最近几年自动模型搜索技术(Neural Architecture Search)成为研究热点。区别于传统NAS,我们专注在搜索精度高并且速度快的模型结构,我们将该功能统称为Light-NAS.
-
-
4.1 搜索策略
+
+
4.1 搜索策略
搜索策略定义了使用怎样的算法可以快速、准确找到最优的网络结构参数配置。常见的搜索方法包括:强化学习、贝叶斯优化、进化算法、基于梯度的算法。我们当前的实现以模拟退火算法为主。
-
-
4.1.1 模拟退火
+
+
4.1.1 模拟退火
模拟退火算法来源于固体退火原理,将固体加温至充分高,再让其徐徐冷却,加温时,固体内部粒子随温升变为无序状,内能增大,而徐徐冷却时粒子渐趋有序,在每个温度都达到平衡态,最后在常温时达到基态,内能减为最小。
鉴于物理中固体物质的退火过程与一般组合优化问题之间的相似性,我们将其用于网络结构的搜索。
使用模拟退火算法搜索模型的过程如下:
$$
-T_k = T_0*\theta^k
+T_k = T_0*theta^k
$$
-
\begin{equation}
+
begin{equation}
P(r_k) =
-\begin{cases}
-e^{\frac{(r_k-r)}{T_k}} & r_k < r\
-1 & r_k>=r
-\end{cases}
-\end{equation}
-
在第k次迭代,搜到的网络为$N_k$, 对$N_k$训练若干epoch后,在测试集上得到reward为$r_k$, 以概率$P(r_k)$接受$r_k$,即执行$r=r_k$。$r$在搜索过程起始时被初始化为0. $T_0$为初始化温度,$\theta$为温度衰减系数,$T_k$为第k次迭代的温度。
+begin{cases}
+e^{frac{(r_k-r)}{T_k}} & r_k < r1 & r_k>=r
+end{cases}
+end{equation}
+
在第k次迭代,搜到的网络为$N_k$, 对$N_k$训练若干epoch后,在测试集上得到reward为$r_k$, 以概率$P(r_k)$接受$r_k$,即执行$r=r_k$。$r$在搜索过程起始时被初始化为0. $T_0$为初始化温度,$theta$为温度衰减系数,$T_k$为第k次迭代的温度。
在我们的NAS任务中,区别于RL每次重新生成一个完整的网络,我们将网络结构映射成一段编码,第一次随机初始化,然后每次随机修改编码中的一部分(对应于网络结构的一部分)生成一个新的编码,然后将这个编码再映射回网络结构,通过在训练集上训练一定的epochs后的精度以及网络延时融合获得reward,来指导退火算法的收敛。
-
-
4.2 搜索空间
+
+
4.2 搜索空间
搜索空间定义了优化问题的变量,变量规模决定了搜索算法的难度和搜索时间。因此为了加快搜索速度,定义一个合理的搜索空间至关重要。在Light-NAS中,为了加速搜索速度,我们将一个网络划分为多个block,先手动按链状层级结构堆叠c,再 使用搜索算法自动搜索每个block内部的结构。
因为要搜索出在移动端运行速度快的模型,我们参考了MobileNetV2中的Linear Bottlenecks和Inverted residuals结构,搜索每一个Inverted residuals中的具体参数,包括kernelsize、channel扩张倍数、重复次数、channels number。如图10所示:
图10
-
-
4.3 模型延时评估
+
+
4.3 模型延时评估
搜索过程支持 FLOPS 约束和模型延时约束。而基于 Android/iOS 移动端、开发板等硬件平台,迭代搜索过程中不断测试模型的延时不仅消耗时间而且非常不方便,因此我们开发了模型延时评估器来评估搜索得到模型的延时。通过延时评估器评估得到的延时与模型实际测试的延时波动偏差小于 10%。
延时评估器分为配置硬件延时评估器和评估模型延时两个阶段,配置硬件延时评估器只需要执行一次,而评估模型延时则在搜索过程中不断评估搜索得到的模型延时。
-配置硬件延时评估器
+配置硬件延时评估器
获取搜索空间中所有不重复的 op 及其参数
获取每组 op 及其参数的延时
-评估模型延时
+评估模型延时
获取给定模型的所有 op 及其参数
根据给定模型的所有 op 及参数,利用延时评估器去估计模型的延时
@@ -414,9 +419,9 @@ e^{\frac{(r_k-r)}{T_k}} & r_k < r\
-
-
5. 参考文献
-
+
diff --git a/api_cn/nas_api.html b/api_cn/nas_api.html
index d930d96d455e2b3965024f2ef71eefcbad1cd9d3..a195a4d51f620232df80ad226134caf6c0458f66 100644
--- a/api_cn/nas_api.html
+++ b/api_cn/nas_api.html
@@ -83,7 +83,7 @@
OneShotNAS
-多进程蒸馏
+大规模可扩展知识蒸馏框架 Pantheon
卷积层通道剪裁
量化
简单蒸馏
@@ -263,7 +263,7 @@
tokens(list): - 一组tokens。tokens的长度和范围取决于搜索空间。
返回:
-根据传入的token得到一个模型结构实例。
+根据传入的token得到一个模型结构实例列表。
示例代码:
import paddle.fluid as fluid
from paddleslim.nas import SANAS
@@ -284,6 +284,13 @@
返回:
搜索过程中最好的token,reward和当前训练的token,形式为dict。
示例代码:
+import paddle.fluid as fluid
+from paddleslim.nas import SANAS
+config = [( 'MobileNetV2Space' )]
+sanas = SANAS ( configs = config )
+print ( sanas . current_info ())
+
+
diff --git a/api_cn/one_shot_api.html b/api_cn/one_shot_api.html
index 71ce4040e1c461d4a236a0c4f182e825b877fe2f..7feb6a73d0c1e7873b89cc5ca11668e79ec54df3 100644
--- a/api_cn/one_shot_api.html
+++ b/api_cn/one_shot_api.html
@@ -35,7 +35,7 @@
-
+
@@ -83,7 +83,7 @@
-
多进程蒸馏
+
大规模可扩展知识蒸馏框架 Pantheon
卷积层通道剪裁
量化
简单蒸馏
@@ -331,7 +331,7 @@ return x, acc