未验证 提交 c46067f4 编写于 作者: C Chang Xu 提交者: GitHub

patch nas&dist docs (#761)

* update nas and distill docs
上级 a7ec2ee8
...@@ -47,8 +47,8 @@ For example, how to add a search space with resnet block. New search space can N ...@@ -47,8 +47,8 @@ For example, how to add a search space with resnet block. New search space can N
```python ```python
### import necessary head file ### import necessary head file
from .search_space_base import SearchSpaceBase from paddleslim.nas import SearchSpaceBase
from .search_space_registry import SEARCHSPACE from paddleslim.nas import SEARCHSPACE
import numpy as np import numpy as np
### use decorator SEARCHSPACE.register to register yourself search space to search space NameSpace ### use decorator SEARCHSPACE.register to register yourself search space to search space NameSpace
......
...@@ -25,6 +25,7 @@ This tutorial trains and verifies distillation model on the MNIST dataset. The i ...@@ -25,6 +25,7 @@ This tutorial trains and verifies distillation model on the MNIST dataset. The i
Select `ResNet50` as the teacher to perform distillation training on the students of the` MobileNet` architecture. Select `ResNet50` as the teacher to perform distillation training on the students of the` MobileNet` architecture.
```python ```python
import paddleslim as slim
model = slim.models.MobileNet() model = slim.models.MobileNet()
student_program = fluid.Program() student_program = fluid.Program()
student_startup = fluid.Program() student_startup = fluid.Program()
......
...@@ -111,6 +111,7 @@ archs = sanas.next_archs()[0] ...@@ -111,6 +111,7 @@ archs = sanas.next_archs()[0]
### 7.2 build program ### 7.2 build program
Get program according to the function in Step3 and model architecture from Step 7.1. Get program according to the function in Step3 and model architecture from Step 7.1.
```python ```python
paddle.enable_static()
exe, train_program, eval_program, inputs, avg_cost, acc_top1, acc_top5 = build_program(archs) exe, train_program, eval_program, inputs, avg_cost, acc_top1, acc_top5 = build_program(archs)
``` ```
......
...@@ -61,7 +61,7 @@ PaddleSlim提供了三种方式构造超网络,下面分别介绍这三种方 ...@@ -61,7 +61,7 @@ PaddleSlim提供了三种方式构造超网络,下面分别介绍这三种方
model = mobilenet_v1() model = mobilenet_v1()
sp_net_config = supernet(kernel_size=(3, 5, 7), expand_ratio=[1, 2, 4]) sp_net_config = supernet(kernel_size=(3, 5, 7), expand_ratio=[1, 2, 4])
sp_model = Convert(sp_net_config).convert(self.model) sp_model = Convert(sp_net_config).convert(model)
方式二 方式二
------------------ ------------------
......
...@@ -54,6 +54,8 @@ DistillConfig ...@@ -54,6 +54,8 @@ DistillConfig
.. code-block:: python .. code-block:: python
from paddleslim.nas.ofa import DistillConfig from paddleslim.nas.ofa import DistillConfig
from paddle.vision.models import mobilenet_v1
teacher_model = mobilenet_v1()
default_distill_config = { default_distill_config = {
'lambda_distill': 0.01, 'lambda_distill': 0.01,
'teacher_model': teacher_model, 'teacher_model': teacher_model,
......
...@@ -26,9 +26,10 @@ merge ...@@ -26,9 +26,10 @@ merge
**使用示例:** **使用示例:**
.. code-block:: python .. code-block:: python
import paddle
import paddle.fluid as fluid import paddle.fluid as fluid
import paddleslim.dist as dist import paddleslim.dist as dist
paddle.enable_static()
student_program = fluid.Program() student_program = fluid.Program()
with fluid.program_guard(student_program): with fluid.program_guard(student_program):
x = fluid.layers.data(name='x', shape=[1, 28, 28]) x = fluid.layers.data(name='x', shape=[1, 28, 28])
...@@ -73,9 +74,10 @@ fsp_loss出自论文 `A Gift from Knowledge Distillation: Fast Optimization, Net ...@@ -73,9 +74,10 @@ fsp_loss出自论文 `A Gift from Knowledge Distillation: Fast Optimization, Net
**使用示例:** **使用示例:**
.. code-block:: python .. code-block:: python
import paddle
import paddle.fluid as fluid import paddle.fluid as fluid
import paddleslim.dist as dist import paddleslim.dist as dist
paddle.enable_static()
student_program = fluid.Program() student_program = fluid.Program()
with fluid.program_guard(student_program): with fluid.program_guard(student_program):
x = fluid.layers.data(name='x', shape=[1, 28, 28]) x = fluid.layers.data(name='x', shape=[1, 28, 28])
...@@ -119,9 +121,10 @@ l2_loss ...@@ -119,9 +121,10 @@ l2_loss
**使用示例:** **使用示例:**
.. code-block:: python .. code-block:: python
import paddle
import paddle.fluid as fluid import paddle.fluid as fluid
import paddleslim.dist as dist import paddleslim.dist as dist
paddle.enable_static()
student_program = fluid.Program() student_program = fluid.Program()
with fluid.program_guard(student_program): with fluid.program_guard(student_program):
x = fluid.layers.data(name='x', shape=[1, 28, 28]) x = fluid.layers.data(name='x', shape=[1, 28, 28])
...@@ -169,9 +172,10 @@ soft_label_loss出自论文 `Distilling the Knowledge in a Neural Network <https ...@@ -169,9 +172,10 @@ soft_label_loss出自论文 `Distilling the Knowledge in a Neural Network <https
**使用示例:** **使用示例:**
.. code-block:: python .. code-block:: python
import paddle
import paddle.fluid as fluid import paddle.fluid as fluid
import paddleslim.dist as dist import paddleslim.dist as dist
paddle.enable_static()
student_program = fluid.Program() student_program = fluid.Program()
with fluid.program_guard(student_program): with fluid.program_guard(student_program):
x = fluid.layers.data(name='x', shape=[1, 28, 28]) x = fluid.layers.data(name='x', shape=[1, 28, 28])
...@@ -215,9 +219,10 @@ loss ...@@ -215,9 +219,10 @@ loss
**使用示例:** **使用示例:**
.. code-block:: python .. code-block:: python
import paddle
import paddle.fluid as fluid import paddle.fluid as fluid
import paddleslim.dist as dist import paddleslim.dist as dist
paddle.enable_static()
student_program = fluid.Program() student_program = fluid.Program()
with fluid.program_guard(student_program): with fluid.program_guard(student_program):
x = fluid.layers.data(name='x', shape=[1, 28, 28]) x = fluid.layers.data(name='x', shape=[1, 28, 28])
......
...@@ -49,7 +49,7 @@ SANAS(Simulated Annealing Neural Architecture Search)是基于模拟退火 ...@@ -49,7 +49,7 @@ SANAS(Simulated Annealing Neural Architecture Search)是基于模拟退火
from paddleslim.nas import SANAS from paddleslim.nas import SANAS
config = [('MobileNetV2Space')] config = [('MobileNetV2Space')]
paddle.enable_static() paddle.enable_static()
sanas = SANAS(configs=config) sanas = SANAS(configs=config, server_addr=("", 8881))
.. note:: .. note::
...@@ -88,7 +88,7 @@ SANAS(Simulated Annealing Neural Architecture Search)是基于模拟退火 ...@@ -88,7 +88,7 @@ SANAS(Simulated Annealing Neural Architecture Search)是基于模拟退火
from paddleslim.nas import SANAS from paddleslim.nas import SANAS
config = [('MobileNetV2Space')] config = [('MobileNetV2Space')]
paddle.enable_static() paddle.enable_static()
sanas = SANAS(configs=config) sanas = SANAS(configs=config, server_addr=("", 8882))
input = paddle.static.data(name='input', shape=[None, 3, 32, 32], dtype='float32') input = paddle.static.data(name='input', shape=[None, 3, 32, 32], dtype='float32')
archs = sanas.next_archs() archs = sanas.next_archs()
for arch in archs: for arch in archs:
...@@ -115,7 +115,7 @@ SANAS(Simulated Annealing Neural Architecture Search)是基于模拟退火 ...@@ -115,7 +115,7 @@ SANAS(Simulated Annealing Neural Architecture Search)是基于模拟退火
from paddleslim.nas import SANAS from paddleslim.nas import SANAS
config = [('MobileNetV2Space')] config = [('MobileNetV2Space')]
paddle.enable_static() paddle.enable_static()
sanas = SANAS(configs=config) sanas = SANAS(configs=config, server_addr=("", 8883))
archs = sanas.next_archs() archs = sanas.next_archs()
### 假设网络计算出来的score是1,实际代码中使用时需要返回真实score。 ### 假设网络计算出来的score是1,实际代码中使用时需要返回真实score。
...@@ -142,7 +142,7 @@ SANAS(Simulated Annealing Neural Architecture Search)是基于模拟退火 ...@@ -142,7 +142,7 @@ SANAS(Simulated Annealing Neural Architecture Search)是基于模拟退火
from paddleslim.nas import SANAS from paddleslim.nas import SANAS
config = [('MobileNetV2Space')] config = [('MobileNetV2Space')]
paddle.enable_static() paddle.enable_static()
sanas = SANAS(configs=config) sanas = SANAS(configs=config, server_addr=("", 8884))
input = paddle.static.data(name='input', shape=[None, 3, 32, 32], dtype='float32') input = paddle.static.data(name='input', shape=[None, 3, 32, 32], dtype='float32')
tokens = ([0] * 25) tokens = ([0] * 25)
archs = sanas.tokens2arch(tokens)[0] archs = sanas.tokens2arch(tokens)[0]
...@@ -163,7 +163,7 @@ SANAS(Simulated Annealing Neural Architecture Search)是基于模拟退火 ...@@ -163,7 +163,7 @@ SANAS(Simulated Annealing Neural Architecture Search)是基于模拟退火
from paddleslim.nas import SANAS from paddleslim.nas import SANAS
config = [('MobileNetV2Space')] config = [('MobileNetV2Space')]
paddle.enable_static() paddle.enable_static()
sanas = SANAS(configs=config) sanas = SANAS(configs=config, server_addr=("", 8885))
print(sanas.current_info()) print(sanas.current_info())
...@@ -233,7 +233,7 @@ RLNAS (Reinforcement Learning Neural Architecture Search)是基于强化学习 ...@@ -233,7 +233,7 @@ RLNAS (Reinforcement Learning Neural Architecture Search)是基于强化学习
config = [('MobileNetV2Space')] config = [('MobileNetV2Space')]
paddle.enable_static() paddle.enable_static()
rlnas = RLNAS(key='lstm', configs=config) rlnas = RLNAS(key='lstm', configs=config, server_addr=("", 8886))
.. py:method:: next_archs(obs=None) .. py:method:: next_archs(obs=None)
...@@ -255,7 +255,7 @@ RLNAS (Reinforcement Learning Neural Architecture Search)是基于强化学习 ...@@ -255,7 +255,7 @@ RLNAS (Reinforcement Learning Neural Architecture Search)是基于强化学习
from paddleslim.nas import RLNAS from paddleslim.nas import RLNAS
config = [('MobileNetV2Space')] config = [('MobileNetV2Space')]
paddle.enable_static() paddle.enable_static()
rlnas = RLNAS(key='lstm', configs=config) rlnas = RLNAS(key='lstm', configs=config, server_addr=("", 8887))
input = paddle.static.data(name='input', shape=[None, 3, 32, 32], dtype='float32') input = paddle.static.data(name='input', shape=[None, 3, 32, 32], dtype='float32')
archs = rlnas.next_archs(1)[0] archs = rlnas.next_archs(1)[0]
for arch in archs: for arch in archs:
...@@ -280,7 +280,7 @@ RLNAS (Reinforcement Learning Neural Architecture Search)是基于强化学习 ...@@ -280,7 +280,7 @@ RLNAS (Reinforcement Learning Neural Architecture Search)是基于强化学习
from paddleslim.nas import RLNAS from paddleslim.nas import RLNAS
config = [('MobileNetV2Space')] config = [('MobileNetV2Space')]
paddle.enable_static() paddle.enable_static()
rlnas = RLNAS(key='lstm', configs=config) rlnas = RLNAS(key='lstm', configs=config, server_addr=("", 8888))
rlnas.next_archs(1) rlnas.next_archs(1)
rlnas.reward(1.0) rlnas.reward(1.0)
...@@ -307,7 +307,7 @@ RLNAS (Reinforcement Learning Neural Architecture Search)是基于强化学习 ...@@ -307,7 +307,7 @@ RLNAS (Reinforcement Learning Neural Architecture Search)是基于强化学习
from paddleslim.nas import RLNAS from paddleslim.nas import RLNAS
config = [('MobileNetV2Space')] config = [('MobileNetV2Space')]
paddle.enable_static() paddle.enable_static()
rlnas = RLNAS(key='lstm', configs=config) rlnas = RLNAS(key='lstm', configs=config, server_addr=("", 8889))
archs = rlnas.final_archs(1) archs = rlnas.final_archs(1)
print(archs) print(archs)
...@@ -330,7 +330,7 @@ RLNAS (Reinforcement Learning Neural Architecture Search)是基于强化学习 ...@@ -330,7 +330,7 @@ RLNAS (Reinforcement Learning Neural Architecture Search)是基于强化学习
from paddleslim.nas import RLNAS from paddleslim.nas import RLNAS
config = [('MobileNetV2Space')] config = [('MobileNetV2Space')]
paddle.enable_static() paddle.enable_static()
rlnas = RLNAS(key='lstm', configs=config) rlnas = RLNAS(key='lstm', configs=config, server_addr=("", 8891))
input = paddle.static.data(name='input', shape=[None, 3, 32, 32], dtype='float32') input = paddle.static.data(name='input', shape=[None, 3, 32, 32], dtype='float32')
tokens = ([0] * 25) tokens = ([0] * 25)
archs = rlnas.tokens2arch(tokens)[0] archs = rlnas.tokens2arch(tokens)[0]
......
...@@ -28,6 +28,7 @@ paddle.enable_static() ...@@ -28,6 +28,7 @@ paddle.enable_static()
选择`ResNet50`作为teacher对`MobileNet`结构的student进行蒸馏训练。 选择`ResNet50`作为teacher对`MobileNet`结构的student进行蒸馏训练。
```python ```python
import paddleslim as slim
model = slim.models.MobileNet() model = slim.models.MobileNet()
student_program = paddle.static.Program() student_program = paddle.static.Program()
student_startup = paddle.static.Program() student_startup = paddle.static.Program()
......
...@@ -29,38 +29,38 @@ OFA的基本流程分为以下步骤: ...@@ -29,38 +29,38 @@ OFA的基本流程分为以下步骤:
训练配置默认根据论文中PS的训练模式进行配置,可进行配置的参数和含义可以参考: [RunConfig](https://paddleslim.readthedocs.io/zh_CN/latest/api_cn/dygraph/ofa/ofa_api.html) 训练配置默认根据论文中PS的训练模式进行配置,可进行配置的参数和含义可以参考: [RunConfig](https://paddleslim.readthedocs.io/zh_CN/latest/api_cn/dygraph/ofa/ofa_api.html)
```python ```python
from paddleslim.nas.ofa import RunConfig from paddleslim.nas.ofa import RunConfig
default_run_config = { default_run_config = {
'train_batch_size': 256, 'train_batch_size': 256,
'n_epochs': [[1], [2, 3], [4, 5]], 'n_epochs': [[1], [2, 3], [4, 5]],
'init_learning_rate': [[0.001], [0.003, 0.001], [0.003, 0.001]], 'init_learning_rate': [[0.001], [0.003, 0.001], [0.003, 0.001]],
'dynamic_batch_size': [1, 1, 1], 'dynamic_batch_size': [1, 1, 1],
'total_images': 1281167, 'total_images': 1281167,
'elastic_depth': (2, 5, 8) 'elastic_depth': (2, 5, 8)
} }
run_config = RunConfig(**default_run_config) run_config = RunConfig(**default_run_config)
``` ```
### 3. 蒸馏配置 ### 3. 蒸馏配置
为OFA训练过程添加蒸馏配置,可进行配置的参数和含义可以参考: [DistillConfig](https://paddleslim.readthedocs.io/zh_CN/latest/api_cn/dygraph/ofa/ofa_api.html#distillconfig) 为OFA训练过程添加蒸馏配置,可进行配置的参数和含义可以参考: [DistillConfig](https://paddleslim.readthedocs.io/zh_CN/latest/api_cn/dygraph/ofa/ofa_api.html#distillconfig)
```python ```python
from paddle.vision.models import mobilenet_v1 from paddle.vision.models import mobilenet_v1
from paddleslim.nas.ofa import DistillConfig from paddleslim.nas.ofa import DistillConfig
teacher_model = mobilenet_v1() teacher_model = mobilenet_v1()
default_distill_config = { default_distill_config = {
'teacher_model': teacher_model 'teacher_model': teacher_model
} }
distill_config = DistillConfig(**default_distill_config) distill_config = DistillConfig(**default_distill_config)
``` ```
### 4. 传入模型和相应配置 ### 4. 传入模型和相应配置
用OFA封装模型、训练配置和蒸馏配置。配置完模型和正常模型训练流程相同。如果添加了蒸馏,则OFA封装后的模型会比原始模型多返回一组教师网络的输出。 用OFA封装模型、训练配置和蒸馏配置。配置完模型和正常模型训练流程相同。如果添加了蒸馏,则OFA封装后的模型会比原始模型多返回一组教师网络的输出。
```python ```python
from paddleslim.nas.ofa import OFA from paddleslim.nas.ofa import OFA
ofa_model = OFA(model, run_config=run_config, distill_config=distill_config) ofa_model = OFA(model, run_config=run_config, distill_config=distill_config)
``` ```
## 实验效果 ## 实验效果
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册