Skip to content
体验新版
项目
组织
正在加载...
登录
切换导航
打开侧边栏
PaddlePaddle
PaddleSlim
提交
08b88c0c
P
PaddleSlim
项目概览
PaddlePaddle
/
PaddleSlim
1 年多 前同步成功
通知
51
Star
1434
Fork
344
代码
文件
提交
分支
Tags
贡献者
分支图
Diff
Issue
53
列表
看板
标记
里程碑
合并请求
16
Wiki
0
Wiki
分析
仓库
DevOps
项目成员
Pages
P
PaddleSlim
项目概览
项目概览
详情
发布
仓库
仓库
文件
提交
分支
标签
贡献者
分支图
比较
Issue
53
Issue
53
列表
看板
标记
里程碑
合并请求
16
合并请求
16
Pages
分析
分析
仓库分析
DevOps
Wiki
0
Wiki
成员
成员
收起侧边栏
关闭侧边栏
动态
分支图
创建新Issue
提交
Issue看板
未验证
提交
08b88c0c
编写于
4年前
作者:
B
Bai Yifan
提交者:
GitHub
4年前
浏览文件
操作
浏览文件
下载
电子邮件补丁
差异文件
distillation tutorial update (#578) (#583)
上级
37931e59
wanghaoshuang-patch-1-1
release/2.0-alpha
release/2.0.0
wanghaoshuang-patch-1
v2.0.0
无相关合并请求
变更
1
隐藏空白更改
内联
并排
Showing
1 changed file
with
48 addition
and
36 deletion
+48
-36
docs/zh_cn/quick_start/distillation_tutorial.md
docs/zh_cn/quick_start/distillation_tutorial.md
+48
-36
未找到文件。
docs/zh_cn/quick_start/distillation_tutorial.md
浏览文件 @
08b88c0c
...
...
@@ -13,67 +13,69 @@
## 1. 导入依赖
PaddleSlim依赖Paddle
1.7
版本,请确认已正确安装Paddle,然后按以下方式导入Paddle和PaddleSlim:
PaddleSlim依赖Paddle
2.0
版本,请确认已正确安装Paddle,然后按以下方式导入Paddle和PaddleSlim:
```
import paddle
import
paddle.fluid as fluid
import
numpy as np
import paddleslim as slim
paddle.enable_static()
```
## 2. 定义student_program和teacher_program
本教程在
MNIST数据集上进行知识蒸馏的训练和验证,输入图片尺寸为
`[1, 28, 28
]`
,输出类别数为10。
本教程在
CIFAR数据集上进行知识蒸馏的训练和验证,输入图片尺寸为
`[3, 32, 32
]`
,输出类别数为10。
选择
`ResNet50`
作为teacher对
`MobileNet`
结构的student进行蒸馏训练。
```
python
model
=
slim
.
models
.
MobileNet
()
student_program
=
fluid
.
Program
()
student_startup
=
fluid
.
Program
()
with
fluid
.
program_guard
(
student_program
,
student_startup
):
image
=
fluid
.
data
(
name
=
'image'
,
shape
=
[
None
]
+
[
1
,
28
,
28
],
dtype
=
'float32'
)
label
=
fluid
.
data
(
name
=
'label'
,
shape
=
[
None
,
1
],
dtype
=
'int64'
)
student_program
=
paddle
.
static
.
Program
()
student_startup
=
paddle
.
static
.
Program
()
with
paddle
.
static
.
program_guard
(
student_program
,
student_startup
):
image
=
paddle
.
static
.
data
(
name
=
'image'
,
shape
=
[
None
,
3
,
32
,
32
],
dtype
=
'float32'
)
label
=
paddle
.
static
.
data
(
name
=
'label'
,
shape
=
[
None
,
1
],
dtype
=
'int64'
)
gt
=
paddle
.
reshape
(
label
,
[
-
1
,
1
])
out
=
model
.
net
(
input
=
image
,
class_dim
=
10
)
cost
=
fluid
.
layers
.
cross_entropy
(
input
=
out
,
label
=
label
)
avg_cost
=
fluid
.
layers
.
mean
(
x
=
cost
)
acc_top1
=
fluid
.
layers
.
accuracy
(
input
=
out
,
label
=
label
,
k
=
1
)
acc_top5
=
fluid
.
layers
.
accuracy
(
input
=
out
,
label
=
label
,
k
=
5
)
cost
=
paddle
.
nn
.
functional
.
loss
.
cross_entropy
(
input
=
out
,
label
=
gt
)
avg_cost
=
paddle
.
mean
(
x
=
cost
)
acc_top1
=
paddle
.
metric
.
accuracy
(
input
=
out
,
label
=
gt
,
k
=
1
)
acc_top5
=
paddle
.
metric
.
accuracy
(
input
=
out
,
label
=
gt
,
k
=
5
)
```
```
python
model
=
slim
.
models
.
ResNet50
()
teacher_program
=
fluid
.
Program
()
teacher_startup
=
fluid
.
Program
()
with
fluid
.
program_guard
(
teacher_program
,
teacher_startup
):
with
fluid
.
unique_name
.
guard
():
image
=
fluid
.
data
(
name
=
'image'
,
shape
=
[
None
]
+
[
1
,
28
,
28
],
dtype
=
'float32'
)
teacher_
model
=
slim
.
models
.
ResNet50
()
teacher_program
=
paddle
.
static
.
Program
()
teacher_startup
=
paddle
.
static
.
Program
()
with
paddle
.
static
.
program_guard
(
teacher_program
,
teacher_startup
):
with
paddle
.
utils
.
unique_name
.
guard
():
image
=
paddle
.
static
.
data
(
name
=
'image'
,
shape
=
[
None
,
3
,
32
,
32
],
dtype
=
'float32'
)
predict
=
teacher_model
.
net
(
image
,
class_dim
=
10
)
exe
=
fluid
.
Executor
(
fluid
.
CPUPlace
())
exe
=
paddle
.
static
.
Executor
(
paddle
.
CPUPlace
())
exe
.
run
(
teacher_startup
)
```
## 3. 选择特征图
我们可以用student_的list_vars方法来观察其中全部的
Variables,从中选出一个或多个变量(Variable
)来拟合teacher相应的变量。
我们可以用student_的list_vars方法来观察其中全部的
Tensor,从中选出一个或多个变量(Tensor
)来拟合teacher相应的变量。
```
python
# get all student
variables
# get all student
tensor
student_vars
=
[]
for
v
in
student_program
.
list_vars
():
student_vars
.
append
((
v
.
name
,
v
.
shape
))
#uncomment the following lines to observe student's
variables
for distillation
#uncomment the following lines to observe student's
tensor
for distillation
#print("="*50+"student_model_vars"+"="*50)
#print(student_vars)
# get all teacher
variables
# get all teacher
tensor
teacher_vars
=
[]
for
v
in
teacher_program
.
list_vars
():
teacher_vars
.
append
((
v
.
name
,
v
.
shape
))
#uncomment the following lines to observe teacher's
variables
for distillation
#uncomment the following lines to observe teacher's
tensor
for distillation
#print("="*50+"teacher_model_vars"+"="*50)
#print(teacher_vars)
```
...
...
@@ -81,33 +83,43 @@ for v in teacher_program.list_vars():
经过筛选我们可以看到,teacher_program中的'bn5c_branch2b.output.1.tmp_3'和student_program的'depthwise_conv2d_11.tmp_0'尺寸一致,可以组成蒸馏损失函数。
## 4. 合并program (merge)并添加蒸馏loss
merge操作将student_program和teacher_program中的所有
Variables和Op都将被添加到同一个Program中,同时为了避免两个program中有同名变量会引起命名冲突,merge也会为teacher_program中的Variables
添加一个同一的命名前缀name_prefix,其默认值是'teacher_'
merge操作将student_program和teacher_program中的所有
Tensor和Op都将被添加到同一个Program中,同时为了避免两个program中有同名变量会引起命名冲突,merge也会为teacher_program中的Tensor
添加一个同一的命名前缀name_prefix,其默认值是'teacher_'
为了确保teacher网络和student网络输入的数据是一样的,merge操作也会对两个program的输入数据层进行合并操作,所以需要指定一个数据层名称的映射关系data_name_map,key是teacher的输入数据名称,value是student的
```
python
data_name_map
=
{
'image'
:
'image'
}
main
=
slim
.
dist
.
merge
(
teacher_program
,
student_program
,
data_name_map
,
fluid
.
CPUPlace
())
with
fluid
.
program_guard
(
student_program
,
student_startup
):
main
=
slim
.
dist
.
merge
(
teacher_program
,
student_program
,
data_name_map
,
paddle
.
CPUPlace
())
with
paddle
.
static
.
program_guard
(
student_program
,
student_startup
):
l2_loss
=
slim
.
dist
.
l2_loss
(
'teacher_bn5c_branch2b.output.1.tmp_3'
,
'depthwise_conv2d_11.tmp_0'
,
student_program
)
loss
=
l2_loss
+
avg_cost
opt
=
fluid
.
optimizer
.
Momentum
(
0.01
,
0.9
)
opt
=
paddle
.
optimizer
.
Momentum
(
0.01
,
0.9
)
opt
.
minimize
(
loss
)
exe
.
run
(
student_startup
)
```
## 5. 模型训练
为了快速执行该示例,我们选取简单的
MNIST数据,Paddle框架的
`paddle.dataset.mnist`
包定义了MNIST
数据的下载和读取。 代码如下:
为了快速执行该示例,我们选取简单的
CIFAR数据,Paddle框架的
`paddle.vision.datasets.Cifar10`
包定义了CIFAR10
数据的下载和读取。 代码如下:
```
python
train_reader
=
paddle
.
fluid
.
io
.
batch
(
paddle
.
dataset
.
mnist
.
train
(),
batch_size
=
128
,
drop_last
=
True
)
train_feeder
=
fluid
.
DataFeeder
([
'image'
,
'label'
],
fluid
.
CPUPlace
(),
student_program
)
import
paddle.vision.transforms
as
T
transform
=
T
.
Compose
([
T
.
Transpose
(),
T
.
Normalize
([
127.5
],
[
127.5
])])
train_dataset
=
paddle
.
vision
.
datasets
.
Cifar10
(
mode
=
"train"
,
backend
=
"cv2"
,
transform
=
transform
)
train_loader
=
paddle
.
io
.
DataLoader
(
train_dataset
,
places
=
paddle
.
CPUPlace
(),
feed_list
=
[
image
,
label
],
drop_last
=
True
,
batch_size
=
64
,
return_list
=
False
,
shuffle
=
True
)
```
```
python
for
data
in
train_reader
(
):
acc1
,
acc5
,
loss_np
=
exe
.
run
(
student_program
,
feed
=
train_feeder
.
feed
(
data
)
,
fetch_list
=
[
acc_top1
.
name
,
acc_top5
.
name
,
loss
.
name
])
for
idx
,
data
in
enumerate
(
train_loader
):
acc1
,
acc5
,
loss_np
=
exe
.
run
(
student_program
,
feed
=
data
,
fetch_list
=
[
acc_top1
.
name
,
acc_top5
.
name
,
loss
.
name
])
print
(
"Acc1: {:.6f}, Acc5: {:.6f}, Loss: {:.6f}"
.
format
(
acc1
.
mean
(),
acc5
.
mean
(),
loss_np
.
mean
()))
```
This diff is collapsed.
Click to expand it.
编辑
预览
Markdown
is supported
0%
请重试
或
添加新附件
.
添加附件
取消
You are about to add
0
people
to the discussion. Proceed with caution.
先完成此消息的编辑!
取消
想要评论请
注册
或
登录
反馈
建议
客服
返回
顶部