1. Quantization Aware Training量化介绍#
+目录#
+-
+
- 量化原理介绍 +
- 剪裁原理介绍 +
- 蒸馏原理介绍 +
- 轻量级模型结构搜索原理介绍 +
1. Quantization Aware Training量化介绍#
1.1 背景#
近年来,定点量化使用更少的比特数(如8-bit、3-bit、2-bit等)表示神经网络的权重和激活已被验证是有效的。定点量化的优点包括低内存带宽、低功耗、低计算资源占用以及低模型存储需求等。
@@ -338,7 +348,7 @@ Y_{dq} = \frac{Y_q}{(n - 1) * (n - 1)} * X_m * W_m \ 在剪裁一个卷积核之前,按l1_norm对filter从高到低排序,越靠后的filter越不重要,优先剪掉靠后的filter.
2.3 基于敏感度剪裁卷积网络#
根据每个卷积层敏感度的不同,剪掉不同比例的卷积核。
-两个假设#
+两个假设#
- 在一个conv layer的parameter内部,按l1_norm对filter从高到低排序,越靠后的filter越不重要。
- 两个layer剪裁相同的比例的filters,我们称对模型精度影响更大的layer的敏感度相对高。 @@ -348,7 +358,7 @@ Y_{dq} = \frac{Y_q}{(n - 1) * (n - 1)} * X_m * W_m \
- layer的剪裁比例与其敏感度成反比
- 优先剪裁layer内l1_norm相对低的filter
敏感度的理解#
+敏感度的理解#
图7
@@ -356,7 +366,7 @@ Y_{dq} = \frac{Y_q}{(n - 1) * (n - 1)} * X_m * W_m \
如**图7**所示,横坐标是将filter剪裁掉的比例,竖坐标是精度的损失,每条彩色虚线表示的是网络中的一个卷积层。 以不同的剪裁比例**单独**剪裁一个卷积层,并观察其在验证数据集上的精度损失,并绘出**图7**中的虚线。虚线上升较慢的,对应的卷积层相对不敏感,我们优先剪不敏感的卷积层的filter.
-选择最优的剪裁率组合#
+选择最优的剪裁率组合#
我们将**图7**中的折线拟合为**图8**中的曲线,每在竖坐标轴上选取一个精度损失值,就在横坐标轴上对应着一组剪裁率,如**图8**中黑色实线所示。 用户给定一个模型整体的剪裁率,我们通过移动**图5**中的黑色实线来找到一组满足条件的且合法的剪裁率。
@@ -364,7 +374,7 @@ Y_{dq} = \frac{Y_q}{(n - 1) * (n - 1)} * X_m * W_m \ 图8
-迭代剪裁#
+迭代剪裁#
考虑到多个卷积层间的相关性,一个卷积层的修改可能会影响其它卷积层的敏感度,我们采取了多次剪裁的策略,步骤如下:
- step1: 统计各卷积层的敏感度信息 diff --git a/api/analysis_api/index.html b/api/analysis_api/index.html index 2f7972dc166d93836d3e711d3b080787023029ef..8b76e19207cd6a53200ce713bf3209aab4e4bf1c 100644 --- a/api/analysis_api/index.html +++ b/api/analysis_api/index.html @@ -166,7 +166,7 @@
- 模型分析
- - Edit on GitHub @@ -178,7 +178,7 @@
FLOPs#
示例:
-import paddle.fluid as fluid -from paddle.fluid.param_attr import ParamAttr -from paddleslim.analysis import flops +import paddle.fluid as fluid +from paddle.fluid.param_attr import ParamAttr +from paddleslim.analysis import flops -def conv_bn_layer(input, - num_filters, - filter_size, - name, - stride=1, - groups=1, - act=None): - conv = fluid.layers.conv2d( - input=input, - num_filters=num_filters, - filter_size=filter_size, - stride=stride, - padding=(filter_size - 1) // 2, - groups=groups, - act=None, - param_attr=ParamAttr(name=name + "_weights"), - bias_attr=False, - name=name + "_out") - bn_name = name + "_bn" - return fluid.layers.batch_norm( - input=conv, - act=act, - name=bn_name + '_output', - param_attr=ParamAttr(name=bn_name + '_scale'), - bias_attr=ParamAttr(bn_name + '_offset'), - moving_mean_name=bn_name + '_mean', - moving_variance_name=bn_name + '_variance', ) +def conv_bn_layer(input, + num_filters, + filter_size, + name, + stride=1, + groups=1, + act=None): + conv = fluid.layers.conv2d( + input=input, + num_filters=num_filters, + filter_size=filter_size, + stride=stride, + padding=(filter_size - 1) // 2, + groups=groups, + act=None, + param_attr=ParamAttr(name=name + "_weights"), + bias_attr=False, + name=name + "_out") + bn_name = name + "_bn" + return fluid.layers.batch_norm( + input=conv, + act=act, + name=bn_name + '_output', + param_attr=ParamAttr(name=bn_name + '_scale'), + bias_attr=ParamAttr(bn_name + '_offset'), + moving_mean_name=bn_name + '_mean', + moving_variance_name=bn_name + '_variance', ) -main_program = fluid.Program() -startup_program = fluid.Program() -# X X O X O -# conv1-->conv2-->sum1-->conv3-->conv4-->sum2-->conv5-->conv6 -# | ^ | ^ -# |____________| |____________________| -# -# X: prune output channels -# O: prune input channels -with fluid.program_guard(main_program, startup_program): - input = fluid.data(name="image", shape=[None, 3, 16, 16]) - conv1 = conv_bn_layer(input, 8, 3, "conv1") - conv2 = conv_bn_layer(conv1, 8, 3, "conv2") - sum1 = conv1 + conv2 - conv3 = conv_bn_layer(sum1, 8, 3, "conv3") - conv4 = conv_bn_layer(conv3, 8, 3, "conv4") - sum2 = conv4 + sum1 - conv5 = conv_bn_layer(sum2, 8, 3, "conv5") - conv6 = conv_bn_layer(conv5, 8, 3, "conv6") +main_program = fluid.Program() +startup_program = fluid.Program() +# X X O X O +# conv1-->conv2-->sum1-->conv3-->conv4-->sum2-->conv5-->conv6 +# | ^ | ^ +# |____________| |____________________| +# +# X: prune output channels +# O: prune input channels +with fluid.program_guard(main_program, startup_program): + input = fluid.data(name="image", shape=[None, 3, 16, 16]) + conv1 = conv_bn_layer(input, 8, 3, "conv1") + conv2 = conv_bn_layer(conv1, 8, 3, "conv2") + sum1 = conv1 + conv2 + conv3 = conv_bn_layer(sum1, 8, 3, "conv3") + conv4 = conv_bn_layer(conv3, 8, 3, "conv4") + sum2 = conv4 + sum1 + conv5 = conv_bn_layer(sum2, 8, 3, "conv5") + conv6 = conv_bn_layer(conv5, 8, 3, "conv6") -print("FLOPs: {}".format(flops(main_program))) +print("FLOPs: {}".format(flops(main_program)))model_size#
-
-
-
- paddleslim.analysis.model_size(program) [源代码] -
-
+
paddleslim.analysis.model_size(program) 源代码
获得指定网络的参数数量。
-
-
参数:
- program(paddle.fluid.Program) - 待分析的目标网络。更多关于Program的介绍请参考:Program概念介绍。 @@ -276,56 +272,56 @@
- model_size(int) - 整个网络的参数数量。
示例:
-import paddle.fluid as fluid -from paddle.fluid.param_attr import ParamAttr -from paddleslim.analysis import model_size +import paddle.fluid as fluid +from paddle.fluid.param_attr import ParamAttr +from paddleslim.analysis import model_size -def conv_layer(input, - num_filters, - filter_size, - name, - stride=1, - groups=1, - act=None): - conv = fluid.layers.conv2d( - input=input, - num_filters=num_filters, - filter_size=filter_size, - stride=stride, - padding=(filter_size - 1) // 2, - groups=groups, - act=None, - param_attr=ParamAttr(name=name + "_weights"), - bias_attr=False, - name=name + "_out") - return conv +def conv_layer(input, + num_filters, + filter_size, + name, + stride=1, + groups=1, + act=None): + conv = fluid.layers.conv2d( + input=input, + num_filters=num_filters, + filter_size=filter_size, + stride=stride, + padding=(filter_size - 1) // 2, + groups=groups, + act=None, + param_attr=ParamAttr(name=name + "_weights"), + bias_attr=False, + name=name + "_out") + return conv -main_program = fluid.Program() -startup_program = fluid.Program() -# X X O X O -# conv1-->conv2-->sum1-->conv3-->conv4-->sum2-->conv5-->conv6 -# | ^ | ^ -# |____________| |____________________| -# -# X: prune output channels -# O: prune input channels -with fluid.program_guard(main_program, startup_program): - input = fluid.data(name="image", shape=[None, 3, 16, 16]) - conv1 = conv_layer(input, 8, 3, "conv1") - conv2 = conv_layer(conv1, 8, 3, "conv2") - sum1 = conv1 + conv2 - conv3 = conv_layer(sum1, 8, 3, "conv3") - conv4 = conv_layer(conv3, 8, 3, "conv4") - sum2 = conv4 + sum1 - conv5 = conv_layer(sum2, 8, 3, "conv5") - conv6 = conv_layer(conv5, 8, 3, "conv6") +main_program = fluid.Program() +startup_program = fluid.Program() +# X X O X O +# conv1-->conv2-->sum1-->conv3-->conv4-->sum2-->conv5-->conv6 +# | ^ | ^ +# |____________| |____________________| +# +# X: prune output channels +# O: prune input channels +with fluid.program_guard(main_program, startup_program): + input = fluid.data(name="image", shape=[None, 3, 16, 16]) + conv1 = conv_layer(input, 8, 3, "conv1") + conv2 = conv_layer(conv1, 8, 3, "conv2") + sum1 = conv1 + conv2 + conv3 = conv_layer(sum1, 8, 3, "conv3") + conv4 = conv_layer(conv3, 8, 3, "conv4") + sum2 = conv4 + sum1 + conv5 = conv_layer(sum2, 8, 3, "conv5") + conv6 = conv_layer(conv5, 8, 3, "conv6") -print("FLOPs: {}".format(model_size(main_program))) +print("FLOPs: {}".format(model_size(main_program)))TableLatencyEvaluator#
-
-
- paddleslim.analysis.TableLatencyEvaluator(table_file, delimiter=",") [源代码] +
- paddleslim.analysis.TableLatencyEvaluator(table_file, delimiter=",") 源代码
-
基于硬件延时表的模型延时评估器。
@@ -333,7 +329,7 @@
-
-
table_file(str) - 所使用的延时评估表的绝对路径。关于演示评估表格式请参考:PaddleSlim硬件延时评估表格式
+table_file(str) - 所使用的延时评估表的绝对路径。关于演示评估表格式请参考:PaddleSlim硬件延时评估表格式
-
delimiter(str) - 硬件延时评估表中,操作信息之前所使用的分割符,默认为英文字符逗号。
@@ -344,7 +340,7 @@ - Evaluator - 硬件延时评估器的实例。
- paddleslim.analysis.TableLatencyEvaluator.latency(graph) [源代码] +
- paddleslim.analysis.TableLatencyEvaluator.latency(graph) 源代码
-
获得指定网络的预估延时。
diff --git a/api/api_guide/index.html b/api/api_guide/index.html
index e8a568334f81e7f8841ef2e75ca63a0cf8ab8631..7b2777a37726161b3b5893aec11fadfc94af0b36 100644
--- a/api/api_guide/index.html
+++ b/api/api_guide/index.html
@@ -150,7 +150,7 @@
参数:
-
-
block_num
表示搜索空间中block的数量。block_mask
是一组由0、1组成的列表,0表示当前block是normal block,1表示当前block是reduction block。如果设置了block_mask
,则主要以block_mask
为主要配置,input_size
,output_size
和block_num
三种配置是无效的。Note
--
-
- reduction block表示经过这个block之后的feature map大小下降为之前的一半,normal block表示经过这个block之后feature map大小不变。
input_size
和output_size
用来计算整个模型结构中reduction block数量。
-
Note:
+1. reduction block表示经过这个block之后的feature map大小下降为之前的一半,normal block表示经过这个block之后feature map大小不变。
+2. input_size
和output_size
用来计算整个模型结构中reduction block数量。
SANAS#
-
-
- paddleslim.nas.SANAS(configs, server_addr=("", 8881), init_temperature=100, reduce_rate=0.85, search_steps=300, save_checkpoint='./nas_checkpoint', load_checkpoint=None, is_server=True)[源代码] +
- paddleslim.nas.SANAS(configs, server_addr=("", 8881), init_temperature=100, reduce_rate=0.85, search_steps=300, save_checkpoint='./nas_checkpoint', load_checkpoint=None, is_server=True)源代码
- SANAS(Simulated Annealing Neural Architecture Search)是基于模拟退火算法进行模型结构搜索的算法,一般用于离散搜索任务。
参数:
@@ -208,18 +204,16 @@返回: 一个SANAS类的实例
示例代码: -
from paddleslim.nas import SANAS -config = [('MobileNetV2Space')] -sanas = SANAS(config=config) +from paddleslim.nas import SANAS +config = [('MobileNetV2Space')] +sanas = SANAS(config=config)
- paddlesim.nas.SANAS.tokens2arch(tokens)
- 通过一组token得到实际的模型结构,一般用来把搜索到最优的token转换为模型结构用来做最后的训练。
Note
-tokens是一个列表,token映射到搜索空间转换成相应的网络结构,一组token对应唯一的一个网络结构。
-Note:
+tokens是一个列表,token映射到搜索空间转换成相应的网络结构,一组token对应唯一的一个网络结构。
参数:
- tokens(list): - 一组token。 @@ -227,12 +221,12 @@
- paddleslim.nas.SANAS.next_archs() @@ -241,12 +235,12 @@
- paddleslim.nas.SANAS.reward(score) diff --git a/api/prune_api/index.html b/api/prune_api/index.html index 7ec610e9187ae2a1b3fd7f3815294dfe9c147141..18de0a287291891582b6278d0e2f0462fa3843a3 100644 --- a/api/prune_api/index.html +++ b/api/prune_api/index.html @@ -172,7 +172,7 @@
- 剪枝与敏感度
- - Edit on GitHub @@ -184,7 +184,7 @@
- paddleslim.prune.Pruner(criterion="l1_norm")[源代码] +
- paddleslim.prune.Pruner(criterion="l1_norm")源代码
-
对卷积网络的通道进行一次剪裁。剪裁一个卷积层的通道,是指剪裁该卷积层输出的通道。卷积层的权重形状为
[output_channel, input_channel, kernel_size, kernel_size]
,通过剪裁该权重的第一纬度达到剪裁输出通道数的目的。
@@ -195,12 +195,12 @@
- paddleslim.prune.Pruner.prune(program, scope, params, ratios, place=None, lazy=False, only_graph=False, param_backup=False, param_shape_backup=False)[源代码] +
- paddleslim.prune.Pruner.prune(program, scope, params, ratios, place=None, lazy=False, only_graph=False, param_backup=False, param_shape_backup=False)源代码
-
对目标网络的一组卷积层的权重进行裁剪。
@@ -211,20 +211,20 @@
-
-
scope(paddle.fluid.Scope) - 要裁剪的权重所在的
+scope
,Paddle中用scope
实例存放模型参数和运行时变量的值。Scope中的参数值会被inplace
的裁剪。更多介绍请参考scope_guardscope(paddle.fluid.Scope) - 要裁剪的权重所在的
scope
,Paddle中用scope
实例存放模型参数和运行时变量的值。Scope中的参数值会被inplace
的裁剪。更多介绍请参考Scope概念介绍 -
params(list
) - 需要被裁剪的卷积层的参数的名称列表。可以通过以下方式查看模型中所有参数的名称: -for block in program.blocks: - for param in block.all_parameters(): - print("param: {}; shape: {}".format(param.name, param.shape)) +
for block in program.blocks: + for param in block.all_parameters(): + print("param: {}; shape: {}".format(param.name, param.shape))
ratios(list
) - 用于裁剪params
的剪切率,类型为列表。该列表长度必须与params
的长度一致。- -
place(paddle.fluid.Place) - 待裁剪参数所在的设备位置,可以是CUDAPlace或CPUPlace。
+place(paddle.fluid.Place) - 待裁剪参数所在的设备位置,可以是
CUDAPlace
或CPUPlace
。Place概念介绍lazy(bool) -
@@ -253,82 +253,82 @@lazy
为True时,通过将指定通道的参数置零达到裁剪的目的,参数的shape保持不变
;lazy
为False时,直接将要裁的通道的参数删除,参数的shape
会发生变化。示例:
点击AIStudio执行以下示例代码。 -
import paddle.fluid as fluid -from paddle.fluid.param_attr import ParamAttr -from paddleslim.prune import Pruner - -def conv_bn_layer(input, - num_filters, - filter_size, - name, - stride=1, - groups=1, - act=None): - conv = fluid.layers.conv2d( - input=input, - num_filters=num_filters, - filter_size=filter_size, - stride=stride, - padding=(filter_size - 1) // 2, - groups=groups, - act=None, - param_attr=ParamAttr(name=name + "_weights"), - bias_attr=False, - name=name + "_out") - bn_name = name + "_bn" - return fluid.layers.batch_norm( - input=conv, - act=act, - name=bn_name + '_output', - param_attr=ParamAttr(name=bn_name + '_scale'), - bias_attr=ParamAttr(bn_name + '_offset'), - moving_mean_name=bn_name + '_mean', - moving_variance_name=bn_name + '_variance', ) - -main_program = fluid.Program() -startup_program = fluid.Program() -# X X O X O -# conv1-->conv2-->sum1-->conv3-->conv4-->sum2-->conv5-->conv6 -# | ^ | ^ -# |____________| |____________________| -# -# X: prune output channels -# O: prune input channels -with fluid.program_guard(main_program, startup_program): - input = fluid.data(name="image", shape=[None, 3, 16, 16]) - conv1 = conv_bn_layer(input, 8, 3, "conv1") - conv2 = conv_bn_layer(conv1, 8, 3, "conv2") - sum1 = conv1 + conv2 - conv3 = conv_bn_layer(sum1, 8, 3, "conv3") - conv4 = conv_bn_layer(conv3, 8, 3, "conv4") - sum2 = conv4 + sum1 - conv5 = conv_bn_layer(sum2, 8, 3, "conv5") - conv6 = conv_bn_layer(conv5, 8, 3, "conv6") - -place = fluid.CPUPlace() -exe = fluid.Executor(place) -scope = fluid.Scope() -exe.run(startup_program, scope=scope) -pruner = Pruner() -main_program, _, _ = pruner.prune( - main_program, - scope, - params=["conv4_weights"], - ratios=[0.5], - place=place, - lazy=False, - only_graph=False, - param_backup=False, - param_shape_backup=False) - -for param in main_program.global_block().all_parameters(): - if "weights" in param.name: - print("param name: {}; param shape: {}".format(param.name, param.shape)) +
import paddle.fluid as fluid +from paddle.fluid.param_attr import ParamAttr +from paddleslim.prune import Pruner + +def conv_bn_layer(input, + num_filters, + filter_size, + name, + stride=1, + groups=1, + act=None): + conv = fluid.layers.conv2d( + input=input, + num_filters=num_filters, + filter_size=filter_size, + stride=stride, + padding=(filter_size - 1) // 2, + groups=groups, + act=None, + param_attr=ParamAttr(name=name + "_weights"), + bias_attr=False, + name=name + "_out") + bn_name = name + "_bn" + return fluid.layers.batch_norm( + input=conv, + act=act, + name=bn_name + '_output', + param_attr=ParamAttr(name=bn_name + '_scale'), + bias_attr=ParamAttr(bn_name + '_offset'), + moving_mean_name=bn_name + '_mean', + moving_variance_name=bn_name + '_variance', ) + +main_program = fluid.Program() +startup_program = fluid.Program() +# X X O X O +# conv1-->conv2-->sum1-->conv3-->conv4-->sum2-->conv5-->conv6 +# | ^ | ^ +# |____________| |____________________| +# +# X: prune output channels +# O: prune input channels +with fluid.program_guard(main_program, startup_program): + input = fluid.data(name="image", shape=[None, 3, 16, 16]) + conv1 = conv_bn_layer(input, 8, 3, "conv1") + conv2 = conv_bn_layer(conv1, 8, 3, "conv2") + sum1 = conv1 + conv2 + conv3 = conv_bn_layer(sum1, 8, 3, "conv3") + conv4 = conv_bn_layer(conv3, 8, 3, "conv4") + sum2 = conv4 + sum1 + conv5 = conv_bn_layer(sum2, 8, 3, "conv5") + conv6 = conv_bn_layer(conv5, 8, 3, "conv6") + +place = fluid.CPUPlace() +exe = fluid.Executor(place) +scope = fluid.Scope() +exe.run(startup_program, scope=scope) +pruner = Pruner() +main_program, _, _ = pruner.prune( + main_program, + scope, + params=["conv4_weights"], + ratios=[0.5], + place=place, + lazy=False, + only_graph=False, + param_backup=False, + param_shape_backup=False) + +for param in main_program.global_block().all_parameters(): + if "weights" in param.name: + print("param name: {}; param shape: {}".format(param.name, param.shape))
sensitivity#
-
-
- paddleslim.prune.sensitivity(program, place, param_names, eval_func, sensitivities_file=None, pruned_ratios=None) [源代码] +
- paddleslim.prune.sensitivity(program, place, param_names, eval_func, sensitivities_file=None, pruned_ratios=None) 源代码
-
计算网络中每个卷积层的敏感度。每个卷积层的敏感度信息统计方法为:依次剪掉当前卷积层不同比例的输出通道数,在测试集上计算剪裁后的精度损失。得到敏感度信息后,可以通过观察或其它方式确定每层卷积的剪裁率。
@@ -339,15 +339,15 @@
program(paddle.fluid.Program) - 待评估的目标网络。更多关于Program的介绍请参考:Program概念介绍。
- -
place(paddle.fluid.Place) - 待分析的参数所在的设备位置,可以是CUDAPlace或CPUPlace。
+place(paddle.fluid.Place) - 待分析的参数所在的设备位置,可以是
CUDAPlace
或CPUPlace
。Place概念介绍param_names(list
) - 待分析的卷积层的参数的名称列表。可以通过以下方式查看模型中所有参数的名称:for block in program.blocks: - for param in block.all_parameters(): - print("param: {}; shape: {}".format(param.name, param.shape)) +
for block in program.blocks: + for param in block.all_parameters(): + print("param: {}; shape: {}".format(param.name, param.shape))
-
@@ -365,116 +365,116 @@
- sensitivities(dict) - 存放敏感度信息的dict,其格式为:
- paddleslim.prune.merge_sensitive(sensitivities)[源代码] +
- paddleslim.prune.merge_sensitive(sensitivities)源代码
-
合并多个敏感度信息。
@@ -487,22 +487,22 @@
- sensitivities(dict) - 合并后的敏感度信息。其格式为:
- paddleslim.prune.load_sensitivities(sensitivities_file)[源代码] +
- paddleslim.prune.load_sensitivities(sensitivities_file)源代码
-
从文件中加载敏感度信息。
@@ -518,7 +518,7 @@
- paddleslim.prune.get_ratios_by_loss(sensitivities, loss)[源代码] +
- paddleslim.prune.get_ratios_by_loss(sensitivities, loss)源代码
-
根据敏感度和精度损失阈值计算出一组剪切率。对于参数
w
, 其剪裁率为使精度损失低于loss
的最大剪裁率。
diff --git a/api/quantization_api/index.html b/api/quantization_api/index.html
index 8e7d1707364f12f0eb7c8bde4120fa55eba4c838..87d88f93a84917f84685111c5f6f698cb9aa5b81 100644
--- a/api/quantization_api/index.html
+++ b/api/quantization_api/index.html
@@ -172,7 +172,7 @@
- 量化
- - Edit on GitHub @@ -184,29 +184,50 @@
- weight_quantize_type(str) - 参数量化方式。可选
'abs_max'
,'channel_wise_abs_max'
,'range_abs_max'
,'moving_average_abs_max'
。 默认'abs_max'
。
- - activation_quantize_type(str) - 激活量化方式,可选
'abs_max'
,'range_abs_max'
,'moving_average_abs_max'
,默认'abs_max'
。
+ - weight_quantize_type(str) - 参数量化方式。可选
'abs_max'
,'channel_wise_abs_max'
,'range_abs_max'
,'moving_average_abs_max'
。如果使用TensorRT
加载量化后的模型来预测,请使用'channel_wise_abs_max'
。 默认'channel_wise_abs_max'
。
+ - activation_quantize_type(str) - 激活量化方式,可选
'abs_max'
,'range_abs_max'
,'moving_average_abs_max'
。如果使用TensorRT
加载量化后的模型来预测,请使用'range_abs_max', 'moving_average_abs_max'
。,默认'moving_average_abs_max'
。 - weight_bits(int) - 参数量化bit数,默认8, 推荐设为8。
- activation_bits(int) - 激活量化bit数,默认8, 推荐设为8。
- not_quant_pattern(str | list[str]) - 所有
name_scope
包含'not_quant_pattern'
字符串的op
,都不量化, 设置方式请参考fluid.name_scope。
@@ -214,6 +235,14 @@
- dtype(int8) - 量化后的参数类型,默认
int8
, 目前仅支持int8
。 - window_size(int) -
'range_abs_max'
量化方式的window size
,默认10000。 - moving_rate(int) -
'moving_average_abs_max'
量化方式的衰减系数,默认 0.9。
+ - for_tensorrt(bool) - 量化后的模型是否使用
TensorRT
进行预测。如果是的话,量化op类型为:TENSORRT_OP_TYPES
。默认值为False.
+ - is_full_quantize(bool) - 是否量化所有可支持op类型。默认值为False. +
- 目前
Paddle-Lite
有int8 kernel来加速的op只有['conv2d', 'depthwise_conv2d', 'mul']
, 其他op的int8 kernel将陆续支持。 - 此接口会改变
program
结构,并且可能增加一些persistable
的变量,所以加载模型参数时请注意和相应的program
对应。 - 此接口底层经历了
fluid.Program
->fluid.framework.IrGraph
->fluid.Program
的转变,在fluid.framework.IrGraph
中没有Parameter
的概念,Variable
只有persistable
和not persistable
的区别,所以在保存和加载参数时,请使用fluid.io.save_persistables
和fluid.io.load_persistables
接口。 - 由于此接口会根据
program
的结构和量化配置来对program
添加op,所以Paddle
中一些通过fuse op
来加速训练的策略不能使用。已知以下策略在使用量化时必须设为False
:fuse_all_reduce_ops, sync_batch_norm
。 - 如果传入的
program
中存在和任何op都没有连接的Variable
,则会在量化的过程中被优化掉。 - paddleslim.quant.convert(program, place, config, scope=None, save_int8=False)[源代码] @@ -266,10 +295,10 @@
- paddleslim.quant.quant_post(executor, model_dir, quantize_model_path,sample_generator, model_filename=None, params_filename=None, batch_size=16,batch_nums=None, scope=None, algo='KL', quantizable_op_type=["conv2d", "depthwise_conv2d", "mul"])[源代码] +
- paddleslim.quant.quant_post(executor, model_dir, quantize_model_path,sample_generator, model_filename=None, params_filename=None, batch_size=16,batch_nums=None, scope=None, algo='KL', quantizable_op_type=["conv2d", "depthwise_conv2d", "mul"], is_full_quantize=False, is_use_cache_file=False, cache_dir="./temp_post_training")[源代码]
-
对保存在
${model_dir}
下的模型进行量化,使用sample_generator
的数据进行参数校正。
@@ -329,18 +358,24 @@
- scope(fluid.Scope, optional) - 用来获取和写入
Variable
, 如果设置为None
,则使用fluid.global_scope(). 默认值是None
. - algo(str) - 量化时使用的算法名称,可为
'KL'
或者'direct'
。该参数仅针对激活值的量化,因为参数值的量化使用的方式为'channel_wise_abs_max'
. 当algo
设置为'direct'
时,使用校正数据的激活值的绝对值的最大值当作Scale
值,当设置为'KL'
时,则使用KL
散度的方法来计算Scale
值。默认值为'KL'
。 - quantizable_op_type(list[str]) - 需要量化的
op
类型列表。默认值为["conv2d", "depthwise_conv2d", "mul"]
。
+ - is_full_quantize(bool) - 是否量化所有可支持的op类型。如果设置为False, 则按照
'quantizable_op_type'
的设置进行量化。
+ - is_use_cache_file(bool) - 是否使用硬盘对中间结果进行存储。如果为False, 则将中间结果存储在内存中。 +
- cache_dir(str) - 如果
'is_use_cache_file'
为True, 则将中间结果存储在此参数设置的路径下。 - 因为该接口会收集校正数据的所有的激活值,当校正图片比较多时,请设置
'is_use_cache_file'
为True, 将中间结果存储在硬盘中。另外,'KL'
散度的计算比较耗时。
+ - 目前
Paddle-Lite
有int8 kernel来加速的op只有['conv2d', 'depthwise_conv2d', 'mul']
, 其他op的int8 kernel将陆续支持。
+ - 知识蒸馏
- - Edit on GitHub @@ -184,9 +184,9 @@
- paddleslim.dist.merge(teacher_program, student_program, data_name_map, place, scope=fluid.global_scope(), name_prefix='teacher_') [源代码] +
- paddleslim.dist.merge(teacher_program, student_program, data_name_map, place, scope=fluid.global_scope(), name_prefix='teacher_') [源代码]
-
-
merge将两个paddle program(teacher_program, student_program)融合为一个program,并将融合得到的program返回。在融合的program中,可以为其中合适的teacher特征图和student特征图添加蒸馏损失函数,从而达到用teacher模型的暗知识(Dark Knowledge)指导student模型学习的目的。
+merge将teacher_program融合到student_program中。在融合的program中,可以为其中合适的teacher特征图和student特征图添加蒸馏损失函数,从而达到用teacher模型的暗知识(Dark Knowledge)指导student模型学习的目的。
- scope(Scope)-该参数表示程序使用的变量作用域,如果不指定将使用默认的全局作用域。默认值:fluid.global_scope()
- name_prefix(str)-merge操作将统一为teacher的Variables添加的名称前缀name_prefix。默认值:'teacher_' -
- teacher_var_name(str): teacher_var的名称. +
- teacher_var_name(str): teacher_var的名称.
- student_var_name(str): student_var的名称.
- program(Program): 用于蒸馏训练的fluid program。默认值:fluid.default_main_program()
- teacher_var_name(str): teacher_var的名称. -
- student_var_name(str): student_var的名称. +
- teacher_var_name(str): teacher_var的名称. +
- student_var_name(str): student_var的名称.
- program(Program): 用于蒸馏训练的fluid program。默认值:fluid.default_main_program() -
- teacher_temperature(float): 对teacher_var进行soft操作的温度值,温度值越大得到的特征图越平滑 -
- student_temperature(float): 对student_var进行soft操作的温度值,温度值越大得到的特征图越平滑 +
- teacher_temperature(float): 对teacher_var进行soft操作的温度值,温度值越大得到的特征图越平滑 +
- student_temperature(float): 对student_var进行soft操作的温度值,温度值越大得到的特征图越平滑
- loss_func(python function): 自定义的损失函数,输入为teacher var和student var,输出为自定义的loss +
- loss_func(python function): 自定义的损失函数,输入为teacher var和student var,输出为自定义的loss
- program(Program): 用于蒸馏训练的fluid program。默认值:fluid.default_main_program()
- **kwargs: loss_func输入名与对应variable名称
- Home
- - Edit on GitHub @@ -211,15 +211,15 @@
- 安装develop版本
- 安装官方发布的最新版本
- 1. 图像分类 +
- 1. 图象分类
- 模型库
- - Edit on GitHub @@ -200,7 +200,7 @@
{"weight_0": - {0.1: 0.22, - 0.2: 0.33 - }, - "weight_1": - {0.1: 0.21, - 0.2: 0.4 - } -} +
{"weight_0": + {0.1: 0.22, + 0.2: 0.33 + }, + "weight_1": + {0.1: 0.21, + 0.2: 0.4 + } +}
其中,
weight_0
是卷积层参数的名称,sensitivities['weight_0']的value
为剪裁比例,value
为精度损失的比例。示例:
点击AIStudio运行以下示例代码。
-import paddle -import numpy as np -import paddle.fluid as fluid -from paddle.fluid.param_attr import ParamAttr -from paddleslim.prune import sensitivity -import paddle.dataset.mnist as reader - -def conv_bn_layer(input, - num_filters, - filter_size, - name, - stride=1, - groups=1, - act=None): - conv = fluid.layers.conv2d( - input=input, - num_filters=num_filters, - filter_size=filter_size, - stride=stride, - padding=(filter_size - 1) // 2, - groups=groups, - act=None, - param_attr=ParamAttr(name=name + "_weights"), - bias_attr=False, - name=name + "_out") - bn_name = name + "_bn" - return fluid.layers.batch_norm( - input=conv, - act=act, - name=bn_name + '_output', - param_attr=ParamAttr(name=bn_name + '_scale'), - bias_attr=ParamAttr(bn_name + '_offset'), - moving_mean_name=bn_name + '_mean', - moving_variance_name=bn_name + '_variance', ) - -main_program = fluid.Program() -startup_program = fluid.Program() -# X X O X O -# conv1-->conv2-->sum1-->conv3-->conv4-->sum2-->conv5-->conv6 -# | ^ | ^ -# |____________| |____________________| -# -# X: prune output channels -# O: prune input channels -image_shape = [1,28,28] -with fluid.program_guard(main_program, startup_program): - image = fluid.data(name='image', shape=[None]+image_shape, dtype='float32') - label = fluid.data(name='label', shape=[None, 1], dtype='int64') - conv1 = conv_bn_layer(image, 8, 3, "conv1") - conv2 = conv_bn_layer(conv1, 8, 3, "conv2") - sum1 = conv1 + conv2 - conv3 = conv_bn_layer(sum1, 8, 3, "conv3") - conv4 = conv_bn_layer(conv3, 8, 3, "conv4") - sum2 = conv4 + sum1 - conv5 = conv_bn_layer(sum2, 8, 3, "conv5") - conv6 = conv_bn_layer(conv5, 8, 3, "conv6") - out = fluid.layers.fc(conv6, size=10, act="softmax") -# cost = fluid.layers.cross_entropy(input=out, label=label) -# avg_cost = fluid.layers.mean(x=cost) - acc_top1 = fluid.layers.accuracy(input=out, label=label, k=1) -# acc_top5 = fluid.layers.accuracy(input=out, label=label, k=5) - - -place = fluid.CPUPlace() -exe = fluid.Executor(place) -exe.run(startup_program) - -val_reader = paddle.batch(reader.test(), batch_size=128) -val_feeder = feeder = fluid.DataFeeder( - [image, label], place, program=main_program) - -def eval_func(program): - - acc_top1_ns = [] - for data in val_reader(): - acc_top1_n = exe.run(program, - feed=val_feeder.feed(data), - fetch_list=[acc_top1.name]) - acc_top1_ns.append(np.mean(acc_top1_n)) - return np.mean(acc_top1_ns) -param_names = [] -for param in main_program.global_block().all_parameters(): - if "weights" in param.name: - param_names.append(param.name) -sensitivities = sensitivity(main_program, - place, - param_names, - eval_func, - sensitivities_file="./sensitive.data", - pruned_ratios=[0.1, 0.2, 0.3]) -print(sensitivities) +
import paddle +import numpy as np +import paddle.fluid as fluid +from paddle.fluid.param_attr import ParamAttr +from paddleslim.prune import sensitivity +import paddle.dataset.mnist as reader + +def conv_bn_layer(input, + num_filters, + filter_size, + name, + stride=1, + groups=1, + act=None): + conv = fluid.layers.conv2d( + input=input, + num_filters=num_filters, + filter_size=filter_size, + stride=stride, + padding=(filter_size - 1) // 2, + groups=groups, + act=None, + param_attr=ParamAttr(name=name + "_weights"), + bias_attr=False, + name=name + "_out") + bn_name = name + "_bn" + return fluid.layers.batch_norm( + input=conv, + act=act, + name=bn_name + '_output', + param_attr=ParamAttr(name=bn_name + '_scale'), + bias_attr=ParamAttr(bn_name + '_offset'), + moving_mean_name=bn_name + '_mean', + moving_variance_name=bn_name + '_variance', ) + +main_program = fluid.Program() +startup_program = fluid.Program() +# X X O X O +# conv1-->conv2-->sum1-->conv3-->conv4-->sum2-->conv5-->conv6 +# | ^ | ^ +# |____________| |____________________| +# +# X: prune output channels +# O: prune input channels +image_shape = [1,28,28] +with fluid.program_guard(main_program, startup_program): + image = fluid.data(name='image', shape=[None]+image_shape, dtype='float32') + label = fluid.data(name='label', shape=[None, 1], dtype='int64') + conv1 = conv_bn_layer(image, 8, 3, "conv1") + conv2 = conv_bn_layer(conv1, 8, 3, "conv2") + sum1 = conv1 + conv2 + conv3 = conv_bn_layer(sum1, 8, 3, "conv3") + conv4 = conv_bn_layer(conv3, 8, 3, "conv4") + sum2 = conv4 + sum1 + conv5 = conv_bn_layer(sum2, 8, 3, "conv5") + conv6 = conv_bn_layer(conv5, 8, 3, "conv6") + out = fluid.layers.fc(conv6, size=10, act="softmax") +# cost = fluid.layers.cross_entropy(input=out, label=label) +# avg_cost = fluid.layers.mean(x=cost) + acc_top1 = fluid.layers.accuracy(input=out, label=label, k=1) +# acc_top5 = fluid.layers.accuracy(input=out, label=label, k=5) + + +place = fluid.CPUPlace() +exe = fluid.Executor(place) +exe.run(startup_program) + +val_reader = paddle.batch(reader.test(), batch_size=128) +val_feeder = feeder = fluid.DataFeeder( + [image, label], place, program=main_program) + +def eval_func(program): + + acc_top1_ns = [] + for data in val_reader(): + acc_top1_n = exe.run(program, + feed=val_feeder.feed(data), + fetch_list=[acc_top1.name]) + acc_top1_ns.append(np.mean(acc_top1_n)) + return np.mean(acc_top1_ns) +param_names = [] +for param in main_program.global_block().all_parameters(): + if "weights" in param.name: + param_names.append(param.name) +sensitivities = sensitivity(main_program, + place, + param_names, + eval_func, + sensitivities_file="./sensitive.data", + pruned_ratios=[0.1, 0.2, 0.3]) +print(sensitivities)
merge_sensitive#
-
-
{"weight_0": - {0.1: 0.22, - 0.2: 0.33 - }, - "weight_1": - {0.1: 0.21, - 0.2: 0.4 - } -} +
{"weight_0": + {0.1: 0.22, + 0.2: 0.33 + }, + "weight_1": + {0.1: 0.21, + 0.2: 0.4 + } +}
其中,
weight_0
是卷积层参数的名称,sensitivities['weight_0']的value
为剪裁比例,value
为精度损失的比例。示例:
load_sensitivities#
-
-
示例:
get_ratios_by_loss#
-
-
量化配置#
通过字典配置量化参数
-quant_config_default = { - 'weight_quantize_type': 'abs_max', - 'activation_quantize_type': 'abs_max', - 'weight_bits': 8, - 'activation_bits': 8, - # ops of name_scope in not_quant_pattern list, will not be quantized - 'not_quant_pattern': ['skip_quant'], - # ops of type in quantize_op_types, will be quantized - 'quantize_op_types': - ['conv2d', 'depthwise_conv2d', 'mul', 'elementwise_add', 'pool2d'], - # data type after quantization, such as 'uint8', 'int8', etc. default is 'int8' - 'dtype': 'int8', - # window size for 'range_abs_max' quantization. defaulf is 10000 - 'window_size': 10000, - # The decay coefficient of moving average, default is 0.9 - 'moving_rate': 0.9, +
TENSORRT_OP_TYPES = [ + 'mul', 'conv2d', 'pool2d', 'depthwise_conv2d', 'elementwise_add', + 'leaky_relu' +] +TRANSFORM_PASS_OP_TYPES = ['conv2d', 'depthwise_conv2d', 'mul'] + +QUANT_DEQUANT_PASS_OP_TYPES = [ + "pool2d", "elementwise_add", "concat", "softmax", "argmax", "transpose", + "equal", "gather", "greater_equal", "greater_than", "less_equal", + "less_than", "mean", "not_equal", "reshape", "reshape2", + "bilinear_interp", "nearest_interp", "trilinear_interp", "slice", + "squeeze", "elementwise_sub", "relu", "relu6", "leaky_relu", "tanh", "swish" + ] + +_quant_config_default = { + # weight quantize type, default is 'channel_wise_abs_max' + 'weight_quantize_type': 'channel_wise_abs_max', + # activation quantize type, default is 'moving_average_abs_max' + 'activation_quantize_type': 'moving_average_abs_max', + # weight quantize bit num, default is 8 + 'weight_bits': 8, + # activation quantize bit num, default is 8 + 'activation_bits': 8, + # ops of name_scope in not_quant_pattern list, will not be quantized + 'not_quant_pattern': ['skip_quant'], + # ops of type in quantize_op_types, will be quantized + 'quantize_op_types': ['conv2d', 'depthwise_conv2d', 'mul'], + # data type after quantization, such as 'uint8', 'int8', etc. default is 'int8' + 'dtype': 'int8', + # window size for 'range_abs_max' quantization. defaulf is 10000 + 'window_size': 10000, + # The decay coefficient of moving average, default is 0.9 + 'moving_rate': 0.9, + # if True, 'quantize_op_types' will be TENSORRT_OP_TYPES + 'for_tensorrt': False, + # if True, 'quantoze_op_types' will be TRANSFORM_PASS_OP_TYPES + QUANT_DEQUANT_PASS_OP_TYPES + 'is_full_quantize': False }
参数:
-
-
++注意事项
+-
+
quant_aware#
-
@@ -237,13 +266,13 @@
注意事项
+convert#
注意事项
-因为该接口会对
op
和Variable
做相应的删除和修改,所以此接口只能在训练完成之后调用。如果想转化训练的中间模型,可加载相应的参数之后再使用此接口。因为该接口会对
op
和Variable
做相应的删除和修改,所以此接口只能在训练完成之后调用。如果想转化训练的中间模型,可加载相应的参数之后再使用此接口。代码示例
-#encoding=utf8 +
#encoding=utf8 import paddle.fluid as fluid import paddleslim.quant as quant @@ -311,7 +340,7 @@
更详细的用法请参考 量化训练demo。
quant_post#
-
-
返回
无。
注意事项
-因为该接口会收集校正数据的所有的激活值,所以使用的校正图片不能太多。
'KL'
散度的计算也比较耗时。-
+
代码示例
注: 此示例不能直接运行,因为需要加载
${model_dir}
下的模型,所以不能直接运行。import paddle.fluid as fluid +
import paddle.fluid as fluid import paddle.dataset.mnist as reader from paddleslim.quant import quant_post val_reader = reader.train() @@ -383,7 +418,7 @@
返回类型
fluid.Program
代码示例 -
import paddle.fluid as fluid +
import paddle.fluid as fluid import paddleslim.quant as quant train_program = fluid.Program() diff --git a/api/single_distiller_api/index.html b/api/single_distiller_api/index.html index 334c738d15c0c23c78fd4050a96f012ca220c810..11bb3de84331c403ca527ddb20718897d178e2d2 100644 --- a/api/single_distiller_api/index.html +++ b/api/single_distiller_api/index.html @@ -172,7 +172,7 @@
merge#
-
-
参数:
@@ -198,13 +198,13 @@返回: 由student_program和teacher_program merge得到的program
+返回: 无
Note
data_name_map 是 teacher_var name到student_var name的映射,如果写反可能无法正确进行merge
使用示例:
-import paddle.fluid as fluid +
@@ -241,7 +241,7 @@import paddle.fluid as fluid import paddleslim.dist as dist student_program = fluid.Program() with fluid.program_guard(student_program): @@ -220,7 +220,7 @@ data_name_map = {'y':'x'} USE_GPU = False place = fluid.CUDAPlace(0) if USE_GPU else fluid.CPUPlace() -main_program = dist.merge(teacher_program, student_program, +dist.merge(teacher_program, student_program, data_name_map, place)
返回: 由teacher_var1, teacher_var2, student_var1, student_var2组合得到的fsp_loss
使用示例:
-import paddle.fluid as fluid +
@@ -272,13 +272,13 @@import paddle.fluid as fluid import paddleslim.dist as dist student_program = fluid.Program() with fluid.program_guard(student_program): @@ -257,8 +257,8 @@ data_name_map = {'y':'x'} USE_GPU = False place = fluid.CUDAPlace(0) if USE_GPU else fluid.CPUPlace() -main_program = merge(teacher_program, student_program, data_name_map, place) -with fluid.program_guard(main_program): +merge(teacher_program, student_program, data_name_map, place) +with fluid.program_guard(student_program): distillation_loss = dist.fsp_loss('teacher_t1.tmp_1', 'teacher_t2.tmp_1', 's1.tmp_1', 's2.tmp_1', main_program)
参数:
-
-
返回: 由teacher_var, student_var组合得到的l2_loss
使用示例:
-import paddle.fluid as fluid +
@@ -309,15 +309,15 @@import paddle.fluid as fluid import paddleslim.dist as dist student_program = fluid.Program() with fluid.program_guard(student_program): @@ -294,8 +294,8 @@ data_name_map = {'y':'x'} USE_GPU = False place = fluid.CUDAPlace(0) if USE_GPU else fluid.CPUPlace() -main_program = merge(teacher_program, student_program, data_name_map, place) -with fluid.program_guard(main_program): +merge(teacher_program, student_program, data_name_map, place) +with fluid.program_guard(student_program): distillation_loss = dist.l2_loss('teacher_t2.tmp_1', 's2.tmp_1', main_program)
参数:
-
-
返回: 由teacher_var, student_var组合得到的soft_label_loss
使用示例:
-import paddle.fluid as fluid +
@@ -348,13 +348,13 @@import paddle.fluid as fluid import paddleslim.dist as dist student_program = fluid.Program() with fluid.program_guard(student_program): @@ -333,8 +333,8 @@ data_name_map = {'y':'x'} USE_GPU = False place = fluid.CUDAPlace(0) if USE_GPU else fluid.CPUPlace() -main_program = merge(teacher_program, student_program, data_name_map, place) -with fluid.program_guard(main_program): +merge(teacher_program, student_program, data_name_map, place) +with fluid.program_guard(student_program): distillation_loss = dist.soft_label_loss('teacher_t2.tmp_1', 's2.tmp_1', main_program, 1., 1.)
参数:
-
-
返回:自定义的损失函数loss
使用示例:
-import paddle.fluid as fluid +
diff --git a/index.html b/index.html index 50fd8ff300c708a82f23e82fcbccc4d40e25c726..f006c752332664dcc875cb52a8d1e649f5c0b0c9 100644 --- a/index.html +++ b/index.html @@ -168,7 +168,7 @@import paddle.fluid as fluid import paddleslim.dist as dist student_program = fluid.Program() with fluid.program_guard(student_program): @@ -370,13 +370,13 @@ data_name_map = {'y':'x'} USE_GPU = False place = fluid.CUDAPlace(0) if USE_GPU else fluid.CPUPlace() -main_program = merge(teacher_program, student_program, data_name_map, place) +merge(teacher_program, student_program, data_name_map, place) def adaptation_loss(t_var, s_var): teacher_channel = t_var.shape[1] s_hint = fluid.layers.conv2d(s_var, teacher_channel, 1) hint_loss = fluid.layers.reduce_mean(fluid.layers.square(s_hint - t_var)) return hint_loss -with fluid.program_guard(main_program): +with fluid.program_guard(student_program): distillation_loss = dist.loss(main_program, adaptation_loss, t_var='teacher_t2.tmp_1', s_var='s2.tmp_1')
git clone https://github.com/PaddlePaddle/PaddleSlim.git -cd PaddleSlim -python setup.py install +
git clone https://github.com/PaddlePaddle/PaddleSlim.git +cd PaddleSlim +python setup.py install
pip install paddleslim -i https://pypi.org/simple +
pip install paddleslim -i https://pypi.org/simple
-
@@ -289,5 +289,5 @@
diff --git a/model_zoo/index.html b/model_zoo/index.html
index 7721f390c729834cd4ea8ae3f58b5d009c198734..a39d7a00a301bf5d20283162e3f88058c459afa8 100644
--- a/model_zoo/index.html
+++ b/model_zoo/index.html
@@ -58,7 +58,7 @@
模型库
-
-
-
@@ -190,7 +190,7 @@
-1. 图像分类#
+1. 图象分类#
数据集:ImageNet1000类
1.1 量化#
返回: 根据传入的token得到一个模型结构实例。
示例代码: -
import paddle.fluid as fluid -input = fluid.data(name='input', shape=[None, 3, 32, 32], dtype='float32') -archs = sanas.token2arch(tokens) -for arch in archs: - output = arch(input) - input = output +import paddle.fluid as fluid +input = fluid.data(name='input', shape=[None, 3, 32, 32], dtype='float32') +archs = sanas.token2arch(tokens) +for arch in archs: + output = arch(input) + input = output
返回: 返回模型结构实例的列表,形式为list。
示例代码: -
import paddle.fluid as fluid -input = fluid.data(name='input', shape=[None, 3, 32, 32], dtype='float32') -archs = sanas.next_archs() -for arch in archs: - output = arch(input) - input = output +import paddle.fluid as fluid +input = fluid.data(name='input', shape=[None, 3, 32, 32], dtype='float32') +archs = sanas.next_archs() +for arch in archs: + output = arch(input) + input = output
Pruner#
-
-
返回: 一个Pruner类的实例
示例代码:
-from paddleslim.prune import Pruner -pruner = Pruner() +from paddleslim.prune import Pruner +pruner = Pruner()
-
-
program(paddle.fluid.Program) - 要裁剪的目标网络。更多关于Program的介绍请参考:Program概念介绍。