paddleslim.prune package

class paddleslim.prune.Pruner(criterion='l1_norm')

The pruner used to prune channels of convolution.

Parameters:criterion (str) – the criterion used to sort channels for pruning. It only supports ‘l1_norm’ currently.
prune(program, scope, params, ratios, place=None, lazy=False, only_graph=False, param_backup=False, param_shape_backup=False)

Pruning the given parameters.

Parameters:
  • program (fluid.Program) – The program to be pruned.
  • scope (fluid.Scope) – The scope storing paramaters to be pruned.
  • params (list<str>) – A list of parameter names to be pruned.
  • ratios (list<float>) – A list of ratios to be used to pruning parameters.
  • place (fluid.Place) – The device place of filter parameters. Defalut: None.
  • lazy (bool) – True means setting the pruned elements to zero. False means cutting down the pruned elements. Default: False.
  • only_graph (bool) – True means only modifying the graph. False means modifying graph and variables in scope. Default: False.
  • param_backup (bool) – Whether to return a dict to backup the values of parameters. Default: False.
  • param_shape_backup (bool) – Whether to return a dict to backup the shapes of parameters. Default: False.
Returns:

(pruned_program, param_backup, param_shape_backup). pruned_program is the pruned program. param_backup is a dict to backup the values of parameters. param_shape_backup is a dict to backup the shapes of parameters.

Return type:

tuple

class paddleslim.prune.AutoPruner(program, scope, place, params=[], init_ratios=None, pruned_flops=0.5, pruned_latency=None, server_addr=('', 0), init_temperature=100, reduce_rate=0.85, max_try_times=300, max_client_num=10, search_steps=300, max_ratios=[0.9], min_ratios=[0], key='auto_pruner', is_server=True)

Bases: object

Search a group of ratios used to prune program.

Parameters:
  • program (Program) – The program to be pruned.
  • scope (Scope) – The scope to be pruned.
  • place (fluid.Place) – The device place of parameters.
  • params (list<str>) – The names of parameters to be pruned.
  • init_ratios (list<float>|float) – Init ratios used to pruned parameters in params. List means ratios used for pruning each parameter in params. The length of init_ratios should be equal to length of params when init_ratios is a list. If it is a scalar, all the parameters in params will be pruned by uniform ratio. None means get a group of init ratios by pruned_flops of pruned_latency. Default: None.
  • pruned_flops (float) – The percent of FLOPS to be pruned. Default: None.
  • pruned_latency (float) – The percent of latency to be pruned. Default: None.
  • server_addr (tuple) – A tuple of server ip and server port for controller server.
  • init_temperature (float) – The init temperature used in simulated annealing search strategy.
  • reduce_rate (float) – The decay rate used in simulated annealing search strategy.
  • max_try_times (int) – The max number of trying to generate legal tokens.
  • max_client_num (int) – The max number of connections of controller server.
  • search_steps (int) – The steps of searching.
  • max_ratios (float|list<float>) – Max ratios used to pruned parameters in params. List means max ratios for each parameter in params. The length of max_ratios should be equal to length of params when max_ratios is a list. If it is a scalar, it will used for all the parameters in params.
  • min_ratios (float|list<float>) – Min ratios used to pruned parameters in params. List means min ratios for each parameter in params. The length of min_ratios should be equal to length of params when min_ratios is a list. If it is a scalar, it will used for all the parameters in params.
  • key (str) – Identity used in communication between controller server and clients.
  • is_server (bool) – Whether current host is controller server. Default: True.
prune(program, eval_program=None)

Prune program with latest tokens generated by controller.

Parameters:program (fluid.Program) – The program to be pruned.
Returns:The pruned program.
Return type:paddle.fluid.Program
reward(score)

Return reward of current pruned program.

Parameters:float – The score of pruned program.
class paddleslim.prune.SensitivePruner(place, eval_func, scope=None, checkpoints=None)

Bases: object

Pruner used to prune parameters iteratively according to sensitivities of parameters in each step.

Parameters:
  • place (fluid.CUDAPlace | fluid.CPUPlace) – The device place where program execute.
  • eval_func (function) – A callback function used to evaluate pruned program. The argument of this function is pruned program. And it return a score of given program.
  • scope (fluid.scope) – The scope used to execute program.
get_ratios_by_sensitive(sensitivities, pruned_flops, eval_program)

Search a group of ratios for pruning target flops.

Parameters:
  • sensitivities (dict) – The sensitivities used to generate a group of pruning ratios. The key of dict is name of parameters to be pruned. The value of dict is a list of tuple with format (pruned_ratio, accuracy_loss).
  • pruned_flops (float) – The percent of FLOPS to be pruned.
  • eval_program (Program) – The program whose FLOPS is considered.
Returns:

A group of ratios. The key of dict is name of parameters while the value is the ratio to be pruned.

Return type:

dict

greedy_prune(train_program, eval_program, params, pruned_flops_rate, topk=1)
prune(train_program, eval_program, params, pruned_flops)

Pruning parameters of training and evaluation network by sensitivities in current step.

Parameters:
  • train_program (fluid.Program) – The training program to be pruned.
  • eval_program (fluid.Program) – The evaluation program to be pruned. And it is also used to calculate sensitivities of parameters.
  • params (list<str>) – The parameters to be pruned.
  • pruned_flops (float) – The ratio of FLOPS to be pruned in current step.
Returns:

A tuple of pruned training program and pruned evaluation program.

Return type:

tuple

restore(checkpoints=None)
save_checkpoint(train_program, eval_program)
paddleslim.prune.sensitivity(program, place, param_names, eval_func, sensitivities_file=None, pruned_ratios=None)

Compute the sensitivities of convolutions in a model. The sensitivity of a convolution is the losses of accuracy on test dataset in differenct pruned ratios. The sensitivities can be used to get a group of best ratios with some condition. This function return a dict storing sensitivities as below:

{"weight_0":
    {0.1: 0.22,
     0.2: 0.33
    },
  "weight_1":
    {0.1: 0.21,
     0.2: 0.4
    }
}

weight_0 is parameter name of convolution. sensitivities['weight_0'] is a dict in which key is pruned ratio and value is the percent of losses.

Parameters:
  • program (paddle.fluid.Program) – The program to be analysised.
  • place (fluid.CPUPlace | fluid.CUDAPlace) – The device place of filter parameters.
  • param_names (list) – The parameter names of convolutions to be analysised.
  • eval_func (function) – The callback function used to evaluate the model. It should accept a instance of paddle.fluid.Program as argument and return a score on test dataset.
  • sensitivities_file (str) – The file to save the sensitivities. It will append the latest computed sensitivities into the file. And the sensitivities in the file would not be computed again. This file can be loaded by pickle library.
  • pruned_ratios (list) – The ratios to be pruned. default: [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9].
Returns:

A dict storing sensitivities.

Return type:

dict

paddleslim.prune.flops_sensitivity(program, place, param_names, eval_func, sensitivities_file=None, pruned_flops_rate=0.1)
paddleslim.prune.load_sensitivities(sensitivities_file)

Load sensitivities from file.

Parameters:sensitivities_file (str) – The file storing sensitivities.
Returns:A dict stroring sensitivities.
Return type:dict
paddleslim.prune.merge_sensitive(sensitivities)

Merge sensitivities.

Parameters:sensitivities (list<dict> | list<str>) – The sensitivities to be merged. It cann be a list of sensitivities files or dict.
Returns:A dict stroring sensitivities.
Return type:dict
paddleslim.prune.get_ratios_by_loss(sensitivities, loss)

Get the max ratio of each parameter. The loss of accuracy must be less than given loss when the single parameter was pruned by the max ratio.

Parameters:
  • sensitivities (dict) – The sensitivities used to generate a group of pruning ratios. The key of dict is name of parameters to be pruned. The value of dict is a list of tuple with format (pruned_ratio, accuracy_loss).
  • loss (float) – The threshold of accuracy loss.
Returns:

A group of ratios. The key of dict is name of parameters while the value is the ratio to be pruned.

Return type:

dict

class paddleslim.prune.conv2d(op, pruned_params, visited={})

Bases: paddleslim.prune.prune_walker.PruneWorker

paddleslim.prune.save_model(exe, graph, dirname)

Save weights of model and information of shapes into filesystem.

Parameters:
  • exe (paddle.fluid.Executor) – The executor used to save model.
  • graph (Program|Graph) – The graph to be saved.
  • dirname (str) – The directory that the model saved into.
paddleslim.prune.load_model(exe, graph, dirname)

Load weights of model and information of shapes from filesystem.

Parameters:
  • graph (Program|Graph) – The graph to be updated by loaded information..
  • dirname (str) – The directory that the model will be loaded.

Submodules

paddleslim.prune.auto_pruner module

class paddleslim.prune.auto_pruner.AutoPruner(program, scope, place, params=[], init_ratios=None, pruned_flops=0.5, pruned_latency=None, server_addr=('', 0), init_temperature=100, reduce_rate=0.85, max_try_times=300, max_client_num=10, search_steps=300, max_ratios=[0.9], min_ratios=[0], key='auto_pruner', is_server=True)

Bases: object

Search a group of ratios used to prune program.

Parameters:
  • program (Program) – The program to be pruned.
  • scope (Scope) – The scope to be pruned.
  • place (fluid.Place) – The device place of parameters.
  • params (list<str>) – The names of parameters to be pruned.
  • init_ratios (list<float>|float) – Init ratios used to pruned parameters in params. List means ratios used for pruning each parameter in params. The length of init_ratios should be equal to length of params when init_ratios is a list. If it is a scalar, all the parameters in params will be pruned by uniform ratio. None means get a group of init ratios by pruned_flops of pruned_latency. Default: None.
  • pruned_flops (float) – The percent of FLOPS to be pruned. Default: None.
  • pruned_latency (float) – The percent of latency to be pruned. Default: None.
  • server_addr (tuple) – A tuple of server ip and server port for controller server.
  • init_temperature (float) – The init temperature used in simulated annealing search strategy.
  • reduce_rate (float) – The decay rate used in simulated annealing search strategy.
  • max_try_times (int) – The max number of trying to generate legal tokens.
  • max_client_num (int) – The max number of connections of controller server.
  • search_steps (int) – The steps of searching.
  • max_ratios (float|list<float>) – Max ratios used to pruned parameters in params. List means max ratios for each parameter in params. The length of max_ratios should be equal to length of params when max_ratios is a list. If it is a scalar, it will used for all the parameters in params.
  • min_ratios (float|list<float>) – Min ratios used to pruned parameters in params. List means min ratios for each parameter in params. The length of min_ratios should be equal to length of params when min_ratios is a list. If it is a scalar, it will used for all the parameters in params.
  • key (str) – Identity used in communication between controller server and clients.
  • is_server (bool) – Whether current host is controller server. Default: True.
prune(program, eval_program=None)

Prune program with latest tokens generated by controller.

Parameters:program (fluid.Program) – The program to be pruned.
Returns:The pruned program.
Return type:paddle.fluid.Program
reward(score)

Return reward of current pruned program.

Parameters:float – The score of pruned program.

paddleslim.prune.prune_io module

paddleslim.prune.prune_io.save_model(exe, graph, dirname)

Save weights of model and information of shapes into filesystem.

Parameters:
  • exe (paddle.fluid.Executor) – The executor used to save model.
  • graph (Program|Graph) – The graph to be saved.
  • dirname (str) – The directory that the model saved into.
paddleslim.prune.prune_io.load_model(exe, graph, dirname)

Load weights of model and information of shapes from filesystem.

Parameters:
  • graph (Program|Graph) – The graph to be updated by loaded information..
  • dirname (str) – The directory that the model will be loaded.

paddleslim.prune.prune_walker module

class paddleslim.prune.prune_walker.conv2d(op, pruned_params, visited={})

Bases: paddleslim.prune.prune_walker.PruneWorker

paddleslim.prune.pruner module

class paddleslim.prune.pruner.Pruner(criterion='l1_norm')

The pruner used to prune channels of convolution.

Parameters:criterion (str) – the criterion used to sort channels for pruning. It only supports ‘l1_norm’ currently.
prune(program, scope, params, ratios, place=None, lazy=False, only_graph=False, param_backup=False, param_shape_backup=False)

Pruning the given parameters.

Parameters:
  • program (fluid.Program) – The program to be pruned.
  • scope (fluid.Scope) – The scope storing paramaters to be pruned.
  • params (list<str>) – A list of parameter names to be pruned.
  • ratios (list<float>) – A list of ratios to be used to pruning parameters.
  • place (fluid.Place) – The device place of filter parameters. Defalut: None.
  • lazy (bool) – True means setting the pruned elements to zero. False means cutting down the pruned elements. Default: False.
  • only_graph (bool) – True means only modifying the graph. False means modifying graph and variables in scope. Default: False.
  • param_backup (bool) – Whether to return a dict to backup the values of parameters. Default: False.
  • param_shape_backup (bool) – Whether to return a dict to backup the shapes of parameters. Default: False.
Returns:

(pruned_program, param_backup, param_shape_backup). pruned_program is the pruned program. param_backup is a dict to backup the values of parameters. param_shape_backup is a dict to backup the shapes of parameters.

Return type:

tuple

paddleslim.prune.sensitive module

paddleslim.prune.sensitive.sensitivity(program, place, param_names, eval_func, sensitivities_file=None, pruned_ratios=None)

Compute the sensitivities of convolutions in a model. The sensitivity of a convolution is the losses of accuracy on test dataset in differenct pruned ratios. The sensitivities can be used to get a group of best ratios with some condition. This function return a dict storing sensitivities as below:

{"weight_0":
    {0.1: 0.22,
     0.2: 0.33
    },
  "weight_1":
    {0.1: 0.21,
     0.2: 0.4
    }
}

weight_0 is parameter name of convolution. sensitivities['weight_0'] is a dict in which key is pruned ratio and value is the percent of losses.

Parameters:
  • program (paddle.fluid.Program) – The program to be analysised.
  • place (fluid.CPUPlace | fluid.CUDAPlace) – The device place of filter parameters.
  • param_names (list) – The parameter names of convolutions to be analysised.
  • eval_func (function) – The callback function used to evaluate the model. It should accept a instance of paddle.fluid.Program as argument and return a score on test dataset.
  • sensitivities_file (str) – The file to save the sensitivities. It will append the latest computed sensitivities into the file. And the sensitivities in the file would not be computed again. This file can be loaded by pickle library.
  • pruned_ratios (list) – The ratios to be pruned. default: [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9].
Returns:

A dict storing sensitivities.

Return type:

dict

paddleslim.prune.sensitive.flops_sensitivity(program, place, param_names, eval_func, sensitivities_file=None, pruned_flops_rate=0.1)
paddleslim.prune.sensitive.load_sensitivities(sensitivities_file)

Load sensitivities from file.

Parameters:sensitivities_file (str) – The file storing sensitivities.
Returns:A dict stroring sensitivities.
Return type:dict
paddleslim.prune.sensitive.merge_sensitive(sensitivities)

Merge sensitivities.

Parameters:sensitivities (list<dict> | list<str>) – The sensitivities to be merged. It cann be a list of sensitivities files or dict.
Returns:A dict stroring sensitivities.
Return type:dict
paddleslim.prune.sensitive.get_ratios_by_loss(sensitivities, loss)

Get the max ratio of each parameter. The loss of accuracy must be less than given loss when the single parameter was pruned by the max ratio.

Parameters:
  • sensitivities (dict) – The sensitivities used to generate a group of pruning ratios. The key of dict is name of parameters to be pruned. The value of dict is a list of tuple with format (pruned_ratio, accuracy_loss).
  • loss (float) – The threshold of accuracy loss.
Returns:

A group of ratios. The key of dict is name of parameters while the value is the ratio to be pruned.

Return type:

dict

paddleslim.prune.sensitive_pruner module

class paddleslim.prune.sensitive_pruner.SensitivePruner(place, eval_func, scope=None, checkpoints=None)

Bases: object

Pruner used to prune parameters iteratively according to sensitivities of parameters in each step.

Parameters:
  • place (fluid.CUDAPlace | fluid.CPUPlace) – The device place where program execute.
  • eval_func (function) – A callback function used to evaluate pruned program. The argument of this function is pruned program. And it return a score of given program.
  • scope (fluid.scope) – The scope used to execute program.
get_ratios_by_sensitive(sensitivities, pruned_flops, eval_program)

Search a group of ratios for pruning target flops.

Parameters:
  • sensitivities (dict) – The sensitivities used to generate a group of pruning ratios. The key of dict is name of parameters to be pruned. The value of dict is a list of tuple with format (pruned_ratio, accuracy_loss).
  • pruned_flops (float) – The percent of FLOPS to be pruned.
  • eval_program (Program) – The program whose FLOPS is considered.
Returns:

A group of ratios. The key of dict is name of parameters while the value is the ratio to be pruned.

Return type:

dict

greedy_prune(train_program, eval_program, params, pruned_flops_rate, topk=1)
prune(train_program, eval_program, params, pruned_flops)

Pruning parameters of training and evaluation network by sensitivities in current step.

Parameters:
  • train_program (fluid.Program) – The training program to be pruned.
  • eval_program (fluid.Program) – The evaluation program to be pruned. And it is also used to calculate sensitivities of parameters.
  • params (list<str>) – The parameters to be pruned.
  • pruned_flops (float) – The ratio of FLOPS to be pruned in current step.
Returns:

A tuple of pruned training program and pruned evaluation program.

Return type:

tuple

restore(checkpoints=None)
save_checkpoint(train_program, eval_program)