README.md

# 非结构化稀疏 -- 动态图剪裁（包括按照阈值和比例剪裁两种模式）

## 简介

在模型压缩中，常见的稀疏方式为结构化和非结构化稀疏，前者在某个特定维度（特征通道、卷积核等等）上进行稀疏化操作；后者以每一个参数为单元进行稀疏化，所以更加依赖于硬件对稀疏后矩阵运算的加速能力。本目录即在PaddlePaddle和PaddleSlim框架下开发的非结构化稀疏算法，MobileNetV1在ImageNet上的稀疏化实验中，剪裁率55.19%，达到无损的表现。

## 版本要求
```bash
python3.5+
paddlepaddle>=2.0.0
paddleslim>=2.1.0
```

请参照github安装[paddlepaddle](https://github.com/PaddlePaddle/Paddle)和[paddleslim](https://github.com/PaddlePaddle/PaddleSlim)。

## 使用

训练前：
- 训练数据下载后，可以通过重写../imagenet_reader.py文件，并在train.py/evaluate.py文件中调用实现。
- 开发者可以通过重写paddleslim.dygraph.prune.unstructured_pruner.py中的UnstructuredPruner.mask_parameters()和UnstructuredPruner.update_threshold()来定义自己的非结构化稀疏策略（目前为剪裁掉绝对值小的parameters）。
- 开发可以在初始化UnstructuredPruner时，传入自定义的skip_params_func，来定义哪些参数不参与剪裁。skip_params_func示例代码如下(路径：paddleslim.dygraph.prune.unstructured_pruner._get_skip_params())。默认为所有的归一化层的参数不参与剪裁。

```python
def _get_skip_params(model):
    """
    This function is used to check whether the given model's layers are valid to be pruned.
    Usually, the convolutions are to be pruned while we skip the normalization-related parameters.
    Deverlopers could replace this function by passing their own when initializing the UnstructuredPuner instance.

    Args:
      - model(Paddle.nn.Layer): the current model waiting to be checked.
    Return:
      - skip_params(set<String>): a set of parameters' names
    """
    skip_params = set()
    for _, sub_layer in model.named_sublayers():
        if type(sub_layer).__name__.split('.')[-1] in paddle.nn.norm.__all__:
            skip_params.add(sub_layer.full_name())
    return skip_params
```

训练：
```bash
python3 train.py --data cifar10 --lr 0.1 --pruning_mode ratio --ratio=0.5
```

推理：
```bash
python3 eval --pruned_model models/ --data cifar10
```

剪裁训练代码示例：
```python
model = mobilenet_v1(num_classes=class_dim, pretrained=True)
#STEP1: initialize the pruner
pruner = UnstructuredPruner(model, mode='ratio', ratio=0.5)

for epoch in range(epochs):
    for batch_id, data in enumerate(train_loader):
        loss = calculate_loss()
        loss.backward()
        opt.step()
        opt.clear_grad()
        #STEP2: update the pruner's threshold given the updated parameters
        pruner.step()

    if epoch % args.test_period == 0:
        #STEP3: before evaluation during training, eliminate the non-zeros generated by opt.step(), which, however, the cached masks setting to be zeros.
        pruner.update_params()
        eval(epoch)

    if epoch % args.model_period == 0:
        # STEP4: same purpose as STEP3
        pruner.update_params()
        paddle.save(model.state_dict(), "model-pruned.pdparams")
        paddle.save(opt.state_dict(), "opt-pruned.pdopt")
```

剪裁后测试代码示例：
```python
model = mobilenet_v1(num_classes=class_dim, pretrained=True)
model.set_state_dict(paddle.load("model-pruned.pdparams"))
print(UnstructuredPruner.total_sparse(model)) #注意，total_sparse为静态方法(static method)，可以不创建实例(instance)直接调用，方便只做测试的写法。
test()
```

更多使用参数请参照shell文件或者运行如下命令查看：
```bash
python train --h
python evaluate --h
```

## 实验结果 （刚开始在动态图代码验证，以下为静态图代码上的结果）

| 模型 | 数据集 | 压缩方法 | 压缩率| Top-1/Top-5 Acc | lr | threshold | epoch |
|:--:|:---:|:--:|:--:|:--:|:--:|:--:|:--:|
| MobileNetV1 | ImageNet | Baseline | - | 70.99%/89.68% | - | - | - |
| MobileNetV1 | ImageNet |   ratio  | -55.19% | 70.87%/89.80% (-0.12%/+0.12%) | 0.005 | - | 68 |
| YOLO v3     |  VOC     | - | - |76.24% | - | - | - |
| YOLO v3     |  VOC     |threshold | -41.35% | 75.29%（-0.95%） | 0.005 | 0.05 | 10w |
| YOLO v3     |  VOC     |threshold | -53.00% | 75.00%（-1.24%） | 0.005 | 0.075 | 10w |

## TODO

- [ ] 完成实验，验证动态图下的效果，并得到压缩模型。
- [ ] 扩充衡量parameter重要性的方法（目前仅为绝对值）。