Training LeNet with MNIST dataset in MindSpore with quantization aware training.
This is the simple and basic tutorial for constructing a network in MindSpore with quantization aware.
# [LeNet Description](#contents)
In this tutorial, you will:
LeNet was proposed in 1998, a typical convolutional neural network. It was used for digit recognition and got big success.
1. Train a MindSpore fusion model for MNIST from scratch using `nn.Conv2dBnAct` and `nn.DenseBnAct`.
2. Fine tune the fusion model by applying the quantization aware training auto network converter API `convert_quant_network`, after the network convergence then export a quantization aware model checkpoint file.
3. Use the quantization aware model to create an actually quantized model for the Ascend inference backend.
4. See the persistence of accuracy in inference backend and a 4x smaller model. To see the latency benefits on mobile, try out the Ascend inference backend examples.
[Paper](https://ieeexplore.ieee.org/document/726791): Y.Lecun, L.Bottou, Y.Bengio, P.Haffner. Gradient-Based Learning Applied to Document Recognition. *Proceedings of the IEEE*. 1998.
This is the quantitative network of LeNet.
## Train fusion model
# [Model Architecture](#contents)
### Install
LeNet is very simple, which contains 5 layers. The layer composition consists of 2 convolutional layers and 3 fully connected layers.
Install MindSpore base on the ascend device and GPU device from [MindSpore](https://www.mindspore.cn/install/en).
You will apply quantization aware training to the whole model and the layers of "fake quant op" are insert into the whole model. All layers are now perpare by "fake quant op".
Note that the resulting model is quantization aware but not quantized (e.g. the weights are float32 instead of int8).
After installing MindSpore via the official website, you can start training and evaluation as follows:
```python
# define funsion network
network = LeNet5Fusion(cfg.num_classes)
# enter ../lenet directory and train lenet network,then a '.ckpt' file will be generated.
The model checkpoint will be saved in the current directory.
Procedure of quantization aware model evaluation is different from normal. Because the checkpoint was create by quantization aware model, so we need to load fusion model checkpoint before convert fusion model to quantization aware model.
## [Evaluation Process](#contents)
```python
# define funsion network
network = LeNet5Fusion(cfg.num_classes)
### Evaluation
# load quantization aware network checkpoint
param_dict = load_checkpoint(args.ckpt_path)
load_param_into_net(network, param_dict)
Before running the command below, please check the checkpoint path used for evaluation.
# convert funsion netwrok to quantization aware network
network = quant.convert_quant_network(network)
```
python eval.py --data_path Data --ckpt_path ckpt/checkpoint_lenet-1_937.ckpt > log.txt 2>&1 &
```
Also, you can just run this command insread.
You can view the results through the file "log.txt". The accuracy of the test dataset will be as follows:
MobileNetV2 is a significant improvement over MobileNetV1 and pushes the state of the art for mobile visual recognition including classification, object detection and semantic segmentation.
-[Script and Sample Code](#script-and-sample-code)
-[Training Process](#training-process)
-[Evaluation Process](#evaluation-process)
-[Evaluation](#evaluation)
-[Model Description](#model-description)
-[Performance](#performance)
-[Training Performance](#evaluation-performance)
-[Inference Performance](#evaluation-performance)
-[Description of Random Situation](#description-of-random-situation)
-[ModelZoo Homepage](#modelzoo-homepage)
MobileNetV2 builds upon the ideas from MobileNetV1, using depthwise separable convolution as efficient building blocks. However, V2 introduces two new features to the architecture: 1) linear bottlenecks between the layers, and 2) shortcut connections between the bottlenecks1.
# [MobileNetV2 Description](#contents)
Training MobileNetV2 with ImageNet dataset in MindSpore with quantization aware training.
This is the simple and basic tutorial for constructing a network in MindSpore with quantization aware.
MobileNetV2 is tuned to mobile phone CPUs through a combination of hardware- aware network architecture search (NAS) complemented by the NetAdapt algorithm and then subsequently improved through novel architecture advances.Nov 20, 2019.
In this readme tutorial, you will:
[Paper](https://arxiv.org/pdf/1905.02244) Howard, Andrew, Mark Sandler, Grace Chu, Liang-Chieh Chen, Bo Chen, Mingxing Tan, Weijun Wang et al. "Searching for MobileNetV2." In Proceedings of the IEEE International Conference on Computer Vision, pp. 1314-1324. 2019.
1. Train a MindSpore fusion MobileNetV2 model for ImageNet from scratch using `nn.Conv2dBnAct` and `nn.DenseBnAct`.
2. Fine tune the fusion model by applying the quantization aware training auto network converter API `convert_quant_network`, after the network convergence then export a quantization aware model checkpoint file.
This is the quantitative network of MobileNetV2.
[Paper](https://arxiv.org/pdf/1801.04381) Sandler, Mark, et al. "Mobilenetv2: Inverted residuals and linear bottlenecks." Proceedings of the IEEE conference on computer vision and pattern recognition. 2018.
# [Model architecture](#contents)
# Dataset
The overall network architecture of MobileNetV2 is show below:
Dataset use: ImageNet
[Link](https://arxiv.org/pdf/1905.02244)
- Dataset size: about 125G
- Train: 120G, 1281167 images: 1000 directories
- Test: 5G, 50000 images: images should be classified into 1000 directories firstly, just like train images
- Dataset size: ~125G, 1.2W colorful images in 1000 classes
- Train: 120G, 1.2W images
- Test: 5G, 50000 images
- Data format: RGB images.
- Note: Data will be processed in src/dataset.py
- Note: Data will be processed in src/dataset.py
# [Features](#contents)
## [Mixed Precision(Ascend)](#contents)
# Environment Requirements
The [mixed precision](https://www.mindspore.cn/tutorial/zh-CN/master/advanced_use/mixed_precision.html) training method accelerates the deep learning neural network training process by using both the single-precision and half-precision data formats, and maintains the network precision achieved by the single-precision training at the same time. Mixed precision training can accelerate the computation process, reduce memory usage, and enable a larger model or batch size to be trained on specific hardware.
For FP16 operators, if the input data type is FP32, the backend of MindSpore will automatically handle it with reduced precision. Users could check the reduced-precision operators by enabling INFO log and then searching ‘reduce precision’.
- Hardware(Ascend)
- Prepare hardware environment with Ascend processor. If you want to try Ascend, please send the [application form](https://obs-9be7.obs.cn-east-2.myhuaweicloud.com/file/other/Ascend%20Model%20Zoo%E4%BD%93%E9%AA%8C%E8%B5%84%E6%BA%90%E7%94%B3%E8%AF%B7%E8%A1%A8.docx) to ascend@huawei.com. Once approved, you can get the resources.
# [Environment Requirements](#contents)
- Hardware:Ascend
- Prepare hardware environment with Ascend. If you want to try Ascend , please send the [application form](https://obs-9be7.obs.cn-east-2.myhuaweicloud.com/file/other/Ascend%20Model%20Zoo%E4%BD%93%E9%AA%8C%E8%B5%84%E6%BA%90%E7%94%B3%E8%AF%B7%E8%A1%A8.docx) to ascend@huawei.com. Once approved, you can get the resources.
├──Readme.md# descriptions about MobileNetV2-Quant
├──scripts
│├──run_train_quant.sh
│├──run_infer_quant.sh
├──src
│├──config.py
│├──dataset.py
│├──luanch.py
│├──lr_generator.py
│├──mobilenetV2.py
├──train.py
├──eval.py
│├──run_train_quant.sh# shell script for train on Ascend
│├──run_infer_quant.sh# shell script for evaluation on Ascend
├──src
│├──config.py# parameter configuration
│├──dataset.py# creating dataset
│├──launch.py# start python script
│├──lr_generator.py# learning rate config
│├──mobilenetV2.py# MobileNetV2 architecture
│├──utils.py# supply the monitor module
├──train.py# training script
├──eval.py# evaluation script
├──export.py# export checkpoint files into air/onnx
```
### Fine-tune for quantization aware training
## [Training process](#contents)
### Usage
Fine tune the fusion model by applying the quantization aware training auto network converter API `convert_quant_network`, after the network convergence then export a quantization aware model checkpoint file.
You can start training using python or shell scripts. The usage of shell scripts as follows:
-Ascend: sh run_train_quant.sh Ascend [DEVICE_NUM] [VISIABLE_DEVICES(0,1,2,3,4,5,6,7)] [RANK_TABLE_FILE] [DATASET_PATH] [CKPT_PATH]
You can just run this command instead.
### Launch
``` bash
>>> sh run_train_quant.sh Ascend 4 192.168.0.1 0,1,2,3 ~/imagenet/train/ ~/mobilenet.ckpt
```
# training example
shell:
Ascend: sh run_train_quant.sh Ascend 8 10.222.223.224 0,1,2,3,4,5,6,7 ~/imagenet/train/ mobilenet_199.ckpt
```
### Result
Training result will be stored in the example path. Checkpoints will be stored at `. /checkpoint` by default, and training log will be redirected to `./train/train.log` like followings.
Training result will be stored in the example path. Checkpoints will be stored at `. /checkpoint` by default, and training log will be redirected to `./train/train.log` like followings.
- Dataset size: ~125G, 1.2W colorful images in 1000 classes
- Train: 120G, 1.2W images
- Test: 5G, 50000 images
- Data format: RGB images.
- Note: Data will be processed in src/dataset.py
# [Features](#contents)
## [Mixed Precision(Ascend)](#contents)
The [mixed precision](https://www.mindspore.cn/tutorial/zh-CN/master/advanced_use/mixed_precision.html) training method accelerates the deep learning neural network training process by using both the single-precision and half-precision data formats, and maintains the network precision achieved by the single-precision training at the same time. Mixed precision training can accelerate the computation process, reduce memory usage, and enable a larger model or batch size to be trained on specific hardware.
For FP16 operators, if the input data type is FP32, the backend of MindSpore will automatically handle it with reduced precision. Users could check the reduced-precision operators by enabling INFO log and then searching ‘reduce precision’.
## Running the example
# [Environment Requirements](#contents)
### Train
- Hardware:Ascend
- Prepare hardware environment with Ascend. If you want to try Ascend , please send the [application form](https://obs-9be7.obs.cn-east-2.myhuaweicloud.com/file/other/Ascend%20Model%20Zoo%E4%BD%93%E9%AA%8C%E8%B5%84%E6%BA%90%E7%94%B3%E8%AF%B7%E8%A1%A8.docx) to ascend@huawei.com. Once approved, you can get the resources.
│├──run_train.sh# shell script for train on Ascend
│├──run_infer.sh# shell script for evaluation on Ascend
├──model
│├──resnet_quant.py# define the network model of resnet50-quant
├──src
│├──config.py# parameter configuration
│├──dataset.py# creating dataset
│├──launch.py# start python script
│├──lr_generator.py# learning rate config
│├──crossentropy.py# define the crossentropy of resnet50-quant
├──train.py# training script
├──eval.py# evaluation script
```
## [Training process](#contents)
### Usage
- Ascend: sh run_train.sh Ascend [DEVICE_NUM] [SERVER_IP(x.x.x.x)] [VISIABLE_DEVICES(0,1,2,3,4,5,6,7)] [DATASET_PATH] [CKPT_PATH]
You can start training using python or shell scripts. The usage of shell scripts as follows:
- Ascend: sh run_train.sh Ascend [DEVICE_NUM] [SERVER_IP(x.x.x.x)] [VISIABLE_DEVICES(0,1,2,3,4,5,6,7)] [DATASET_PATH][CKPT_PATH]
### Launch
```
```
# training example
Ascend: sh run_train.sh Ascend 8 192.168.0.1 0,1,2,3,4,5,6,7 ~/imagenet/train/
shell:
Ascend: sh run_train.sh Ascend 8 10.222.223.224 0,1,2,3,4,5,6,7 ~/resnet/train/ Resnet50-90_5004.ckpt
```
### Result
Training result will be stored in the example path. Checkpoints will be stored at `. /checkpoint` by default, and training log will be redirected to `./train/train.log` like followings.
Training result will be stored in the example path. Checkpoints will be stored at `. /checkpoint` by default, and training log will be redirected to `./train/train.log` like followings.
```
```
epoch: 1 step: 5004, loss is 4.8995576
epoch: 2 step: 5004, loss is 3.9235563
epoch: 3 step: 5004, loss is 3.833077
...
...
@@ -96,27 +112,76 @@ epoch: 4 step: 5004, loss is 3.2795618
epoch: 5 step: 5004, loss is 3.1978393
```
## Eval process
## [Eval process](#contents)
### Usage
You can start training using python or shell scripts. The usage of shell scripts as follows:
- Ascend: sh run_infer.sh Ascend [DATASET_PATH] [CHECKPOINT_PATH]
### Launch
```
```
# infer example
Ascend: sh run_infer.sh Ascend ~/imagenet/val/ ~/checkpoint/resnet50-110_5004.ckpt
shell:
Ascend: sh run_infer.sh Ascend ~/imagenet/val/ ~/train/Resnet50-30_5004.ckpt
```
> checkpoint can be produced in training process.
#### Result
### Result
Inference result will be stored in the example path, whose folder name is "infer". Under this, you can find result like the followings in log.
Inference result will be stored in the example path, you can find result like the followings in `./eval/infer.log`.