VGG, a very deep convolutional networks for large-scale image recognition, was proposed in 2014 and won the 1th place in object localization and 2th place in image classification task in ImageNet Large-Scale Visual Recognition Challenge 2014 (ILSVRC14).
- Download the CIFAR-10 binary version dataset.
[Paper](): Simonyan K, zisserman A. Very Deep Convolutional Networks for Large-Scale Image Recognition[J]. arXiv preprint arXiv:1409.1556, 2014.
> Unzip the CIFAR-10 dataset to any path you want and the folder structure should be as follows:
> ```
> .
> ├── cifar-10-batches-bin # train dataset
> └── cifar-10-verify-bin # infer dataset
> ```
# [Model Architecture](#contents)
VGG 16 network is mainly consisted by several basic modules (including convolution and pooling layer) and three continuous Dense layer.
here basic modules mainly include basic operation like: **3×3 conv** and **2×2 max pooling**.
- Dataset size: ~146G, 1.28 million colorful images in 1000 classes
- Train: 140G, 1,281,167 images
- Test: 6.4G, 50, 000 images
- Data format: RGB images
- Note: Data will be processed in src/dataset.py
#### Dataset organize way
CIFAR-10
> Unzip the CIFAR-10 dataset to any path you want and the folder structure should be as follows:
> ```
> .
> ├── cifar-10-batches-bin # train dataset
> └── cifar-10-verify-bin # infer dataset
> ```
ImageNet2012
> Unzip the ImageNet2012 dataset to any path you want and the folder should include train and eval dataset as follows:
>
> ```
> .
> └─dataset
> ├─ilsvrc # train dataset
> └─validation_preprocess # evaluate dataset
> ```
# [Features](#contents)
## Mixed Precision
The [mixed precision](https://www.mindspore.cn/tutorial/zh-CN/master/advanced_use/mixed_precision.html) training method accelerates the deep learning neural network training process by using both the single-precision and half-precision data formats, and maintains the network precision achieved by the single-precision training at the same time. Mixed precision training can accelerate the computation process, reduce memory usage, and enable a larger model or batch size to be trained on specific hardware.
For FP16 operators, if the input data type is FP32, the backend of MindSpore will automatically handle it with reduced precision. Users could check the reduced-precision operators by enabling INFO log and then searching ‘reduce precision’.
# [Environment Requirements](#contents)
- Hardware(Ascend/GPU)
- Prepare hardware environment with Ascend or GPU processor. If you want to try Ascend , please send the [application form](https://obs-9be7.obs.cn-east-2.myhuaweicloud.com/file/other/Ascend%20Model%20Zoo%E4%BD%93%E9%AA%8C%E8%B5%84%E6%BA%90%E7%94%B3%E8%AF%B7%E8%A1%A8.docx) to ascend@huawei.com. Once approved, you can get the resources.
- Framework
-[MindSpore](https://www.mindspore.cn/install/en)
- For more information, please check the resources below:
The python command above will run in the background, you can view the results through the file `out.train.log`.
### Distribute Training
After training, you'll get some checkpoint files in specified ckpt_path, default in ./output directory.
You will get the loss value as following:
```
# grep "loss is " output.train.log
epoch: 1 step: 781, loss is 2.093086
epcoh: 2 step: 781, loss is 1.827582
...
```
- Distributed Training
```
sh run_distribute_train.sh rank_table.json your_data_path
```
...
...
@@ -68,40 +292,83 @@ train_parallel1/log:epcoh: 2 step: 97, loss is 1.7133579
```
> About rank_table.json, you can refer to the [distributed training tutorial](https://www.mindspore.cn/tutorial/en/master/advanced_use/distributed_training.html).
- The above python command will run in the background, you can view the results through the file `output.eval.log`. You will get the accuracy as following:
```
Usage: sh script/run_distribute_train.sh [MINDSPORE_HCCL_CONFIG_PATH] [DATA_PATH]