2.8 KB
Newer Older
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74
# LRC Local Rademachar Complexity Regularization
This directory contains image classification model based on novel regularizer rooted in Local Rademacher Complexity (LRC). The regularization by LRC and [DARTS]( are combined in this model and it achieves 97.3% accuracy on CIFAR-10 dataset.

# Table of Contents

- [Installation](#installation)
- [Data preparation](#data-preparation)
- [Training](#training)
- [Model performances](#model-performances)

## Installation

Running sample code in this directory requires PaddelPaddle Fluid v.1.2.0 and later. If the PaddlePaddle on your device is lower than this version, please follow the instructions in [installation document]( and make an update.

## Data preparation

When you want to use the cifar-10 dataset for the first time, you can download the dataset as:

    sh ./dataset/

Please make sure your environment has an internet connection.

The dataset will be downloaded to `dataset/cifar/cifar-10-batches-py` in the same directory as the ``. If automatic download fails, you can download cifar-10-python.tar.gz from and decompress it to the location mentioned above.

## Training

After data preparation, one can start the training step by:

    python -u \
        --batch_size=80 \
        --auxiliary \
        --weight_decay=0.0003 \
        --learning_rate=0.025 \
        --lrc_loss_lambda=0.7 \
- Set ```export CUDA_VISIBLE_DEVICES=0``` to specifiy one GPU to train.
- For more help on arguments:

    python --help

**data reader introduction:**

* Data reader is defined in ``.
* Reshape the images to 32 * 32.
* In training stage, images are padding to 40 * 40 and cropped randomly to the original size.
* In training stage, images are horizontally random flipped.
* Images are standardized to (0, 1).
* In training stage, cutout images randomly.
* Shuffle the order of the input images during training.

**model configuration:**

* Use auxiliary loss and auxiliary\_weight=0.4.
* Use dropout and drop\_path\_prob=0.2.
* Set lrc\_loss\_lambda=0.7.

**training strategy:**

*  Use momentum optimizer with momentum=0.9.
*  Weight decay is 0.0003.
*  Use cosine decay with init\_lr=0.025.
*  Total epoch is 600.
*  Use Xaiver initalizer to weight in conv2d, Constant initalizer to weight in batch norm and Normal initalizer to weight in fc.
*  Initalize bias in batch norm and fc to zero constant and do not add bias to conv2d.

## Model performances
Below is the accuracy on CIFAR-10 dataset:

| model | avg top1 | avg top5 |
| ----- | -------- | -------- |
| [DARTS-LRC]( | 97.34 | 99.75 |