README.md 9.8 KB
Newer Older
W
wilfChen 已提交
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
# Contents

- [LSTM Description](#lstm-description)
- [Model Architecture](#model-architecture)
- [Dataset](#dataset)
- [Environment Requirements](#environment-requirements)
- [Quick Start](#quick-start)
- [Script Description](#script-description)
    - [Script and Sample Code](#script-and-sample-code)
    - [Script Parameters](#script-parameters)
    - [Dataset Preparation](#dataset-preparation)
    - [Training Process](#training-process)
    - [Evaluation Process](#evaluation-process)
- [Model Description](#model-description)
    - [Performance](#performance)
        - [Training Performance](#training-performance)
        - [Evaluation Performance](#evaluation-performance)
- [Description of Random Situation](#description-of-random-situation)
- [ModelZoo Homepage](#modelzoo-homepage)


# [LSTM Description](#contents)
C
caojian05 已提交
23 24 25

This example is for LSTM model training and evaluation.

W
wilfChen 已提交
26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
[Paper](https://www.aclweb.org/anthology/P11-1015/):  Andrew L. Maas, Raymond E. Daly, Peter T. Pham, Dan Huang, Andrew Y. Ng, Christopher Potts. [Learning Word Vectors for Sentiment Analysis](https://www.aclweb.org/anthology/P11-1015/). Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies. 2011

# [Model Architecture](#contents)

LSTM contains embeding, encoder and decoder modules. Encoder module consists of LSTM layer. Decoder module consists of fully-connection layer.


# [Dataset](#contents)

- aclImdb_v1 for training evaluation.[Large Movie Review Dataset](http://ai.stanford.edu/~amaas/data/sentiment/)
- GloVe: Vector representations for words.[GloVe: Global Vectors for Word Representation](https://nlp.stanford.edu/projects/glove/)


# [Environment Requirements](#contents)

- Hardware(GPU/CPU)
- Framework
  - [MindSpore](https://gitee.com/mindspore/mindspore)
- For more information, please check the resources below:
  - [MindSpore tutorials](https://www.mindspore.cn/tutorial/en/master/index.html)
  - [MindSpore API](https://www.mindspore.cn/api/en/master/index.html)


# [Quick Start](#contents)

- runing on GPU

  ```bash
  # run training example
  bash run_train_gpu.sh 0 ./aclimdb ./glove_dir

  # run evaluation example
  bash run_eval_gpu.sh 0 ./aclimdb ./glove_dir lstm-20_390.ckpt
  ```

C
CaoJian 已提交
61 62 63 64 65 66 67 68 69 70
- runing on CPU

  ```bash
  # run training example
  bash run_train_cpu.sh ./aclimdb ./glove_dir

  # run evaluation example
  bash run_eval_cpu.sh ./aclimdb ./glove_dir lstm-20_390.ckpt
  ```

W
wilfChen 已提交
71 72 73 74 75 76 77 78 79 80 81

# [Script Description](#contents)

## [Script and Sample Code](#contents)

```shell
.
├── lstm
    ├── README.md               # descriptions about LSTM
    ├── script
    │   ├── run_eval_gpu.sh     # shell script for evaluation on GPU
C
CaoJian 已提交
82 83 84
    │   ├── run_eval_cpu.sh     # shell script for evaluation on CPU
    │   ├── run_train_gpu.sh    # shell script for training on GPU
    │   └── run_train_cpu.sh    # shell script for training on CPU
W
wilfChen 已提交
85 86 87 88 89
    ├── src
    │   ├── config.py           # parameter configuration
    │   ├── dataset.py          # dataset preprocess
    │   ├── imdb.py             # imdb dataset read script
    │   └── lstm.py             # Sentiment model
C
CaoJian 已提交
90 91
    ├── eval.py                 # evaluation script on both GPU and CPU
    └── train.py                # training script on both GPU and CPU
W
wilfChen 已提交
92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116
```


## [Script Parameters](#contents)

### Training Script Parameters

```python
usage: train.py  [-h] [--preprocess {true, false}] [--aclimdb_path ACLIMDB_PATH]
                 [--glove_path GLOVE_PATH] [--preprocess_path PREPROCESS_PATH]
                 [--ckpt_path CKPT_PATH] [--pre_trained PRE_TRAINING]
                 [--device_target {GPU, CPU}]

Mindspore LSTM Example

options:
  -h, --help                          # show this help message and exit
  --preprocess {true, false}          # whether to preprocess data.
  --aclimdb_path ACLIMDB_PATH         # path where the dataset is stored.
  --glove_path GLOVE_PATH             # path where the GloVe is stored.
  --preprocess_path PREPROCESS_PATH   # path where the pre-process data is stored.
  --ckpt_path CKPT_PATH               # the path to save the checkpoint file.
  --pre_trained                       # the pretrained checkpoint file path.
  --device_target                     # the target device to run, support "GPU", "CPU". Default: "GPU".
```
C
caojian05 已提交
117 118


W
wilfChen 已提交
119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138
### Running Options

```python
config.py:
    num_classes                   # classes num
    learning_rate                 # value of learning rate
    momentum                      # value of momentum
    num_epochs                    # epoch size
    batch_size                    # batch size of input dataset
    embed_size                    # the size of each embedding vector
    num_hiddens                   # number of features of hidden layer
    num_layers                    # number of layers of stacked LSTM
    bidirectional                 # specifies whether it is a bidirectional LSTM
    save_checkpoint_steps         # steps for saving checkpoint files
```

### Network Parameters


## [Dataset Preparation](#contents)
C
caojian05 已提交
139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158
- Download the dataset aclImdb_v1.

> Unzip the aclImdb_v1 dataset to any path you want and the folder structure should be as follows:
> ```
> .
> ├── train  # train dataset
> └── test   # infer dataset
> ```

- Download the GloVe file.

> Unzip the glove.6B.zip to any path you want and the folder structure should be as follows:
> ```
> .
> ├── glove.6B.100d.txt
> ├── glove.6B.200d.txt
> ├── glove.6B.300d.txt    # we will use this one later.
> └── glove.6B.50d.txt
> ```

W
wilfChen 已提交
159
> Adding a new line at the beginning of the file which named `glove.6B.300d.txt`.
C
caojian05 已提交
160 161 162 163 164
> It means reading a total of 400,000 words, each represented by a 300-latitude word vector.
> ```
> 400000    300
> ```

W
wilfChen 已提交
165
## [Training Process](#contents)
C
caojian05 已提交
166

W
wilfChen 已提交
167
- Set options in `config.py`, including learning rate and network hyperparameters.
C
caojian05 已提交
168

C
CaoJian 已提交
169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185
- runing on GPU

  Run `sh run_train_gpu.sh` for training.

  ``` bash
  bash run_train_gpu.sh 0 ./aclimdb ./glove_dir
  ```

  The above shell script will run distribute training in the background. You will get the loss value as following:
  ```shell
  # grep "loss is " log.txt
  epoch: 1 step: 390, loss is 0.6003723
  epcoh: 2 step: 390, loss is 0.35312173
  ...
  ```

- runing on CPU
C
caojian05 已提交
186

C
CaoJian 已提交
187
  Run `sh run_train_cpu.sh` for training.
C
caojian05 已提交
188

C
CaoJian 已提交
189 190 191 192 193 194 195 196 197 198 199 200
  ``` bash
  bash run_train_cpu.sh ./aclimdb ./glove_dir
  ```

  The above shell script will train in the background. You will get the loss value as following:

  ```shell
  # grep "loss is " log.txt
  epoch: 1 step: 390, loss is 0.6003723
  epcoh: 2 step: 390, loss is 0.35312173
  ...
  ```
C
caojian05 已提交
201 202


W
wilfChen 已提交
203
## [Evaluation Process](#contents)
C
caojian05 已提交
204

C
CaoJian 已提交
205 206 207
- evaluation on GPU

  Run `bash run_eval_gpu.sh` for evaluation.
C
caojian05 已提交
208

C
CaoJian 已提交
209 210 211 212 213 214 215 216 217 218 219
  ``` bash
  bash run_eval_gpu.sh 0 ./aclimdb ./glove_dir lstm-20_390.ckpt
  ```

- evaluation on CPU

  Run `bash run_eval_cpu.sh` for evaluation.

  ``` bash
  bash run_eval_cpu.sh ./aclimdb ./glove_dir lstm-20_390.ckpt
  ```
C
caojian05 已提交
220

W
wilfChen 已提交
221 222
# [Model Description](#contents)
## [Performance](#contents)
C
caojian05 已提交
223

W
wilfChen 已提交
224
### Training Performance
C
caojian05 已提交
225

C
CaoJian 已提交
226 227 228 229 230 231 232 233 234 235 236 237 238 239
| Parameters                 | LSTM (GPU)                                                     | LSTM (CPU)                 |
| -------------------------- | -------------------------------------------------------------- | -------------------------- |
| Resource                   | Tesla V100-SMX2-16GB                                           | Ubuntu X86-i7-8565U-16GB   |
| uploaded Date              | 08/06/2020 (month/day/year)                                    | 08/06/2020 (month/day/year)|
| MindSpore Version          | 0.6.0-beta                                                     | 0.6.0-beta                 |
| Dataset                    | aclimdb_v1                                                     | aclimdb_v1                 |
| Training Parameters        | epoch=20, batch_size=64                                        | epoch=20, batch_size=64    |
| Optimizer                  | Momentum                                                       | Momentum                   |
| Loss Function              | Softmax Cross Entropy                                          | Softmax Cross Entropy      |
| Speed                      | 1022 (1pcs)                                                    | 20                         |
| Loss                       | 0.12                                                           | 0.12                       |
| Params (M)                 | 6.45                                                           | 6.45                       |
| Checkpoint for inference   | 292.9M (.ckpt file)                                            | 292.9M (.ckpt file)        |
| Scripts                    | [lstm script](https://gitee.com/mindspore/mindspore/tree/master/model_zoo/official/nlp/lstm) | [lstm script](https://gitee.com/mindspore/mindspore/tree/master/model_zoo/official/nlp/lstm) |
W
wilfChen 已提交
240 241 242 243


### Evaluation Performance

C
CaoJian 已提交
244 245 246 247 248 249 250 251
| Parameters          | LSTM (GPU)                  | LSTM (CPU)                   |
| ------------------- | --------------------------- | ---------------------------- |
| Resource            | Tesla V100-SMX2-16GB        | Ubuntu X86-i7-8565U-16GB     |
| uploaded Date       | 08/06/2020 (month/day/year) | 08/06/2020 (month/day/year)  |
| MindSpore Version   | 0.6.0-beta                  | 0.6.0-beta                   |
| Dataset             | aclimdb_v1                  | aclimdb_v1                   |
| batch_size          | 64                          | 64                           |
| Accuracy            | 84%                         | 83%                          |
W
wilfChen 已提交
252 253 254 255 256 257 258 259 260 261 262 263


# [Description of Random Situation](#contents)

There are three random situations:
- Shuffle of the dataset.
- Initialization of some model weights.


# [ModelZoo Homepage](#contents)

Please check the official [homepage](https://gitee.com/mindspore/mindspore/tree/master/model_zoo).