GETTING_STARTED.md 9.9 KB
Newer Older
1 2
# Getting Started

K
Kaipeng Deng 已提交
3
For setting up the running environment, please refer to [installation
4 5 6 7 8 9 10 11 12 13
instructions](INSTALL.md).


## Training

#### Single-GPU Training


```bash
export CUDA_VISIBLE_DEVICES=0
14
export PYTHONPATH=$PYTHONPATH:.
15 16 17 18 19 20 21
python tools/train.py -c configs/faster_rcnn_r50_1x.yml
```

#### Multi-GPU Training

```bash
export CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7
22 23 24 25 26 27 28 29 30
export PYTHONPATH=$PYTHONPATH:.
python tools/train.py -c configs/faster_rcnn_r50_1x.yml
```

#### CPU Training

```bash
export CPU_NUM=8
export PYTHONPATH=$PYTHONPATH:.
W
wangguanzhong 已提交
31
python tools/train.py -c configs/faster_rcnn_r50_1x.yml -o use_gpu=false
32 33
```

34 35 36 37
##### Optional arguments

- `-r` or `--resume_checkpoint`: Checkpoint path for resuming training. Such as: `-r output/faster_rcnn_r50_1x/10000`
- `--eval`: Whether to perform evaluation in training, default is `False`
38
- `--output_eval`: If perform evaluation in training, this edits evaluation directory, default is current directory.
39
- `-d` or `--dataset_dir`: Dataset path, same as `dataset_dir` of configs. Such as: `-d dataset/coco`
40 41
- `-c`: Select config file and all files are saved in `configs/`
- `-o`: Set configuration options in config file. Such as: `-o max_iters=180000`. `-o` has higher priority to file configured by `-c`
42 43
- `--use_tb`: Whether to record the data with [tb-paddle](https://github.com/linshuliang/tb-paddle), so as to display in Tensorboard, default is `False`
- `--tb_log_dir`: tb-paddle logging directory for scalar, default is `tb_log_dir/scalar`
44 45
- `--fp16`: Whether to enable mixed precision training (requires GPU), default is `False`
- `--loss_scale`: Loss scaling factor for mixed precision training, default is `8.0`
46 47 48 49 50 51 52 53 54 55 56 57 58


##### Examples

- Perform evaluation in training
```bash
export CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7
export PYTHONPATH=$PYTHONPATH:.
python -u tools/train.py -c configs/faster_rcnn_r50_1x.yml --eval
```

Alternating between training epoch and evaluation run is possible, simply pass
in `--eval` to do so and evaluate at each snapshot_iter. It can be modified at `snapshot_iter` of the configuration file. If evaluation dataset is large and
59
causes time-consuming in training, we suggest decreasing evaluation times or evaluating after training. When perform evaluation in training,
60
the best model with highest MAP is saved at each `snapshot_iter`. `best_model` has the same path as `model_final`.
61 62


63
- Configure dataset path
64 65 66 67 68 69 70
```bash
export CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7
export PYTHONPATH=$PYTHONPATH:.
python -u tools/train.py -c configs/faster_rcnn_r50_1x.yml \
                         -d dataset/coco
```

71 72 73 74 75 76 77 78 79 80 81
- Fine-tune other task

When using pre-trained model to fine-tune other task, the excluded pre-trained parameters can be set by finetune_exclude_pretrained_params in YAML config or -o finetune_exclude_pretrained_params in the arguments.

```bash
export CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7
export PYTHONPATH=$PYTHONPATH:.
python -u tools/train.py -c configs/faster_rcnn_r50_1x.yml \
                         -o pretrain_weights=output/faster_rcnn_r50_1x/model_final/ \
                            finetune_exclude_pretrained_params = ['cls_score','bbox_pred']
```
82

83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98
- Mixed Precision Training

Mixed precision training can be enabled with `--fp16` flag. Currently Faster-FPN, Mask-FPN and Yolov3 have been verified to be working with little to no loss of precision (less than 0.2 mAP)

To speed up mixed precision training, it is recommended to train in multi-process mode, for example

```bash
export PYTHONPATH=$PYTHONPATH:.
python -m paddle.distributed.launch --selected_gpus 0,1,2,3,4,5,6,7 tools/train.py --fp16 -c configs/faster_rcnn_r50_fpn_1x.yml
```

If loss becomes `NaN` during training, try tweak the `--loss_scale` value. Please refer to the Nvidia [documentation](https://docs.nvidia.com/deeplearning/sdk/mixed-precision-training/index.html#mptrain) on mixed precision training for details.

Also, please note mixed precision training currently requires changing `norm_type` from `affine_channel` to `bn`.


99 100 101 102 103
##### NOTES

- `CUDA_VISIBLE_DEVICES` can specify different gpu numbers. Such as: `export CUDA_VISIBLE_DEVICES=0,1,2,3`. GPU calculation rules can refer [FAQ](#faq)
- Dataset is stored in `dataset/coco` by default (configurable).
- Dataset will be downloaded automatically and cached in `~/.cache/paddle/dataset` if not be found locally.
104
- Pretrained model is downloaded automatically and cached in `~/.cache/paddle/weights`.
105
- Model checkpoints are saved in `output` by default (configurable).
106
- When finetuning, users could set `pretrain_weights` to the models published by PaddlePaddle. Parameters matched by fields in finetune_exclude_pretrained_params will be ignored in loading and fields can be wildcard matching. For detailed information, please refer to [Transfer Learning](TRANSFER_LEARNING.md).
W
wangguanzhong 已提交
107
- To check out hyper parameters used, please refer to the [configs](../configs).
108
- RCNN models training on CPU is not supported on PaddlePaddle<=1.5.1 and will be fixed on later version.
109 110 111 112 113 114



## Evaluation

```bash
W
wangguanzhong 已提交
115
# run on GPU with:
116
export PYTHONPATH=$PYTHONPATH:.
W
wangguanzhong 已提交
117
export CUDA_VISIBLE_DEVICES=0
118 119 120
python tools/eval.py -c configs/faster_rcnn_r50_1x.yml
```

121 122 123
#### Optional arguments

- `-d` or `--dataset_dir`: Dataset path, same as dataset_dir of configs. Such as: `-d dataset/coco`
124
- `--output_eval`: Evaluation directory, default is current directory.
125 126 127 128 129
- `-o`: Set configuration options in config file. Such as: `-o weights=output/faster_rcnn_r50_1x/model_final`
- `--json_eval`: Whether to eval with already existed bbox.json or mask.json. Default is `False`. Json file directory is assigned by `-f` argument.

#### Examples

130
- Evaluate by specified weights path and dataset path
131
```bash
W
wangguanzhong 已提交
132
# run on GPU with:
133
export PYTHONPATH=$PYTHONPATH:.
W
wangguanzhong 已提交
134
export CUDA_VISIBLE_DEVICES=0
135 136 137 138 139
python -u tools/eval.py -c configs/faster_rcnn_r50_1x.yml \
                        -o weights=output/faster_rcnn_r50_1x/model_final \
                        -d dataset/coco
```

140
- Evaluate with json
141
```bash
W
wangguanzhong 已提交
142
# run on GPU with:
143
export PYTHONPATH=$PYTHONPATH:.
W
wangguanzhong 已提交
144
export CUDA_VISIBLE_DEVICES=0
145
python tools/eval.py -c configs/faster_rcnn_r50_1x.yml \
W
wangguanzhong 已提交
146 147
             --json_eval \
             -f evaluation/
148 149 150 151 152 153
```

The json file must be named bbox.json or mask.json, placed in the `evaluation/` directory. Or without the `-f` parameter, default is the current directory.

#### NOTES

154 155 156 157 158 159 160 161 162 163 164
- Checkpoint is loaded from `output` by default (configurable)
- Multi-GPU evaluation for R-CNN and SSD models is not supported at the
moment, but it is a planned feature


## Inference


- Run inference on a single image:

```bash
W
wangguanzhong 已提交
165
# run on GPU with:
166
export PYTHONPATH=$PYTHONPATH:.
W
wangguanzhong 已提交
167
export CUDA_VISIBLE_DEVICES=0
168 169 170
python tools/infer.py -c configs/faster_rcnn_r50_1x.yml --infer_img=demo/000000570688.jpg
```

171
- Multi-image inference:
172 173

```bash
W
wangguanzhong 已提交
174
# run on GPU with:
175
export PYTHONPATH=$PYTHONPATH:.
W
wangguanzhong 已提交
176
export CUDA_VISIBLE_DEVICES=0
177 178 179
python tools/infer.py -c configs/faster_rcnn_r50_1x.yml --infer_dir=demo
```

180 181 182 183 184
#### Optional arguments

- `--output_dir`: Directory for storing the output visualization files.
- `--draw_threshold`: Threshold to reserve the result for visualization. Default is 0.5.
- `--save_inference_model`: Save inference model in output_dir if True.
185 186
- `--use_tb`: Whether to record the data with [tb-paddle](https://github.com/linshuliang/tb-paddle), so as to display in Tensorboard, default is `False`
- `--tb_log_dir`: tb-paddle logging directory for image, default is `tb_log_dir/image`
187 188 189 190

#### Examples

- Output specified directory && Set up threshold
191

192
```bash
W
wangguanzhong 已提交
193
# run on GPU with:
194
export PYTHONPATH=$PYTHONPATH:.
W
wangguanzhong 已提交
195
export CUDA_VISIBLE_DEVICES=0
196 197 198
python tools/infer.py -c configs/faster_rcnn_r50_1x.yml \
                      --infer_img=demo/000000570688.jpg \
                      --output_dir=infer_output/ \
199
                      --draw_threshold=0.5 \
200 201
                      -o weights=output/faster_rcnn_r50_1x/model_final \
                      --use_tb=Ture
202
```
203

204
The visualization files are saved in `output` by default, to specify a different path, simply add a `--output_dir=` flag.
205
`--draw_threshold` is an optional argument. Default is 0.5.
206 207
Different thresholds will produce different results depending on the calculation of [NMS](https://ieeexplore.ieee.org/document/1699659).
If users want to infer according to customized model path, `-o weights` can be set for specified path.
208
`--use_tb` is an optional argument, if `--use_tb` is `True`, the tb-paddle will record data in directory,
209
so users can see the results in Tensorboard.
210

211 212 213
- Save inference model

```bash
W
wangguanzhong 已提交
214
# run on GPU with:
215
export CUDA_VISIBLE_DEVICES=0
216 217 218
export PYTHONPATH=$PYTHONPATH:.
python tools/infer.py -c configs/faster_rcnn_r50_1x.yml \
                      --infer_img=demo/000000570688.jpg \
219 220 221
                      --save_inference_model
```

K
Kaipeng Deng 已提交
222
Save inference model by set `--save_inference_model`, which can be loaded by PaddlePaddle predict library.
223

224 225 226

## FAQ

Q
qingqing01 已提交
227 228
**Q:**  Why do I get `NaN` loss values during single GPU training? </br>
**A:**  The default learning rate is tuned to multi-GPU training (8x GPUs), it must
229 230
be adapted for single GPU training accordingly (e.g., divide by 8).
The calculation rules are as follows,they are equivalent: </br>
231

232

233 234
| GPU number  | Learning rate  | Max_iters | Milestones       |
| :---------: | :------------: | :-------: | :--------------: |
235 236 237
| 2           | 0.0025         | 720000    | [480000, 640000] |
| 4           | 0.005          | 360000    | [240000, 320000] |
| 8           | 0.01           | 180000    | [120000, 160000] |
238

239

Q
qingqing01 已提交
240 241 242 243 244
**Q:**  How to reduce GPU memory usage? </br>
**A:**  Setting environment variable FLAGS_conv_workspace_size_limit to a smaller
number can reduce GPU memory footprint without affecting training speed.
Take Mask-RCNN (R50) as example, by setting `export FLAGS_conv_workspace_size_limit=512`,
batch size could reach 4 per GPU (Tesla V100 16GB).
245 246 247 248 249 250


**Q:**  How to change data preprocessing? </br>
**A:**  Set `sample_transform` in configuration. Note that **the whole transforms** need to be added in configuration.
For example, `DecodeImage`, `NormalizeImage` and `Permute` in RCNN models. For detail description, please refer
to [config_example](config_example).