GETTING_STARTED.md 2.7 KB
Newer Older
1 2
# Getting Started

K
Kaipeng Deng 已提交
3
For setting up the running environment, please refer to [installation
4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
instructions](INSTALL.md).


## Training


#### Single-GPU Training


```bash
export CUDA_VISIBLE_DEVICES=0
python tools/train.py -c configs/faster_rcnn_r50_1x.yml
```

#### Multi-GPU Training


```bash
export CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7
J
jerrywgz 已提交
23
python tools/train.py -c configs/faster_rcnn_r50_1x.yml
24 25 26
```

- Datasets is stored in `dataset/coco` by default (configurable).
K
Kaipeng Deng 已提交
27
- Datasets will be downloaded automatically and cached in `~/.cache/paddle/dataset` if not be found locally.
28 29 30 31 32
- Pretrained model is downloaded automatically and cached in `~/.cache/paddle/weights`.
- Model checkpoints is saved in `output` by default (configurable).
- To check out hyper parameters used, please refer to the config file.

Alternating between training epoch and evaluation run is possible, simply pass
K
Kaipeng Deng 已提交
33
in `--eval` to do so (tested with `SSD` detector on Pascal-VOC, not
34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73
recommended for two stage models or training sessions on COCO dataset)


## Evaluation


```bash
export CUDA_VISIBLE_DEVICES=0
# or run on CPU with:
# export CPU_NUM=1
python tools/eval.py -c configs/faster_rcnn_r50_1x.yml
```

- Checkpoint is loaded from `output` by default (configurable)
- Multi-GPU evaluation for R-CNN and SSD models is not supported at the
moment, but it is a planned feature


## Inference


- Run inference on a single image:

```bash
export CUDA_VISIBLE_DEVICES=0
# or run on CPU with:
# export CPU_NUM=1
python tools/infer.py -c configs/faster_rcnn_r50_1x.yml --infer_img=demo/000000570688.jpg
```

- Batch inference:

```bash
export CUDA_VISIBLE_DEVICES=0
# or run on CPU with:
# export CPU_NUM=1
python tools/infer.py -c configs/faster_rcnn_r50_1x.yml --infer_dir=demo
```

The visualization files are saved in `output` by default, to specify a different
74
path, simply add a `--output_dir=` flag.
75

76 77 78 79 80 81 82 83 84 85
- Save inference model

```bash
export CUDA_VISIBLE_DEVICES=0
# or run on CPU with:
# export CPU_NUM=1
python tools/infer.py -c configs/faster_rcnn_r50_1x.yml --infer_img=demo/000000570688.jpg \
                      --save_inference_model
```

K
Kaipeng Deng 已提交
86
Save inference model by set `--save_inference_model`, which can be loaded by PaddlePaddle predict library.
87

88 89 90

## FAQ

Q
qingqing01 已提交
91 92 93
**Q:**  Why do I get `NaN` loss values during single GPU training? </br>
**A:**  The default learning rate is tuned to multi-GPU training (8x GPUs), it must
be adapted for single GPU training accordingly (e.g., divide by 8).
94 95


Q
qingqing01 已提交
96 97 98 99 100
**Q:**  How to reduce GPU memory usage? </br>
**A:**  Setting environment variable FLAGS_conv_workspace_size_limit to a smaller
number can reduce GPU memory footprint without affecting training speed.
Take Mask-RCNN (R50) as example, by setting `export FLAGS_conv_workspace_size_limit=512`,
batch size could reach 4 per GPU (Tesla V100 16GB).