GETTING_STARTED.md 2.6 KB
Newer Older
1 2
# Getting Started

Y
Yang Zhang 已提交
3 4
For setting up the test environment, please refer to [installation
instructions](INSTALL.md).
5 6


Y
Yang Zhang 已提交
7 8 9 10
## Training


#### Single-GPU Training
11 12 13 14 15 16 17


```bash
export CUDA_VISIBLE_DEVICES=0
python tools/train.py -c configs/faster_rcnn_r50_1x.yml
```

Y
Yang Zhang 已提交
18 19
#### Multi-GPU Training

20 21

```bash
Y
Yang Zhang 已提交
22
export CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7
J
jerrywgz 已提交
23
python tools/train.py -c configs/faster_rcnn_r50_1x.yml
24 25
```

Y
Yang Zhang 已提交
26 27 28 29
- Datasets is stored in `dataset/coco` by default (configurable).
- Pretrained model is downloaded automatically and cached in `~/.cache/paddle/weights`.
- Model checkpoints is saved in `output` by default (configurable).
- To check out hyper parameters used, please refer to the config file.
30

Y
Yang Zhang 已提交
31 32 33
Alternating between training epoch and evaluation run is possible, simply pass
in `--eval=True` to do so (tested with `SSD` detector on Pascal-VOC, not
recommended for two stage models or training sessions on COCO dataset)
34 35


Y
Yang Zhang 已提交
36
## Evaluation
37 38 39 40


```bash
export CUDA_VISIBLE_DEVICES=0
Y
Yang Zhang 已提交
41 42
# or run on CPU with:
# export CPU_NUM=1
43 44 45
python tools/eval.py -c configs/faster_rcnn_r50_1x.yml
```

Y
Yang Zhang 已提交
46 47 48 49
- Checkpoint is loaded from `output` by default (configurable)
- Multi-GPU evaluation for R-CNN and SSD models is not supported at the
moment, but it is a planned feature

50

Y
Yang Zhang 已提交
51
## Inference
52 53


Y
Yang Zhang 已提交
54
- Run inference on a single image:
55 56 57

```bash
export CUDA_VISIBLE_DEVICES=0
Y
Yang Zhang 已提交
58 59 60
# or run on CPU with:
# export CPU_NUM=1
python tools/infer.py -c configs/faster_rcnn_r50_1x.yml --infer_img=demo/000000570688.jpg
61 62
```

Y
Yang Zhang 已提交
63
- Batch inference:
64 65 66

```bash
export CUDA_VISIBLE_DEVICES=0
Y
Yang Zhang 已提交
67 68
# or run on CPU with:
# export CPU_NUM=1
69 70 71
python tools/infer.py -c configs/faster_rcnn_r50_1x.yml --infer_dir=demo
```

Y
Yang Zhang 已提交
72 73
The visualization files are saved in `output` by default, to specify a different
path, simply add a `--save_file=` flag.
74

75 76 77 78 79 80 81 82 83 84 85 86
- Save inference model

```bash
export CUDA_VISIBLE_DEVICES=0
# or run on CPU with:
# export CPU_NUM=1
python tools/infer.py -c configs/faster_rcnn_r50_1x.yml --infer_img=demo/000000570688.jpg \
                      --save_inference_model
```

Save inference model by set `--save_inference_model`.

87 88 89

## FAQ

Q
qingqing01 已提交
90 91 92
**Q:**  Why do I get `NaN` loss values during single GPU training? </br>
**A:**  The default learning rate is tuned to multi-GPU training (8x GPUs), it must
be adapted for single GPU training accordingly (e.g., divide by 8).
93 94


Q
qingqing01 已提交
95 96 97 98 99
**Q:**  How to reduce GPU memory usage? </br>
**A:**  Setting environment variable FLAGS_conv_workspace_size_limit to a smaller
number can reduce GPU memory footprint without affecting training speed.
Take Mask-RCNN (R50) as example, by setting `export FLAGS_conv_workspace_size_limit=512`,
batch size could reach 4 per GPU (Tesla V100 16GB).