提交 214c538a 编写于 作者: G Guanghua Yu 提交者: wangguanzhong

[PaddleDetection] update GETTING_STARTED.md (#2982)

* update GETTING_STARTED.md
上级 ccb2ce8c
......@@ -6,45 +6,122 @@ instructions](INSTALL.md).
## Training
#### Single-GPU Training
```bash
export CUDA_VISIBLE_DEVICES=0
export PYTHONPATH=$PYTHONPATH:.
python tools/train.py -c configs/faster_rcnn_r50_1x.yml
```
#### Multi-GPU Training
```bash
export CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7
export PYTHONPATH=$PYTHONPATH:.
python tools/train.py -c configs/faster_rcnn_r50_1x.yml
```
#### CPU Training
```bash
export CPU_NUM=8
export PYTHONPATH=$PYTHONPATH:.
python tools/train.py -c configs/faster_rcnn_r50_1x.yml
```
- Datasets is stored in `dataset/coco` by default (configurable).
- Datasets will be downloaded automatically and cached in `~/.cache/paddle/dataset` if not be found locally.
##### Optional arguments
- `-r` or `--resume_checkpoint`: Checkpoint path for resuming training. Such as: `-r output/faster_rcnn_r50_1x/10000`
- `--eval`: Whether to perform evaluation in training, default is `False`
- `-p` or `--output_eval`: If perform evaluation in training, this edits evaluation directory, default is current directory.
- `-d` or `--dataset_dir`: Dataset path, same as `dataset_dir` of configs. Such as: `-d dataset/coco`
- `-o`: Set configuration options in config file. Such as: `-o weights=output/faster_rcnn_r50_1x/model_final`
##### Examples
- Perform evaluation in training
```bash
export CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7
export PYTHONPATH=$PYTHONPATH:.
python -u tools/train.py -c configs/faster_rcnn_r50_1x.yml --eval
```
Alternating between training epoch and evaluation run is possible, simply pass
in `--eval` to do so and evaluate at each snapshot_iter. It can be modified at `snapshot_iter` of the configuration file. If evaluation dataset is large and
causes time-consuming in training, we suggest decreasing evaluation times or evaluating after training.
- configuration options and assign Dataset path
```bash
export CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7
export PYTHONPATH=$PYTHONPATH:.
python -u tools/train.py -c configs/faster_rcnn_r50_1x.yml \
-o weights=output/faster_rcnn_r50_1x/model_final \
-d dataset/coco
```
##### NOTES
- `CUDA_VISIBLE_DEVICES` can specify different gpu numbers. Such as: `export CUDA_VISIBLE_DEVICES=0,1,2,3`. GPU calculation rules can refer [FAQ](#faq)
- Dataset is stored in `dataset/coco` by default (configurable).
- Dataset will be downloaded automatically and cached in `~/.cache/paddle/dataset` if not be found locally.
- Pretrained model is downloaded automatically and cached in `~/.cache/paddle/weights`.
- Model checkpoints is saved in `output` by default (configurable).
- Model checkpoints are saved in `output` by default (configurable).
- To check out hyper parameters used, please refer to the config file.
- RCNN models training on CPU is not supported on PaddlePaddle<=1.5.1 and will be fixed on later version.
Alternating between training epoch and evaluation run is possible, simply pass
in `--eval` to do so and evaluate at each snapshot_iter. If evaluation dataset is large and
causes time-consuming in training, we suggest decreasing evaluation times or evaluating after training.
## Evaluation
```bash
export CUDA_VISIBLE_DEVICES=0
export PYTHONPATH=$PYTHONPATH:.
# or run on CPU with:
# export CPU_NUM=1
python tools/eval.py -c configs/faster_rcnn_r50_1x.yml
```
#### Optional arguments
- `-d` or `--dataset_dir`: Dataset path, same as dataset_dir of configs. Such as: `-d dataset/coco`
- `-p` or `--output_eval`: Evaluation directory, default is current directory.
- `-o`: Set configuration options in config file. Such as: `-o weights=output/faster_rcnn_r50_1x/model_final`
- `--json_eval`: Whether to eval with already existed bbox.json or mask.json. Default is `False`. Json file directory is assigned by `-f` argument.
#### Examples
- configuration options && assign Dataset path
```bash
export CUDA_VISIBLE_DEVICES=0
export PYTHONPATH=$PYTHONPATH:.
# or run on CPU with:
# export CPU_NUM=1
python -u tools/eval.py -c configs/faster_rcnn_r50_1x.yml \
-o weights=output/faster_rcnn_r50_1x/model_final \
-d dataset/coco
```
- Evaluation with json
```bash
export CUDA_VISIBLE_DEVICES=0
export PYTHONPATH=$PYTHONPATH:.
# or run on CPU with:
# export CPU_NUM=1
python tools/eval.py -c configs/faster_rcnn_r50_1x.yml \
--json_eval \
-f evaluation/
```
The json file must be named bbox.json or mask.json, placed in the `evaluation/` directory. Or without the `-f` parameter, default is the current directory.
#### NOTES
- Checkpoint is loaded from `output` by default (configurable)
- Multi-GPU evaluation for R-CNN and SSD models is not supported at the
moment, but it is a planned feature
......@@ -57,30 +134,54 @@ moment, but it is a planned feature
```bash
export CUDA_VISIBLE_DEVICES=0
export PYTHONPATH=$PYTHONPATH:.
# or run on CPU with:
# export CPU_NUM=1
python tools/infer.py -c configs/faster_rcnn_r50_1x.yml --infer_img=demo/000000570688.jpg
```
- Batch inference:
- Multi-image inference:
```bash
export CUDA_VISIBLE_DEVICES=0
export PYTHONPATH=$PYTHONPATH:.
# or run on CPU with:
# export CPU_NUM=1
python tools/infer.py -c configs/faster_rcnn_r50_1x.yml --infer_dir=demo
```
#### Optional arguments
- `--output_dir`: Directory for storing the output visualization files.
- `--draw_threshold`: Threshold to reserve the result for visualization. Default is 0.5.
- `--save_inference_model`: Save inference model in output_dir if True.
#### Examples
- Output specified directory && Set up threshold
```bash
export CUDA_VISIBLE_DEVICES=0
export PYTHONPATH=$PYTHONPATH:.
# or run on CPU with:
# export CPU_NUM=1
python tools/infer.py -c configs/faster_rcnn_r50_1x.yml \
--infer_img=demo/000000570688.jpg \
--output_dir=infer_output/ \
--draw_threshold=0.5
```
The visualization files are saved in `output` by default, to specify a different
path, simply add a `--output_dir=` flag.
path, simply add a `--output_dir=` flag.
`--draw_threshold` is an optional argument. Default is 0.5. Different thresholds will produce different results depending on the calculation of [NMS](https://ieeexplore.ieee.org/document/1699659)
- Save inference model
```bash
export CUDA_VISIBLE_DEVICES=0
export PYTHONPATH=$PYTHONPATH:.
# or run on CPU with:
# export CPU_NUM=1
python tools/infer.py -c configs/faster_rcnn_r50_1x.yml --infer_img=demo/000000570688.jpg \
python tools/infer.py -c configs/faster_rcnn_r50_1x.yml \
--infer_img=demo/000000570688.jpg \
--save_inference_model
```
......@@ -91,8 +192,15 @@ Save inference model by set `--save_inference_model`, which can be loaded by Pad
**Q:** Why do I get `NaN` loss values during single GPU training? </br>
**A:** The default learning rate is tuned to multi-GPU training (8x GPUs), it must
be adapted for single GPU training accordingly (e.g., divide by 8).
be adapted for single GPU training accordingly (e.g., divide by 8).
The calculation rules are as follows,they are equivalent: </br>
| GPU number | Learning rate | Max_iters | Milestones |
| :---------: | :------------: | :-------: | :--------------: |
| 2 | 0.0025 | 720000 | [480000, 640000] |
| 4 | 0.005 | 360000 | [240000, 320000] |
| 8 | 0.01 | 180000 | [120000, 160000] |
**Q:** How to reduce GPU memory usage? </br>
**A:** Setting environment variable FLAGS_conv_workspace_size_limit to a smaller
......
......@@ -11,6 +11,7 @@
```bash
export CUDA_VISIBLE_DEVICES=0
export PYTHONPATH=$PYTHONPATH:.
python tools/train.py -c configs/faster_rcnn_r50_1x.yml
```
......@@ -19,11 +20,49 @@ python tools/train.py -c configs/faster_rcnn_r50_1x.yml
```bash
export CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7
# or run on CPU with:
# export CPU_NUM=8
export PYTHONPATH=$PYTHONPATH:.
python tools/train.py -c configs/faster_rcnn_r50_1x.yml
```
#### CPU训练
```bash
export CPU_NUM=8
export PYTHONPATH=$PYTHONPATH:.
python tools/train.py -c configs/faster_rcnn_r50_1x.yml
```
##### 可选参数
- `-r` or `--resume_checkpoint`: 从某一检查点恢复训练,例如: `-r output/faster_rcnn_r50_1x/10000`
- `--eval`: 是否边训练边测试,默认是 `False`
- `-p` or `--output_eval`: 如果边训练边测试, 这个参数可以编辑评测保存json路径, 默认是当前目录。
- `-d` or `--dataset_dir`: 数据集路径, 同配置文件里的`dataset_dir`. 例如: `-d dataset/coco`
- `-o`: 设置配置文件里的参数内容。 例如: `-o weights=output/faster_rcnn_r50_1x/model_final`
##### 例子
- 边训练边测试
```bash
export CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7
export PYTHONPATH=$PYTHONPATH:.
python -u tools/train.py -c configs/faster_rcnn_r50_1x.yml --eval
```
可通过设置`--eval`在训练epoch中交替执行评估, 评估在每个snapshot_iter时开始。可在配置文件的`snapshot_iter`处修改。
如果验证集很大,测试将会比较耗时,影响训练速度,建议减少评估次数,或训练完再进行评估。
- 设置配置文件参数 && 指定数据集路径
```bash
export CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7
export PYTHONPATH=$PYTHONPATH:.
python -u tools/train.py -c configs/faster_rcnn_r50_1x.yml \
-o weights=output/faster_rcnn_r50_1x/model_final \
-d dataset/coco
```
##### 提示
- `CUDA_VISIBLE_DEVICES` 参数可以指定不同的GPU。例如: `export CUDA_VISIBLE_DEVICES=0,1,2,3`. GPU计算规则可以参考 [FAQ](#faq)
- 数据集默认存储在`dataset/coco`中(可配置)。
- 若本地未找到数据集,将自动下载数据集并保存在`~/.cache/paddle/dataset`中。
- 预训练模型自动下载并保存在`〜/.cache/paddle/weights`中。
......@@ -32,9 +71,6 @@ python tools/train.py -c configs/faster_rcnn_r50_1x.yml
- RCNN系列模型CPU训练在PaddlePaddle 1.5.1及以下版本暂不支持,将在下个版本修复。
可通过设置`--eval`在训练epoch中交替执行评估, 评估在每个snapshot_iter时开始。
如果验证集很大,测试将会比较耗时,影响训练速度,建议减少评估次数,或训练完再进行评估。
## 评估
......@@ -45,6 +81,41 @@ export CUDA_VISIBLE_DEVICES=0
python tools/eval.py -c configs/faster_rcnn_r50_1x.yml
```
#### 可选参数
- `-d` or `--dataset_dir`: 数据集路径, 同配置文件里的`dataset_dir`。例如: `-d dataset/coco`
- `-p` or `--output_eval`: 这个参数可以编辑评测保存json路径, 默认是当前目录。
- `-o`: 设置配置文件里的参数内容。 例如: `-o weights=output/faster_rcnn_r50_1x/model_final`
- `--json_eval`: 是否通过已存在的bbox.json或者mask.json进行评估。默认是`False`。json文件路径通过`-f`指令来设置。
#### 例子
- 设置配置文件参数 && 指定数据集路径
```bash
export CUDA_VISIBLE_DEVICES=0
export PYTHONPATH=$PYTHONPATH:.
# or run on CPU with:
# export CPU_NUM=1
python -u tools/eval.py -c configs/faster_rcnn_r50_1x.yml \
-o weights=output/faster_rcnn_r50_1x/model_final \
-d dataset/coco
```
- 通过json文件评估
```bash
export CUDA_VISIBLE_DEVICES=0
export PYTHONPATH=$PYTHONPATH:.
# or run on CPU with:
# export CPU_NUM=1
python tools/eval.py -c configs/faster_rcnn_r50_1x.yml \
--json_eval \
-f evaluation/
```
json文件必须命名为bbox.json或者mask.json,放在`evaluation/`目录下,或者不加`-f`参数,默认为当前目录。
#### 提示
- 默认从`output`加载checkpoint(可配置)
- R-CNN和SSD模型目前暂不支持多GPU评估,将在后续版本支持
......@@ -70,7 +141,29 @@ export CUDA_VISIBLE_DEVICES=0
python tools/infer.py -c configs/faster_rcnn_r50_1x.yml --infer_dir=demo
```
可视化文件默认保存在`output`中,可通过`--output_dir=`指定不同的输出路径。
#### 可选参数
- `--output_dir`: 输出推断后可视化文件。
- `--draw_threshold`: 设置推断的阈值。默认是0.5.
- `--save_inference_model`: Save inference model in output_dir if True.
#### 例子
- 设置输出路径 && 设置推断阈值
```bash
export CUDA_VISIBLE_DEVICES=0
export PYTHONPATH=$PYTHONPATH:.
# or run on CPU with:
# export CPU_NUM=1
python tools/infer.py -c configs/faster_rcnn_r50_1x.yml \
--infer_img=demo/000000570688.jpg \
--output_dir=infer_output/ \
--draw_threshold=0.5
```
可视化文件默认保存在`output`中,可通过`--output_dir=`指定不同的输出路径。
`--draw_threshold` 是个可选参数. 根据 [NMS](https://ieeexplore.ieee.org/document/1699659) 的计算,不同阈值会产生不同的结果。
- 保存推断模型
......@@ -88,7 +181,15 @@ python tools/infer.py -c configs/faster_rcnn_r50_1x.yml --infer_img=demo/0000005
## FAQ
**Q:** 为什么我使用单GPU训练loss会出`NaN`? </br>
**A:** 默认学习率是适配多GPU训练(8x GPU),若使用单GPU训练,须对应调整学习率(例如,除以8)。
**A:** 默认学习率是适配多GPU训练(8x GPU),若使用单GPU训练,须对应调整学习率(例如,除以8)。
计算规则表如下所示,它们是等价的: </br>
| GPU数 | 学习率 | 最大轮数 | 变化节点 |
| :---------: | :------------: | :-------: | :--------------: |
| 2 | 0.0025 | 720000 | [480000, 640000] |
| 4 | 0.005 | 360000 | [240000, 320000] |
| 8 | 0.01 | 180000 | [120000, 160000] |
**Q:** 如何减少GPU显存使用率? </br>
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册