提交 e70839a0 编写于 作者: Q qingqing01 提交者: GitHub

Update README.md (#2628)

* Update README.md and GETTING_STARTED.md for PaddleDetection.
上级 1199c33b
...@@ -13,23 +13,26 @@ flexible, catering to research needs. ...@@ -13,23 +13,26 @@ flexible, catering to research needs.
## Introduction ## Introduction
Design Principles: Features:
- Production Ready: - Production Ready:
Key operations are implemented in C++ and CUDA, together with PaddlePaddle's
Key operations are implemented in C++ and CUDA, together with PaddlePaddle's
highly efficient inference engine, enables easy deployment in server environments. highly efficient inference engine, enables easy deployment in server environments.
- Highly Flexible: - Highly Flexible:
Components are designed to be modular. Model architectures, as well as data
Components are designed to be modular. Model architectures, as well as data
preprocess pipelines, can be easily customized with simple configuration preprocess pipelines, can be easily customized with simple configuration
changes. changes.
- Performance Optimized: - Performance Optimized:
With the help of the underlying PaddlePaddle framework, faster training and
With the help of the underlying PaddlePaddle framework, faster training and
reduced GPU memory footprint is achieved. Notably, Yolo V3 training is reduced GPU memory footprint is achieved. Notably, Yolo V3 training is
much faster compared to other frameworks. Another example is Mask-RCNN much faster compared to other frameworks. Another example is Mask-RCNN
(ResNet50), we managed to fit up to 5 images per GPU (V100 16GB) during (ResNet50), we managed to fit up to 4 images per GPU (Tesla V100 16GB) during
training. multi-GPU training.
Supported Architectures: Supported Architectures:
...@@ -44,7 +47,7 @@ Supported Architectures: ...@@ -44,7 +47,7 @@ Supported Architectures:
| Yolov3 | ✓ | ✗ | ✗ | ✗ | ✓ | ✓ | | Yolov3 | ✓ | ✗ | ✗ | ✗ | ✓ | ✓ |
| SSD | ✗ | ✗ | ✗ | ✗ | ✓ | ✗ | | SSD | ✗ | ✗ | ✗ | ✗ | ✓ | ✗ |
<a name="vd">[1]</a> ResNet-vd models offer much improved accuracy with negligible performance cost. <a name="vd">[1]</a> [ResNet-vd](https://arxiv.org/pdf/1812.01187) models offer much improved accuracy with negligible performance cost.
Advanced Features: Advanced Features:
...@@ -67,7 +70,7 @@ Please follow the [installation guide](docs/INSTALL.md). ...@@ -67,7 +70,7 @@ Please follow the [installation guide](docs/INSTALL.md).
## Get Started ## Get Started
For inference, simply run the following command and the visualized result will For inference, simply run the following command and the visualized result will
be saved in `output/`. be saved in `output`.
```bash ```bash
export PYTHONPATH=`pwd`:$PYTHONPATH export PYTHONPATH=`pwd`:$PYTHONPATH
...@@ -102,6 +105,7 @@ Some of the planned features include: ...@@ -102,6 +105,7 @@ Some of the planned features include:
## Updates ## Updates
#### Initial release (7/3/2019) #### Initial release (7/3/2019)
- Initial release of PaddleDetection and detection model zoo - Initial release of PaddleDetection and detection model zoo
- Models included: Faster R-CNN, Mask R-CNN, Faster R-CNN+FPN, Mask - Models included: Faster R-CNN, Mask R-CNN, Faster R-CNN+FPN, Mask
R-CNN+FPN, Cascade-Faster-RCNN+FPN, RetinaNet, Yolo v3, and SSD. R-CNN+FPN, Cascade-Faster-RCNN+FPN, RetinaNet, Yolo v3, and SSD.
......
...@@ -75,8 +75,13 @@ path, simply add a `--save_file=` flag. ...@@ -75,8 +75,13 @@ path, simply add a `--save_file=` flag.
## FAQ ## FAQ
**Q:** Why do I get `NaN` loss values during single GPU training? </br>
**A:** The default learning rate is tuned to multi-GPU training (8x GPUs), it must
be adapted for single GPU training accordingly (e.g., divide by 8).
Q: Why do I get `NaN` loss values during single GPU training?
A: The default learning rate is tuned to multi-GPU training (8x GPUs), it must **Q:** How to reduce GPU memory usage? </br>
be adapted for single GPU training accordingly (e.g., divide by 8). **A:** Setting environment variable FLAGS_conv_workspace_size_limit to a smaller
number can reduce GPU memory footprint without affecting training speed.
Take Mask-RCNN (R50) as example, by setting `export FLAGS_conv_workspace_size_limit=512`,
batch size could reach 4 per GPU (Tesla V100 16GB).
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册