diff --git a/PaddleCV/object_detection/README.md b/PaddleCV/object_detection/README.md index 909d709e347b7a6450dc382c28304bfbbd969ca3..383f0d895253e94c9c8962b54cedca192e1d1f06 100644 --- a/PaddleCV/object_detection/README.md +++ b/PaddleCV/object_detection/README.md @@ -1,4 +1,105 @@ -PaddlePaddle Object Detection -=== +# PaddlePaddle Object Detection -The document will be coming soon. +This object detection framework is based on PaddlePaddle. We want to provide classically and state of the art detection algorithms in generic object detection and specific target detection for users. And we aimed to make this framework easy to extend, train, and deploy. We aimed to make it easy to use in research and products. + + +
+ +
+ + +## Introduction + +Major features: + +- Easy to Deploy: + All the operations related to inference are implemented by C++ and CUDA, it makes the detection model easy to deploy in products on the server without Python based on the high efficient inference engine of PaddlePaddle. + We release detection models based on ResNet-D backbone, for example, the accuracy of Faster-RCNN model with FPN based on ResNet50 VD is close to model based on ResNet 101. But the former is smaller and faster. + +- Easy to Customize: + All components are modular encapsulated, including the data transforms. It's easy to plug in and pull out any module. For example, users can switch backbone easily or add mixup data augmentation for models. + +- High Efficiency: + Based on the high efficient PaddlePaddle framework, less memory is required. For example, the batch size of Mask-RCNN based on ResNet50 can be 5 per Tesla V100 (16G). The training speed of Yolo v3 is faster than other frameworks. + +The supported architectures are as follows: + +| | ResNet |ResNet vd| ResNeXt | SENet | MobileNet | DarkNet| +|--------------------|:------:|--------:|:--------:|:--------:|:---------:|:------:| +| Faster R-CNN | ✓ | ✓ | ✓ | ✓ | ✗ | ✗ | +| Faster R-CNN + FPN | ✓ | ✓ | ✓ | ✓ | ✗ | ✗ | +| Mask R-CNN | ✓ | ✓ | ✓ | ✓ | ✗ | ✗ | +| Mask R-CNN + FPN | ✓ | ✓ | ✓ | ✓ | ✗ | ✗ | +| Cascade R-CNN | ✓ | ✓ | ✓ | ✓ | ✗ | ✗ | +| RetinaNet | ✓ | ✓ | ✓ | ✓ | ✗ | ✗ | +| Yolov3 | ✓ | ✗ | ✗ | ✗ | ✓ | ✓ | +| SSD | ✗ | ✗ | ✗ | ✗ | ✓ | ✗ | + +The extensive capabilities: + +- [x] **Synchronized batch norm**: used in Yolo v3. +- [x] **Group Norm**: supported this operation, the related model will be added later. +- [x] **Modulated deformable convolution**: supported this operation, the related model will be added later. +- [x] **Deformable PSRoI Pooling**: supported this operation, the related model will be added later. + + +#### Work in Progress and to Do + +- About Framework: + - Mixed precision training and distributed training. + - 8-bit deployment. + - Easy to customize the user-defined function. + +- About Algorithms: + - More SOTA models. + - More easy-to-deployed models. + +We are glad to receive your feedback. + +## Model zoo + +The trained models can be available in PaddlePaddle [detection model zoo](docs/MODEL_ZOO.md). + +## Installation + +Please follow the [installation instructions](docs/INSTALL.md) to install and prepare environment. + +## Get Started + +For quickly start, infer an image: + +```bash +export PYTHONPATH=`pwd`:$PYTHONPATH +python tools/infer.py -c configs/mask_rcnn_r50_1x.yml \ + -o weights=http:// + -infer_img=demo/000000523957.jpg +``` + +The predicted and visualized results are saved in `output/1.jpg`. + +For more detailed training and evaluating pipeline, please refer [GETTING_STARTED.md](docs/GETTING_STARTED.md). + + +More documentation: + +- [How to config an object detection pipeline.](docs/CONFIG.md) +- [How to use customized dataset and add data preprocessing.](docs/DATA.md) + + +## Deploy + +The example of how to use PaddlePaddle to deploy detection model will be added later. + +## Updates + +The major updates are as follows: + +#### 2019-07-03 +- Release the unified detection framework. +- Supported algorithms: Faster R-CNN, Mask R-CNN, Faster R-CNN + FPN, Mask R-CNN + FPN, Cascade-Faster-RCNN + FPN, RetinaNet, Yolo v3, and SSD. +- Release the first version of Model Zoo. + + +## Contributing + +We appreciate everyone's contributions! diff --git a/PaddleCV/object_detection/demo/000000000139.jpg b/PaddleCV/object_detection/demo/000000000139.jpg new file mode 100644 index 0000000000000000000000000000000000000000..19023f718333c56c70776c79201dc03d742c1ed3 Binary files /dev/null and b/PaddleCV/object_detection/demo/000000000139.jpg differ diff --git a/PaddleCV/object_detection/demo/000000523957.jpg b/PaddleCV/object_detection/demo/000000523957.jpg new file mode 100644 index 0000000000000000000000000000000000000000..864ccd864fcde3497b894356d79ece92dd0b5ae1 Binary files /dev/null and b/PaddleCV/object_detection/demo/000000523957.jpg differ diff --git a/PaddleCV/object_detection/demo/output/000000523957.jpg b/PaddleCV/object_detection/demo/output/000000523957.jpg new file mode 100644 index 0000000000000000000000000000000000000000..bc40ed4aeb71dec7bc93e507ad80db401e036260 Binary files /dev/null and b/PaddleCV/object_detection/demo/output/000000523957.jpg differ diff --git a/PaddleCV/object_detection/docs/GETTING_STARTED.md b/PaddleCV/object_detection/docs/GETTING_STARTED.md new file mode 100644 index 0000000000000000000000000000000000000000..a28b216d8c43690792a16a2e43cc42c90ee477d1 --- /dev/null +++ b/PaddleCV/object_detection/docs/GETTING_STARTED.md @@ -0,0 +1,74 @@ +# Getting Started + +Please refer [installation instructions](INSTALL.md) to install PaddlePaddle and prepare dataset at first. + + +## Train a Model + +#### One-Device Training + +```bash +export CUDA_VISIBLE_DEVICES=0 +# export CPU_NUM=1 # for CPU training +python tools/train.py -c configs/faster_rcnn_r50_1x.yml +``` + +#### Multi-Device Training + +```bash +export CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 # set devices +# export CPU_NUM=8 # for CPU training +python tools/train.py -c =configs/faster_rcnn_r50_1x.yml +``` + +- Default dataset directory is `dataset/coco`, users also can specify it in the configure file. +- Pretrained model will be downloaded automatically and saved at `~/.cache/paddle/weights`. +- Model will be saved at `output/faster_rcnn_r50_1x` by default, users also can specify it in the configure file. +- All hyper parameters can refer input config. +- Change config file for other models. +- For more help, please run `python tools/train.py --help`. + + +For `SSD` on Pascal-VOC dataset, set `--eval=True` to do evaluation during training. +For other models based on COCO dataset, the evaluating during training is not fully verified, better to do evaluation after training. + + +## Evaluate with Pretrained models. + + +```bash +export CUDA_VISIBLE_DEVICES=0 +# export CPU_NUM=1 # for CPU training +python tools/eval.py -c configs/faster_rcnn_r50_1x.yml +``` + +- The default model directory is `output/faster_rcnn_r50_1x`, you also can specify it. +- For R-CNN and SSD models, do not support evaluating by multi-device now, we will enhanced it in next version. +- For more help, please run `python tools/eval.py --help`. + + +## Inference with Pretrained Models + +- Infer one image: + +```bash +export CUDA_VISIBLE_DEVICES=0 +python tools/infer.py -c configs/faster_rcnn_r50_1x.yml --infer_img=demo/000000000139.jpg +``` + +- Infer several images: + +```bash +export CUDA_VISIBLE_DEVICES=0 +python tools/infer.py -c configs/faster_rcnn_r50_1x.yml --infer_dir=demo +``` + +The predicted and visualized images are saved in `output` by default, users can change saved directory by specifying `--savefile=`. For more help please run `python tools/infer.py --help`. + + +## FAQ + + +Q: Why the loss may be NaN when using single GPU to train? + +A: The default learning rate is adapt to multi-device training, when use single GPU and small batch size, you need to decrease `base_lr` by corresponding multiples.