README_en.md 18.4 KB
Newer Older
1 2
English | [简体中文](README_cn.md)

3 4 5 6 7 8 9
<div align="center">
<p align="center">
  <img src="https://user-images.githubusercontent.com/48054808/160532560-34cf7a1f-d950-435e-90d2-4b0a679e5119.png" align="middle" width = "800" />
</p>

****A High-Efficient Development Toolkit for Object Detection based on [PaddlePaddle](https://github.com/paddlepaddle/paddle).****

Y
YixinKristy 已提交
10 11 12 13 14 15
<p align="center">
    <a href="./LICENSE"><img src="https://img.shields.io/badge/license-Apache%202-dfd.svg"></a>
    <a href="https://github.com/PaddlePaddle/PaddleDetection/releases"><img src="https://img.shields.io/github/v/release/PaddlePaddle/PaddleDetection?color=ffa"></a>
    <a href=""><img src="https://img.shields.io/badge/python-3.7+-aff.svg"></a>
    <a href=""><img src="https://img.shields.io/badge/os-linux%2C%20win%2C%20mac-pink.svg"></a>
    <a href="https://github.com/PaddlePaddle/PaddleDetection/stargazers"><img src="https://img.shields.io/github/stars/PaddlePaddle/PaddleDetection?color=ccf"></a>
16 17 18

</div>

19 20 21 22 23
<div  align="center">
  <img src="docs/images/ppdet.gif" width="800"/>

</div>

24
## <img src="https://user-images.githubusercontent.com/48054808/157793354-6e7f381a-0aa6-4bb7-845c-9acf2ecc05c3.png" width="20"/> Latest News
25

26 27 28 29
- 🔥 **2022.8.09:Release [YOLO series model zoo](https://github.com/nemonameless/PaddleDetection_YOLOSeries)**
  - Comprehensive coverage of classic and latest models of the YOLO series: Including YOLOv3,Paddle real-time object detection model PP-YOLOE, and frontier detection algorithms YOLOv4, YOLOv5, YOLOX, MT-YOLOv6 and YOLOv7
  - Better model performance:Upgrade based on various YOLO algorithms, shorten training time in 5-8 times and the accuracy is generally improved by 1%-5% mAP. The model compression strategy is used to achieve 30% improvement in speed without precision loss
  - Complete end-to-end development support:End-to-end development pipieline including training, evaluation, inference, model compression and deployment on various hardware. Meanwhile, support flexible algorithnm switch and implement customized development efficiently
30

31 32 33 34 35
- 🔥 **2022.8.01:Release [PP-TinyPose plus](./configs/keypoint/tiny_pose/). The end-to-end precision improves 9.1% AP in dataset
 of fitness and dance scenes**
  - Increase data of sports scenes, and the recognition performance of complex actions is significantly improved, covering actions such as sideways, lying down, jumping, and raising legs
  - Detection model uses PP-PicoDet plus and the precision on COCO dataset is improved by 3.1% mAP
  - The stability of keypoints is enhanced. Implement the filter stabilization method to make the video prediction result more stable and smooth.
W
wangguanzhong 已提交
36

37 38 39 40
- 2022.7.14:Release [pedestrian analysis tool PP-Human v2](./deploy/pipeline)
  - Four major functions: five complicated action recognition with high performance and Flexible, real-time human attribute recognition, visitor flow statistics and high-accuracy multi-camera tracking.
  - High performance algorithm: including pedestrian detection, tracking, attribute recognition which is robust to the number of targets and the variant of background and light.
  - Highly Flexible: providing complete introduction of end-to-end development and optimization strategy, simple command for deployment and compatibility with different input format.
41

42 43 44 45 46
- 2022.3.24:PaddleDetection released[release/2.4 version](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.4)  
  - Release high-performanace SOTA object detection model [PP-YOLOE](configs/ppyoloe). It integrates cloud and edge devices and provides S/M/L/X versions. In particular, Verson L has the accuracy as 51.4% on COCO test 2017 dataset, inference speed as 78.1 FPS on a single Test V100. It supports mixed precision training, 33% faster than PP-YOLOv2. Its full range of multi-sized models can meet different hardware arithmetic requirements, and adaptable to server, edge-device GPU and other AI accelerator cards on servers.
  - Release ultra-lightweight SOTA object detection model [PP-PicoDet Plus](configs/picodet) with 2% improvement in accuracy and 63% improvement in CPU inference speed. Add PicoDet-XS model with a 0.7M parameter, providing model sparsification and quantization functions for model acceleration. No specific post processing module is required for all the hardware, simplifying the deployment.  
  - Release the real-time pedestrian analysis tool [PP-Human](deploy/pphuman). It has four major functions: pedestrian tracking, visitor flow statistics, human attribute recognition and falling detection. For falling detection, it is optimized based on real-life data with accurate recognition of various types of falling posture. It can adapt to different environmental background, light and camera angle.
  - Add [YOLOX](configs/yolox) object detection model with nano/tiny/S/M/L/X. X version has the accuracy as 51.8% on COCO  Val2017 dataset.
47

48
- [More releases](https://github.com/PaddlePaddle/PaddleDetection/releases)
49

50
## <img title="" src="https://user-images.githubusercontent.com/48054808/157795569-9fc77c85-732f-4870-9be0-99a7fe2cff27.png" alt="" width="20"> Introduction
51

Y
YixinKristy 已提交
52
PaddleDetection is an end-to-end object detection development kit based on PaddlePaddle, which implements varied mainstream object detection, instance segmentation, tracking and keypoint detection algorithms in modular design with configurable modules such as network components, data augmentations and losses. It releases many kinds SOTA industry practice models and integrates abilities of model compression and cross-platform high-performance deployment to help developers in the whole process with a faster and better way.
53

54 55 56 57 58 59 60
#### PaddleDetection provides image processing capabilities such as object detection, instance segmentation, multi-object tracking, keypoint detection and etc.

<div  align="center">
  <img src="docs/images/ppdet.gif" width="800"/>
</div>

#### PaddleDetection covers industrialization, smart city, security & protection, retail, medicare industry and etc.
61

62 63
<div  align="center">
  <img src="https://user-images.githubusercontent.com/48054808/157826886-2e101a71-25a2-42f5-bf5e-30a97be28f46.gif" width="800"/>
64 65
</div>

66
## <img src="https://user-images.githubusercontent.com/48054808/157799599-e6a66855-bac6-4e75-b9c0-96e13cb9612f.png" width="20"/> Features
67

W
Wenyu 已提交
68
- **Rich Models**
69

70
  PaddleDetection provides rich of models, including **250+ pre-trained models** such as **object detection**, **instance segmentation**, **face detection**, **keypoint detection**, **multi-object tracking** and etc, covering a variety of **global competition champion** schemes.
71

72
- **Highly Flexible**
73

74
  Components are designed to be modular. Model architectures, as well as data preprocess pipelines and optimization strategies, can be easily customized with simple configuration changes.
75

W
Wenyu 已提交
76
- **Production Ready**
77

78
  From data augmentation, constructing models, training, compression, depolyment, get through end to end, and complete support for multi-architecture, multi-device deployment for **cloud and edge device**.
79

W
Wenyu 已提交
80
- **High Performance**
81

82
  Based on the high performance core of PaddlePaddle, advantages of training speed and memory occupation are obvious. FP16 training and multi-machine training are supported as well.
83

84
## <img title="" src="https://user-images.githubusercontent.com/48054808/157800467-2a9946ad-30d1-49a9-b9db-ba33413d9c90.png" alt="" width="20"> Community
85

86 87 88
- If you have any problem or suggestion on PaddleDetection, please send us issues through [GitHub Issues](https://github.com/PaddlePaddle/PaddleDetection/issues).

- Welcome to Join PaddleDetection QQ Group and Wechat Group (reply "Det").
89

90
  <div align="center">
91 92
  <img src="https://user-images.githubusercontent.com/22989727/183843004-baebf75f-af7c-4a7c-8130-1497b9a3ec7e.png"  width = "200" />  
  <img src="https://user-images.githubusercontent.com/34162360/177678712-4655747d-4290-4ad9-b7a1-4564a5418ac6.jpg"  width = "200" />  
93 94 95
  </div>

## <img src="https://user-images.githubusercontent.com/48054808/157827140-03ffaff7-7d14-48b4-9440-c38986ea378c.png" width="20"/> Overview of Kit Structures
96

K
Kaipeng Deng 已提交
97
<table align="center">
98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114
  <tbody>
    <tr align="center" valign="bottom">
      <td>
        <b>Architectures</b>
      </td>
      <td>
        <b>Backbones</b>
      </td>
      <td>
        <b>Components</b>
      </td>
      <td>
        <b>Data Augmentation</b>
      </td>
    </tr>
    <tr valign="top">
      <td>
K
Kaipeng Deng 已提交
115 116
        <ul>
          <li><b>Object Detection</b></li>
117 118 119 120 121 122
          <ul>
            <li>Faster RCNN</li>
            <li>FPN</li>
            <li>Cascade-RCNN</li>
            <li>Libra RCNN</li>
            <li>Hybrid Task RCNN</li>
K
Kaipeng Deng 已提交
123
            <li>PSS-Det</li>
124 125 126
            <li>RetinaNet</li>
            <li>YOLOv3</li>
            <li>YOLOv4</li>  
K
Kaipeng Deng 已提交
127 128
            <li>PP-YOLOv1/v2</li>
            <li>PP-YOLO-Tiny</li>
129 130
            <li>PP-YOLOE</li>
            <li>YOLOX</li>
131 132 133 134
            <li>SSD</li>
            <li>CornerNet-Squeeze</li>
            <li>FCOS</li>  
            <li>TTFNet</li>
K
Kaipeng Deng 已提交
135 136 137 138 139
            <li>PP-PicoDet</li>
            <li>DETR</li>
            <li>Deformable DETR</li>
            <li>Swin Transformer</li>
            <li>Sparse RCNN</li>
140
        </ul>
K
Kaipeng Deng 已提交
141
        <li><b>Instance Segmentation</b></li>
142
        <ul>
K
Kaipeng Deng 已提交
143 144
            <li>Mask RCNN</li>
            <li>SOLOv2</li>
145
        </ul>
K
Kaipeng Deng 已提交
146
        <li><b>Face Detection</b></li>
K
Kaipeng Deng 已提交
147
        <ul>
K
Kaipeng Deng 已提交
148 149 150
            <li>FaceBoxes</li>
            <li>BlazeFace</li>
            <li>BlazeFace-NAS</li>
K
Kaipeng Deng 已提交
151
        </ul>
K
Kaipeng Deng 已提交
152
        <li><b>Multi-Object-Tracking</b></li>
K
Kaipeng Deng 已提交
153
        <ul>
K
Kaipeng Deng 已提交
154 155
            <li>JDE</li>
            <li>FairMOT</li>
156
            <li>DeepSORT</li>
K
Kaipeng Deng 已提交
157
        </ul>
K
Kaipeng Deng 已提交
158
        <li><b>KeyPoint-Detection</b></li>
K
Kaipeng Deng 已提交
159
        <ul>
K
Kaipeng Deng 已提交
160 161
            <li>HRNet</li>
            <li>HigherHRNet</li>
K
Kaipeng Deng 已提交
162
        </ul>
K
Kaipeng Deng 已提交
163
      </ul>
164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180
      </td>
      <td>
        <ul>
          <li>ResNet(&vd)</li>
          <li>ResNeXt(&vd)</li>
          <li>SENet</li>
          <li>Res2Net</li>
          <li>HRNet</li>
          <li>Hourglass</li>
          <li>CBNet</li>
          <li>GCNet</li>
          <li>DarkNet</li>
          <li>CSPDarkNet</li>
          <li>VGG</li>
          <li>MobileNetv1/v3</li>  
          <li>GhostNet</li>
          <li>Efficientnet</li>  
K
Kaipeng Deng 已提交
181
          <li>BlazeNet</li>  
182 183 184 185 186 187 188 189 190 191 192
        </ul>
      </td>
      <td>
        <ul><li><b>Common</b></li>
          <ul>
            <li>Sync-BN</li>
            <li>Group Norm</li>
            <li>DCNv2</li>
            <li>Non-local</li>
          </ul>  
        </ul>
K
Kaipeng Deng 已提交
193 194 195 196 197
        <ul><li><b>KeyPoint</b></li>
          <ul>
            <li>DarkPose</li>
          </ul>  
        </ul>
198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228
        <ul><li><b>FPN</b></li>
          <ul>
            <li>BiFPN</li>
            <li>BFP</li>  
            <li>HRFPN</li>
            <li>ACFPN</li>
          </ul>  
        </ul>  
        <ul><li><b>Loss</b></li>
          <ul>
            <li>Smooth-L1</li>
            <li>GIoU/DIoU/CIoU</li>  
            <li>IoUAware</li>
          </ul>  
        </ul>  
        <ul><li><b>Post-processing</b></li>
          <ul>
            <li>SoftNMS</li>
            <li>MatrixNMS</li>  
          </ul>  
        </ul>
        <ul><li><b>Speed</b></li>
          <ul>
            <li>FP16 training</li>
            <li>Multi-machine training </li>  
          </ul>  
        </ul>  
      </td>
      <td>
        <ul>
          <li>Resize</li>  
K
Kaipeng Deng 已提交
229
          <li>Lighting</li>  
230 231 232 233 234 235
          <li>Flipping</li>  
          <li>Expand</li>
          <li>Crop</li>
          <li>Color Distort</li>  
          <li>Random Erasing</li>  
          <li>Mixup </li>
K
Kaipeng Deng 已提交
236
          <li>Mosaic</li>
237
          <li>AugmentHSV</li>
238 239 240
          <li>Cutmix </li>
          <li>Grid Mask</li>
          <li>Auto Augment</li>  
K
Kaipeng Deng 已提交
241
          <li>Random Perspective</li>  
242 243 244 245 246 247 248 249 250
        </ul>  
      </td>  
    </tr>

</td>
    </tr>
  </tbody>
</table>

251
## <img src="https://user-images.githubusercontent.com/48054808/157801371-9a9a8c65-1690-4123-985a-e0559a7f9494.png" width="20"/> Overview of Model Performance
K
Kaipeng Deng 已提交
252 253

The relationship between COCO mAP and FPS on Tesla V100 of representative models of each server side architectures and backbones.
254 255 256

<div align="center">
  <img src="docs/images/fps_map.png" />
257
</div>
258

259
**NOTE:**
260

261
- `CBResNet stands` for `Cascade-Faster-RCNN-CBResNet200vd-FPN`, which has highest mAP on COCO as 53.3%
262

263
- `Cascade-Faster-RCNN` stands for `Cascade-Faster-RCNN-ResNet50vd-DCN`, which has been optimized to 20 FPS inference speed when COCO mAP as 47.8% in PaddleDetection models
264

265
- `PP-YOLO` achieves mAP of 45.9% on COCO and 72.9FPS on Tesla V100. Both precision and speed surpass [YOLOv4](https://arxiv.org/abs/2004.10934)
266

267
- `PP-YOLO v2` is optimized version of `PP-YOLO` which has mAP of 49.5% and 68.9FPS on Tesla V100
Y
YixinKristy 已提交
268

269
- `PP-YOLOE` is optimized version of `PP-YOLO v2` which has mAP of 51.6% and 78.1FPS on Tesla V100
Y
YixinKristy 已提交
270

271
- All these models can be get in [Model Zoo](#ModelZoo)
K
Kaipeng Deng 已提交
272 273 274 275

The relationship between COCO mAP and FPS on Qualcomm Snapdragon 865 of representative mobile side models.

<div align="center">
276
  <img src="docs/images/mobile_fps_map.png" width=600/>
K
Kaipeng Deng 已提交
277 278 279
</div>

**NOTE:**
280

281
- All data tested on Qualcomm Snapdragon 865(4*A77 + 4*A55) processor with batch size of 1 and CPU threads of 4, and use NCNN library in testing, benchmark scripts is publiced at [MobileDetBenchmark](https://github.com/JiweiMaster/MobileDetBenchmark)
K
Kaipeng Deng 已提交
282
- [PP-PicoDet](configs/picodet) and [PP-YOLO-Tiny](configs/ppyolo) are developed and released by PaddleDetection, other models are not provided in PaddleDetection.
283

284
## <img src="https://user-images.githubusercontent.com/48054808/157828296-d5eb0ccb-23ea-40f5-9957-29853d7d13a9.png" width="20"/> Tutorials
285 286 287

### Get Started

Y
YixinKristy 已提交
288 289 290
- [Installation Guide](docs/tutorials/INSTALL.md)
- [Prepare Dataset](docs/tutorials/PrepareDataSet_en.md)
- [Quick Start on PaddleDetection](docs/tutorials/GETTING_STARTED.md)
291 292 293

### Advanced Tutorials

Y
YixinKristy 已提交
294
- Parameter Configuration
295

qq_30618961's avatar
qq_30618961 已提交
296 297
  - [Parameter configuration for RCNN model](docs/tutorials/config_annotation/faster_rcnn_r50_fpn_1x_coco_annotation_en.md)
  - [Parameter configuration for PP-YOLO model](docs/tutorials/config_annotation/ppyolo_r50vd_dcn_1x_coco_annotation_en.md)
298 299

- Model Compression(Based on [PaddleSlim](https://github.com/PaddlePaddle/PaddleSlim))
300

301 302
  - [Prune/Quant/Distill](configs/slim)

Y
YixinKristy 已提交
303
- Inference and Deployment
304

qq_30618961's avatar
qq_30618961 已提交
305 306
  - [Export model for inference](deploy/EXPORT_MODEL_en.md)
  - [Paddle Inference](deploy/README_en.md)
307 308
    - [Python inference](deploy/python)
    - [C++ inference](deploy/cpp)
G
Guanghua Yu 已提交
309
  - [Paddle-Lite](deploy/lite)
W
wangguanzhong 已提交
310
  - [Paddle Serving](deploy/serving)
qq_30618961's avatar
qq_30618961 已提交
311 312
  - [Export ONNX model](deploy/EXPORT_ONNX_MODEL_en.md)
  - [Inference benchmark](deploy/BENCHMARK_INFER_en.md)
313
  - [Exporting to ONNX and using OpenVINO for inference](docs/advanced_tutorials/openvino_inference/README.md)
314

Y
YixinKristy 已提交
315
- Advanced Development
316

qq_30618961's avatar
qq_30618961 已提交
317
  - [New data augmentations](docs/advanced_tutorials/READER_en.md)
318
  - [New detection algorithms](docs/advanced_tutorials/MODEL_TECHNICAL.md)
319

320
## <img src="https://user-images.githubusercontent.com/48054808/157829890-a535b8a6-631c-4c87-b861-64d4b32b2d6a.png" width="20"/> Model Zoo
321

Y
YixinKristy 已提交
322
- General Object Detection
323
  - [Model library and baselines](docs/MODEL_ZOO_cn.md)
324
  - [PP-YOLOE](configs/ppyoloe/README_cn.md)
325
  - [PP-YOLO](configs/ppyolo/README.md)
W
wangguanzhong 已提交
326
  - [PP-PicoDet](configs/picodet/README.md)
qq_30618961's avatar
qq_30618961 已提交
327 328 329 330
  - [Enhanced Anchor Free model--TTFNet](configs/ttfnet/README_en.md)
  - [Mobile models](static/configs/mobile/README_en.md)
  - [676 classes of object detection](static/docs/featured_model/LARGE_SCALE_DET_MODEL_en.md)
  - [Two-stage practical PSS-Det](configs/rcnn_enhance/README_en.md)
331
  - [SSLD pretrained models](docs/feature_models/SSLD_PRETRAINED_MODEL_en.md)
Y
YixinKristy 已提交
332
- General Instance Segmentation
333
  - [SOLOv2](configs/solov2/README.md)
Y
YixinKristy 已提交
334
- Rotated Object Detection
qq_30618961's avatar
qq_30618961 已提交
335
  - [S2ANet](configs/dota/README_en.md)
Y
YixinKristy 已提交
336
- [Keypoint Detection](configs/keypoint)
W
wangguanzhong 已提交
337
  - [PP-TinyPose](configs/keypoint/tiny_pose)
G
Guanghua Yu 已提交
338
  - HigherHRNet
339
  - HRNet
340
  - LiteHRNet
G
Guanghua Yu 已提交
341
- [Multi-Object Tracking](configs/mot/README.md)
Y
YixinKristy 已提交
342
  - [PP-Tracking](deploy/pptracking/README_en.md)
G
Guanghua Yu 已提交
343 344 345
  - [DeepSORT](configs/mot/deepsort/README.md)
  - [JDE](configs/mot/jde/README.md)
  - [FairMOT](configs/mot/fairmot/README.md)
346
  - [ByteTrack](configs/mot/bytetrack/README.md)
Y
YixinKristy 已提交
347
- Practical Specific Models
qq_30618961's avatar
qq_30618961 已提交
348
  - [Face detection](configs/face_detection/README_en.md)
349 350
  - [Pedestrian detection](configs/pedestrian/README.md)
  - [Vehicle detection](configs/vehicle/README.md)
Y
YixinKristy 已提交
351
- Scienario Solution
352
  - [Real-Time Human Analysis Tool PP-Human](deploy/pphuman)
Y
YixinKristy 已提交
353
- Competition Solution
qq_30618961's avatar
qq_30618961 已提交
354 355
  - [Objects365 2019 Challenge champion model](static/docs/featured_model/champion_model/CACascadeRCNN_en.md)
  - [Best single model of Open Images 2019-Object Detection](static/docs/featured_model/champion_model/OIDV5_BASELINE_MODEL_en.md)
356

357
## <img title="" src="https://user-images.githubusercontent.com/48054808/157836473-1cf451fa-f01f-4148-ba68-b6d06d5da2f9.png" alt="" width="20"> Applications
358 359

- [Christmas portrait automatic generation tool](static/application/christmas)
W
wangguanzhong 已提交
360
- [Android Fitness Demo](https://github.com/zhiboniu/pose_demo_android)
361

362
## <img src="https://user-images.githubusercontent.com/48054808/157835981-ef6057b4-6347-4768-8fcc-cd07fcc3d8b0.png" width="20"/> Updates
363

364
For the details of version update, please refer to [Version Update Doc](docs/CHANGELOG.md).
365

366
## <img title="" src="https://user-images.githubusercontent.com/48054808/157835345-f5d24128-abaf-4813-b793-d2e5bdc70e5a.png" alt="" width="20"> License
367 368 369

PaddleDetection is released under the [Apache 2.0 license](LICENSE).

370
## <img src="https://user-images.githubusercontent.com/48054808/157835796-08d4ffbc-87d9-4622-89d8-cf11a44260fc.png" width="20"/> Contribution
371 372

Contributions are highly welcomed and we would really appreciate your feedback!!
373

374
- Thanks [Mandroide](https://github.com/Mandroide) for cleaning the code and unifying some function interface.
375
- Thanks [FL77N](https://github.com/FL77N/) for contributing the code of `Sparse-RCNN` model.
W
Wenyu 已提交
376
- Thanks [Chen-Song](https://github.com/Chen-Song) for contributing the code of `Swin Faster-RCNN` model.
W
wangguanzhong 已提交
377
- Thanks [yangyudong](https://github.com/yangyudong2020), [hchhtc123](https://github.com/hchhtc123) for contributing PP-Tracking GUI interface.
W
wangguanzhong 已提交
378
- Thanks [Shigure19](https://github.com/Shigure19) for contributing PP-TinyPose fitness APP.
379
- Thanks [manangoel99](https://github.com/manangoel99) for contributing Wandblogger for visualization of the training and evaluation metrics  
380

381
## <img src="https://user-images.githubusercontent.com/48054808/157835276-9aab9d1c-1c46-446b-bdd4-5ab75c5cfa48.png" width="20"/> Citation
382 383 384 385 386 387 388 389 390

```
@misc{ppdet2019,
title={PaddleDetection, Object detection and instance segmentation toolkit based on PaddlePaddle.},
author={PaddlePaddle Authors},
howpublished = {\url{https://github.com/PaddlePaddle/PaddleDetection}},
year={2019}
}
```