README_en.md 17.3 KB
Newer Older
W
wangguanzhong 已提交
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113
[简体中文](README.md) | English

# Real Time Pedestrian Analysis Tool PP-Human

**PP-Human is the industry's first open-sourced real-time pedestrian analysis tool based on PaddlePaddle deep learning framework. It has three major features: rich functions, wide application, and efficient deployment.**



![](https://user-images.githubusercontent.com/22989727/178965250-14be25c1-125d-4d90-8642-7a9b01fecbe2.gif)



PP-Human supports various inputs such as images, single-camera, and multi-camera videos. It covers multi-object tracking, attributes recognition, behavior analysis, visitor traffic statistics, and trace records. PP-Human can be applied to fields including Smart Transportation, Smart Community, and industrial inspections. It can also be deployed on server sides and TensorRT accelerator. On the T4 server, it could achieve real-time analysis.

## 📣 Updates

- 🔥 **2022.7.13:PP-Human v2 launched with a full upgrade of four industrial features: behavior analysis, attributes recognition, visitor traffic statistics and ReID. It provides a strong core algorithm for pedestrian detection, tracking and attribute analysis with a simple and detailed development process and model optimization strategy.**
- 2022.4.18: Add  PP-Human practical tutorials, including training, deployment, and action expansion. Details for AIStudio project please see [Link](https://aistudio.baidu.com/aistudio/projectdetail/3842982)

- 2022.4.10: Add PP-Human examples; empower refined management of intelligent community management. A quick start for AIStudio [Link](https://aistudio.baidu.com/aistudio/projectdetail/3679564)
- 2022.4.5: Launch the real-time pedestrian analysis tool PP-Human. It supports pedestrian tracking, visitor traffic statistics, attributes recognition, and falling detection. Due to its specific optimization of real-scene data, it can accurately recognize various falling gestures, and adapt to different environmental backgrounds, light and camera angles.

## 🔮 Features and demonstration

| ⭐ Feature                                          | 💟 Advantages                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          | 💡Example                                                                                                                                     |
| -------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------- |
| **ReID**                                           | Extraordinary performance: special optimization for technical challenges such as target occlusion, uncompleted and blurry objects to achieve mAP 98.8, 1.5ms/person                                                                                                                                                                                                                                                                                                                                                    | <img src="https://user-images.githubusercontent.com/48054808/173037607-0a5deadc-076e-4dcc-bd96-d54eea205f1f.png" title="" alt="" width="191"> |
| **Attribute analysis**                             | Compatible with a variety of data formats: support for images, video input<br/><br/>High performance: Integrated open-sourced datasets with real enterprise data for training, achieved mAP 94.86, 2ms/person<br/><br/>Support 26 attributes: gender, age, glasses, tops, shoes, hats, backpacks and other 26 high-frequency attributes                                                                                                                                                                                | <img src="https://user-images.githubusercontent.com/48054808/173036043-68b90df7-e95e-4ada-96ae-20f52bc98d7c.png" title="" alt="" width="207"> |
| **Behaviour detection**                            | Rich function: support five high-frequency anomaly behavior detection of falling, fighting, smoking, telephoning, and intrusion<br/><br/>Robust: unlimited by different environmental backgrounds, light, and camera angles.<br/><br/>High performance: Compared with video recognition technology, it takes significantly smaller computation resources; support localization and service-oriented rapid deployment<br/><br/>Fast training: only takes 15 minutes to produce high precision behavior detection models | <img src="https://user-images.githubusercontent.com/48054808/173034825-623e4f78-22a5-4f14-9b83-dc47aa868478.gif" title="" alt="" width="209"> |
| **Visitor traffic statistics**<br>**Trace record** | Simple and easy to use: single parameter to initiate functions of visitor traffic statistics and trace record                                                                                                                                                                                                                                                                                                                                                                                                          | <img src="https://user-images.githubusercontent.com/22989727/174736440-87cd5169-c939-48f8-90a1-0495a1fcb2b1.gif" title="" alt="" width="200"> |

## 🗳 Model Zoo

<details>
<summary><b> Single model results (click to expand) </b></summary>

| Task                                        | Application                             | Accuracy        | Inference speed(ms)  | Model size | Inference deployment model                                                                              |
|:-------------------------------------------:|:---------------------------------------:|:--------------- |:--------------------:|:----------:|:-------------------------------------------------------------------------------------------------------:|
| Object detection (high precision)           | Image input                             | mAP: 57.8       | 25.1ms               | 182M       | [Link](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_l_36e_pipeline.zip)               |
| Object detection (Lightweight)              | Image input                             | mAP: 53.2       | 16.2ms               | 27M        | [Link](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_s_36e_pipeline.zip)               |
| Object tracking (high precision)            | Video input                             | MOTA: 82.2      | 31.8ms               | 182M       | [Link](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_l_36e_pipeline.zip)               |
| Object tracking (high precision)            | Video input                             | MOTA: 73.9      | 21.0ms               | 27M        | [Link](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_s_36e_pipeline.zip)               |
| Attribute recognition (high precision)      | Image/Video input Attribute recognition | mA: 95.4        | Single person 4.2ms  | 86M        | [Link](https://bj.bcebos.com/v1/paddledet/models/pipeline/PPHGNet_small_person_attribute_954_infer.zip) |
| Attribute recognition (Lightweight)         | Image/Video input Attribute recognition | mA: 94.5        | Single person 2.9ms  | 7.2M       | [Link](https://bj.bcebos.com/v1/paddledet/models/pipeline/PPLCNet_x1_0_person_attribute_945_infer.zip)  |
| Keypoint detection                          | Video input Attribute recognition       | AP: 87.1        | Single person 5.7ms  | 101M       | [Link](https://bj.bcebos.com/v1/paddledet/models/pipeline/dark_hrnet_w32_256x192.zip)                   |
| Classification based on key point sequences | Video input Attribute recognition       | Accuracy: 96.43 | Single person 0.07ms | 21.8M      | [Link](https://bj.bcebos.com/v1/paddledet/models/pipeline/STGCN.zip)                                    |
| Detection based on Human ID                 | Video input Attribute recognition       | Accuracy: 86.85 | Single person 1.8ms  | 45M        | [Link](https://bj.bcebos.com/v1/paddledet/models/pipeline/PPHGNet_tiny_calling_halfbody.zip)            |
| Detection based on Human ID                 | Video input Attribute recognition       | AP50: 79.5      | Single person 10.9ms | 27M        | [Link](https://bj.bcebos.com/v1/paddledet/models/pipeline/ppyoloe_crn_s_80e_smoking_visdrone.zip)       |
| Video classification                        | Video input Attribute recognition       | Accuracy: 89.0  | 19.7ms/1s Video      | 90M        | [Link](https://videotag.bj.bcebos.com/PaddleVideo-release2.3/ppTSM_fight.pdparams)                      |
| ReID                                        | Video input ReID                        | mAP: 98.8       | Single person 0.23ms | 85M        | [Link](https://bj.bcebos.com/v1/paddledet/models/pipeline/reid_model.zip)                               |

</details>

<details>
<summary><b>End-to-end model results (click to expand)</b></summary>

| Task                                   | End-to-End Speed(ms) | Model                                                                                                                                                                                                                                                                                                                           | Size                                                                                                   |
|:--------------------------------------:|:--------------------:|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------:|:------------------------------------------------------------------------------------------------------:|
| Pedestrian detection (high precision)  | 25.1ms               | [Multi-object tracking](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_l_36e_pipeline.zip)                                                                                                                                                                                                                      | 182M                                                                                                   |
| Pedestrian detection (lightweight)     | 16.2ms               | [Multi-object tracking](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_s_36e_pipeline.zip)                                                                                                                                                                                                                      | 27M                                                                                                    |
| Pedestrian tracking (high precision)   | 31.8ms               | [Multi-object tracking](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_l_36e_pipeline.zip)                                                                                                                                                                                                                      | 182M                                                                                                   |
| Pedestrian tracking (lightweight)      | 21.0ms               | [Multi-object tracking](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_s_36e_pipeline.zip)                                                                                                                                                                                                                      | 27M                                                                                                    |
| Attribute recognition (high precision) | Single person8.5ms   | [Object detection](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_l_36e_pipeline.zip)<br> [Attribute recognition](https://bj.bcebos.com/v1/paddledet/models/pipeline/strongbaseline_r50_30e_pa100k.zip)                                                                                                         | Object detection:182M<br>Attribute recognition:86M                                                     |
| Attribute recognition (lightweight)    | Single person 7.1ms  | [Object detection](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_l_36e_pipeline.zip)<br> [Attribute recognition](https://bj.bcebos.com/v1/paddledet/models/pipeline/strongbaseline_r50_30e_pa100k.zip)                                                                                                         | Object detection:182M<br>Attribute recognition:86M                                                     |
| Falling detection                      | Single person 10ms   | [Multi-object tracking](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_l_36e_pipeline.zip) <br> [Keypoint detection](https://bj.bcebos.com/v1/paddledet/models/pipeline/dark_hrnet_w32_256x192.zip) <br> [Behavior detection based on key points](https://bj.bcebos.com/v1/paddledet/models/pipeline/STGCN.zip) | Multi-object tracking:182M<br>Keypoint detection:101M<br>Behavior detection based on key points: 21.8M |
| Intrusion detection                    | 31.8ms               | [Multi-object tracking](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_l_36e_pipeline.zip)                                                                                                                                                                                                                      | 182M                                                                                                   |
| Fighting detection                     | 19.7ms               | [Video classification](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_l_36e_pipeline.zip)                                                                                                                                                                                                                       | 90M                                                                                                    |
| Smoking detection                      | Single person 15.1ms | [Object detection](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_l_36e_pipeline.zip)<br>[Object detection based on Human Id](https://bj.bcebos.com/v1/paddledet/models/pipeline/ppyoloe_crn_s_80e_smoking_visdrone.zip)                                                                                        | Object detection:182M<br>Object detection based on Human ID: 27M                                       |
| Phoning detection                      | Single person ms     | [Object detection](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_l_36e_pipeline.zip)<br>[Image classification based on Human ID](https://bj.bcebos.com/v1/paddledet/models/pipeline/PPHGNet_tiny_calling_halfbody.zip)                                                                                         | Object detection:182M<br>Image classification based on Human ID:45M                                    |

</details>

Click to download the model, then unzip and save it in the `. /output_inference`.

## 📚 Doc Tutorials

### [A Quick Start](docs/tutorials/QUICK_STARTED.md)

### Pedestrian attribute/feature recognition

* [A quick start](docs/tutorials/attribute.md)
* [Customized development tutorials](../../docs/advanced_tutorials/customization/attribute.md)
  * Data Preparation
  * Model Optimization
  * New Attributes

### Behavior detection

* [A quick start](docs/tutorials/action.md)
  * Falling detection
  * Fighting detection
* [Customized development tutorials](../../docs/advanced_tutorials/customization/action_recognotion/README.md)
  * Solution Selection
  * Data Preparation
  * Model Optimization
  * New Attributes

### ReID

* [A quick start](docs/tutorials/mtmct.md)
* [Customized development tutorials](../../docs/advanced_tutorials/customization/mtmct.md)
  * Data Preparation
  * Model Optimization

### Pedestrian tracking, visitor traffic statistics, trace records

* [A quick start](docs/tutorials/mot.md)
  * Pedestrian tracking,
  * Visitor traffic statistics
  * Regional intrusion diagnosis and counting
* [Customized development tutorials](../../docs/advanced_tutorials/customization/mot.md)
  * Data Preparation
  * Model Optimization