1. For installation details, please refer to [Installation Tutorials](../../../../docs/tutorials/INSTALL.md)
2. If you need TensorRT inference acceleration (speed measurement), please install PaddlePaddle with `TensorRT version`. You can download and install it from the [PaddlePaddle Installation Package](https://paddleinference.paddlepaddle.org.cn/v2.2/user_guides/download_lib.html#python) or follow the [Instructions](https://www. paddlepaddle.org.cn/inference/master/optimize/paddle_trt.html) or use docker, or self-compiling to prepare the environment.
## Model Download
PP-Human provides object detection, attribute recognition, behaviour recognition and ReID pre-trained models for different applications. Developers can download them directly.
| Task | End-to(ms) | Model Solution | Model Size |
| Smoking Detection | Single Person 15.1ms | [Object Detection](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_l_36e_pipeline.zip)<br>[Object Detection Based On Body ID](https://bj.bcebos.com/v1/paddledet/models/pipeline/ppyoloe_crn_s_80e_smoking_visdrone.zip) | Object Detection:182M<br>Object Detection Based On Body ID:27M |
| Phone-calling Detection | Single Person 6.0ms | [Object Detection](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_l_36e_pipeline.zip)<br>[Image Classification Based On Body ID](https://bj.bcebos.com/v1/paddledet/models/pipeline/PPHGNet_tiny_calling_halfbody.zip) | Object Detection:182M<br>Image Classification Based On Body ID:45M |
Download the model and unzip it into the `. /output_inference` folder.
In the configuration file, the model path defaults to the download path of the model. If the user does not change it, the corresponding model will be downloaded automatically upon inference.
**Note:**
- Model accuracy is tested on fused datasets, which contain both open source and enterprise datasets.
- ReID model accuracy is tested on the Market1501 dataset
- Prediction speed is obtained at T4 with TensorRT FP16 enabled, which includes data pre-processing, model inference and post-processing.
## Configuration
The PP-Human-related configuration is located in ``deploy/pipeline/config/infer_cfg_pphuman.yml``, and this configuration file contains all the features currently supported by PP-Human. If you want to see the configuration for a specific feature, please refer to the relevant configuration in ``deploy/pipeline/config/examples/``. In addition, the contents of the configuration file can be modified with the `-o`command line parameter. E.g. to modify the model directory of an attribute, developers can run ```-o ATTR.model_dir="DIR_PATH"``.
The features and corresponding task types are as follows.
| Single-camera video | Attribute Recognition | Multi-Object Tracking Attribute Recognition | MOT ATTR |
| Single-camera video | Behaviour Recognition | Multi-Object Tracking Keypoint Detection Falling detection | MOT KPT SKELETON_ACTION |
Take attribute recognition based on video input as an example: Its task type includes multi-object tracking and attributes recognition. The specific configuration is as follows.
2. Use the command line to enable functions or change the model path.
```
# Example: Pedestrian tracking, specify config file path, model path and test video. The specified model path on the command line has a higher priority than the config file.
3. For rtsp stream, use --rtsp RTSP [RTSP ...] parameter to specify one or more rtsp streams. Separate the multiple addresses with a space, or replace the video address directly after the video_file with the rtsp stream address), examples as follows
```
# Example: Single video stream for pedestrian attribute recognition
Due to the large gap in computing power of the Jetson platform compared to the server, we suggest:
1. choose a lightweight model, especially for tracking model, `ppyoloe_s: https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_s_36e_pipeline.zip` is recommended
2. For frame skipping of tracking; we recommend 2 or 3: `skip_frame_num: 3`
With this recommended configuration, it is possible to achieve higher speeds on the TX2 platform. It has been tested with attribute case, with speeds up to 20fps. The configuration file can be modified directly (recommended) or from the command line (not recommended due to its long fields).
| -o | Option | Overwrite the corresponding configuration in the configuration file |
| --image_file | Option | Images to be predicted |
| --image_dir | Option | Path to the images folder to be predicted |
| --video_file | Option | Video to be predicted, or rtsp stream address (rtsp parameter recommended) |
| --rtsp | Option | rtsp video stream address, supports one or more simultaneous streams input |
| --camera_id | Option | The camera ID for prediction, default is -1 ( for no camera prediction, can be set to 0 - (number of cameras - 1) ), press `q` in the visualization interface during the prediction process to output the prediction result to: output/output.mp4 |
| --device | Option | Running device, options include `CPU/GPU/XPU`, and the default is `CPU`. |
| --output_dir | Option | The root directory for the visualization results, and the default is output/ |
| --run_mode | Option | For GPU, the default is paddle, with (paddle/trt_fp32/trt_fp16/trt_int8) as optional |
| --enable_mkldnn | Option | Whether to enable MKLDNN acceleration in CPU prediction, the default is False |
| --cpu_threads | Option | Set the number of cpu threads, and the default is 1 |
| --trt_calib_mode | Option | Whether TensorRT uses the calibration function, and the default is False; set to True when using TensorRT's int8 function and False when using the PaddleSlim quantized model |
| --do_entrance_counting | Option | Whether to count entrance/exit traffic flows, the default is False |
| --draw_center_traj | Option | Whether to map the trace, the default is False |
| --region_type | Option | 'horizontal' (default), 'vertical': traffic count direction; 'custom': set break-in area |
| --region_polygon | Option | Set the coordinates of the polygon multipoint in the break-in area. No default. |
| --do_break_in_counting | Option | Area break-in checks |
## Solutions
The overall solution for PP-Human v2 is shown in the graph below:
### Pedestrian detection
- Take PP-YOLOE L as the object detection model
- For detailed documentation, please refer to [PP-YOLOE](... /... /... /... /configs/ppyoloe/) and [Multiple-Object-Tracking](pphuman_mot_en.md)
### Pedestrian tracking
- Vehicle tracking by SDE solution
- Adopt PP-YOLOE L (high precision) and S (lightweight) for detection models
- Adopt the OC-SORT solution for racking module
- Refer to [OC-SORT](... /... /... /... /configs/mot/ocsort) and [Multi-Object Tracking](pphuman_mot_en.md) for details
### Multi-camera & multi-pedestrain tracking
- Use PP-YOLOE & OC-SORT to acquire single-camera multi-object tracking trajectory
- Extract features for each frame using ReID (StrongBaseline network).
- Match multi-camera trajectory features to obtain multi-camera tracking results.
- Refer to [Multi-camera & multi-pedestrain tracking](pphuman_mtmct_en.md) for details.
### Attribute Recognition
- Use PP-YOLOE + OC-SORT to track the human body.
- Use PP-HGNet, PP-LCNet (multi-classification model) to complete the attribute recognition. Main attributes include age, gender, hat, eyes, top and bottom dressing style, backpack.
- Refer to [attribute recognition](pphuman_attribute_en.md) for details.
### Behaviour Recognition:
- Four behaviour recognition solutions are provided:
- 1. Behaviour recognition based on skeletal points, e.g. falling recognition
- 2. Behaviour recognition based on image classification, e.g. phone call recognition
- 3. Behaviour recognition based on detection, e.g. smoking recognition
- 4. Behaviour recognition based on Video classification, e.g. fighting recognition
- For details, please refer to [Behaviour Recognition](pphuman_action_en.md)
1. For installation details, please refer to [Installation Tutorials](../../../../docs/tutorials/INSTALL.md)
2. If you need TensorRT inference acceleration (speed measurement), please install PaddlePaddle with `TensorRT version`. You can download and install it from the [PaddlePaddle Installation Package](https://paddleinference.paddlepaddle.org.cn/v2.2/user_guides/download_lib.html#python) or follow the [Instructions]([https://www](https://www). paddlepaddle.org.cn/inference/master/optimize/paddle_trt.html) or use docker, or self-compiling to prepare the environment.
## Model Download
PP-Vehicle provides object detection, attribute recognition, behaviour recognition and ReID pre-trained models for different applications. Developers can download them directly.
| Task | End-to(ms) | Model Solution | Model Size |
Download the model and unzip it into the `. /output_inference` folder.
In the configuration file, the model path defaults to the download path of the model. If the user does not change it, the corresponding model will be downloaded automatically upon inference.
**Notes:**
- The accuracy of detection tracking model is obtained from the joint dataset PPVehicle (integration of the public dataset BDD100K-MOT and UA-DETRAC). For more details, please refer to [PP-Vehicle](... /... /... /... /configs/ppvehicle)
- Inference speed is obtained at T4 with TensorRT FP16 enabled, which includes data pre-processing, model inference and post-processing.
## Configuration
PP-Vehicle related configuration locates in ``deploy/pipeline/config/infer_cfg_ppvehicle.yml``. Developers need to set specific task types to use different features.
The features and corresponding task types are as follows.
| Single-camera video | Attribute Recognition | Multi-Object Tracking Attribute Recognition | MOT ATTR |
| Single-camera video | Attribute Recognition | Multi-Object Tracking Attribute Recognition | MOT VEHICLEPLATE |
Take attribute recognition based on video input as an example: Its task type includes multi-object tracking and attributes recognition. The specific configuration is as follows.
- If the developer needs to carry out different tasks, set the corresponding enables option to be True in the configuration file.
- If the developer only needs to modify the model file path, run the command line with `-o MOT.model_dir=ppyoloe/` after --config, or manually modify the corresponding model path in the configuration file. For more details, please refer to the following parameter descriptions
## Inference Deployment
1. Use the default configuration directly or the configuration file in examples, or modify the configuration in `infer_cfg_ppvehicle.yml`
```
# Example:In vehicle detection,specify configuration file path and test image
2. Use the command line to enable functions or change the model path.
```
# Example:In vehicle tracking,specify configuration file path and test video, Turn on the MOT model and modify the model path on the command line, the model path specified on the command line has higher priority than the configuration file
3. For rtsp stream, use --rtsp RTSP [RTSP ...] parameter to specify one or more rtsp streams. Separate the multiple addresses with a space, or replace the video address directly after the video_file with the rtsp stream address), examples as follows
```
# Example: Single video stream for pedestrian attribute recognition
Due to the large gap in computing power of the Jetson platform compared to the server, we suggest:
1. choose a lightweight model, especially for tracking model, `ppyoloe_s: https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_s_36e_pipeline.zip` is recommended
2. For frame skipping of tracking; we recommend 2 or 3: `skip_frame_num: 3`
With this recommended configuration, it is possible to achieve higher speeds on the TX2 platform. It has been tested with attribute case, with speeds up to 20fps. The configuration file can be modified directly (recommended) or from the command line (not recommended due to its long fields).
| -o | Option | Overwrite the corresponding configuration in the configuration file |
| --image_file | Option | Images to be predicted |
| --image_dir | Option | Path to the images folder to be predicted |
| --video_file | Option | Video to be predicted, or rtsp stream address (rtsp parameter recommended) |
| --rtsp | Option | rtsp video stream address, supports one or more simultaneous streams input |
| --camera_id | Option | The camera ID for prediction, default is -1 ( for no camera prediction, can be set to 0 - (number of cameras - 1) ), press `q` in the visualization interface during the prediction process to output the prediction result to: output/output.mp4 |
| --device | Option | Running device, options include `CPU/GPU/XPU`, and the default is `CPU`. |
| --output_dir | Option | The root directory for the visualization results, and the default is output/ |
| --run_mode | Option | For GPU, the default is paddle, with (paddle/trt_fp32/trt_fp16/trt_int8) as optional |
| --enable_mkldnn | Option | Whether to enable MKLDNN acceleration in CPU prediction, the default is False |
| --cpu_threads | Option | Set the number of cpu threads, and the default is 1 |
| --trt_calib_mode | Option | Whether TensorRT uses the calibration function, and the default is False; set to True when using TensorRT's int8 function and False when using the PaddleSlim quantized model |
| --do_entrance_counting | Option | Whether to count entrance/exit traffic flows, the default is False |
| --draw_center_traj | Option | Whether to draw center trajectory, the default is False |
| --region_type | Option | 'horizontal' (default), 'vertical': traffic count direction; 'custom': set illegal parking area |
| --region_polygon | Option | Set the coordinates of the polygon multipoint in the illegal parking area. No default. |
| --illegal_parking_time | Option | Set the time threshold for illegal parking in seconds (s), -1 (default) indicates no check |
## Solutions
The overall solution for PP-Vehicle v2 is shown in the graph below:
<divwidth="1000"align="center">
<imgsrc="../../../../docs/images/ppvehicle.png"/>
</div>
###
### Vehicle detection
- Take PP-YOLOE L as the object detection model
- For detailed documentation, please refer to [PP-YOLOE](... /... /... /... /configs/ppyoloe/) and [Multiple-Object-Tracking](ppvehicle_mot_en.md)
### Vehicle tracking
- Vehicle tracking by SDE solution
- Adopt PP-YOLOE L (high precision) and S (lightweight) for detection models
- Adopt the OC-SORT solution for racking module
- Refer to [OC-SORT](... /... /... /... /configs/mot/ocsort) and [Multi-Object Tracking](ppvehicle_mot_en.md) for details
### Attribute Recognition
- Use PP-LCNet provided by PaddleClas to recognize vehicle colours and model attributes.
- For details, please refer to [Attribute Recognition](ppvehicle_attribute_en.md)
### License plate recognition
- Use ch_PP-OCRv3_det+ch_PP-OCRv3_rec model to recognize license plate number
- For details, please refer to [Plate Recognition](ppvehicle_plate_en.md)
### Illegal Parking Detection
- Use vehicle tracking model (high precision) PP-YOLOE L to determine whether the parking is illegal based on the vehicle's trajectory and the designated illegal parking area. If it is illegal parking, display the illegal parking plate number.
- For details, please refer to [Illegal Parking Detection](ppvehicle_illegal_parking_en.md)
Vehicle detection and tracking are widely used in traffic monitoring and autonomous driving. The detection and tracking module is integrated in PP-Vehicle, providing a solid foundation for tasks including license plate detection and vehicle attribute recognition. We provide pre-trained models that can be directly used by developers.
1. The detection/tracking model uses the PPVehicle dataset ( which integrates BDD100K-MOT and UA-DETRAC). The dataset merged car, truck, bus, van from BDD100K-MOT and car, bus, van from UA-DETRAC all into 1 class vehicle(1). The detection accuracy mAP was tested on the test set of PPVehicle, and the tracking accuracy MOTA was obtained on the test set of BDD100K-MOT (`car, truck, bus, van` were combined into 1 class `vehicle`). For more details about the training procedure, please refer to [ppvehicle](... /... /... /... /configs/ppvehicle).
2. Inference speed is obtained at T4 with TensorRT FP16 enabled, which includes data pre-processing, model inference and post-processing.
## How To Use
【Config】
The parameters associated with the attributes in the configuration file are as follows.
```
DET:
model_dir: output_inference/mot_ppyoloe_l_36e_ppvehicle/ # Vehicle detection model path
batch_size: 1 # Batch_size size for model inference
MOT:
model_dir: output_inference/mot_ppyoloe_l_36e_ppvehicle/ # Vehicle tracking model path
batch_size: 1 # Batch_size size for model inference, 1 only for tracking task.
skip_frame_num: -1 # Number of frames to skip, -1 means no skipping, the maximum skipped frames are recommended to be 3
enable: False # Whether or not to enable this function, please make sure it is set to True before tracking
```
【Usage】
1. Download the model from the link in the table above and unzip it to ``. /output_inference`` and change the model path in the configuration file. The default is to download the model automatically, no changes are needed.
2. The image input will start a pure detection task, and the start command is as follows
3. Video input will start a tracking task. Please set `enable=True` for the MOT configuration in infer_cfg_ppvehicle.yml. If skip frames are needed for faster detection and tracking, it is recommended to set `skip_frame_num: 2` the maximum should not exceed 3.
- Config different model path in```./deploy/pipeline/config/infer_cfg_ppvehicle.yml```. The detection and tracking models correspond to the `DET` and `MOT` fields respectively. Modify the path under the corresponding field to the actual path.
-**[Recommand]** Add`-o MOT.model_dir=[YOUR_DETMODEL_PATH]` after the config in the command line to modify model path
-`--do_entrance_counting` : Whether to count entrance/exit traffic flows, the default is False
-`--draw_center_traj` : Whether to draw center trajectory, the default is False. Its input video is preferably taken from a still camera
-`--region_type` : The region for traffic counting. When setting `--do_entrance_counting`, there are two options: `horizontal` or `vertical`. The default is `horizontal`, which means the center horizontal line of the video picture is the entrance and exit. When the center point of the same object frame is on both sides of the centre horizontal line of the region in two adjacent seconds, i.e. the count adds 1.
5. Regional break-in and counting
Please set the MOT config: enable=True in `infer_cfg_ppvehicle.yml` before running the starting command:
- Test video of area break-ins must be taken from a still camera, with no shaky or moving footage.
-`--do_break_in_counting`Indicates whether or not to count the entrance and exit of the area. The default is False.
-`--region_type` indicates the region for traffic counting, when setting `--do_break_in_counting` only `custom` can be selected, and the default is `custom`. It means that the customized region is used as the entry and exit. When the coordinates of the lower boundary midpoint of the same object frame goes to the inside of the region within two adjacent seconds, i.e. the count adds one.
-`--region_polygon` indicates a sequence of point coordinates for a polygon in a customized region. Every two are a pair of point coordinates (x,y). **In clockwise order** they are connected into a **closed region**, at least 3 pairs of points are needed (or 6 integers). The default value is `[]`. Developers need to set the point coordinates manually. If it is a quadrilateral region, the coordinate order is `top left, top right , bottom right, bottom left`. Developers can run [this code](... /... /tools/get_video_info.py) to get the resolution frames of the predicted video. It also supports customizing and adjusting the visualisation of the polygon area.
The code for the visualisation of the customized polygon area runs as follows.
A quick tip for drawing customized area: first take any point to get the picture, open it with the drawing tool, mouse over the area point and the coordinates will be displayed, record it and round it up, use it as a region_polygon parameter for this visualisation code and run the visualisation again, and fine-tune the point coordinates parameter.
【Showcase】
<divwidth="1000"align="center">
<imgsrc="../images/mot_vehicle.gif"/>
</div>
## Solution
【Solution and feature】
- PP-YOLOE is adopted for vehicle detection frame of object detection, multi-object tracking in the picture/video input. For details, please refer to [PP-YOLOE](... /... /... /configs/ppyoloe/README_cn.md) and [PPVehicle](../../../../configs/ppvehicle)
-[OC-SORT](https://arxiv.org/pdf/2203.14360.pdf) is adopted as multi-object tracking model. PP-YOLOE replaced YOLOX as detector, and OCSORTTracker is the tracker. For more details, please refer to [OC-SORT](../../../../configs/mot/ocsort)
## Reference
```
@article{cao2022observation,
title={Observation-Centric SORT: Rethinking SORT for Robust Multi-Object Tracking},
author={Cao, Jinkun and Weng, Xinshuo and Khirodkar, Rawal and Pang, Jiangmiao and Kitani, Kris},
1. The tracking model uses the PPVehicle dataset ( which integrates BDD100K-MOT and UA-DETRAC). The dataset merged car, truck, bus, van from BDD100K-MOT and car, bus, van from UA-DETRAC all into 1 class vehicle(1).
2. License plate detection and recognition models were obtained from fine-tuned PP-OCRv3 model on the CCPD2019 and CCPD2020 mixed license plate datasets.
## How to Use
1. Download models from the above table and unzip it to```PaddleDetection/output_inference```, and modify the model path in the configuration file. Models can also be downloaded automatically by default: Set enable: True of `VEHICLE_PLATE` in `deploy/pipeline/config/infer_cfg_ppvehicle.yml`
2. For picture input, the start command is as follows (for more descriptions of the command parameters, please refer to [Quick Start - Parameter Description](. /PPVehicle_QUICK_STARTED.md#41-parameter description)).
- Config different model path in ```./deploy/pipeline/config/infer_cfg_ppvehicle.yml```, and modify`VEHICLE_PLATE`field to config license plate recognition model modification
-**[Recommand]** Add`-o VEHICLE_PLATE.det_model_dir=[YOUR_DETMODEL_PATH] VEHICLE_PLATE.rec_model_dir=[YOUR_RECMODEL_PATH]` to config file in command line.
The test results are as follows:
<divwidth="1000"align="center">
<imgsrc="../images/ppvehicleplate.jpg"/>
</div>
## Solutions
1. PP-YOLOE is adopted for vehicle detection frame of object detection, multi-object tracking in the picture/video input. For details, please refer to [PP-YOLOE](... /... /... /configs/ppyoloe/README_cn.md)
2. By using the coordinates of the vehicle detection frame, each vehicle's image is intercepted in the input image
3. Use the license plate detection model to identify the location of the license plate in each vehicle screenshot as well as the license plate area. The PP-OCRv3_det model is adopted as the solution, obtained from fine-tuned CCPD dataset in terms of number plate.
4. Use a character recognition model to identify characters in a number plate. The PP-OCRv3_det model is adopted as the solution, obtained from fine-tuned CCPD dataset in terms of number plate.
**Performance optimization measures:**
1. Use a frame skipping strategy to detect license plates every 10 frames to reduce the computing workload.
2. Use the license plate result stabilization strategy to avoid the volatility of single frame results; use all historical license plate recognition results of the same id to gain the most likely result for that id.
## Reference
1. PaddeDetection featured detection model PP-YOLOE](../../../../configs/ppyoloe)。
2. Paddle OCR Model Library [PaddleOCR](https://github.com/PaddlePaddle/PaddleOCR)。