[cherry-pick] update pphuman mot doc and add ppyoloe-s (#6211)

* add ppyoloe-s, update mot doc, test=document_fix * fix doc typo, test=document_fix

[cherry-pick] update pphuman mot doc and add ppyoloe-s (#6211)
* add ppyoloe-s, update mot doc, test=document_fix * fix doc typo, test=document_fix
e98e5f36 · Feng Ni · GitHub · eaf2dbe0 · e98e5f36 · e98e5f36
5 changed file
--- a/deploy/pphuman/README.md
+++ b/deploy/pphuman/README.md
@@ -42,8 +42,10 @@ PP-Human提供了目标检测、属性识别、行为识别、ReID预训练模
 | 任务            | 适用场景 | 精度 | 预测速度（ms） | 模型权重 | 预测部署模型 |
 | :---------:     |:---------:     |:---------------     | :-------:  |  :------:      | :------:      |
-| 目标检测        | 图片输入 | mAP: 56.3  | 28.0ms          |[下载链接](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_l_36e_pipeline.pdparams) |[下载链接](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_l_36e_pipeline.zip) |
+| 目标检测(高精度) | 图片输入 | mAP: 56.6  | 28.0ms          |[下载链接](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_l_36e_pipeline.pdparams) |[下载链接](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_l_36e_pipeline.zip) |
-| 目标跟踪        | 视频输入 | MOTA: 72.0  | 33.1ms           |[下载链接](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_l_36e_pipeline.pdparams) |[下载链接](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_l_36e_pipeline.zip) |
+| 目标检测(轻量级) | 图片输入 | mAP: 53.2  | 22.1ms          |[下载链接](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_s_36e_pipeline.pdparams) |[下载链接](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_s_36e_pipeline.zip) |
+| 目标跟踪(高精度) | 视频输入 | MOTA: 79.5  | 33.1ms           |[下载链接](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_l_36e_pipeline.pdparams) |[下载链接](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_l_36e_pipeline.zip) |
+| 目标跟踪(轻量级) | 视频输入 | MOTA: 69.1  | 27.2ms           |[下载链接](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_s_36e_pipeline.pdparams) |[下载链接](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_s_36e_pipeline.zip) |
 | 属性识别    | 图片/视频输入 属性识别  | mA: 94.86 |  单人2ms     | - |[下载链接](https://bj.bcebos.com/v1/paddledet/models/pipeline/strongbaseline_r50_30e_pa100k.zip) |
 | 关键点检测    | 视频输入 行为识别 | AP: 87.1 | 单人2.9ms        |[下载链接](https://bj.bcebos.com/v1/paddledet/models/pipeline/dark_hrnet_w32_256x192.pdparams) |[下载链接](https://bj.bcebos.com/v1/paddledet/models/pipeline/dark_hrnet_w32_256x192.zip)
 | 行为识别   |  视频输入 行为识别  | 准确率: 96.43 |  单人2.7ms      | - |[下载链接](https://bj.bcebos.com/v1/paddledet/models/pipeline/STGCN.zip) |

--- a/deploy/pphuman/README_en.md
+++ b/deploy/pphuman/README_en.md
@@ -43,8 +43,10 @@ To make users have access to models of different scenarios, PP-Human provides pr
 | Task            | Scenario | Precision | Inference Speed（FPS） | Model Weights |Model Inference and Deployment |
 | :---------:     |:---------:     |:---------------     | :-------:  | :------:      | :------:      |
-| Object Detection        | Image/Video Input | mAP: 56.3  | 28.0ms           |[Link](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_l_36e_pipeline.pdparams) |[Link](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_l_36e_pipeline.zip) |
+| Object Detection(high-precision)        | Image/Video Input | mAP: 56.6  | 28.0ms           |[Link](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_l_36e_pipeline.pdparams) |[Link](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_l_36e_pipeline.zip) |
-| Object Tracking       | Image/Video Input | MOTA: 72.0  | 33.1ms           |[Link](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_l_36e_pipeline.pdparams) |[Link](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_l_36e_pipeline.zip) |
+| Object Detection(light-weight)        | Image/Video Input | mAP: 53.2  | 22.1ms           |[Link](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_s_36e_pipeline.pdparams) |[Link](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_s_36e_pipeline.zip) |
+| Object Tracking(high-precision)       | Image/Video Input | MOTA: 79.5  | 33.1ms           |[Link](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_l_36e_pipeline.pdparams) |[Link](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_l_36e_pipeline.zip) |
+| Object Tracking(light-weight)       | Image/Video Input | MOTA: 69.1  | 27.2ms           |[Link](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_s_36e_pipeline.pdparams) |[Link](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_s_36e_pipeline.zip) |
 | Attribute Recognition    | Image/Video Input  Attribute Recognition | mA: 94.86 |  2ms per person       | - |[Link](https://bj.bcebos.com/v1/paddledet/models/pipeline/strongbaseline_r50_30e_pa100k.zip) |
 | Keypoint Detection    | Video Input  Action Recognition | AP: 87.1 | 2.9ms per person        | [Link](https://bj.bcebos.com/v1/paddledet/models/pipeline/dark_hrnet_w32_256x192.pdparams) |[Link](https://bj.bcebos.com/v1/paddledet/models/pipeline/dark_hrnet_w32_256x192.zip)
 | Action Recognition   |  Video Input  Action Recognition  | Precision 96.43 |  2.7ms per person          | - |[Link](https://bj.bcebos.com/v1/paddledet/models/pipeline/STGCN.zip) |

--- a/deploy/pphuman/config/tracker_config.yml
+++ b/deploy/pphuman/config/tracker_config.yml
@@ -2,7 +2,7 @@
 # The tracker of MOT JDE Detector (such as FairMOT) is exported together with the model.
 # Here 'min_box_area' and 'vertical_ratio' are set for pedestrian, you can modify for other objects tracking.
-type: JDETracker # 'JDETracker' or 'DeepSORTTracker'
+type: JDETracker
 # BYTETracker
 JDETracker:
@@ -13,14 +13,3 @@ JDETracker:
  match_thres: 0.9
  min_box_area: 0
  vertical_ratio: 0 # 1.6 for pedestrian
-DeepSORTTracker:
-  input_size: [64, 192]
-  min_box_area: 0
-  vertical_ratio: -1
-  budget: 100
-  max_age: 70
-  n_init: 3
-  metric_type: cosine
-  matching_threshold: 0.2
-  max_iou_distance: 0.9
--- a/deploy/pphuman/docs/mot.md
+++ b/deploy/pphuman/docs/mot.md
@@ -6,9 +6,10 @@
 | 任务                 | 算法 | 精度 | 预测速度(ms) |下载链接                                                                               |
 |:---------------------|:---------:|:------:|:------:| :---------------------------------------------------------------------------------: |
-| 行人检测/跟踪    |  PP-YOLOE | mAP: 56.3 <br> MOTA: 72.0 | 检测: 28ms <br> 跟踪：33.1ms | [下载链接](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_l_36e_pipeline.zip) |
+| 行人检测/跟踪    |  PP-YOLOE-l | mAP: 56.6 <br> MOTA: 79.5 | 检测: 28.0ms <br> 跟踪：33.1ms | [下载链接](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_l_36e_pipeline.zip) |
+| 行人检测/跟踪    |  PP-YOLOE-s | mAP: 53.2 <br> MOTA: 69.1 | 检测: 22.1ms <br> 跟踪：27.2ms | [下载链接](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_s_36e_pipeline.zip) |
-1. 检测/跟踪模型精度为MOT17，CrowdHuman，HIEVE和部分业务数据融合训练测试得到
+1. 检测/跟踪模型精度为[COCO-Person](http://cocodataset.org/), [CrowdHuman](http://www.crowdhuman.org/), [HIEVE](http://humaninevents.org/) 和部分业务数据融合训练测试得到，验证集为业务数据
 2. 预测速度为T4 机器上使用TensorRT FP16时的速度, 速度包含数据预处理、模型预测、后处理全流程
 ## 使用方法
@@ -53,8 +54,8 @@ python deploy/pphuman/pipeline.py --config deploy/pphuman/config/infer_cfg.yml \
 ## 方案说明
-1. 目标检测/多目标跟踪获取图片/视频输入中的行人检测框，模型方案为PP-YOLOE，详细文档参考[PP-YOLOE](../../../configs/ppyoloe/README_cn.md)
+1. 目标检测/多目标跟踪获取图片/视频输入中的行人检测框，模型方案为PP-YOLOE，详细文档参考[PP-YOLOE](../../../configs/ppyoloe)
-2. 多目标跟踪模型方案基于[ByteTrack](https://arxiv.org/pdf/2110.06864.pdf)，采用PP-YOLOE替换原文的YOLOX作为检测器，采用BYTETracker作为跟踪器。
+2. 多目标跟踪模型方案基于[ByteTrack](https://arxiv.org/pdf/2110.06864.pdf)，采用PP-YOLOE替换原文的YOLOX作为检测器，采用BYTETracker作为跟踪器，详细文档参考[ByteTrack](../../../configs/mot/bytetrack)
 ## 参考文献
 ```

--- a/deploy/pphuman/docs/mot_en.md
+++ b/deploy/pphuman/docs/mot_en.md
@@ -6,9 +6,10 @@ Pedestrian detection and tracking is widely used in the intelligent community, i
 | Task                 | Algorithm | Precision | Inference Speed(ms) | Download Link                                                                               |
 |:---------------------|:---------:|:------:|:------:| :---------------------------------------------------------------------------------: |
-| Pedestrian Detection/ Tracking    |  PP-YOLOE | mAP: 56.3 <br> MOTA: 72.0 | Detection: 28ms <br> Tracking：33.1ms | [Download Link](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_l_36e_pipeline.zip) |
+| Pedestrian Detection/ Tracking    |  PP-YOLOE-l | mAP: 56.6 <br> MOTA: 79.5 | Detection: 28.0ms <br> Tracking：33.1ms | [Download Link](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_l_36e_pipeline.zip) |
+| Pedestrian Detection/ Tracking    |  PP-YOLOE-s | mAP: 53.2 <br> MOTA: 69.2 | Detection: 22.1ms <br> Tracking：27.2ms | [Download Link](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_s_36e_pipeline.zip) |
-1. The precision of the pedestrian detection/ tracking model is obtained by trainning and testing on [MOT17](https://motchallenge.net/), [CrowdHuman](http://www.crowdhuman.org/), [HIEVE](http://humaninevents.org/) and some business data.
+1. The precision of the pedestrian detection/ tracking model is obtained by trainning and testing on [COCO-Person](http://cocodataset.org/), [CrowdHuman](http://www.crowdhuman.org/), [HIEVE](http://humaninevents.org/) and some business data.
 2. The inference speed is the speed of using TensorRT FP16 on T4, the total number of data pre-training, model inference, and post-processing.
 ## How to Use
@@ -57,7 +58,7 @@ Data source and copyright owner：Skyinfor Technology. Thanks for the provision
 1. Get the pedestrian detection box of the image/ video input through object detection and multi-object tracking. The adopted model is PP-YOLOE, and for details, please refer to [PP-YOLOE](../../../configs/ppyoloe).
-2. The multi-object tracking model solution is based on [ByteTrack](https://arxiv.org/pdf/2110.06864.pdf), and replace the original YOLOX with P-YOLOE as the detector，and BYTETracker as the tracker.
+2. The multi-object tracking model solution is based on [ByteTrack](https://arxiv.org/pdf/2110.06864.pdf), and replace the original YOLOX with P-YOLOE as the detector，and BYTETracker as the tracker, please refer to [ByteTrack](../../../configs/mot/bytetrack).
 ## Reference
 ```