{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "## 1. PP-YOLOv2模型简介\n", "YOLO系列作为目标检测的重要算法,采用单阶段(one-stage)方法使得检测速度大幅提升,但是速度的提升也牺牲了部分准确率作为代价。因此,如何在提升YOLOv3的准确性的同时保持推理速度成为了其实际应用时的关键问题。为同时满足准确性与高效性,PP-YOLOv2作者团队做了大量优化工作,PP-YOLOv2(R50)在COCO test数据集mAP从45.9%达到了49.5%,相较v1提升了3.6个百分点,FP32 FPS高达68.9FPS,FP16 FPS高达106.5FPS,超越了YOLOv4甚至YOLOv5!如果使用RestNet101作为骨架网络,PP-YOLOv2(R101)的mAP更高达50.3%,并且比同等精度下的YOLOv5x快15.9%!\n", "\n", "PP-YOLO模型由飞桨官方出品,是PaddleDetection优化和改进的YOLOv3的模型。\n", "更多关于PaddleDetection可以点击https://github.com/PaddlePaddle/PaddleDetection 进行了解。\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 2. 模型效果及应用场景\n", "### 2.1 目标检测任务:\n", "\n", "#### 2.1.1 数据集:\n", "\n", "数据集以COCO格式为主,分为训练集和测试集。\n", "\n", "#### 2.1.2 模型效果速览:\n", "\n", "PP-YOLOv2在图片上的检测效果为:\n", "\n", "
\n", "\n", "
\n", "\n", "
\n", "\n", "
\n", "\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 3. 模型如何使用\n", "\n", "### 3.1 模型推理:\n", "* 下载 \n", "\n", "(不在Jupyter Notebook上运行时需要将\"!\"或者\"%\"去掉。)\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false }, "scrolled": true, "tags": [] }, "outputs": [], "source": [ "# 克隆PaddleDetection仓库\n", "%mkdir -p ~/work\n", "%cd ~/work/\n", "!git clone https://github.com/PaddlePaddle/PaddleDetection.git\n", "\n", "# 安装其他依赖\n", "%cd PaddleDetection\n", "%mkdir -p demo_input demo_output\n", "!pip install -r requirements.txt" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* 验证是否安装成功\n", "如果报错,只需执行上一步操作。" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "# 测试是否安装成功\n", "!python ppdet/modeling/tests/test_architectures.py" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* 快速体验\n", "\n", "恭喜! 您已经成功安装了PaddleDetection,接下来快速体验目标检测效果" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true, "tags": [] }, "outputs": [], "source": [ "# 在GPU上预测一张图片\n", "!export CUDA_VISIBLE_DEVICES=0\n", "!wget -P demo_input -N https://paddledet.bj.bcebos.com/modelcenter/images/General/000000014439.jpg\n", "!python tools/infer.py -c configs/ppyolo/ppyolo_r50vd_dcn_1x_coco.yml -o use_gpu=true weights=https://paddledet.bj.bcebos.com/models/ppyolo_r50vd_dcn_1x_coco.pdparams --infer_img=demo_input/000000014439.jpg" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "会在output文件夹下生成一个画有预测结果的同名图像。\n", "\n", "结果如下图:\n", "
\n", "\n", "
\n", "\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 3.2 模型训练:\n", "* 克隆PaddleDetection仓库(详见3.1)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* 准备数据集\n", "\n", "这里需要开发者自行准备数据集,以下举例假设开发者已经准备好wider_face数据集并且解压到 PaddleDetection/dataset/wider_face/下\n", "通过以下命令确认数据集已经准备完成。\n", " " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* 修改yaml配置文件\n", "\n", "\n", "\n", "修改配置文件``` configs/runtime.yml```\n", "\n", "```\n", "use_gpu: true # 是否使用GPU\n", "log_iter: 20 # 每多少个迭代次显示\n", "save_dir: output # 模型保存目录\n", "snapshot_epoch: 1 # 多少个epoch保存一次\n", "print_flops: false\n", "\n", "```\n", "修改配置文件``` configs/datasets/coco_detection.yml```\n", "\n", "```\n", "metric: COCO \n", "num_classes: 1 # 分类个数 \n", "\n", "TrainDataset:\n", " !COCODataSet\n", " image_dir: WIDER_train/images # 训练图像数据基于数据集根目录的相对路径\n", " anno_path: WIDERFaceTrainCOCO.json # 训练标注文件基于数据集根目录的相对路径\n", " dataset_dir: dataset/wider_face # 数据集根目录\n", " data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']\n", "\n", "EvalDataset:\n", " !COCODataSet\n", " image_dir: WIDER_val/images # 测试图像数据基于数据集根目录的相对路径\n", " anno_path: WIDERFaceValCOCO.json # 测试标注文件基于数据集根目录的相对路径\n", " dataset_dir: dataset/wider_face\n", "\n", "TestDataset:\n", " !ImageFolder\n", " anno_path: WIDERFaceValCOCO.json\n", " \n", "```\n", "修改配置文件``` configs/ppyolo/ppyolov2_r50vd_dcn_365e_coco.yml```\n", "\n", "```\n", "_BASE_: [\n", " '../datasets/coco_detection.yml',\n", " '../runtime.yml',\n", " './_base_/ppyolov2_r50vd_dcn.yml',\n", " './_base_/optimizer_365e.yml',\n", " './_base_/ppyolov2_reader.yml',\n", "]\n", "\n", "snapshot_epoch: 8 #每训练多少个epoch保存一次模型\n", "weights: output/ppyolov2_r50vd_dcn_365e_coco/model_final\n", "```\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* 训练模型" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true, "tags": [] }, "outputs": [], "source": [ "%cd ~/work/PaddleDetection/\n", "%env CUDA_VISIBLE_DEVICES=0\n", "#开始训练\n", "!python tools/train.py -c configs/ppyolo/ppyolov2_r50vd_dcn_365e_coco.yml --use_vdl=true " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* 模型评估\n", "\n", "我们提供了```configs/ppyolo/ppyolo_test.yml```用于评估COCO test-dev2017数据集的效果,评估COCO test-dev2017数据集的效果须先从COCO数据集下载页下载test-dev2017数据集,解压到```configs/ppyolo/ppyolo_test.yml```中EvalReader.dataset中配置的路径,并使用如下命令进行评估(需附上模型平均精度AP和AR评估指标,提供图片或者表格)。" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "%cd ~/work/PaddleDetection/\n", "%env CUDA_VISIBLE_DEVICES=0\n", "#训练完以后,进行评估\n", "!python tools/eval.py -c configs/ppyolo/ppyolov2_r50vd_dcn_365e_coco.yml -o use_gpu=true" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 4. 模型原理" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* 采用 Path Aggregation Network(路径聚合网络)设计 Detection Net\n", "\n", "PP-YOLOv2 采用了 FPN 的变形之一—PAN(Path Aggregation Network)来从上至下的聚合特征信息。\n", "\n", "
\n", "\n", "
\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* 采用 Mish 激活函数\n", "\n", "PP-YOLOv2的mish 激活函数应用在了 detection neck 而不是骨架网络上。\n", "\n", "* 更大的输入尺寸\n", "\n", "增加输入尺寸直接带来了目标面积的扩大。这样,网络可以更容易捕捉到小尺幅目标的信息,得到更高的性能。然而,更大的输入会带来更多的内存占用。所以在使用这个策略的同时,PP-YOLOv2同时减少 Batch Size。" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 5. 注意事项\n", "\n", "不管是 PP-YOLO 还是 PP-YOLOv2,都是在寻找在产业实践中最高性价比的目标检测方案,而不是单纯的以提升单阶段目标检测的精度去堆网络和策略。关于PP-YOLOv2的论文中也特别提到,是以实验报告的角度来为业界开发者展示更多网络优化的方法,这些策略也可以被应用在其他网络的优化上,希望在给业界开发者带来更好的网络的同时,也带来更多的算法优化启发。同时,在使用PP-YOLO系列时也应当注意:\n", "\n", "\n", "* PP-YOLO模型使用COCO数据集中的train2017作为训练集,使用val2017和test-dev2017作为测试集,Box APtest为mAP(IoU=0.5:0.95)评估结果。\n", "* PP-YOLO模型训练过程中使用8 GPUs,每GPU batch size为24进行训练,如训练GPU数和batch size不使用上述配置,须参考FAQ调整学习率和迭代次数。\n", "* PP-YOLO模型推理速度测试采用单卡V100,batch size=1进行测试,使用CUDA 10.2, CUDNN 7.5.1,TensorRT推理速度测试使用TensorRT 5.1.2.2。\n", "* PP-YOLO模型FP32的推理速度测试数据为使用tools/export_model.py脚本导出模型后,使用deploy/python/infer.py脚本中的--run_benchnark参数使用Paddle预测库进行推理速度benchmark测试结果, 且测试的均为不包含数据预处理和模型输出后处理(NMS)的数据(与YOLOv4(AlexyAB)测试方法一致)。\n", "* TensorRT FP16的速度测试相比于FP32去除了yolo_box(bbox解码)部分耗时,即不包含数据预处理,bbox解码和NMS(与YOLOv4(AlexyAB)测试方法一致)。" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 6. 相关论文以及引用信息\n", "(如果本模型有相关论文发表,或者是基于某些论文的结果,可以在这里\n", "提供Bibtex格式的参考文献。)\n", "```\n", "@article{huang2021pp,\n", " title={PP-YOLOv2: A Practical Object Detector},\n", " author={Huang, Xin and Wang, Xinxin and Lv, Wenyu and Bai, Xiaying and Long, Xiang and Deng, Kaipeng and Dang, Qingqing and Han, Shumin and Liu, Qiwen and Hu, Xiaoguang and others},\n", " journal={arXiv preprint arXiv:2104.10419},\n", " year={2021}\n", "}\n", "```\n" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3.8.13 ('paddle_env')", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.13" }, "vscode": { "interpreter": { "hash": "864bc28e4d94d9c1c4bd0747e4313c0ab41718ab445ced17dbe1a405af5ecc64" } } }, "nbformat": 4, "nbformat_minor": 4 }