introduction_cn.ipynb 13.1 KB
Notebook
Newer Older
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155
{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 1. PP-YOLOv2模型简介\n",
    "YOLO系列作为目标检测的重要算法,采用单阶段(one-stage)方法使得检测速度大幅提升,但是速度的提升也牺牲了部分准确率作为代价。因此,如何在提升YOLOv3的准确性的同时保持推理速度成为了其实际应用时的关键问题。为同时满足准确性与高效性,PP-YOLOv2作者团队做了大量优化工作,PP-YOLOv2(R50)在COCO test数据集mAP从45.9%达到了49.5%,相较v1提升了3.6个百分点,FP32 FPS高达68.9FPS,FP16 FPS高达106.5FPS,超越了YOLOv4甚至YOLOv5!如果使用RestNet101作为骨架网络,PP-YOLOv2(R101)的mAP更高达50.3%,并且比同等精度下的YOLOv5x快15.9%!\n",
    "\n",
    "PP-YOLO模型由飞桨官方出品,是PaddleDetection优化和改进的YOLOv3的模型。\n",
    "更多关于PaddleDetection可以点击https://github.com/PaddlePaddle/PaddleDetection 进行了解。\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 2. 模型效果及应用场景\n",
    "### 2.1 目标检测任务:\n",
    "\n",
    "#### 2.1.1 数据集:\n",
    "\n",
    "数据集以COCO格式为主,分为训练集和测试集。\n",
    "\n",
    "#### 2.1.2 模型效果速览:\n",
    "\n",
    "PP-YOLOv2在图片上的检测效果为:\n",
    "\n",
    "<div align=\"center\">\n",
    "<img src=\"https://user-images.githubusercontent.com/23690325/198869600-b7a549db-2cc6-49b1-8009-937fb5abe992.png\"  width = \"80%\"  />\n",
    "</div>\n",
    "\n",
    "<div align=\"center\">\n",
    "<img src=\"https://user-images.githubusercontent.com/23690325/198869611-451eda5f-eda6-4717-902c-b9b06070bc72.png\"  width = \"80%\"  />\n",
    "</div>\n",
    "\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 3. 模型如何使用\n",
    "\n",
    "### 3.1 模型推理:\n",
    "* 下载 \n",
    "\n",
    "(不在Jupyter Notebook上运行时需要将\"!\"或者\"%\"去掉。)\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    },
    "scrolled": true,
    "tags": []
   },
   "outputs": [],
   "source": [
    "%cd ~/work\n",
    "# 克隆PaddleDetection(从gitee上更快),本项目以做持久化处理,不用克隆了。\n",
    "!git clone https://gitee.com/paddlepaddle/PaddleDetection"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "* 安装"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true,
    "tags": []
   },
   "outputs": [],
   "source": [
    "# 运行脚本需在PaddleDetection目录下\n",
    "%cd ~/work/PaddleDetection/\n",
    "\n",
    "# 安装所需依赖项【已经做持久化处理,无需再安装】\n",
    "!pip install pyzmq   \n",
    "!pip install -r requirements.txt\n",
    "\n",
    "# 运行脚本需在PaddleDetection目录下\n",
    "%cd ~/work/PaddleDetection/\n",
    "# 设置python运行目录\n",
    "%env PYTHONPATH=.:$PYTHONPATH\n",
    "# 设置GPU\n",
    "%env CUDA_VISIBLE_DEVICES=0\n",
    "\n",
    "# 经简单测试,提前安装所需依赖,比直接使用setup.py更快\n",
    "!pip install pycocotools  \n",
    "!pip install cython-bbox      \n",
    "!pip install xmltodict  \n",
    "!pip install terminaltables    \n",
    "!pip intall motmetrics  \n",
    "!pip install lap    \n",
    "!pip install shapely      \n",
    "!pip install pytest-benchmark    \n",
    "!pip install pytest    \n",
    "\n",
    "\n",
    "# 开始安装PaddleDetection \n",
    "!python setup.py install  #如果安装过程中长时间卡住,可中断后继续重新执行,"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "* 验证是否安装成功\n",
    "如果报错,只需执行上一步操作。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "# 测试是否安装成功\n",
    "!python ppdet/modeling/tests/test_architectures.py"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "* 快速体验\n",
    "\n",
    "恭喜! 您已经成功安装了PaddleDetection,接下来快速体验目标检测效果"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true,
    "tags": []
   },
   "outputs": [],
   "source": [
    "# 在GPU上预测一张图片\n",
156 157
    "!export CUDA_VISIBLE_DEVICES=0\n",
    "!python tools/infer.py -c configs/ppyolo/ppyolo_r50vd_dcn_1x_coco.yml -o use_gpu=true weights=https://paddledet.bj.bcebos.com/models/ppyolo_r50vd_dcn_1x_coco.pdparams --infer_img=demo/000000014439.jpg"
158 159 160 161 162 163 164 165 166 167
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "会在output文件夹下生成一个画有预测结果的同名图像。\n",
    "\n",
    "结果如下图:\n",
    "<div align=\"center\">\n",
168
    "<img src=\"https://bj.bcebos.com/v1/paddledet/modelcenter/images/ppyolov2_infer.jpg\"  width = \"60%\"  />\n",
169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401
    "</div>\n",
    "\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 3.2 模型训练:\n",
    "* 克隆PaddleDetection仓库(详见3.1)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "* 准备数据集\n",
    "\n",
    "这里需要开发者自行准备数据集,以下举例假设开发者已经准备好wider_face数据集并且解压到 PaddleDetection/dataset/wider_face/下\n",
    "通过以下命令确认数据集已经准备完成。\n",
    " "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "# 查看解压目录\n",
    "#%cd ~/work/PaddleDetection/\n",
    "#!tree -d dataset/wider_face"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "* 修改yaml配置文件\n",
    "\n",
    "\n",
    "\n",
    "修改配置文件``` configs/runtime.yml```\n",
    "\n",
    "```\n",
    "use_gpu: true # 是否使用GPU\n",
    "log_iter: 20  # 每多少个迭代次显示\n",
    "save_dir: output # 模型保存目录\n",
    "snapshot_epoch: 1 # 多少个epoch保存一次\n",
    "print_flops: false\n",
    "\n",
    "```\n",
    "修改配置文件``` configs/datasets/coco_detection.yml```\n",
    "\n",
    "```\n",
    "metric: COCO    \n",
    "num_classes: 1  # 分类个数 \n",
    "\n",
    "TrainDataset:\n",
    "  !COCODataSet\n",
    "    image_dir: WIDER_train/images   # 训练图像数据基于数据集根目录的相对路径\n",
    "    anno_path: WIDERFaceTrainCOCO.json  # 训练标注文件基于数据集根目录的相对路径\n",
    "    dataset_dir: dataset/wider_face # 数据集根目录\n",
    "    data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']\n",
    "\n",
    "EvalDataset:\n",
    "  !COCODataSet\n",
    "    image_dir: WIDER_val/images     # 测试图像数据基于数据集根目录的相对路径\n",
    "    anno_path: WIDERFaceValCOCO.json     # 测试标注文件基于数据集根目录的相对路径\n",
    "    dataset_dir: dataset/wider_face\n",
    "\n",
    "TestDataset:\n",
    "  !ImageFolder\n",
    "    anno_path: WIDERFaceValCOCO.json\n",
    "    \n",
    "```\n",
    "修改配置文件``` configs/ppyolo/ppyolov2_r50vd_dcn_365e_coco.yml```\n",
    "\n",
    "```\n",
    "_BASE_: [\n",
    "  '../datasets/coco_detection.yml',\n",
    "  '../runtime.yml',\n",
    "  './_base_/ppyolov2_r50vd_dcn.yml',\n",
    "  './_base_/optimizer_365e.yml',\n",
    "  './_base_/ppyolov2_reader.yml',\n",
    "]\n",
    "\n",
    "snapshot_epoch: 8   #每训练多少个epoch保存一次模型\n",
    "weights: output/ppyolov2_r50vd_dcn_365e_coco/model_final\n",
    "```\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "* 训练模型"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true,
    "tags": []
   },
   "outputs": [],
   "source": [
    "%cd ~/work/PaddleDetection/\n",
    "%env CUDA_VISIBLE_DEVICES=0\n",
    "#开始训练\n",
    "!python  tools/train.py -c configs/ppyolo/ppyolov2_r50vd_dcn_365e_coco.yml  --use_vdl=true "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "* 模型评估\n",
    "\n",
    "我们提供了```configs/ppyolo/ppyolo_test.yml```用于评估COCO test-dev2017数据集的效果,评估COCO test-dev2017数据集的效果须先从COCO数据集下载页下载test-dev2017数据集,解压到```configs/ppyolo/ppyolo_test.yml```中EvalReader.dataset中配置的路径,并使用如下命令进行评估(需附上模型平均精度AP和AR评估指标,提供图片或者表格)。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "%cd ~/work/PaddleDetection/\n",
    "%env CUDA_VISIBLE_DEVICES=0\n",
    "#训练完以后,进行评估\n",
    "!python tools/eval.py -c configs/ppyolo/ppyolov2_r50vd_dcn_365e_coco.yml  -o use_gpu=true"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 4. 模型原理\n",
    "\n",
    "\n",
    "(必须带有图片)\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "* 采用 Path Aggregation Network(路径聚合网络)设计 Detection Net\n",
    "\n",
    "PP-YOLOv2 采用了 FPN 的变形之一—PAN(Path Aggregation Network)来从上至下的聚合特征信息。\n",
    "\n",
    "<div align=\"center\">\n",
    "<img src=\"https://user-images.githubusercontent.com/23690325/198870025-f7f0e976-0fb0-4496-b359-b0e2a751cab1.png\"  width = \"80%\"  />\n",
    "</div>\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "* 采用 Mish 激活函数\n",
    "\n",
    "PP-YOLOv2的mish 激活函数应用在了 detection neck 而不是骨架网络上。\n",
    "\n",
    "* 更大的输入尺寸\n",
    "\n",
    "增加输入尺寸直接带来了目标面积的扩大。这样,网络可以更容易捕捉到小尺幅目标的信息,得到更高的性能。然而,更大的输入会带来更多的内存占用。所以在使用这个策略的同时,PP-YOLOv2同时减少 Batch Size。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 5. 注意事项\n",
    "\n",
    "不管是 PP-YOLO 还是 PP-YOLOv2,都是在寻找在产业实践中最高性价比的目标检测方案,而不是单纯的以提升单阶段目标检测的精度去堆网络和策略。关于PP-YOLOv2的论文中也特别提到,是以实验报告的角度来为业界开发者展示更多网络优化的方法,这些策略也可以被应用在其他网络的优化上,希望在给业界开发者带来更好的网络的同时,也带来更多的算法优化启发。同时,在使用PP-YOLO系列时也应当注意:\n",
    "\n",
    "\n",
    "* PP-YOLO模型使用COCO数据集中的train2017作为训练集,使用val2017和test-dev2017作为测试集,Box APtest为mAP(IoU=0.5:0.95)评估结果。\n",
    "* PP-YOLO模型训练过程中使用8 GPUs,每GPU batch size为24进行训练,如训练GPU数和batch size不使用上述配置,须参考FAQ调整学习率和迭代次数。\n",
    "* PP-YOLO模型推理速度测试采用单卡V100,batch size=1进行测试,使用CUDA 10.2, CUDNN 7.5.1,TensorRT推理速度测试使用TensorRT 5.1.2.2。\n",
    "* PP-YOLO模型FP32的推理速度测试数据为使用tools/export_model.py脚本导出模型后,使用deploy/python/infer.py脚本中的--run_benchnark参数使用Paddle预测库进行推理速度benchmark测试结果, 且测试的均为不包含数据预处理和模型输出后处理(NMS)的数据(与YOLOv4(AlexyAB)测试方法一致)。\n",
    "* TensorRT FP16的速度测试相比于FP32去除了yolo_box(bbox解码)部分耗时,即不包含数据预处理,bbox解码和NMS(与YOLOv4(AlexyAB)测试方法一致)。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 6. 相关论文以及引用信息\n",
    "(如果本模型有相关论文发表,或者是基于某些论文的结果,可以在这里\n",
    "提供Bibtex格式的参考文献。)\n",
    "```\n",
    "@article{huang2021pp,\n",
    "  title={PP-YOLOv2: A Practical Object Detector},\n",
    "  author={Huang, Xin and Wang, Xinxin and Lv, Wenyu and Bai, Xiaying and Long, Xiang and Deng, Kaipeng and Dang, Qingqing and Han, Shumin and Liu, Qiwen and Hu, Xiaoguang and others},\n",
    "  journal={arXiv preprint arXiv:2104.10419},\n",
    "  year={2021}\n",
    "}\n",
    "```\n"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.8"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}