introduction_en.ipynb 13.0 KB
Notebook
Newer Older
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147
{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 1.PP-YOLOv2 Introduction\n",
    "\n",
    "As an important algorithm for object detection, the YOLO series adopts the one-stage method to greatly improve the detection speed, but the speed improvement also sacrifices some of the accuracy as a cost. Therefore, how to improve the accuracy of YOLOv3 while maintaining the speed of reasoning has become a key issue in its practical application.PP-YOLOv2 (R50) mAP in the COCO test dataset rises from 45.9% to 49.5%, an increase of 3.6 percentage points compared to v1. FP32 FPS is up to 68.9FPS, FP16 FPS is up to 106.5FPS, surpassing YOLOv4 and even YOLOv5! If RestNet101 is used as the backbone network, PP-YOLOv2 (R101) has up to 50.3% mAP and 15.9% faster than YOLOv5x with the same accuracy!\n",
    "\n",
    "The PP-YOLO model is officially produced by PaddlePaddle and is a model of the YOLOv3 optimized and improved by PaddleDetection. More information about PaddleDetection can be found here https://github.com/PaddlePaddle/PaddleDetection.\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 2. Model Effects and Application Scenarios\n",
    "### 2.1 Object Detection Tasks:\n",
    "\n",
    "#### 2.1.1 Datasets:\n",
    "\n",
    "The dataset is mainly in COCO format, which is divided into training set and test set.\n",
    "\n",
    "#### 2.1.2 Model Effects:\n",
    "\n",
    "\n",
    "The detection effect of PP-YOLOv2 on the picture is:\n",
    "\n",
    "<div align=\"center\">\n",
    "<img src=\"https://user-images.githubusercontent.com/23690325/198869600-b7a549db-2cc6-49b1-8009-937fb5abe992.png\"  width = \"80%\"  />\n",
    "</div>\n",
    "\n",
    "<div align=\"center\">\n",
    "<img src=\"https://user-images.githubusercontent.com/23690325/198869611-451eda5f-eda6-4717-902c-b9b06070bc72.png\"  width = \"80%\"  />\n",
    "</div>\n",
    "\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 3. How to Use the Model\n",
    "\n",
    "### 3.1 Model Inference:\n",
    "* Download \n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "%cd /home/aistudio/work\n",
    "\n",
    "!git clone https://gitee.com/paddlepaddle/PaddleDetection"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "* Installation"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "# The script needs to be run in the PaddleDetection directory\n",
    "%cd /home/aistudio/work/PaddleDetection/\n",
    "\n",
    "# Install the required dependencies [already persisted, no need to install again].\n",
    "!pip install pyzmq  -t /home/aistudio/external-libraries    \n",
    "# After testing on AIstudio paddlepaddle 2.2.2, an error will occur, because pyzmq needs to be installed in advance.\n",
    "!pip install -r requirements.txt\n",
    "\n",
    "# The script needs to be run in the PaddleDetection directory.\n",
    "%cd /home/aistudio/work/PaddleDetection/\n",
    "# Set the python run directory.\n",
    "%env PYTHONPATH=.:$PYTHONPATH\n",
    "# Set GPU\n",
    "%env CUDA_VISIBLE_DEVICES=0\n",
    "\n",
    "!pip install pycocotools  \n",
    "!pip install cython-bbox      \n",
    "!pip install xmltodict  \n",
    "!pip install terminaltables    \n",
    "!pip intall motmetrics  \n",
    "!pip install lap    \n",
    "!pip install shapely      \n",
    "!pip install pytest-benchmark    \n",
    "!pip install pytest    \n",
    "\n",
    "\n",
    "# Download PaddleDetection \n",
    "!python setup.py install  "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "* Verify whether the installation was successful or not.\n",
    "If an error is reported, only perform the previous step."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "# Whether the installation was successful or not.\n",
    "!python ppdet/modeling/tests/test_architectures.py"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "* Quick experience\n",
    "\n",
    "Congratulations! Now that you've successfully installed PaddleDetection, let's get a quick feel at object detection."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "# Predict a picture on the GPU.\n",
148 149
    "!export CUDA_VISIBLE_DEVICES=0\n",
    "!python tools/infer.py -c configs/ppyolo/ppyolo_r50vd_dcn_1x_coco.yml -o use_gpu=true weights=https://paddledet.bj.bcebos.com/models/ppyolo_r50vd_dcn_1x_coco.pdparams --infer_img=demo/000000014439.jpg"
150 151 152 153 154 155 156 157 158
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "An image with the predicted result is generated under the output folder.\n",
    "\n",
    "The result is as follows:\n",
159 160 161
    "<div align=\"center\">\n",
    "<img src=\"https://bj.bcebos.com/v1/paddledet/modelcenter/images/ppyolov2_infer.jpg\"  width = \"60%\"  />\n",
    "</div>\n"
162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 3.2 Model Training\n",
    "* Clone the PaddleDetection repository (see 3.1 for details)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "* Prepare the datasets."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "# return to /home/aistudio\n",
    "%cd ~\n",
    "\n",
    "# Review the extract directory\n",
    "%cd /home/aistudio/work/PaddleDetection/\n",
    "!tree -d dataset/wider_face"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "* Change yaml configurations files.\n",
    "\n",
    "\n",
    "\n",
    "Change yaml configurations files``` configs/runtime.yml```\n",
    "\n",
    "```\n",
    "use_gpu: true \n",
    "log_iter: 20  \n",
    "save_dir: output \n",
    "snapshot_epoch: 1 \n",
    "print_flops: false\n",
    "\n",
    "```\n",
    "Change yaml configurations files``` configs/datasets/coco_detection.yml```\n",
    "\n",
    "```\n",
    "metric: COCO    \n",
    "num_classes: 1  \n",
    "\n",
    "TrainDataset:\n",
    "  !COCODataSet\n",
    "    image_dir: WIDER_train/images   \n",
    "    anno_path: WIDERFaceTrainCOCO.json  \n",
    "    dataset_dir: dataset/wider_face \n",
    "    data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']\n",
    "\n",
    "EvalDataset:\n",
    "  !COCODataSet\n",
    "    image_dir: WIDER_val/images     \n",
    "    anno_path: WIDERFaceValCOCO.json   \n",
    "    dataset_dir: dataset/wider_face\n",
    "\n",
    "TestDataset:\n",
    "  !ImageFolder\n",
    "    anno_path: WIDERFaceValCOCO.json\n",
    "    \n",
    "```\n",
    "Change yaml configurations files``` configs/ppyolo/ppyolov2_r50vd_dcn_365e_coco.yml```\n",
    "\n",
    "```\n",
    "_BASE_: [\n",
    "  '../datasets/coco_detection.yml',\n",
    "  '../runtime.yml',\n",
    "  './_base_/ppyolov2_r50vd_dcn.yml',\n",
    "  './_base_/optimizer_365e.yml',\n",
    "  './_base_/ppyolov2_reader.yml',\n",
    "]\n",
    "\n",
    "snapshot_epoch: 8   \n",
    "weights: output/ppyolov2_r50vd_dcn_365e_coco/model_final\n",
    "```\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "* Train the model."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "%cd /home/aistudio/work/PaddleDetection/\n",
    "%env CUDA_VISIBLE_DEVICES=0\n",
    "# Beginning training\n",
    "!python  tools/train.py -c configs/ppyolo/ppyolov2_r50vd_dcn_365e_coco.yml  --use_vdl=true "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "* Model evaluation\n",
    "\n",
    "We provide ```configs/ppyolo/ppyolo_test.yml```for evaluating the effect of COCO test-dev2017 dataset, to evaluate the effect of COCO test-dev2017 dataset, you must first download the test-dev2017 dataset from the COCO dataset download page, and extract it to ```configs/ppyolo/ppyolo_test.yml```. The path configured in EvalReader.dataset and evaluated using the following command (attach the average accuracy AP and AR evaluation indicators of the model, and provide pictures or tables)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "%cd /home/aistudio/work/PaddleDetection/\n",
    "%env CUDA_VISIBLE_DEVICES=0\n",
    "\n",
    "!python tools/eval.py -c configs/ppyolo/ppyolov2_r50vd_dcn_365e_coco.yml  -o use_gpu=true"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 4. Model Principles\n",
    "* Design Detection Net using Path Aggregation Network\n",
    "\n",
    "PP-YOLOv2 uses one of FPN variations, PAN (Path Aggregation Network), to aggregate feature information from top to bottom.\n",
    "\n",
    "![](https://ai-studio-static-online.cdn.bcebos.com/5f047e2e5f3c47efbb81c6cf3d81415e531133c1feff4f36a2cc13f88210ab69)\n",
    "\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "* Use the Mish activation function\n",
    "\n",
    "PP-YOLOv2's mish activation function is applied to the detection neck instead of the skeleton network.\n",
    "\n",
    "* Larger input size\n",
    "\n",
    "Increasing the input size directly leads to an increase in the target area. This makes it easier for the network to capture information about small-sized targets for higher performance. However, larger inputs result in a larger memory footprint. So while using this strategy, PP-YOLOv2 also reduces the Batch Size."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 5. Attention\n",
    "\n",
    "Whether it is PP-YOLO or PP-YOLOv2, they are looking for the most cost-effective object detection solution in industrial practice, rather than simply stacking networks and strategies to improve the accuracy of single-stage object detection. The paper on PP-YOLOv2 also specifically mentioned that it is to show more network optimization methods for industry developers from the perspective of experimental reports, and these strategies can also be applied to the optimization of other networks, hoping to bring better networks to industry developers and more algorithm optimization inspiration. At the same time, when using the PP-YOLO series, attention should also be paid to:\n",
    "\n",
    "\n",
    "* The PP-YOLO model uses train2017 from the COCO dataset as the training set, val2017 and test-dev2017 as the test set, and Box APtest evaluates the results for mAP (IoU=0.5:0.95).\n",
    "* PP-YOLO model training process uses 8 GPUs, each GPU batch size is 24 for training, if the number of training GPUs and batch size do not use the above configuration, you must refer to the FAQ to adjust the learning rate and number of iterations.\n",
    "* PP-YOLO model inference speed test is tested with single card V100, batch size=1, CUDA 10.2, CUDNN 7.5.1, and TensorRT inference speed test using TensorRT 5.1.2.2.\n",
    "* The inference speed test data of PP-YOLO model FP32 is the inference speed benchmark test result using the Paddle prediction library using the --run_benchnark parameter in the deploy/python/infer .py script after exporting the model using the tools/export_model.py script, and the test is data that does not contain data preprocessing and model output post-processing (NMS) ( Consistent with YOLOv4 (AlexyAB) test method).\n",
    "* Compared to FP32, the speed test of TensorRT FP16 removes the yolo_box (bbox decoding) part of the time-consuming, i.e. does not include data preprocessing, bbox decoding and NMS (consistent with YOLOv4 (AlexyAB) test method).\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 6. Related papers and citations\n",
    "(If this model has relevant papers published, or is based on the results of certain papers, it can be here.)\n",
    "References in Bibtex format are provided. )\n",
    "\n",
    "```\n",
    "@article{huang2021pp,\n",
    "  title={PP-YOLOv2: A Practical Object Detector},\n",
    "  author={Huang, Xin and Wang, Xinxin and Lv, Wenyu and Bai, Xiaying and Long, Xiang and Deng, Kaipeng and Dang, Qingqing and Han, Shumin and Liu, Qiwen and Hu, Xiaoguang and others},\n",
    "  journal={arXiv preprint arXiv:2104.10419},\n",
    "  year={2021}\n",
    "}\n",
    "```\n",
    "\n"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.8"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}