introduction_en.ipynb

{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 1. Introduction\n",
    "P-YOLOE is an excellent single-stage anchor-free model based on PP-YOLOv2, surpassing a variety of popular YOLO models. PP-YOLOE has a series of models, named s/m/l/x, which are configured through width multiplier and depth multiplier. PP-YOLOE avoids using special operators, such as Deformable Convolution or Matrix NMS, to be deployed friendly on various hardware. For more details, please refer to [official documentation](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.5/configs/ppyoloe/README.md)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 2. Model Effects\n",
    "PP-YOLOE-l achieves 51.6 mAP on COCO test-dev2017 dataset with 78.1 FPS on Tesla V100. While using TensorRT FP16, PP-YOLOE-l can be further accelerated to 149.2 FPS. PP-YOLOE-s/m/x also have excellent accuracy and speed performance as shown below.\n",
    "\n",
    "<div align=\"center\">\n",
    "  <img src=\"https://paddledet.bj.bcebos.com/modelcenter/images/ppyoloe_map_fps.png\" width=500 />\n",
    "</div>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 3. How to use the model\n",
    "Clone PaddleDetection firstly and put the COCO-style dataset in `dataset/coco`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%mkdir -p ~/work\n",
    "%cd ~/work/\n",
    "!git clone https://github.com/PaddlePaddle/PaddleDetection.git\n",
    "\n",
    "%cd PaddleDetection\n",
    "%mkdir -p demo_input demo_output\n",
    "!pip install -r requirements.txt"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 3.1 Training\n",
    "Training PP-YOLOE with mixed precision on 8 GPUs with following command"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# training with single GPU\n",
    "!python tools/train.py -c configs/ppyoloe/ppyoloe_crn_l_300e_coco.yml --eval --amp\n",
    "\n",
    "# training with mutiple GPUs\n",
    "!python -m paddle.distributed.launch --gpus 0,1,2,3,4,5,6,7 tools/train.py -c configs/ppyoloe/ppyoloe_crn_l_300e_coco.yml --eval --amp"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Notes:**\n",
    "- If you need to evaluate while training, please add `--eval`.\n",
    "- PP-YOLOE supports mixed precision training, please add `--amp`."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 3.2 Inference\n",
    "#### 3.2.1 Inference with Paddle-TRT\n",
    "Next, we will introduce how to use Paddle Inference to deploy PP-YOLOE models in TensorRT FP16 mode.\n",
    "\n",
    "First, refer to [Paddle Inference Docs](https://www.paddlepaddle.org.cn/inference/master/user_guides/download_lib.html#python), download and install packages corresponding to CUDA, CUDNN and TensorRT version.\n",
    "\n",
    "Then, Exporting PP-YOLOE for Paddle Inference **with TensorRT**, use following command."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!python tools/export_model.py -c configs/ppyoloe/ppyoloe_crn_l_300e_coco.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyoloe_crn_l_300e_coco.pdparams trt=True"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Finally, inference in TensorRT FP16 mode."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# inference single image\n",
    "!wget -P demo_input -N https://paddledet.bj.bcebos.com/modelcenter/images/General/000000014439.jpg\n",
    "!CUDA_VISIBLE_DEVICES=0 python deploy/python/infer.py --model_dir=output_inference/ppyoloe_crn_l_300e_coco --image_file=demo_input/000000014439.jpg --device=gpu --run_mode=trt_fp16 --output_dir=demo_output"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Notes:**\n",
    "- TensorRT will perform optimization for the current hardware platform according to the definition of the network, generate an inference engine and serialize it into a file. This inference engine is only applicable to the current hardware hardware platform. If your hardware and software platform has not changed, you can set `use_static=True` in [enable_tensorrt_engine](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.5/deploy/python/infer.py#L857). In this way, the serialized file generated will be saved in the `output_inference` folder, and the saved serialized file will be loaded the next time when TensorRT is executed.\n",
    "- PaddleDetection release/2.4 and later versions will support NMS calling TensorRT, which requires PaddlePaddle release/2.3 and later versions."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 3.2.2 Inference with PaddleInference\n",
    "For some AI acceleration hardware that does not support TensorRT, we can directly use PaddleInference to deploy. First run the following command to export the model"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!python tools/export_model.py -c configs/ppyoloe/ppyoloe_crn_l_300e_coco.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyoloe_crn_l_300e_coco.pdparams"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Inference with PaddleInference directly."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# inference single image\n",
    "!CUDA_VISIBLE_DEVICES=0 python deploy/python/infer.py --model_dir=output_inference/ppyoloe_crn_l_300e_coco --image_file=demo_input/000000014439.jpg --device=gpu --run_mode=paddle --output_dir=demo_output"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 4. Model principle\n",
    "The overall archtecture of PP-YOLOE is shown as follows:\n",
    "<div align=\"center\">\n",
    "  <img src=\"https://paddledet.bj.bcebos.com/modelcenter/images/PP-YOLOE-Arch.png\" width=70% />\n",
    "</div>\n",
    "\n",
    "PP-YOLOE is composed of following methods:\n",
    "- Scalable backbone and neck\n",
    "- [Task Alignment Learning](https://arxiv.org/abs/2108.07755)\n",
    "- Efficient Task-aligned head with [DFL](https://arxiv.org/abs/2006.04388) and [VFL](https://arxiv.org/abs/2008.13367)\n",
    "- [SiLU(Swish) activation function](https://arxiv.org/abs/1710.05941)\n",
    "\n",
    "For more details, please refer to our technical report: https://arxiv.org/abs/2203.16250"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 5. Attention\n",
    "**All commands run on AI Studio's `jupyter` by default. If running on a terminal, remove the % or ! at the beginning of the command.**"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 6. Related papers and citations\n",
    "```\n",
    "@article{xu2022pp,\n",
    "  title={PP-YOLOE: An evolved version of YOLO},\n",
    "  author={Xu, Shangliang and Wang, Xinxin and Lv, Wenyu and Chang, Qinyao and Cui, Cheng and Deng, Kaipeng and Wang, Guanzhong and Dang, Qingqing and Wei, Shengyu and Du, Yuning and others},\n",
    "  journal={arXiv preprint arXiv:2203.16250},\n",
    "  year={2022}\n",
    "}\n",
    "```"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3.10.6 64-bit",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "name": "python",
   "version": "3.10.6"
  },
  "orig_nbformat": 4,
  "vscode": {
   "interpreter": {
    "hash": "31f2aee4e71d21fbe5cf8b01ff0e069b9275f58929596ceb00d14d90e3e16cd6"
   }
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}