introduction_en.ipynb 13.2 KB
Notebook
Newer Older
H
HydrogenSulfate 已提交
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "source": [
    "## 1. Introduction\n",
    "PP-ShiTuv2 is a practical lightweight general image recognition system improved on PP-ShitUV1. It is composed of three modules: mainbody detection, feature extraction and vector search. Compared with PP-ShiTuV1, PP-ShiTuV2 has higher recognition accuracy, stronger generalization and similar inference speed *. This paper mainly optimize in training dataset, feature extraction with better backbone network, loss function and training strategy, which significantly improved the retrieval performance of PP-ShiTuV2 in multiple practical application scenarios.\n",
    "\n",
    "The PP-ShiTuV2 model is officially produced by PaddleClas, which is an optimized and improved recognition retrieval model by PaddleClas. More about PaddleClas can be found at https://github.com/PaddlePaddle/PaddleClas.\n",
    "\n",
    "## 2. Preview and application scenarios\n",
    "### 2.1 product recognition:\n",
    "\n",
    "#### 2.1.1 dataset:\n",
    "\n",
21
    "Including Aliproduct and GLDv2. For details, please refer to [PP-ShiTuV2 Experiment Section](https://github.com/PaddlePaddle/PaddleClas/blob/release/2.5/docs/zh_CN/training/PP-ShiTu/feature_extraction.md#4-%E5%AE%9E%E9%AA%8C%E9%83%A8%E5%88%86)\n",
H
HydrogenSulfate 已提交
22 23 24 25 26
    "\n",
    "#### 2.1.2 output preview:\n",
    "\n",
    "for example, the output of PP-ShiTuV2 on the picture is as follows\n",
    "\n",
27
    "![](https://github.com/PaddlePaddle/PaddleClas/blob/release/2.5/docs/images/recognition/drink_data_demo/output/100.jpeg?raw=true)\n"
H
HydrogenSulfate 已提交
28 29 30 31 32 33 34 35 36 37
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 3. How to use\n",
    "\n",
    "### 3.1 model inference:\n",
    "\n",
38
    "- Install PaddleClas and its dependencies"
H
HydrogenSulfate 已提交
39 40 41
   ]
  },
  {
42
   "cell_type": "markdown",
H
HydrogenSulfate 已提交
43 44 45 46 47 48 49 50 51 52 53 54 55 56 57
   "metadata": {
    "collapsed": true,
    "execution": {
     "iopub.execute_input": "2022-11-08T08:24:16.514016Z",
     "iopub.status.busy": "2022-11-08T08:24:16.513368Z",
     "iopub.status.idle": "2022-11-08T08:25:00.630629Z",
     "shell.execute_reply": "2022-11-08T08:25:00.629113Z",
     "shell.execute_reply.started": "2022-11-08T08:24:16.513971Z"
    },
    "jupyter": {
     "outputs_hidden": true
    },
    "scrolled": true,
    "tags": []
   },
58 59 60 61 62 63 64 65
   "source": [
    "Install paddlepaddle-gpu if using GPU"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
H
HydrogenSulfate 已提交
66 67
   "outputs": [],
   "source": [
68
    "!pip install paddlepaddle-gpu -i https://mirror.baidu.com/pypi/simple"
H
HydrogenSulfate 已提交
69 70 71 72 73 74
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
75
    "Install paddlepaddle if using CPU"
H
HydrogenSulfate 已提交
76 77 78 79 80
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
81
   "metadata": {},
H
HydrogenSulfate 已提交
82 83
   "outputs": [],
   "source": [
84
    "!pip install paddlepaddle -i https://mirror.baidu.com/pypi/simple"
H
HydrogenSulfate 已提交
85 86 87 88 89 90
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
91
    "Install paddleclas whl package"
H
HydrogenSulfate 已提交
92 93 94 95 96
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
97
   "metadata": {},
H
HydrogenSulfate 已提交
98 99
   "outputs": [],
   "source": [
100 101 102 103 104 105 106 107
    "!pip install paddleclas"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "- Quick start\n",
H
HydrogenSulfate 已提交
108
    "\n",
109
    "Congratulations! You have successfully installed PaddleClas, now you can experience the image recognition as guided below."
H
HydrogenSulfate 已提交
110 111 112 113 114 115 116
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "execution": {
117 118 119 120 121
     "iopub.execute_input": "2022-11-08T08:26:11.091828Z",
     "iopub.status.busy": "2022-11-08T08:26:11.090376Z",
     "iopub.status.idle": "2022-11-08T08:29:06.202735Z",
     "shell.execute_reply": "2022-11-08T08:29:06.201197Z",
     "shell.execute_reply.started": "2022-11-08T08:26:11.091754Z"
H
HydrogenSulfate 已提交
122 123 124 125 126 127
    },
    "scrolled": true,
    "tags": []
   },
   "outputs": [],
   "source": [
128 129 130 131 132 133 134 135 136 137 138 139 140
    "# download and unzip demo data\n",
    "!wget -nc https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/data/drink_dataset_v2.0.tar && tar -xf drink_dataset_v2.0.tar\n",
    "\n",
    "# build gallery\n",
    "!paddleclas --build_gallery=True --model_name=\"PP-ShiTuV2\" \\\n",
    "-o IndexProcess.image_root=./drink_dataset_v2.0/gallery/ \\\n",
    "-o IndexProcess.index_dir=./drink_dataset_v2.0/index \\\n",
    "-o IndexProcess.data_file=./drink_dataset_v2.0/gallery/drink_label.txt\n",
    "\n",
    "# run recognition\n",
    "!paddleclas --model_name=\"PP-ShiTuV2\" --predict_type=shitu \\\n",
    "-o Global.infer_imgs='./drink_dataset_v2.0/test_images/100.jpeg' \\\n",
    "-o IndexProcess.index_dir='./drink_dataset_v2.0/index'"
H
HydrogenSulfate 已提交
141 142 143 144 145 146
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
147 148 149 150 151
    "the result are below\n",
    "```log\n",
    "ppcls INFO: [{'bbox': [437, 71, 660, 728], 'rec_docs': '元气森林', 'rec_scores': 0.7740249}, {'bbox': [221, 72, 449, 701], 'rec_docs': '元气森林', 'rec_scores': 0.6950992}, {'bbox': [794, 104, 979, 652], 'rec_docs': '元气森林', 'rec_scores': 0.6305153}], filename: ./drink_dataset_v2.0/test_images/100.jpeg\n",
    "ppcls INFO: Predict complete!\n",
    "```\n",
H
HydrogenSulfate 已提交
152
    "\n",
153
    "if you need to save and visualize result as picture,please refer to [PP-ShiTuV2 inference](https://github.com/PaddlePaddle/PaddleClas/blob/release/2.5/docs/en/PPShiTu/PPShiTuV2_introduction.md#4-inference-deployment), inference with full PaddleClas code."
H
HydrogenSulfate 已提交
154 155 156 157 158 159 160 161 162
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 3.2 Model training\n",
    "\n",
    "- clone PaddleClas repo(refer to [3.1 model inference](#31-model-inference))\n",
163 164
    "- For the dataset preparation, training, evaluation and other steps of the mainbody detection model, please refer to [PP-ShiTuV2 mainbody detection doc](https://github.com/PaddlePaddle/PaddleClas/blob/release/2.5/docs/zh_CN/training/PP-ShiTu/mainbody_detection.md)\n",
    "- For the dataset preparation, training, evaluation and other steps of the feature extraction model, please refer to [PP-ShiTuV2 feature extraction doc](https://github.com/PaddlePaddle/PaddleClas/blob/release/2.5/docs/zh_CN/training/PP-ShiTu/feature_extraction.md#5-%E8%87%AA%E5%AE%9A%E4%B9%89%E7%89%B9%E5%BE%81%E6%8F%90%E5%8F%96)\n"
H
HydrogenSulfate 已提交
165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 4. Algorithm\n",
    "PP-ShiTu series recognition systems, including PP-ShiTuV2 introduced in this document, consists of three modules to complete the entire recognition process, as shown in the figure below\n",
    "\n",
    "![PP-ShiTu System](https://github.com/PaddlePaddle/PaddleClas/raw/develop/docs/images/structure.jpg)\n",
    "\n",
    "- Mainbody detection: The blue colored module in the figure above detects potential targets in the input image, and then cropping these targets, filtering unimportant backgrounds, and reducing background interference. In fact, this practice of retaining the mainbody and filtering the background is a simple, effective and widely used method in practice.\n",
    "- Feature extraction: Receive the cropped image containing the target mainbody output by the **mainbody detection** module, and input it into the feature extraction model to obtain the corresponding feature vector, which is used as the representation feature of the image for subsequent retrieval.\n",
    "- Vector retrieval: Receive one or more feature vectors output by the **feature extraction** module, and retrieve them one by one in the vector library, finally return the retrieval result. This module does not require additional training and can be used by installing the third-party open source faiss retrieval library\n",
    "\n",
    "In the recognition system, one of the most important modules is the feature extraction model. The generalization of feature extraction model directly affects the quality of the vectors in the retrieval library and the vectors to be retrieved. Therefore, we will introduce feature extraction model below in 5 parts.\n",
    "\n",
    "- Backbone\n",
    "\n",
    "    The Backbone adopts PP-LCNetV2_base. On the basis of PPLCNet_V1, it adds multiple optimization points including Rep strategy, PW convolution, Shortcut, activation function improvement, SE module improvement, etc., so that the final classification accuracy is similar to PPLCNet_x2_5, but the inference latency has been reduced by 40%\\*. During the experiment, we made appropriate tweaks for PPLCNetV2_base, and make higher performance in recognition tasks while keeping the inference speed basically unchanged. Including: remove ReLU and FC layer at the end of the PPLCNetV2_base, and change the stride of the last stage (RepDepthwise Separated) to 1.\n",
    "\n",
    "    **Note**: \\*The inference environment is based on Intel(R) Xeon(R) Gold 6271C CPU @ 2.60GHz hardware platform, OpenVINO inference platform.\n",
    "\n",
    "- Neck\n",
    "\n",
    "    The Neck adopts BN Neck to standardize each dimension of the features extracted by Backbone, which reduces the difficulty of optimizing the metric learning loss function and the classification loss function at the same time, accelerates the convergence speed, and reduces the difference between IDLoss and TripletLoss due to optimization goals.\n",
    "\n",
    "- Head\n",
    "\n",
    "    The Head adopts FC Layer, as classification head to convert features into logits for subsequent calculation of classification loss (generally using cross entropy loss, called CELoss or IDLoss).\n",
    "\n",
    "- Loss\n",
    "\n",
198
    "    The Loss adopts Cross entropy loss and TripletAngularMarginLoss, using classification loss and cos-similarity based triplet loss to optimize the network during training. We improved based on the original TripletLoss (Hard Triplet Loss), with replacing the optimization objective from L2 Euclidean space to cosine space, and added a hard distance constraint between anchor and positive/negtive samples, making training and testing goal more closer, and the generalization ability of the model is improved. For detailed configuration files, please refer to [GeneralRecognitionV2_PPLCNetV2_base.yaml](https://github.com/PaddlePaddle/PaddleClas/blob/release/2.5/ppcls/configs/GeneralRecognitionV2/GeneralRecognitionV2_PPLCNetV2_base.yaml#L63-L77).\n",
H
HydrogenSulfate 已提交
199 200 201
    "\n",
    "- Data Augmentation\n",
    "\n",
202
    "    We consider that the mainbody may rotate to a certain extent and not maintain an upright state when the camera is shot in real scenes, so we add an [RandomRotation](https://github.com/PaddlePaddle/PaddleClas/blob/release/2.5/ppcls/configs/GeneralRecognitionV2/GeneralRecognitionV2_PPLCNetV2_base.yaml#L117-L120) in data augmentation to improve the generalization ability of the model in real scenes.\n",
H
HydrogenSulfate 已提交
203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266
    "\n",
    "## 5. Note\n",
    "PP-ShiTuV2 is looking for the most cost-effective image recognition solution in industrial practice. However, considering that the datasets of different recognition scenarios have their own distribution characteristics, as well as the limitations of software and hardware during training, it is difficult to integrate all datasets at one time. Therefore, it is recommended that users, after understanding the characteristics of your actual datasets, fine-tune or even make an further development on your own datasets based on the PP-ShiTuV2 pre-training model and training configuration, in order to obtain better performance and generalization.\n",
    "\n",
    "## 6. Reference\n",
    "```log\n",
    "@article{cui2021pp,\n",
    "    title={PP-LCNet: A Lightweight CPU Convolutional Neural Network},\n",
    "    author={Cui, Cheng and Gao, Tingquan and Wei, Shengyu and Du, Yuning and Guo, Ruoyu and Dong, Shuilong and Lu, Bin and Zhou, Ying and Lv, Xueying and Liu, Qiwen and others},\n",
    "    journal={arXiv preprint arXiv:2109.15099},\n",
    "    year={2021}\n",
    "}\n",
    "\n",
    "@InProceedings{Luo_2019_CVPR_Workshops,\n",
    "    author = {Luo, Hao and Gu, Youzhi and Liao, Xingyu and Lai, Shenqi and Jiang, Wei},\n",
    "    title = {Bag of Tricks and a Strong Baseline for Deep Person Re-Identification},\n",
    "    booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops},\n",
    "    month = {June},\n",
    "    year = {2019}\n",
    "}\n",
    "\n",
    "@ARTICLE{Luo_2019_Strong_TMM,\n",
    "    author={H. {Luo} and W. {Jiang} and Y. {Gu} and F. {Liu} and X. {Liao} and S. {Lai} and J. {Gu}},\n",
    "    journal={IEEE Transactions on Multimedia},\n",
    "    title={A Strong Baseline and Batch Normalization Neck for Deep Person Re-identification},\n",
    "    year={2019},\n",
    "    pages={1-1},\n",
    "    doi={10.1109/TMM.2019.2958756},\n",
    "    ISSN={1941-0077},\n",
    "}\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "请点击[此处](https://ai.baidu.com/docs#/AIStudio_Project_Notebook/a38e5576)查看本环境基本用法.  <br>\n",
    "Please click [here ](https://ai.baidu.com/docs#/AIStudio_Project_Notebook/a38e5576) for more detailed instructions. "
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "py35-paddle1.2.0"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.4"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}