introduction_cn.ipynb 12.6 KB
Notebook
Newer Older
H
HydrogenSulfate 已提交
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "source": [
    "## 1. PP-ShiTuV2模型简介\n",
    "PP-ShiTuV2 是基于 PP-ShiTuV1 改进的一个实用轻量级通用图像识别系统,由主体检测、特征提取、向量检索三个模块构成,相比 PP-ShiTuV1 具有更高的识别精度、更强的泛化能力以及相近的推理速度*。主要针对训练数据集、特征提取两个部分进行优化,使用了更优的骨干网络、损失函数与训练策略,使得 PP-ShiTuV2 在多个实际应用场景上的检索性能有显著提升。\n",
    "\n",
    "PP-ShiTuV2模型由飞桨官方出品,是PaddleClas优化和改进的识别检索模型。 更多关于PaddleClas可以点击 https://github.com/PaddlePaddle/PaddleClas 进行了解。\n",
    "\n",
    "## 2. 模型效果及应用场景\n",
    "### 2.1 商品识别任务:\n",
    "\n",
    "#### 2.1.1 数据集:\n",
    "\n",
21
    "PP-ShiTuV2的训练数据集以 Aliproduct、GLDv2等数据集为主,详细信息可参考 [PP-ShiTuV2 实验部分](https://github.com/PaddlePaddle/PaddleClas/blob/release/2.5/docs/zh_CN/training/PP-ShiTu/feature_extraction.md#4-%E5%AE%9E%E9%AA%8C%E9%83%A8%E5%88%86)\n",
H
HydrogenSulfate 已提交
22 23 24
    "\n",
    "#### 2.1.2 模型效果速览:\n",
    "\n",
25
    "PP-ShiTuV2 在图片上的识别效果如下\n",
H
HydrogenSulfate 已提交
26
    "\n",
27
    "![](https://github.com/PaddlePaddle/PaddleClas/blob/release/2.5/docs/images/recognition/drink_data_demo/output/100.jpeg?raw=true)"
H
HydrogenSulfate 已提交
28 29 30 31 32 33 34 35 36 37
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 3. 模型如何使用\n",
    "\n",
    "### 3.1 模型推理:\n",
    "\n",
38 39 40 41 42 43 44 45
    "- 安装 PaddleClas 及其依赖包"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "如果您的机器安装了 CUDA9、CUDA10 或 CUDA11,请运行以下命令安装 paddle"
H
HydrogenSulfate 已提交
46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true,
    "execution": {
     "iopub.execute_input": "2022-11-08T08:24:16.514016Z",
     "iopub.status.busy": "2022-11-08T08:24:16.513368Z",
     "iopub.status.idle": "2022-11-08T08:25:00.630629Z",
     "shell.execute_reply": "2022-11-08T08:25:00.629113Z",
     "shell.execute_reply.started": "2022-11-08T08:24:16.513971Z"
    },
    "jupyter": {
     "outputs_hidden": true
    },
    "scrolled": true,
    "tags": []
   },
   "outputs": [],
   "source": [
68
    "!pip install paddlepaddle-gpu -i https://mirror.baidu.com/pypi/simple"
H
HydrogenSulfate 已提交
69 70 71 72 73 74
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
75
    "如果您的机器是CPU,请运行以下命令安装"
H
HydrogenSulfate 已提交
76 77 78 79 80
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
81
   "metadata": {},
H
HydrogenSulfate 已提交
82 83 84
   "outputs": [],
   "source": [
    "\n",
85
    "!pip install paddlepaddle -i https://mirror.baidu.com/pypi/simple"
H
HydrogenSulfate 已提交
86 87 88 89 90 91
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
92
    "安装 paddleclas whl包"
H
HydrogenSulfate 已提交
93 94 95 96 97
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
98
   "metadata": {},
H
HydrogenSulfate 已提交
99 100
   "outputs": [],
   "source": [
101 102 103 104 105 106 107 108
    "!pip install paddleclas"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "- 快速体验\n",
H
HydrogenSulfate 已提交
109
    "\n",
110
    "恭喜! 您已经成功安装了 PaddleClas,接下来快速体验图像识别效果"
H
HydrogenSulfate 已提交
111 112 113 114 115 116 117
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "execution": {
118 119 120 121 122
     "iopub.execute_input": "2022-11-08T08:26:11.091828Z",
     "iopub.status.busy": "2022-11-08T08:26:11.090376Z",
     "iopub.status.idle": "2022-11-08T08:29:06.202735Z",
     "shell.execute_reply": "2022-11-08T08:29:06.201197Z",
     "shell.execute_reply.started": "2022-11-08T08:26:11.091754Z"
H
HydrogenSulfate 已提交
123 124 125 126 127 128
    },
    "scrolled": true,
    "tags": []
   },
   "outputs": [],
   "source": [
129 130 131 132 133 134 135 136 137 138 139 140 141
    "# 下载并解压demo数据\n",
    "!wget -nc https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/data/drink_dataset_v2.0.tar && tar -xf drink_dataset_v2.0.tar\n",
    "\n",
    "# 在命令行中构建索引库\n",
    "!paddleclas --build_gallery=True --model_name=\"PP-ShiTuV2\" \\\n",
    "-o IndexProcess.image_root=./drink_dataset_v2.0/gallery/ \\\n",
    "-o IndexProcess.index_dir=./drink_dataset_v2.0/index \\\n",
    "-o IndexProcess.data_file=./drink_dataset_v2.0/gallery/drink_label.txt\n",
    "\n",
    "# 执行图像识别命令\n",
    "!paddleclas --model_name=\"PP-ShiTuV2\" --predict_type=shitu \\\n",
    "-o Global.infer_imgs='./drink_dataset_v2.0/test_images/100.jpeg' \\\n",
    "-o IndexProcess.index_dir='./drink_dataset_v2.0/index'"
H
HydrogenSulfate 已提交
142 143 144 145 146 147
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
148 149 150 151 152
    "识别的结果如下所示\n",
    "```log\n",
    "ppcls INFO: [{'bbox': [437, 71, 660, 728], 'rec_docs': '元气森林', 'rec_scores': 0.7740249}, {'bbox': [221, 72, 449, 701], 'rec_docs': '元气森林', 'rec_scores': 0.6950992}, {'bbox': [794, 104, 979, 652], 'rec_docs': '元气森林', 'rec_scores': 0.6305153}], filename: ./drink_dataset_v2.0/test_images/100.jpeg\n",
    "ppcls INFO: Predict complete!\n",
    "```\n",
H
HydrogenSulfate 已提交
153
    "\n",
154
    "如想保存识别结果为如本文档开头所示的图片,请参考 [PP-ShiTuV2 推理部署](https://github.com/PaddlePaddle/PaddleClas/blob/release/2.5/docs/zh_CN/models/PP-ShiTu/README.md#4-%E6%8E%A8%E7%90%86%E9%83%A8%E7%BD%B2),使用完整 PaddleClas 代码进行推理预测"
H
HydrogenSulfate 已提交
155 156 157 158 159 160 161 162 163
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 3.2 模型训练\n",
    "\n",
    "- 克隆 PaddleClas 仓库(参考 3.1模型推理 - 下载PaddleClas)\n",
164 165
    "- 主体检测模型的数据集准备、开始训练、模型评估等步骤,请参考 [PP-ShiTu 主体检测 文档](https://github.com/PaddlePaddle/PaddleClas/blob/release/2.5/docs/zh_CN/training/PP-ShiTu/mainbody_detection.md)\n",
    "- 特征提取模型的数据集准备、开始训练、模型评估等步骤,请参考 [PP-ShiTu 特征提取 文档](https://github.com/PaddlePaddle/PaddleClas/blob/release/2.5/docs/zh_CN/training/PP-ShiTu/feature_extraction.md)\n"
H
HydrogenSulfate 已提交
166 167 168 169 170 171 172 173 174
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 4. 模型原理\n",
    "PP-ShiTu 系列识别系统,包括本文档介绍的 PP-ShiTuV2,均由3个模块串联完成整个识别过程,如下图所示\n",
    "\n",
175
    "![PP-ShiTu系统](https://raw.githubusercontent.com/PaddlePaddle/PaddleClas/release/2.5/docs/images/structure.jpg)\n",
H
HydrogenSulfate 已提交
176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198
    "\n",
    "- 主体检测:上图中的蓝色模块,主要负责检测出用户输入图片中可能的识别目标,进而裁剪出这些目标,过滤不重要的背景,减少背景的干扰。事实上这种保留主体,过滤背景的做法是实践中会采用的一种简单而有效的方法。\n",
    "- 特征提取:接收 **主体检测** 模块输出的含有目标主体的裁剪后的图片,将其输入到特征提取模型中,得到对应的特征向量,作为该图片的表示特征用于接下来的检索步骤。\n",
    "- 向量检索:接收 **特征提取** 模块输出的一个或多个特征向量,逐个地在向量库中检索,将检索库中最邻近(一般以相似度表示邻近程度)的向量的类别,作为检索向量的类别,最后返回检索结果。该模块不需要额外训练,安装第三方开源的faiss检索库即可使用\n",
    "\n",
    "在检索系统中,最重要的模块之一就是特征提取模型,其特征提取能力好坏直接影响检索库内向量和待检索向量的质量,因此接下来分5个部分,重点介绍 PP-ShiTuV2 所使用的特征提取模型。\n",
    "\n",
    "- Backbone\n",
    "\n",
    "    Backbone 部分采用了 PP-LCNetV2_base,其在 PPLCNet_V1 的基础上,加入了包括Rep 策略、PW 卷积、Shortcut、激活函数改进、SE 模块改进等多个优化点,使得最终分类精度与 PPLCNet_x2_5 相近,且推理延时减少了40%\\*。在实验过程中我们对 PPLCNetV2_base 进行了适当的改进,在保持速度基本不变的情况下,让其在识别任务中得到更高的性能,包括:去掉 PPLCNetV2_base 末尾的 ReLU 和 FC、将最后一个 stage(RepDepthwiseSeparable) 的 stride 改为1。\n",
    "\n",
    "    **注**: \\*推理环境基于 Intel(R) Xeon(R) Gold 6271C CPU @ 2.60GHz 硬件平台,OpenVINO 推理平台。\n",
    "\n",
    "- Neck\n",
    "\n",
    "    Neck 部分采用了 BN Neck,对 Backbone 抽取得到的特征的每个维度进行标准化操作,减少了同时优化度量学习损失函数和分类损失函数的难度,加快收敛速度的同时减少 IDLoss 和 TripletLoss 之间由于优化目标所在空间不同带来的影响。\n",
    "\n",
    "- Head\n",
    "\n",
    "    Head 部分选用 FC Layer,使用分类头将 feature 转换成 logits 供后续计算分类损失(一般使用交叉熵损失,称之为CELoss或IDLoss)。\n",
    "\n",
    "- Loss\n",
    "\n",
199
    "    Loss 部分选用 Cross entropy loss 和 TripletAngularMarginLoss,在训练时以分类损失和基于角度的三元组损失来指导网络进行优化。我们基于原始的 TripletLoss (困难三元组损失)进行了改进,将优化目标从 L2 欧几里得空间更换成余弦空间,并加入了 anchor 与 positive/negtive 之间的硬性距离约束,让训练与测试的目标更加接近,提升模型的泛化能力。详细的配置文件见 [GeneralRecognitionV2_PPLCNetV2_base.yaml](https://github.com/PaddlePaddle/PaddleClas/blob/release/2.5/ppcls/configs/GeneralRecognitionV2/GeneralRecognitionV2_PPLCNetV2_base.yaml#L63-L77)。\n",
H
HydrogenSulfate 已提交
200 201 202
    "\n",
    "- Data Augmentation\n",
    "\n",
203
    "    我们考虑到实际相机拍摄时目标主体可能出现一定的旋转而不一定能保持正立状态,因此我们在数据增强中加入了适当的 [随机旋转增强](https://github.com/PaddlePaddle/PaddleClas/blob/release/2.5/ppcls/configs/GeneralRecognitionV2/GeneralRecognitionV2_PPLCNetV2_base.yaml#L117-L120),以提升模型在真实场景中的检索能力。\n",
H
HydrogenSulfate 已提交
204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267
    "\n",
    "## 5. 注意事项\n",
    "PP-ShiTuV2 是在寻找在产业实践中最高性价比的图像识别方案,但考虑到不同识别场景的数据集均有各自的分布特点,以及训练时的软硬件限制,无法一次性将所有的数据集全部纳入到训练集中,经过权衡才使用了目前这套训练数据集的组合。因此推荐用户在了解自己实际业务数据集的特点之后,基于 PP-ShiTuV2 的预训练模型以及训练配置,在自己的业务数据集上进行微调甚至二次开发,以获得性能更好,更适配自己数据集的识别模型。\n",
    "\n",
    "## 6. 相关论文以及引用信息\n",
    "```log\n",
    "@article{cui2021pp,\n",
    "    title={PP-LCNet: A Lightweight CPU Convolutional Neural Network},\n",
    "    author={Cui, Cheng and Gao, Tingquan and Wei, Shengyu and Du, Yuning and Guo, Ruoyu and Dong, Shuilong and Lu, Bin and Zhou, Ying and Lv, Xueying and Liu, Qiwen and others},\n",
    "    journal={arXiv preprint arXiv:2109.15099},\n",
    "    year={2021}\n",
    "}\n",
    "\n",
    "@InProceedings{Luo_2019_CVPR_Workshops,\n",
    "    author = {Luo, Hao and Gu, Youzhi and Liao, Xingyu and Lai, Shenqi and Jiang, Wei},\n",
    "    title = {Bag of Tricks and a Strong Baseline for Deep Person Re-Identification},\n",
    "    booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops},\n",
    "    month = {June},\n",
    "    year = {2019}\n",
    "}\n",
    "\n",
    "@ARTICLE{Luo_2019_Strong_TMM,\n",
    "    author={H. {Luo} and W. {Jiang} and Y. {Gu} and F. {Liu} and X. {Liao} and S. {Lai} and J. {Gu}},\n",
    "    journal={IEEE Transactions on Multimedia},\n",
    "    title={A Strong Baseline and Batch Normalization Neck for Deep Person Re-identification},\n",
    "    year={2019},\n",
    "    pages={1-1},\n",
    "    doi={10.1109/TMM.2019.2958756},\n",
    "    ISSN={1941-0077},\n",
    "}\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "请点击[此处](https://ai.baidu.com/docs#/AIStudio_Project_Notebook/a38e5576)查看本环境基本用法.  <br>\n",
    "Please click [here ](https://ai.baidu.com/docs#/AIStudio_Project_Notebook/a38e5576) for more detailed instructions. "
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "py35-paddle1.2.0"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.4"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}