{ "cells": [ { "cell_type": "markdown", "metadata": { "jupyter": { "outputs_hidden": false } }, "source": [ "## 1. Introduction of PP-ShiTu\n", "PP-ShiTu is a practical lightweight general-purpose image recognition system, which mainly consists of three modules: subject detection, feature learning and vector retrieval. The system adopts a variety of strategies from 8 aspects of backbone network selection and adjustment, loss function selection, data enhancement, learning rate transformation strategy, regularization parameter selection, use of pre-trained models, and model pruning and quantification Optimization, and finally a system that can complete the image recognition of the 10w+ library in only 0.2s on the CPU.\n", "For more details, please refer to [PP-ShiTu Technical Solution](https://arxiv.org/pdf/2111.00775.pdf).\n", "\n", "Learn more about PaddleClas at https://github.com/PaddlePaddle/PaddleClas.\n", "\n", "## 2. Preview and application scenarios\n", "### 2.1 product recognition:\n", "\n", "#### 2.1.1 dataset:\n", "\n", "The training data set and test set of PP-ShiTu are composed of 7 data sets such as Aliproduct and GLDv2. For details, please refer to [PP-ShiTu Experimental Part](https://github.com/PaddlePaddle/PaddleClas/blob/release/ 2.4/docs/zh_CN/image_recognition_pipeline/feature_extraction.md#4-%E5%AE%9E%E9%AA%8C%E9%83%A8%E5%88%86)\n", "\n", "#### 2.1.2 output preview:\n", "\n", "The detection effect of PP-ShiTu on the picture is as follows\n", "\n", "![](https://github.com/PaddlePaddle/PaddleClas/raw/release/2.4/docs/images/recognition/drink_data_demo/output/nongfu_spring.jpeg)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 3. How to use\n", "\n", "### 3.1 model inference:\n", "\n", "- download PaddleClas" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true, "execution": { "iopub.execute_input": "2022-11-08T08:24:16.514016Z", "iopub.status.busy": "2022-11-08T08:24:16.513368Z", "iopub.status.idle": "2022-11-08T08:25:00.630629Z", "shell.execute_reply": "2022-11-08T08:25:00.629113Z", "shell.execute_reply.started": "2022-11-08T08:24:16.513971Z" }, "jupyter": { "outputs_hidden": true }, "scrolled": true, "tags": [] }, "outputs": [], "source": [ "# 不在Jupyter Notebook上运行时需要将含 \"!\" 和 \"%\" 的语句注释,不需要运行。\n", "%cd ~/work\n", "\n", "# 克隆 PaddleClas(gitee上克隆速度较快)\n", "!git clone https://gitee.com/paddlepaddle/PaddleClas" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "- Install PaddleClas and its dependencies" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true, "execution": { "iopub.execute_input": "2022-11-08T08:26:02.622321Z", "iopub.status.busy": "2022-11-08T08:26:02.621656Z", "iopub.status.idle": "2022-11-08T08:26:05.016413Z", "shell.execute_reply": "2022-11-08T08:26:05.015052Z", "shell.execute_reply.started": "2022-11-08T08:26:02.622277Z" }, "jupyter": { "outputs_hidden": true }, "scrolled": true, "tags": [] }, "outputs": [], "source": [ "# 进入 PaddleClas 目录\n", "%cd ~/work/PaddleClas/\n", "\n", "# 切换到2.4分支\n", "!git checkout release/2.4\n", "\n", "# 安装所需依赖项\n", "!pip install -r requirements.txt\n", "\n", "# 设置GPU\n", "# %env CUDA_VISIBLE_DEVICES=0" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "- Quick start\n", "\n", "Congratulations! You have successfully installed PaddleClas, now you can experience the image recognition as guided below." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "execution": { "iopub.execute_input": "2022-11-08T08:26:11.091828Z", "iopub.status.busy": "2022-11-08T08:26:11.090376Z", "iopub.status.idle": "2022-11-08T08:29:06.202735Z", "shell.execute_reply": "2022-11-08T08:29:06.201197Z", "shell.execute_reply.started": "2022-11-08T08:26:11.091754Z" }, "scrolled": true, "tags": [] }, "outputs": [], "source": [ "# 进入 PaddleClas 目录\n", "%cd ~/work/PaddleClas/\n", "\n", "# 创建存放主体检测、特征提取推理模型的文件夹\n", "%mkdir -p deploy/models\n", "\n", "# 进入该文件夹\n", "%cd deploy/models\n", "\n", "# 下载主体检测inference模型并解压\n", "!wget -nc https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/inference/picodet_PPLCNet_x2_5_mainbody_lite_v1.0_infer.tar && tar -xf picodet_PPLCNet_x2_5_mainbody_lite_v1.0_infer.tar\n", "\n", "# 下载特征提取inference模型并解压\n", "!wget -nc https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/inference/general_PPLCNet_x2_5_lite_v1.0_infer.tar && tar -xf general_PPLCNet_x2_5_lite_v1.0_infer.tar\n", "\n", "# 返回至deploy文件夹\n", "%cd ~/work/PaddleClas/deploy/\n", "\n", "# 下载测试数据 drink_dataset_v1.0 并解压\n", "!wget -nc https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/data/drink_dataset_v1.0.tar && tar -xf drink_dataset_v1.0.tar" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "execution": { "iopub.execute_input": "2022-11-08T08:31:30.364422Z", "iopub.status.busy": "2022-11-08T08:31:30.363351Z", "iopub.status.idle": "2022-11-08T08:31:37.682006Z", "shell.execute_reply": "2022-11-08T08:31:37.680563Z", "shell.execute_reply.started": "2022-11-08T08:31:30.364378Z" }, "scrolled": true, "tags": [] }, "outputs": [], "source": [ "# 进入 PaddleClas 目录\n", "%cd ~/work/PaddleClas/\n", "\n", "# 进入deploy文件夹\n", "%cd ./deploy\n", "\n", "# 用 general_PPLCNet_x2_5_lite_v1.0 推理模型提取gallery图片的特征,制作成检索库\n", "!python3.7 python/build_gallery.py -c configs/inference_general.yaml -o Global.rec_inference_model_dir=./models/general_PPLCNet_x2_5_lite_v1.0_infer\n", "\n", "# 对 nongfu_spring.jpeg 图片进行识别推理(GPU推理)\n", "!python3.7 python/predict_system.py -c configs/inference_general.yaml\n", "# 对 nongfu_spring.jpeg 图片进行识别推理(CPU推理)\n", "!python3.7 python/predict_system.py -c configs/inference_general.yaml -o Global.use_gpu=False" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "At the same time, the recognition results (with detection bounding box, predicted class name and similarity) will be saved to `PaddleClas/deploy/output/nongfu_spring.jpeg`, as [2.1.2 output preview](#212-output-preview) displayed.\n", "\n", "![](https://github.com/PaddlePaddle/PaddleClas/raw/release/2.4/docs/images/recognition/drink_data_demo/output/nongfu_spring.jpeg)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 3.2 Model training\n", "\n", "- clone PaddleClas repo(refer to [3.1 model inference](#31-model-inference)), and checkout to release/2.4 branch\n", "- For the dataset preparation, training, evaluation and other steps of the mainbody detection model, please refer to [PP-ShiTu mainbody detection doc](https://github.com/PaddlePaddle/PaddleClas/blob/release/2.4/docs/zh_CN/image_recognition_pipeline/mainbody_detection.md)\n", "- For the dataset preparation, training, evaluation and other steps of the mainbody detection model, please refer to [PP-ShiTu feature extraction doc](https://github.com/PaddlePaddle/PaddleClas/blob/release/2.4/docs/zh_CN/image_recognition_pipeline/feature_extraction.md#51-%E6%95%B0%E6%8D%AE%E5%87%86%E5%A4%87)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 4. 模型原理\n", "PP-ShiTu 系列识别系统,包括本文档介绍的 PP-ShiTu,均由3个模块串联完成整个识别过程,如下图所示\n", "\n", "![PP-ShiTu系统](https://github.com/PaddlePaddle/PaddleClas/blob/release/2.4/docs/images/structure.jpg?raw=true)\n", "\n", "- 主体检测:上图中的蓝色模块,主要负责检测出用户输入图片中可能的识别目标,进而裁剪出这些目标,过滤不重要的背景,减少背景的干扰。事实上这种保留主体,过滤背景的做法是实践中会采用的一种简单而有效的方法。\n", "- 特征提取:接收 **主体检测** 模块输出的含有目标主体的裁剪后的图片,将其输入到特征提取模型中,得到对应的特征向量,作为该图片的表示特征用于接下来的检索步骤。\n", "- 向量检索:接收 **特征提取** 模块输出的一个或多个特征向量,逐个地在向量库中检索,将检索库中最邻近(一般以相似度表示邻近程度)的向量的类别,作为检索向量的类别,最后返回检索结果。该模块不需要额外训练,安装第三方开源的faiss检索库即可使用\n", "\n", "在检索系统中,最重要的模块之一就是特征提取模型,其特征提取能力好坏直接影响检索库内向量和待检索向量的质量,因此接下来分5个部分,重点介绍 PP-ShiTu 所使用的特征提取模型。\n", "\n", "- Backbone\n", " Backbone 部分采用了 PP_LCNet_x2_5,其针对Intel CPU端的性能优化探索了多个有效的结构设计方案,最终实现了在不增加推理时间的情况下,进一步提升模型的性能,最终大幅度超越现有的 SOTA 模型。\n", "\n", "- Neck\n", "\n", " Neck 部分采用了 FC Layer,对 Backbone 抽取得到的特征进行降维,减少了特征存储的成本与计算量。\n", "\n", "- Head\n", "\n", " Head 部分选用 ArcMargin,在训练时通过指定margin,增大同类特征之间的角度差异再进行分类,进一步提升抽取特征的表征能力。\n", "\n", "- Loss\n", "\n", " Loss 部分选用 Cross entropy loss,在训练时以分类任务的损失函数来指导网络进行优化。详细的配置文件见通用识别配置文件。\n", "\n", "## 5. 注意事项\n", "PP-ShiTu 是在寻找在产业实践中最高性价比的图像识别方案,但考虑到不同识别场景的数据集均有各自的分布特点,以及训练时的软硬件限制,无法一次性将所有的数据集全部纳入到训练集中,经过权衡才使用了目前这套训练数据集的组合。因此推荐用户在了解自己实际业务数据集的特点之后,基于 PP-ShiTu 的预训练模型以及训练配置,在自己的业务数据集上进行微调甚至二次开发,以获得性能更好,更适配自己数据集的识别模型。\n", "\n", "## 6. 相关论文以及引用信息\n", "```log\n", "@article{cui2021pp,\n", " title={PP-LCNet: A Lightweight CPU Convolutional Neural Network},\n", " author={Cui, Cheng and Gao, Tingquan and Wei, Shengyu and Du, Yuning and Guo, Ruoyu and Dong, Shuilong and Lu, Bin and Zhou, Ying and Lv, Xueying and Liu, Qiwen and others},\n", " journal={arXiv preprint arXiv:2109.15099},\n", " year={2021}\n", "}\n", "\n", "@article{wei2021pp,\n", " title={PP-ShiTu: A Practical Lightweight Image Recognition System},\n", " author={Wei, Shengyu and Guo, Ruoyu and Cui, Cheng and Lu, Bin and Dong, Shuilong and Gao, Tingquan and Du, Yuning and Zhou, Ying and Lyu, Xueying and Liu, Qiwen and others},\n", " journal={arXiv preprint arXiv:2111.00775},\n", " year={2021}\n", "}\n", "```" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "请点击[此处](https://ai.baidu.com/docs#/AIStudio_Project_Notebook/a38e5576)查看本环境基本用法.
\n", "Please click [here ](https://ai.baidu.com/docs#/AIStudio_Project_Notebook/a38e5576) for more detailed instructions. " ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "py35-paddle1.2.0" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.4" } }, "nbformat": 4, "nbformat_minor": 4 }