未验证 提交 25c32c0e 编写于 作者: C cuicheng01 提交者: GitHub

Merge branch 'PaddlePaddle:develop' into develop

简体中文 | [English](README_en.md)
# PaddleClas
## 简介
飞桨图像识别套件PaddleClas是飞桨为工业界和学术界所准备的一个图像识别任务的工具集,助力使用者训练出更好的视觉模型和应用落地。
**近期更新**
- 2021.06.22,23,24 PaddleClas官方研发团队带来技术深入解读三日直播课,6月22日、23日、24日晚上20:30,[直播地址](https://live.bilibili.com/21689802)
- 🔥🔥🔥: 2021.06.16 PaddleClas v2.2版本升级,集成Metric learning,向量检索等组件。新增商品识别、动漫人物识别、车辆识别和logo识别等4个图像识别应用。新增LeViT、Twins、TNT、DLA、HarDNet、RedNet系列30个预训练模型。
- 2021.05.14 添加`SwinTransformer` 系列模型。
- 2021.04.15 添加`MixNet_L``ReXNet_3_0`系列模型。
[more](./docs/zh_CN/update_history.md)
## 特性
- 实用的图像识别系统:集成了目标检测、特征学习、图像检索等模块,广泛适用于各类图像识别任务。
提供商品识别、车辆识别、logo识别和动漫人物识别等4个场景应用示例。
- 丰富的预训练模型库:提供了35个系列共164个ImageNet预训练模型,其中6个精选系列模型支持结构快速修改。
- 全面易用的特征学习组件:集成arcmargin, triplet loss等12度量学习方法,通过配置文件即可随意组合切换。
- SSLD知识蒸馏:14个分类预训练模型,精度普遍提升3%以上;其中ResNet50_vd模型在ImageNet-1k数据集上的Top-1精度达到了84.0%,
Res2Net200_vd预训练模型Top-1精度高达85.1%。
- 数据增广:支持AutoAugment、Cutout、Cutmix等8种数据增广算法详细介绍、代码复现和在统一实验环境下的效果评估。
## 图像识别系统效果展示
<div align="center">
<img src="./docs/images/recognition.gif" width = "400" />
</div>
更多效果图请见:[识别效果展示页面](./docs/zh_CN/more_demo.md)
## 欢迎加入技术交流群
* 您可以扫描下面的微信群二维码, 加入PaddleClas 微信交流群。获得更高效的问题答疑,与各行各业开发者充分交流,期待您的加入。
<div align="center">
<img src="./docs/images/wx_group.png" width = "200" />
</div>
## 快速体验
图像识别快速体验:[点击这里](./docs/zh_CN/tutorials/quick_start_recognition.md)
## 文档教程
- [快速安装](./docs/zh_CN/tutorials/install.md)
- [图像识别快速体验](./docs/zh_CN/tutorials/quick_start_recognition.md)
- [图像识别系统介绍](#图像识别系统介绍)
- [识别效果展示](#识别效果展示)
- 算法介绍
- [骨干网络和预训练模型库](./docs/zh_CN/ImageNet_models_cn.md)
- [主体检测](./docs/zh_CN/application/mainbody_detection.md)
- [图像分类](./docs/zh_CN/tutorials/image_classification.md)
- [特征学习](./docs/zh_CN/application/feature_learning.md)
- [商品识别](./docs/zh_CN/application/product_recognition.md)
- [车辆识别](./docs/zh_CN/application/vehicle_recognition.md)
- [logo识别](./docs/zh_CN/application/logo_recognition.md)
- [动漫人物识别](./docs/zh_CN/application/cartoon_character_recognition.md)
- [向量检索](./deploy/vector_search/README.md)
- 模型训练/评估
- [图像分类任务](./docs/zh_CN/tutorials/getting_started.md)
- [特征学习任务](./docs/zh_CN/tutorials/getting_started_retrieval.md)
- 模型预测
- [基于Python预测引擎预测推理](./docs/zh_CN/inference.md)
- [基于C++预测引擎预测推理](./deploy/cpp/readme.md)(当前只支持图像分类任务,图像识别更新中)
- 模型部署(当前只支持图像分类任务,图像识别更新中)
- [服务化部署](./deploy/hubserving/readme.md)
- [端侧部署](./deploy/lite/readme.md)
- [whl包预测](./docs/zh_CN/whl.md)
- 高阶使用
- [知识蒸馏](./docs/zh_CN/advanced_tutorials/distillation/distillation.md)
- [模型量化](./docs/zh_CN/extension/paddle_quantization.md)
- [数据增广](./docs/zh_CN/advanced_tutorials/image_augmentation/ImageAugment.md)
- FAQ(暂停更新)
- [图像分类任务FAQ](docs/zh_CN/faq.md)
- [许可证书](#许可证书)
- [贡献代码](#贡献代码)
<a name="图像识别系统介绍"></a>
## 图像识别系统介绍
<a name="图像识别系统介绍"></a>
<div align="center">
<img src="./docs/images/structure.png" width = "400" />
</div>
整个图像识别系统分为三步:(1)通过一个目标检测模型,检测图像物体候选区域(2)对每个候选区域进行特征提取(3)与检索库中图像进行特征匹配,提取识别结果。
对于新的未知类别,无需重新训练模型,只需要在检索库补入该类别图像,重新建立检索库,就可以识别该类别。
<a name="识别效果展示"></a>
## 识别效果展示 [more](./docs/zh_CN/more_demo.md)
- 商品识别
<div align="center">
<img src="./docs/images/recognition/more_demo_images/output_product/channelhandle_5.jpg" width = "400" />
</div>
<div align="center">
<img src="./docs/images/recognition/more_demo_images/output_product/daoxiangcunjinzhubing_10.jpg" width = "400" />
</div>
- 动漫人物识别
<div align="center">
<img src="./docs/images/recognition/more_demo_images/output_cartoon/labixiaoxin-005.jpeg" width = "400" />
</div>
<div align="center">
<img src="./docs/images/recognition/more_demo_images/output_cartoon/liuchuanfeng-010.jpeg" width = "400" />
</div>
- logo识别
<div align="center">
<img src="./docs/images/recognition/more_demo_images/output_logo/cctv_4.jpg" width = "400" />
</div>
<div align="center">
<img src="./docs/images/recognition/more_demo_images/output_logo/mangguo_8.jpeg" width = "400" />
</div>
- 车辆识别
<div align="center">
<img src="./docs/images/recognition/more_demo_images/output_vehicle/audia5-115.jpeg" width = "400" />
</div>
<div align="center">
<img src="./docs/images/recognition/more_demo_images/output_vehicle/bentian-yage-101.jpeg" width = "400" />
</div>
<a name="许可证书"></a>
## 许可证书
本项目的发布受<a href="https://github.com/PaddlePaddle/PaddleCLS/blob/master/LICENSE">Apache 2.0 license</a>许可认证。
<a name="贡献代码"></a>
## 贡献代码
我们非常欢迎你为PaddleClas贡献代码,也十分感谢你的反馈。
- 非常感谢[nblib](https://github.com/nblib)修正了PaddleClas中RandErasing的数据增广配置文件。
- 非常感谢[chenpy228](https://github.com/chenpy228)修正了PaddleClas文档中的部分错别字。
- 非常感谢[jm12138](https://github.com/jm12138)为PaddleClas添加ViT,DeiT系列模型和RepVGG系列模型。
- 非常感谢[FutureSI](https://aistudio.baidu.com/aistudio/personalcenter/thirdview/76563)对PaddleClas代码的解析与总结。
我们非常欢迎你为PaddleClas贡献代码,也十分感谢你的反馈。
README_en.md
\ No newline at end of file
简体中文 | [English](README_en.md)
# PaddleClas
## 简介
飞桨图像识别套件PaddleClas是飞桨为工业界和学术界所准备的一个图像识别任务的工具集,助力使用者训练出更好的视觉模型和应用落地。
**近期更新**
- 2021.06.22,23,24 PaddleClas官方研发团队带来技术深入解读三日直播课,6月22日、23日、24日晚上20:30,[直播地址](https://live.bilibili.com/21689802)
- 🔥🔥🔥: 2021.06.16 PaddleClas v2.2版本升级,集成Metric learning,向量检索等组件。新增商品识别、动漫人物识别、车辆识别和logo识别等4个图像识别应用。新增LeViT、Twins、TNT、DLA、HarDNet、RedNet系列30个预训练模型。
- [more](./docs/zh_CN/update_history.md)
## 特性
- 实用的图像识别系统:集成了目标检测、特征学习、图像检索等模块,广泛适用于各类图像识别任务。
提供商品识别、车辆识别、logo识别和动漫人物识别等4个场景应用示例。
- 丰富的预训练模型库:提供了35个系列共164个ImageNet预训练模型,其中6个精选系列模型支持结构快速修改。
- 全面易用的特征学习组件:集成arcmargin, triplet loss等12度量学习方法,通过配置文件即可随意组合切换。
- SSLD知识蒸馏:14个分类预训练模型,精度普遍提升3%以上;其中ResNet50_vd模型在ImageNet-1k数据集上的Top-1精度达到了84.0%,
Res2Net200_vd预训练模型Top-1精度高达85.1%。
- 数据增广:支持AutoAugment、Cutout、Cutmix等8种数据增广算法详细介绍、代码复现和在统一实验环境下的效果评估。
## 图像识别系统效果展示
<div align="center">
<img src="./docs/images/recognition.gif" width = "400" />
</div>
更多效果图请见:[识别效果展示页面](./docs/zh_CN/more_demo.md)
## 欢迎加入技术交流群
* 您可以扫描下面的微信群二维码, 加入PaddleClas 微信交流群。获得更高效的问题答疑,与各行各业开发者充分交流,期待您的加入。
<div align="center">
<img src="./docs/images/wx_group.jpeg" width = "200" />
</div>
## 快速体验
图像识别快速体验:[点击这里](./docs/zh_CN/tutorials/quick_start_recognition.md)
## 文档教程
- [快速安装](./docs/zh_CN/tutorials/install.md)
- [图像识别快速体验](./docs/zh_CN/tutorials/quick_start_recognition.md)
- [图像识别系统介绍](#图像识别系统介绍)
- [识别效果展示](#识别效果展示)
- 算法介绍
- [骨干网络和预训练模型库](./docs/zh_CN/ImageNet_models_cn.md)
- [主体检测](./docs/zh_CN/application/mainbody_detection.md)
- [图像分类](./docs/zh_CN/tutorials/image_classification.md)
- [特征学习](./docs/zh_CN/application/feature_learning.md)
- [商品识别](./docs/zh_CN/application/product_recognition.md)
- [车辆识别](./docs/zh_CN/application/vehicle_recognition.md)
- [logo识别](./docs/zh_CN/application/logo_recognition.md)
- [动漫人物识别](./docs/zh_CN/application/cartoon_character_recognition.md)
- [向量检索](./deploy/vector_search/README.md)
- 模型训练/评估
- [图像分类任务](./docs/zh_CN/tutorials/getting_started.md)
- [特征学习任务](./docs/zh_CN/tutorials/getting_started_retrieval.md)
- 模型预测
- [基于Python预测引擎预测推理](./docs/zh_CN/inference.md)
- [基于C++预测引擎预测推理](./deploy/cpp/readme.md)(当前只支持图像分类任务,图像识别更新中)
- 模型部署(当前只支持图像分类任务,图像识别更新中)
- [服务化部署](./deploy/hubserving/readme.md)
- [端侧部署](./deploy/lite/readme.md)
- [whl包预测](./docs/zh_CN/whl.md)
- 高阶使用
- [知识蒸馏](./docs/zh_CN/advanced_tutorials/distillation/distillation.md)
- [模型量化](./docs/zh_CN/extension/paddle_quantization.md)
- [数据增广](./docs/zh_CN/advanced_tutorials/image_augmentation/ImageAugment.md)
- FAQ(暂停更新)
- [图像分类任务FAQ](docs/zh_CN/faq.md)
- [许可证书](#许可证书)
- [贡献代码](#贡献代码)
<a name="图像识别系统介绍"></a>
## 图像识别系统介绍
<a name="图像识别系统介绍"></a>
<div align="center">
<img src="./docs/images/structure.png" width = "400" />
</div>
整个图像识别系统分为三步:(1)通过一个目标检测模型,检测图像物体候选区域(2)对每个候选区域进行特征提取(3)与检索库中图像进行特征匹配,提取识别结果。
对于新的未知类别,无需重新训练模型,只需要在检索库补入该类别图像,重新建立检索库,就可以识别该类别。
<a name="识别效果展示"></a>
## 识别效果展示 [more](./docs/zh_CN/more_demo.md)
- 商品识别
<div align="center">
<img src="./docs/images/recognition/more_demo_images/output_product/channelhandle_5.jpg" width = "400" />
</div>
<div align="center">
<img src="./docs/images/recognition/more_demo_images/output_product/daoxiangcunjinzhubing_10.jpg" width = "400" />
</div>
- 动漫人物识别
<div align="center">
<img src="./docs/images/recognition/more_demo_images/output_cartoon/labixiaoxin-005.jpeg" width = "400" />
</div>
<div align="center">
<img src="./docs/images/recognition/more_demo_images/output_cartoon/liuchuanfeng-010.jpeg" width = "400" />
</div>
- logo识别
<div align="center">
<img src="./docs/images/recognition/more_demo_images/output_logo/cctv_4.jpg" width = "400" />
</div>
<div align="center">
<img src="./docs/images/recognition/more_demo_images/output_logo/mangguo_8.jpeg" width = "400" />
</div>
- 车辆识别
<div align="center">
<img src="./docs/images/recognition/more_demo_images/output_vehicle/audia5-115.jpeg" width = "400" />
</div>
<div align="center">
<img src="./docs/images/recognition/more_demo_images/output_vehicle/bentian-yage-101.jpeg" width = "400" />
</div>
<a name="许可证书"></a>
## 许可证书
本项目的发布受<a href="https://github.com/PaddlePaddle/PaddleCLS/blob/master/LICENSE">Apache 2.0 license</a>许可认证。
<a name="贡献代码"></a>
## 贡献代码
我们非常欢迎你为PaddleClas贡献代码,也十分感谢你的反馈。
- 非常感谢[nblib](https://github.com/nblib)修正了PaddleClas中RandErasing的数据增广配置文件。
- 非常感谢[chenpy228](https://github.com/chenpy228)修正了PaddleClas文档中的部分错别字。
- 非常感谢[jm12138](https://github.com/jm12138)为PaddleClas添加ViT,DeiT系列模型和RepVGG系列模型。
- 非常感谢[FutureSI](https://aistudio.baidu.com/aistudio/personalcenter/thirdview/76563)对PaddleClas代码的解析与总结。
我们非常欢迎你为PaddleClas贡献代码,也十分感谢你的反馈。
[简体中文](README.md) | English
[简体中文](README_ch.md) | English
# PaddleClas
......@@ -8,17 +8,7 @@ PaddleClas is an image recognition toolset for industry and academia, helping us
**Recent updates**
- 🔥🔥🔥: 2021.06.16 PaddleClas release/2.2.
- Add metric learning and vector search modules.
- Add product recognition, animation character recognition, vehicle recognition and logo recognition.
- Added 30 pretrained models of LeViT, Twins, TNT, DLA, HarDNet, and RedNet, and the accuracy is roughly the same as that of the paper.
- 2021.05.14
- Add `SwinTransformer` series pretrained models, whose Top-1 Acc on ImageNet-1k dataset reaches 87.19%.
- 2021.04.15
- Add `MixNet` and `ReXNet` pretrained models, `MixNet_L`'s Top-1 Acc on ImageNet-1k reaches 78.6% and `ReXNet_3_0` reaches 82.09%.
- 🔥🔥🔥: 2021.06.16 PaddleClas release/2.2. Add metric learning and vector search modules. Add product recognition, animation character recognition, vehicle recognition and logo recognition. Added 30 pretrained models of LeViT, Twins, TNT, DLA, HarDNet, and RedNet, and the accuracy is roughly the same as that of the paper.
- [more](./docs/en/update_history_en.md)
## Features
......@@ -34,27 +24,29 @@ Four sample solutions are provided, including product recognition, vehicle recog
- Data augmentation: Provide 8 data augmentation algorithms such as AutoAugment, Cutout, Cutmix, etc. with detailed introduction, code replication and evaluation of effectiveness in a unified experimental environment.
## Image Recognition System Effect Demonstration
<div align="center">
<img src="./docs/images/recognition.gif" width = "400" />
<img src="./docs/images/recognition_en.gif" width = "400" />
</div>
For more effect pictures, please see [Demo images](./docs/en/more_demo.md).
## Welcome to Join the Technical Exchange Group
* You can also scan the QR code below to join the PaddleClas WeChat group to get more efficient answers to your questions and to communicate with developers from all walks of life. We look forward to hearing from you.
<div align="center">
<img src="./docs/images/wx_group.png" width = "200" />
<img src="./docs/images/wx_group.jpeg" width = "200" />
</div>
## Quick Start
## Quick Start
Quick experience of image recognition:[Link](./docs/en/tutorials/quick_start_recognition_en.md)
## Tutorials
- [Quick Installatiopn](./docs/en/tutorials/install_en.md)
- [Quick Installation](./docs/en/tutorials/install_en.md)
- [Quick Start of Recognition](./docs/en/tutorials/quick_start_recognition_en.md)
- [Introduction to Image Recognition Systems](#Introduction_to_Image_Recognition_Systems)
- [Demo images](#Demo_images)
......@@ -89,7 +81,7 @@ Quick experience of image recognition:[Link](./docs/en/tutorials/quick_start_r
## Introduction to Image Recognition Systems
<div align="center">
<img src="./docs/images/structure.png" width = "400" />
<img src="./docs/images/mainpage/recognition_pipeline_en.png" width = "400" />
</div>
Image recognition can be divided into three steps:
......@@ -103,34 +95,34 @@ For a new unknown category, there is no need to retrain the model, just prepare
## Demo images [more](./docs/en/more_demo.md)
- Product recognition
<div align="center">
<img src="./docs/images/recognition/more_demo_images/output_product/channelhandle_5.jpg" width = "400" />
<img src="./docs/images/recognition/more_demo_images/output_product/channelhandle_5_en.jpg" width = "400" />
</div>
<div align="center">
<img src="./docs/images/recognition/more_demo_images/output_product/daoxiangcunjinzhubing_10.jpg" width = "400" />
<img src="./docs/images/recognition/more_demo_images/output_product/daoxiangcunjinzhubing_10_en.jpg" width = "400" />
</div>
- Cartoon character recognition
<div align="center">
<img src="./docs/images/recognition/more_demo_images/output_cartoon/labixiaoxin-005.jpeg" width = "400" />
<img src="./docs/images/recognition/more_demo_images/output_cartoon/labixiaoxin-005_en.jpeg" width = "400" />
</div>
<div align="center">
<img src="./docs/images/recognition/more_demo_images/output_cartoon/liuchuanfeng-010.jpeg" width = "400" />
<img src="./docs/images/recognition/more_demo_images/output_cartoon/liuchuanfeng-010_en.jpeg" width = "400" />
</div>
- Logo recognition
<div align="center">
<img src="./docs/images/recognition/more_demo_images/output_logo/cctv_4.jpg" width = "400" />
<img src="./docs/images/recognition/more_demo_images/output_logo/cctv_4_en.jpg" width = "400" />
</div>
<div align="center">
<img src="./docs/images/recognition/more_demo_images/output_logo/mangguo_8.jpeg" width = "400" />
<img src="./docs/images/recognition/more_demo_images/output_logo/mangguo_8_en.jpeg" width = "400" />
</div>
- Car recognition
<div align="center">
<img src="./docs/images/recognition/more_demo_images/output_vehicle/audia5-115.jpeg" width = "400" />
<img src="./docs/images/recognition/more_demo_images/output_vehicle/audia5-115_en.jpeg" width = "400" />
</div>
<div align="center">
<img src="./docs/images/recognition/more_demo_images/output_vehicle/bentian-yage-101.jpeg" width = "400" />
<img src="./docs/images/recognition/more_demo_images/output_vehicle/bentian-yage-101_en.jpeg" width = "400" />
</div>
<a name="License"></a>
......
......@@ -39,7 +39,8 @@ def draw_bbox_results(image,
xmin, ymin, xmax, ymax = result["bbox"]
text = "{}, {:.2f}".format(result["rec_docs"], result["rec_scores"])
th = font_size
tw = int(len(result["rec_docs"]) * font_size) + 60
tw = font.getsize(text)[0]
# tw = int(len(result["rec_docs"]) * font_size) + 60
start_y = max(0, ymin - th)
draw.rectangle(
......
## Demo images
- Product recognition
<div align="center">
<img src="../images/recognition/more_demo_images/output_product/channelhandle_5.jpg" width = "400" />
<img src="../images/recognition/more_demo_images/output_product/channelhandle_5_en.jpg" width = "400" />
</div>
<div align="center">
<img src="../images/recognition/more_demo_images/output_product/cliniqueblush_1.jpg" width = "400" />
<img src="../images/recognition/more_demo_images/output_product/cliniqueblush_1_en.jpg" width = "400" />
</div>
<div align="center">
<img src="../images/recognition/more_demo_images/output_product/daoxiangcunjinzhubing_10.jpg" width = "400" />
<img src="../images/recognition/more_demo_images/output_product/daoxiangcunjinzhubing_10_en.jpg" width = "400" />
</div>
<div align="center">
<img src="../images/recognition/more_demo_images/output_product/gannidress_10.jpg" width = "400" />
<img src="../images/recognition/more_demo_images/output_product/gannidress_10_en.jpg" width = "400" />
</div>
<div align="center">
<img src="../images/recognition/more_demo_images/output_product/gbyingerche_15.jpg" width = "400" />
<img src="../images/recognition/more_demo_images/output_product/gbyingerche_15_en.jpg" width = "400" />
</div>
<div align="center">
<img src="../images/recognition/more_demo_images/output_product/lafiolewine_03.jpg" width = "400" />
<img src="../images/recognition/more_demo_images/output_product/lafiolewine_03_en.jpg" width = "400" />
</div>
<div align="center">
<img src="../images/recognition/more_demo_images/output_product/taochunqiu_8.jpg" width = "400" />
<img src="../images/recognition/more_demo_images/output_product/taochunqiu_8_en.jpg" width = "400" />
</div>
<div align="center">
<img src="../images/recognition/more_demo_images/output_product/weiduomeinaiyougege_10.jpg" width = "400" />
<img src="../images/recognition/more_demo_images/output_product/weiduomeinaiyougege_10_en.jpg" width = "400" />
</div>
- Cartoon character recognition
<div align="center">
<img src="../images/recognition/more_demo_images/output_cartoon/labixiaoxin-005.jpeg" width = "400" />
<img src="../images/recognition/more_demo_images/output_cartoon/labixiaoxin-005_en.jpeg" width = "400" />
</div>
<div align="center">
<img src="../images/recognition/more_demo_images/output_cartoon/liuchuanfeng-010.jpeg" width = "400" />
<img src="../images/recognition/more_demo_images/output_cartoon/liuchuanfeng-010_en.jpeg" width = "400" />
</div>
<div align="center">
<img src="../images/recognition/more_demo_images/output_cartoon/zhangchulan-007.jpeg" width = "400" />
<img src="../images/recognition/more_demo_images/output_cartoon/zhangchulan-007_en.jpeg" width = "400" />
</div>
- Logo recognition
<div align="center">
<img src="../images/recognition/more_demo_images/output_logo/cctv_4.jpg" width = "400" />
<img src="../images/recognition/more_demo_images/output_logo/cctv_4_en.jpg" width = "400" />
</div>
<div align="center">
<img src="../images/recognition/more_demo_images/output_logo/mangguo_8.jpeg" width = "400" />
<img src="../images/recognition/more_demo_images/output_logo/mangguo_8_en.jpeg" width = "400" />
</div>
<div align="center">
<img src="../images/recognition/more_demo_images/output_logo/zhongshiyou-007.jpg" width = "400" />
<img src="../images/recognition/more_demo_images/output_logo/zhongshiyou-007_en.jpg" width = "400" />
</div>
- Car recognition
<div align="center">
<img src="../images/recognition/more_demo_images/output_vehicle/audia5-115.jpeg" width = "400" />
<img src="../images/recognition/more_demo_images/output_vehicle/audia5-115_en.jpeg" width = "400" />
</div>
<div align="center">
<img src="../images/recognition/more_demo_images/output_vehicle/bentian-yage-101.jpeg" width = "400" />
<img src="../images/recognition/more_demo_images/output_vehicle/bentian-yage-101_en.jpeg" width = "400" />
</div>
<div align="center">
<img src="../images/recognition/more_demo_images/output_vehicle/bmw-m340-107.jpeg" width = "400" />
<img src="../images/recognition/more_demo_images/output_vehicle/bmw-m340-107_en.jpeg" width = "400" />
</div>
[More demo images](../images/recognition/more_demo_images)
......@@ -41,7 +41,7 @@ You can also visit [DockerHub](https://hub.docker.com/r/paddlepaddle/paddle/tags
```
# use ctrl+P+Q to exit docker, to re-enter docker using the following command:
sudo docker container exec -it ppcls /bin/bash
sudo docker exec -it ppcls /bin/bash
```
### 1.3 Install PaddlePaddle using pip
......
......@@ -40,10 +40,10 @@ The detection model with the recognition inference model for the 4 directions (L
| Logo Recognition Model | Logo Scenario | [Model Download Link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/inference/logo_rec_ResNet50_Logo3K_v1.0_infer.tar) | [inference_logo.yaml](../../../deploy/configs/inference_logo.yaml) | [build_logo.yaml](../../../deploy/configs/build_logo.yaml) |
| Cartoon Face Recognition Model| Cartoon Face Scenario | [Model Download Link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/inference/cartoon_rec_ResNet50_iCartoon_v1.0_infer.tar) | [inference_cartoon.yaml](../../../deploy/configs/inference_cartoon.yaml) | [build_cartoon.yaml](../../../deploy/configs/build_cartoon.yaml) |
| Vehicle Subclassification Model | Vehicle Scenario | [Model Download Link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/inference/vehicle_cls_ResNet50_CompCars_v1.0_infer.tar) | [inference_vehicle.yaml](../../../deploy/configs/inference_vehicle.yaml) | [build_vehicle.yaml](../../../deploy/configs/build_vehicle.yaml) |
| Product Recignition Model | Product Scenario | [Model Download Link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/inference/product_ResNet50_vd_Inshop_v1.0_infer.tar) | [inference_inshop.yaml](../../../deploy/configs/) | [build_inshop.yaml](../../../deploy/configs/build_inshop.yaml) |
| Product Recignition Model | Product Scenario | [Model Download Link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/inference/product_ResNet50_vd_Inshop_v1.0_infer.tar) | [inference_product.yaml](../../../deploy/configs/inference_product.yaml) | [build_product.yaml](../../../deploy/configs/build_product.yaml) |
Demo data in this tutorial can be downloaded here: [download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/data/recognition_demo_data_v1.0.tar).
Demo data in this tutorial can be downloaded here: [download link](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/data/recognition_demo_data_en_v1.0.tar).
**Attention**
......@@ -89,7 +89,7 @@ wget https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/infere
cd ..
# Download the demo data and unzip it
wget https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/data/recognition_demo_data_v1.0.tar && tar -xf recognition_demo_data_v1.0.tar
wget https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/data/recognition_demo_data_en_v1.0.tar && tar -xf recognition_demo_data_en_v1.0.tar
```
Once unpacked, the `recognition_demo_data_v1.0` folder should have the following file structure.
......@@ -123,7 +123,7 @@ The `models` folder should have the following file structure.
```
<a name="Product_recognition_and_retrival"></a>
### 2.2 Product Recognition and Retrival
### 2.2 Product Recognition and Retrieval
Take the product recognition demo as an example to show the recognition and retrieval process (if you wish to try other scenarios of recognition and retrieval, replace the corresponding configuration file after downloading and unzipping the corresponding demo data and model to complete the prediction)。
......@@ -137,7 +137,7 @@ Run the following command to identify and retrieve the image `./recognition_demo
# use the following command to predict using GPU.
python3.7 python/predict_system.py -c configs/inference_product.yaml
# use the following command to predict using CPU
python3.7 python/predict_system.py -c configs/inference_product.yaml
python3.7 python/predict_system.py -c configs/inference_product.yaml -o Global.use_gpu=False
```
**Note:** Program lib used to build index is compliled on our machine, if error occured because of the environment, you can refer to [vector search tutorial](../../../deploy/vector_search/README.md) to rebuild the lib.
......@@ -153,7 +153,7 @@ The image to be retrieved is shown below.
The final output is shown below.
```
[{'bbox': [287, 129, 497, 326], 'rec_docs': '稻香村金猪饼', 'rec_scores': 0.8309420943260193}, {'bbox': [99, 242, 313, 426], 'rec_docs': '稻香村金猪饼', 'rec_scores': 0.7245652079582214}]
[{'bbox': [287, 129, 497, 326], 'rec_docs': 'Daoxaingcun Golden Piggie Cake', 'rec_scores': 0.8309420347213745}, {'bbox': [99, 242, 313, 426], 'rec_docs': 'Daoxaingcun Golden Piggie Cake', 'rec_scores': 0.7245651483535767}]
```
......@@ -163,7 +163,7 @@ where bbox indicates the location of the detected object, rec_docs indicates the
The detection result is also saved in the folder `output`, for this image, the visualization result is as follows.
<div align="center">
<img src="../../images/recognition/product_demo/result/daoxiangcunjinzhubing_6.jpg" width = "400" />
<img src="../../images/recognition/product_demo/result/daoxiangcunjinzhubing_6_en.jpg" width = "400" />
</div>
......@@ -182,13 +182,12 @@ The results on the screen are shown as following.
```
...
[{'bbox': [37, 29, 123, 89], 'rec_docs': '香奈儿包', 'rec_scores': 0.6163763999938965}, {'bbox': [153, 96, 235, 175], 'rec_docs': '香奈儿包', 'rec_scores': 0.5279821157455444}]
[{'bbox': [735, 562, 1133, 851], 'rec_docs': '香奈儿包', 'rec_scores': 0.5588355660438538}]
[{'bbox': [124, 50, 230, 129], 'rec_docs': '香奈儿包', 'rec_scores': 0.6980369687080383}]
[{'bbox': [0, 0, 275, 183], 'rec_docs': '香奈儿包', 'rec_scores': 0.5818190574645996}]
[{'bbox': [400, 1179, 905, 1537], 'rec_docs': '香奈儿包', 'rec_scores': 0.9814301133155823}]
[{'bbox': [544, 4, 1482, 932], 'rec_docs': '香奈儿包', 'rec_scores': 0.5143815279006958}]
[{'bbox': [29, 42, 194, 183], 'rec_docs': '香奈儿包', 'rec_scores': 0.9543638229370117}]
[{'bbox': [37, 29, 123, 89], 'rec_docs': 'Chanel Handbag', 'rec_scores': 0.6163763999938965}, {'bbox': [153, 96, 235, 175], 'rec_docs': 'Chanel Handbag', 'rec_scores': 0.5279821157455444}]
[{'bbox': [735, 562, 1133, 851], 'rec_docs': 'Chanel Handbag', 'rec_scores': 0.5588355660438538}]
[{'bbox': [124, 50, 230, 129], 'rec_docs': 'Chanel Handbag', 'rec_scores': 0.6980369687080383}]
[{'bbox': [0, 0, 275, 183], 'rec_docs': 'Chanel Handbag', 'rec_scores': 0.5818190574645996}]
[{'bbox': [400, 1179, 905, 1537], 'rec_docs': 'Chanel Handbag', 'rec_scores': 0.9814301133155823}, {'bbox': [295, 713, 820, 1046], 'rec_docs': 'Chanel Handbag', 'rec_scores': 0.9496176242828369}, {'bbox': [153, 236, 694, 614], 'rec_docs': 'Chanel Handbag', 'rec_scores': 0.8395382761955261}]
[{'bbox': [544, 4, 1482, 932], 'rec_docs': 'Chanel Handbag', 'rec_scores': 0.5143815279006958}]
...
```
......@@ -238,12 +237,12 @@ cp recognition_demo_data_v1.0/gallery_product/data_file.txt recognition_demo_dat
Then add some new lines into the new label file, which is shown as follows.
```
gallery/anmuxi/001.jpg 安慕希酸奶
gallery/anmuxi/002.jpg 安慕希酸奶
gallery/anmuxi/003.jpg 安慕希酸奶
gallery/anmuxi/004.jpg 安慕希酸奶
gallery/anmuxi/005.jpg 安慕希酸奶
gallery/anmuxi/006.jpg 安慕希酸奶
gallery/anmuxi/001.jpg Anmuxi Ambrosial Yogurt
gallery/anmuxi/002.jpg Anmuxi Ambrosial Yogurt
gallery/anmuxi/003.jpg Anmuxi Ambrosial Yogurt
gallery/anmuxi/004.jpg Anmuxi Ambrosial Yogurt
gallery/anmuxi/005.jpg Anmuxi Ambrosial Yogurt
gallery/anmuxi/006.jpg Anmuxi Ambrosial Yogurt
```
Each line can be splited into two fields. The first field denotes the relative image path, and the second field denotes its label. The `delimiter` is `tab` here.
......@@ -274,11 +273,11 @@ python3.7 python/predict_system.py -c configs/inference_product.yaml -o Global.i
The output is as follows:
```
[{'bbox': [243, 80, 523, 522], 'rec_docs': '安慕希酸奶', 'rec_scores': 0.5570770502090454}]
[{'bbox': [243, 80, 523, 522], 'rec_docs': 'Anmuxi Ambrosial Yogurt', 'rec_scores': 0.5570770502090454}]
```
The final recognition result is `安慕希酸奶`, which is corrrect, the visualization result is as follows.
The final recognition result is `Anmuxi Ambrosial Yogurt`, which is corrrect, the visualization result is as follows.
<div align="center">
<img src="../../images/recognition/product_demo/result/anmuxi.jpg" width = "400" />
<img src="../../images/recognition/product_demo/result/anmuxi_en.jpg" width = "400" />
</div>
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册