提交 ce166b6c 编写于 作者: T Tingquan Gao 提交者: Tingquan Gao

docs: fix invalid links

上级 52ce65b0
...@@ -309,7 +309,7 @@ sh tools/train.sh ...@@ -309,7 +309,7 @@ sh tools/train.sh
* After the use of data augmentation, the model may tend to be underfitting. It is recommended to reduce `l2_decay` for better performance on validation set. * After the use of data augmentation, the model may tend to be underfitting. It is recommended to reduce `l2_decay` for better performance on validation set.
* hyperparameters exist in almost all agmenatation methods. Here we provide hyperparameters for ImageNet1k dataset. User may need to finetune the hyperparameters on specified dataset. More training tricks can be referred to [**Tricks**](../../../zh_CN/models/Tricks.md). * hyperparameters exist in almost all agmenatation methods. Here we provide hyperparameters for ImageNet1k dataset. User may need to finetune the hyperparameters on specified dataset. More training tricks can be referred to [Tricks](../models_training/train_strategy_en.md).
> If this document is helpful to you, welcome to star our project: [https://github.com/PaddlePaddle/PaddleClas](https://github.com/PaddlePaddle/PaddleClas) > If this document is helpful to you, welcome to star our project: [https://github.com/PaddlePaddle/PaddleClas](https://github.com/PaddlePaddle/PaddleClas)
......
...@@ -169,7 +169,7 @@ python3.7 tools/export.py \ ...@@ -169,7 +169,7 @@ python3.7 tools/export.py \
The exported model can be deployed directly using inference, please refer to [inference deployment](../inference_deployment/). The exported model can be deployed directly using inference, please refer to [inference deployment](../inference_deployment/).
You can also use PaddleLite's opt tool to convert the inference model to a mobile model for its mobile deployment. Please refer to [Mobile Model Deployment](../inference_deployment/paddle_lite_deploy_en.md ) for more details. You can also use PaddleLite's opt tool to convert the inference model to a mobile model for its mobile deployment. Please refer to [Mobile Model Deployment](../inference_deployment/paddle_lite_deploy_en.md) for more details.
<a name="5"></a> <a name="5"></a>
......
# Multilabel classification quick start # Multilabel classification quick start
Based on the [NUS-WIDE-SCENE](https://lms.comp.nus.edu.sg/wp-content/uploads/2019/research/nuswide/NUS-WIDE.html) dataset which is a subset of NUS-WIDE dataset, you can experience multilabel of PaddleClas, include training, evaluation and prediction. Please refer to [Installation](install.md) to install at first. Based on the [NUS-WIDE-SCENE](https://lms.comp.nus.edu.sg/wp-content/uploads/2019/research/nuswide/NUS-WIDE.html) dataset which is a subset of NUS-WIDE dataset, you can experience multilabel of PaddleClas, include training, evaluation and prediction. Please refer to [Installation](../../installation/) to install at first.
## Preparation ## Preparation
......
...@@ -16,7 +16,7 @@ ...@@ -16,7 +16,7 @@
- [2.2.4 GridMask](#2.2.4) - [2.2.4 GridMask](#2.2.4)
- [2.3 Image mix](#2.3) - [2.3 Image mix](#2.3)
- [2.3.1 Mixup](#2.3.1) - [2.3.1 Mixup](#2.3.1)
- [2.3.2 Cutmix](#3.2.2) - [2.3.2 Cutmix](#2.3.2)
<a name="1"></a> <a name="1"></a>
## 1. Introduction to data augmentation ## 1. Introduction to data augmentation
......
...@@ -3,7 +3,7 @@ ...@@ -3,7 +3,7 @@
- [1. Dataset Introduction](#1) - [1. Dataset Introduction](#1)
- [1.1 ImageNet-1k](#1.1) - [1.1 ImageNet-1k](#1.1)
- [1.2 CIFAR-10/CIFAR-100](#1.2) - [1.2 CIFAR-10/CIFAR-100](#1.2)
- [2. Image Classification Process](2) - [2. Image Classification Process](#2)
- [2.1 Data and its Preprocessing](#2.1) - [2.1 Data and its Preprocessing](#2.1)
- [2.2 Prepare the model](#2.2) - [2.2 Prepare the model](#2.2)
- [2.3 Train the model](#2.3) - [2.3 Train the model](#2.3)
......
...@@ -30,7 +30,7 @@ Two learning paradigms are adopted in Metric Learning: ...@@ -30,7 +30,7 @@ Two learning paradigms are adopted in Metric Learning:
<a name="3.1"></a> <a name="3.1"></a>
### 3.1 Classification based: ### 3.1 Classification based:
This refers to methods based on classification labels. They learn the effective feature representation by classifying each sample into the correct category and require the participation of the explicit labels of each sample in the Loss calculation during the learning process. Common algorithms include [L2-Softmax](https://arxiv.org/abs/1703.09507), [Large-margin Softmax](https://arxiv.org/abs/1612.02295), [Angular Softmax]( https://arxiv.org/pdf/1704.08063.pdf), [NormFace](https://arxiv.org/abs/1704.06369), [AM-Softmax](https://arxiv.org/abs/1801.05599), [CosFace](https://arxiv.org/abs/1801.09414), [ArcFace](https://arxiv.org/abs/1801.07698), etc. These methods are also called proxy-based, because what they optimize is essentially the similarity between a sample and a set of proxies. This refers to methods based on classification labels. They learn the effective feature representation by classifying each sample into the correct category and require the participation of the explicit labels of each sample in the Loss calculation during the learning process. Common algorithms include [L2-Softmax](https://arxiv.org/abs/1703.09507), [Large-margin Softmax](https://arxiv.org/abs/1612.02295), [Angular Softmax](https://arxiv.org/pdf/1704.08063.pdf), [NormFace](https://arxiv.org/abs/1704.06369), [AM-Softmax](https://arxiv.org/abs/1801.05599), [CosFace](https://arxiv.org/abs/1801.09414), [ArcFace](https://arxiv.org/abs/1801.07698), etc. These methods are also called proxy-based, because what they optimize is essentially the similarity between a sample and a set of proxies.
<a name="3.2"></a> <a name="3.2"></a>
### 3.2 Pairwise based: ### 3.2 Pairwise based:
......
# Cartoon Character Recognition
Since the 1970s, face recognition has become one of the most important topics in the field of computer vision and biometrics. In recent years, traditional face recognition methods have been replaced by the deep learning method based on convolutional neural network (CNN). At present, face recognition technology is widely used in security, commerce, finance, intelligent self-service terminal, entertainment and other fields. With the strong demand of industry application, animation media has been paid more and more attention, and face recognition of animation characters has become a new research field.
## 1 Pipeline
See the pipline of [feature learning](./feature_learning_en.md) for details. It is worth noting that the `Neck` module is not used in this process.
The config file: [ResNet50_icartoon.yaml](../../../ppcls/configs/Cartoonface/ResNet50_icartoon.yaml)
The details are as follows.
### 1.1 Data Augmentation
- `RandomCrop`: 224x224
- `RandomFlip`
- `Normlize`: normlize images to 0~1
### 1.2 Backbone
`ResNet50` is used as the backbone. And Large model was used for distillation.
### 1.3 Metric Learning Losses
`CELoss` is used for training.
## 2 Experiment
This method is validated on icartoonface [1] dataset. The dataset consists of 389678 images of 5013 cartoon characters with ID, bounding box, pose and other auxiliary attributes. The dataset is the largest cartoon media dataset in the field of image recognition.
Compared with other datasets, icartoonface has obvious advantages in both image quantity and entity number. Among them, training set inclues 5013 classes, 389678 images. The query dataset has 2500 images and gallery dataset has 20000 images.
![icartoon](../../images/icartoon1.png)
It is worth noting that, compared with the face recognition task, the accessories, props, hairstyle and other factors of cartoon characters' head portraits can significantly improve the recognition accuracy. Therefore, based on the annotation box of the original dataset, we double the length and width of bbox to get a more comprehensive cartoon character image.
On this dataset, the recall1 of this method reaches 83.24%.
## 3 References
[1] Cartoon Face Recognition: A Benchmark Dataset. 2020. [download](https://github.com/luxiangju-PersonAI/iCartoonFace)
# Feature Learning
This part mainly explains the training mode of feature learning, which is `RecModel` training mode in code. The main purpose of feature learning is to support the application, such as vehicle recognition (vehicle fine-grained classification, vehicle Reid), logo recognition, cartoon character recognition , product recognition, which needs to learn robust features to identify objects. Different from training classification network on Imagenet, this feature learning part mainly has the following features:
- Support to truncate the `backbone`, which means feature of any intermediate layer can be extracted
- Support to add configurable layers after `backbone` output, namely `Neck`
- Support `Arcface Loss` and other `metric learning`loss functions to improve feature learning ability
# 1 Pipeline
![](../../images/recognition/rec_pipeline.png)
The overall structure of feature learning is shown in the figure above, which mainly includes `Data Augmentation`, `Backbone`, `Neck`, `Metric Learning` and so on. The `Neck` part is a freely added layers, such as `Embedding layer`. Of course, this module can be omitted if not needed. During training, the loss of `Metric Learning` is used to optimize the model. Generally speaking, the output of the `Neck` is used as the feature output when in inference stage.
## 2 Config Description
The feature learning config file description can be found in [yaml description](../tutorials/config_en.md).
## 3 Pretrained Model
The following are the pretrained models trained on different dataset.
- Vehicle Fine-Grained Classification:[CompCars](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/pretrain/vehicle_cls_ResNet50_CompCars_v1.2_pretrained.pdparams)
- Vehicle ReID:[VERI-Wild](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/pretrain/vehicle_reid_ResNet50_VERIWild_v1.1_pretrained.pdparams)
- Cartoon Character Recognition:[iCartoon](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/pretrain/cartoon_rec_ResNet50_iCartoon_v1.0_pretrained.pdparams)
- Logo Recognition:[Logo 3K](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/pretrain/logo_rec_ResNet50_Logo3K_v1.1_pretrained.pdparams)
- Product Recognition: [Inshop](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/pretrain/product_ResNet50_vd_Inshop_pretrained_v1.1.pdparams)[Aliproduct](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/pretrain/product_ResNet50_vd_Aliproduct_v1.0_pretrained.pdparams)
application
================================
.. toctree::
:maxdepth: 2
transfer_learning_en.md
object_detection_en.md
# Logo Recognition
Logo recognition is a field that is widely used in real life, such as whether the Adidas or Nike logo appears in a photo, or whether the Starbucks or Coca-Cola logo appears on a cup. Usually, when the number of logo categories is large, the two-stage method of detection and recognition is often used. The detection module is responsible for detecting the potential logo area, and then feed the logo area to the recognition module to identify the category. The recognition module mostly adopts retrieval-based method, and sorts the similarity of the query and the gallery to obtain the predicted category. This document mainly introduces the feature learning part.
## 1 Pipeline
See the pipline of [feature learning](./feature_learning_en.md) for details.
The config file of logo recognition: [ResNet50_ReID.yaml](../../../ppcls/configs/Logo/ResNet50_ReID.yaml).
The details are as follows.
### 1.1 Data Augmentation
Different from classification, this part mainly uses the following methods:
- `Resize` to 224. The input image is already croped using bbox by a logo detector.
- [AugMix](https://arxiv.org/abs/1912.02781v1):Simulate lighting changes, camera position changes and other real scenes.
- [RandomErasing](https://arxiv.org/pdf/1708.04896v2.pdf):Simulate occlusion.
### 1.2 Backbone
Using `ResNet50` as backbone, and make the following modifications:
- Last stage stride = 1, keep the size of the final output feature map to 14x14. At the cost of increasing a small amount of calculation, the ability of feature representation is greatly improved.
- Use pretrained weights of ImageNet
code:[ResNet50_last_stage_stride1](../../../ppcls/arch/backbone/variant_models/resnet_variant.py)
### 1.3 Neck
In order to reduce the complexity of calculating feature distance in inference, an embedding convolution layer is added, and the feature dimension is set to 512.
### 1.4 Metric Learning Losses
[PairwiseCosface](../../../ppcls/loss/pairwisecosface.py) , [CircleMargin](../../../ppcls/arch/gears/circlemargin.py) [1] are used. The weight ratio of two losses is 1:1.
## 2 Experiment
<img src="../../images/logo/logodet3k.jpg" style="zoom:50%;" />
LogoDet-3K[2] dataset is used for experiments. The dataset is fully labeled, with 3000 logo categories, about 200,000 high-quality manually labeled logo objects and 158,652 images.
Since the dataset is original desigined for detection task, only the cropped logo area is used in the logo recognition stage. Therefore, the labeled bbox annotations are used to crop the logo area to form the training set, eliminating the influence of the background in the recognition stage. After cropping preprocessing, the dataset was splited to 155,427 images as training sets, covering 3000 logo categories (also used as the gallery during testing), and 3225 as test sets, which were used as query sets. The cropped dataset is available [download here](https://arxiv.org/abs/2008.05359)
On this data, the single model Recall@1 Acc: 89.8%.
## 3 References
[1] Circle loss: A unified perspective of pair similarity optimization. *Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition*. 2020.
[2] LogoDet-3K: A Large-Scale Image Dataset for Logo Detection[J]. arXiv preprint arXiv:2008.05359, 2020.
# Mainbody Detection
The mainbody detection technology is currently a very widely used detection technology, which refers to the detect one or some mainbody objects in the picture, crop the corresponding area in the image and carry out recognition, thereby completing the entire recognition process. Mainbody detection is the first step of the recognition task, which can effectively improve the recognition accuracy.
This tutorial will introduce the dataset and model training for mainbody detection in PaddleClas.
## 1. Dataset
The datasets we used for mainbody detection task are shown in the following table.
| Dataset | Image number | Image number used in <<br>>mainbody detection | Scenarios | Dataset link |
| ------------ | ------------- | -------| ------- | -------- |
| Objects365 | 170W | 6k | General Scenarios | [link](https://www.objects365.org/overview.html) |
| COCO2017 | 12W | 5k | General Scenarios | [link](https://cocodataset.org/) |
| iCartoonFace | 2k | 2k | Cartoon Face | [link](https://github.com/luxiangju-PersonAI/iCartoonFace) |
| LogoDet-3k | 3k | 2k | Logo | [link](https://github.com/Wangjing1551/LogoDet-3K-Dataset) |
| RPC | 3k | 3k | Product | [link](https://rpc-dataset.github.io/) |
In the actual training process, all datasets are mixed together. Categories of all the labeled boxes are modified to the category `foreground`, and the detection model we trained just contains one category (`foreground`).
## 2. Model Selection
There are many types of object detection methods such as the commonly used two-stage detectors (FasterRCNN series, etc.), single-stage detectors (YOLO, SSD, etc.), anchor-free detectors (FCOS, etc.) and so on.
PP-YOLO is proposed by [PaddleDetection](https://github.com/PaddlePaddle/PaddleDetection). It deeply optimizes the yolov3 model from multiple perspectives such as backbone, data augmentation, regularization strategy, loss function, and post-processing. Finally, it reached the state of the art in terms of "speed-precision". Specifically, the optimization strategy is as follows.
- Better backbone: ResNet50vd-DCN
- Larger training batch size: 8 GPUs and mini-batch size as 24 on each GPU
- [Drop Block](https://arxiv.org/abs/1810.12890)
- [Exponential Moving Average](https://www.investopedia.com/terms/e/ema.asp)
- [IoU Loss](https://arxiv.org/pdf/1902.09630.pdf)
- [Grid Sensitive](https://arxiv.org/abs/2004.10934)
- [Matrix NMS](https://arxiv.org/pdf/2003.10152.pdf)
- [CoordConv](https://arxiv.org/abs/1807.03247)
- [Spatial Pyramid Pooling](https://arxiv.org/abs/1406.4729)
- Better ImageNet pretrain weights
For more information about PP-YOLO, you can refer to [PP-YOLO tutorial](https://github.com/PaddlePaddle/PaddleDetection/blob/release%2F2.1/configs/ppyolo/README.md)
In the mainbody detection task, we use `ResNet50vd-DCN` as our backbone for better performance. The config file is [ppyolov2_r50vd_dcn_365e_coco.yml](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.1/configs/ppyolo/ppyolov2_r50vd_dcn_365e_coco.yml) used for the model training, in which the dagtaset path is modified to the mainbody detection dataset.
The final inference model can be downloaded [here](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/inference/ppyolov2_r50vd_dcn_mainbody_v1.0_infer.tar).
## 3. Model training
This section mainly talks about how to train your own mainbody detection model using PaddleDetection on your own dataset.
### 3.1 Prepare for the environment
Download PaddleDetection and install requirements。
```shell
cd <path/to/clone/PaddleDetection>
git clone https://github.com/PaddlePaddle/PaddleDetection.git
cd PaddleDetection
# install requirements
pip install -r requirements.txt
```
For more installation tutorials, please refer to [Installation tutorial](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.1/docs/tutorials/INSTALL.md)
### 3.2 Prepare for the dataset
For customized dataset, you should convert it to COCO format. Please refer to [Customized dataset tutorial](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.1/static/docs/tutorials/Custom_DataSet.md) to build your own dataset with COCO format.
In mainbody detection task, all the objects belong to foregroud. Therefore, `category_id` of all the objects in the annotation file should be modified to 1. And the `categories` map should be modified as follows, in which just class `foregroud` is included.
```json
[{u'id': 1, u'name': u'foreground', u'supercategory': u'foreground'}]
```
### 3.3 Configuration files
You can use `configs/ppyolo/ppyolov2_r50vd_dcn_365e_coco.yml` to train the model, mode details are as follows.
<div align='center'>
<img src='../../images/det/PaddleDetection_config.png' width='400'/>
</div>
`ppyolov2_r50vd_dcn_365e_coco.yml` depends on other configuration files, their meanings are as follows.
```
coco_detection.yml:num_class of the model, and train/eval/test dataset.
runtime.yml:public runtime parameters, use_gpu, save_interval, etc.
optimizer_365e.yml:learning rate and optimizer.
ppyolov2_r50vd_dcn.yml:model architecture.
ppyolov2_reader.yml:train/eval/test reader.
```
In mainbody detection task, you need to modify `num_classes` in `datasets/coco_detection.yml` to 1 (just `foreground` is included). Dataset path should also be updated.
### 3.4 Begin the training process
PaddleDetection supports many ways of training process.
* Training using single GPU
```bash
# not needed for windows and Mac
export CUDA_VISIBLE_DEVICES=0
python tools/train.py -c configs/ppyolo/ppyolov2_r50vd_dcn_365e_coco.yml
```
* Training using multiple GPU's
```bash
export CUDA_VISIBLE_DEVICES=0,1,2,3
python -m paddle.distributed.launch --gpus 0,1,2,3 tools/train.py -c configs/ppyolo/ppyolov2_r50vd_dcn_365e_coco.yml --eval
```
--eval:eval during training
* (**Recommend**) Model finetune
If you want to finetune the model on your own dataset, you can run the following command to train the model.
```bash
export CUDA_VISIBLE_DEVICES=0
# assign pretrain_weights, load the general mainbody-detection pretrained model
python tools/train.py -c configs/ppyolo/ppyolov2_r50vd_dcn_365e_coco.yml -o pretrain_weights=https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/pretrain/ppyolov2_r50vd_dcn_mainbody_v1.0_pretrained.pdparams
```
* Resume training: you can use `-r` to load checkpoints and resume training.
```bash
export CUDA_VISIBLE_DEVICES=0,1,2,3
python -m paddle.distributed.launch --gpus 0,1,2,3 tools/train.py -c configs/ppyolo/ppyolov2_r50vd_dcn_365e_coco.yml --eval -r output/ppyolov2_r50vd_dcn_365e_coco/10000
```
Note:
If error `out of memory` occured, you can try to decrease `batch_size` in `ppyolov2_reader.yml`.
### 3.5 Model prediction
Use the following command to finish the prediction process.
```bash
export CUDA_VISIBLE_DEVICES=0
python tools/infer.py -c configs/ppyolo/ppyolov2_r50vd_dcn_365e_coco.yml --infer_img=your_image_path.jpg --output_dir=infer_output/ --draw_threshold=0.5 -o weights=output/ppyolov2_r50vd_dcn_365e_coco/model_final
```
`--draw_threshold` is an optional parameter.
### 3.6 Export model and inference.
Use the following to export the inference model.
```bash
python tools/export_model.py -c configs/ppyolo/ppyolov2_r50vd_dcn_365e_coco.yml --output_dir=./inference -o weights=output/ppyolov2_r50vd_dcn_365e_coco/model_final.pdparams
```
The inference model will be saved folder `inference/ppyolov2_r50vd_dcn_365e_coco`, which contains `model.pdiparams`, `model.pdiparams.info`,`model.pdmodel` and `infer_cfg.yml`(optional for mainbody detection).
* Note: Inference model name that `PaddleDetection` exports is `model.xxx`, here if you want to keep it consistent with `PaddleClas`, you can rename `model.xxx` to `inference.xxx` for subsequent inference.
For more model export tutorial, please refer to [EXPORT_MODEL](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.1/deploy/EXPORT_MODEL.md).
Now you get the newest model on your own dataset. In the recognition process, you can replace the detection model path with yours. For quick start of recognition process, please refer to the [tutorial](../tutorials/quick_start_recognition_en.md).
# General object detection
## Practical Server-side detection method base on RCNN
### Introduction
* In recent years, object detection tasks have attracted widespread attention. [PaddleClas](https://github.com/PaddlePaddle/PaddleClas) open-sourced the ResNet50_vd_SSLD pretrained model based on ImageNet(Top1 Acc 82.4%). And based on the pretrained model, PaddleDetection provided the PSS-DET (Practical Server-side detection) with the help of the rich operators in PaddleDetection. The inference speed can reach 61FPS on single V100 GPU when COCO mAP is 41.6%, and 20FPS when COCO mAP is 47.8%.
* We take the standard `Faster RCNN ResNet50_vd FPN` as an example. The following table shows ablation study of PSS-DET.
| Trick | Train scale | Test scale | COCO mAP | Infer speed/FPS |
|- |:-: |:-: | :-: | :-: |
| `baseline` | 640x640 | 640x640 | 36.4% | 43.589 |
| +`test proposal=pre/post topk 500/300` | 640x640 | 640x640 | 36.2% | 52.512 |
| +`fpn channel=64` | 640x640 | 640x640 | 35.1% | 67.450 |
| +`ssld pretrain` | 640x640 | 640x640 | 36.3% | 67.450 |
| +`ciou loss` | 640x640 | 640x640 | 37.1% | 67.450 |
| +`DCNv2` | 640x640 | 640x640 | 39.4% | 60.345 |
| +`3x, multi-scale training` | 640x640 | 640x640 | 41.0% | 60.345 |
| +`auto augment` | 640x640 | 640x640 | 41.4% | 60.345 |
| +`libra sampling` | 640x640 | 640x640 | 41.6% | 60.345 |
Based on the ablation experiments, Cascade RCNN and larger inference scale(1000x1500) are used for better performance. The final COCO mAP is 47.8%
and the following figure shows `mAP-Speed` curves for some common detectors.
![pssdet](../../images/det/pssdet.png)
**Note**
> For fair comparison, inference time for PSS-DET models on V100 GPU is transformed to Titan V GPU by multiplying by 1.2 times.
For more detailed information, you can refer to [PaddleDetection](https://github.com/PaddlePaddle/PaddleDetection/tree/master/configs/rcnn_server_side_det).
## Practical Mobile-side detection method base on RCNN
* This part is comming soon!
# Product Recognition
Product recogniton is now widely used . The way of shopping by taking a photo has been adopted by many people. And the unmanned settlement platform has entered the major supermarkets, which is also supported by product recognition technology. The technology is about the process of "product detection + product identification". The product detection module is responsible for detecting potential product areas, and the product identification model is responsible for identifying the main body detected by the product detection module. The recognition module uses the retrieval method to get the similarity rank of product in database and the query image . This document mainly introduces the feature extraction part of product pictures.
## 1 Pipeline
See the pipline of [feature learning](./feature_learning_en.md) for details.
The config file: [ResNet50_vd_Aliproduct.yaml](../../../ppcls/configs/Products/ResNet50_vd_Aliproduct.yaml)
The details are as follows.
### 1.1 Data Augmentation
- `RandomCrop`: 224x224
- `RandomFlip`
- `Normlize`: normlize images to 0~1
### 1.2 Backbone
Using `ResNet50_vd` as the backbone, whicle is pretrained on ImageNet.
### 1.3 Neck
A 512 dimensional embedding FC layer without batchnorm and activation is used.
### 1.4 Metric Learning Losses
At present, `CELoss` is used. In order to obtain more robust features, other loss will be used for training in the future. Please look forward to it.
## 2 Experiment
This scheme is tested on Aliproduct [1] dataset. This dataset is an open source dataset of Tianchi competition, which is the largest open source product data set at present. It has more than 50000 identification categories and about 2.5 million training pictures.
On this data, the single model Top1 Acc: 85.67%.
## 3 References
[1] Weakly Supervised Learning with Side Information for Noisy Labeled Images. ECCV, 2020.
# Transfer learning in image classification
Transfer learning is an important part of machine learning, which is widely used in various fields such as text and images. Here we mainly introduce transfer learning in the field of image classification, which is often called domain transfer, such as migration of the ImageNet classification model to the specified image classification task, such as flower classification.
## Hyperparameter search
ImageNet is the widely used dataset for image classification. A series of empirical hyperparameters have been summarized. High accuracy can be got using the hyperparameters. However, when applied in the specified dataset, the hyperparameters may not be optimal. There are two commonly used hyperparameter search methods that can be used to help us obtain better model hyperparameters.
### Grid search
For grid search, which is also called exhaustive search, the optimal value is determined by finding the best solution from all solutions in the search space. The method is simple and effective, but when the search space is large, it takes huge computing resource.
### Bayesian search
Bayesian search, which is also called Bayesian optimization, is realized by randomly selecting a group of hyperparameters in the search space. Gaussian process is used to update the hyperparameters, compute their expected mean and variance according to the performance of the previous hyperparameters. The larger the expected mean, the greater the probability of being close to the optimal solution. The larger the expected variance, the greater the uncertainty. Usually, the hyperparameter point with large expected mean is called `exporitation`, and the hyperparameter point with large variance is called `exploration`. Acquisition function is defined to balance the expected mean and variance. The currently selected hyperparameter point is viewed as the optimal position with maximum probability.
According to the above two search schemes, we carry out some experiments based on fixed scheme and two search schemes on 8 open source datasets. As the experimental scheme in [1], we search for 4 hyperparameters, the search space and The experimental results are as follows:
a fixed set of parameter experiments and two search schemes on 8 open source data sets. With reference to the experimental scheme of [1], we search for 4 hyperparameters, the search space and the experimental results are as follows:
- Fixed scheme.
```
lr=0.003,l2 decay=1e-4,label smoothing=False,mixup=False
```
- Search space of the hyperparameters.
```
lr: [0.1, 0.03, 0.01, 0.003, 0.001, 0.0003, 0.0001]
l2 decay: [1e-3, 3e-4, 1e-4, 3e-5, 1e-5, 3e-6, 1e-6]
label smoothing: [False, True]
mixup: [False, True]
```
It takes 196 times for grid search, and takes 10 times less for Bayesian search. The baseline is trained by using ImageNet1k pretrained model based on ResNet50_vd and fixed scheme. The follow shows the experiments.
| Dataset | Fix scheme | Grid search | Grid search time | Bayesian search | Bayesian search time|
| ------------------ | -------- | -------- | -------- | -------- | ---------- |
| Oxford-IIIT-Pets | 93.64% | 94.55% | 196 | 94.04% | 20 |
| Oxford-102-Flowers | 96.08% | 97.69% | 196 | 97.49% | 20 |
| Food101 | 87.07% | 87.52% | 196 | 87.33% | 23 |
| SUN397 | 63.27% | 64.84% | 196 | 64.55% | 20 |
| Caltech101 | 91.71% | 92.54% | 196 | 92.16% | 14 |
| DTD | 76.87% | 77.53% | 196 | 77.47% | 13 |
| Stanford Cars | 85.14% | 92.72% | 196 | 92.72% | 25 |
| FGVC Aircraft | 80.32% | 88.45% | 196 | 88.36% | 20 |
- The above experiments verify that Bayesian search only reduces the accuracy by 0% to 0.4% under the condition of reducing the number of searches by about 10 times compared to grid search.
- The search space can be expaned easily using Bayesian search.
## Large-scale image classification
In practical applications, due to the lack of training data, the classification model trained on the ImageNet1k data set is often used as the pretrained model for other image classification tasks. In order to further help solve practical problems, based on ResNet50_vd, Baidu open sourced a self-developed large-scale classification pretrained model, in which the training data contains 100,000 categories and 43 million pictures. The pretrained model can be downloaded as follows:[**download link**](https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/ResNet50_vd_10w_pretrained.pdparams)
We conducted transfer learning experiments on 6 self-collected datasets,
using a set of fixed parameters and a grid search method, in which the number of training rounds was set to 20epochs, the ResNet50_vd model was selected, and the ImageNet pre-training accuracy was 79.12%. The comparison results of the experimental data set parameters and model accuracy are as follows:
Fixed scheme:
```
lr=0.001,l2 decay=1e-4,label smoothing=False,mixup=False
```
| Dataset | Statstics | **Pretrained moel on ImageNet <br />Top-1(fixed)/Top-1(search)** | **Pretrained moel on large-scale dataset<br />Top-1(fixed)/Top-1(search)** |
| --------------- | ----------------------------------------- | -------------------------------------------------------- | --------------------------------------------------------- |
| Flowers | class:102<br />train:5789<br />valid:2396 | 0.7779/0.9883 | 0.9892/0.9954 |
| Hand-painted stick figures | Class:18<br />train:1007<br />valid:432 | 0.8795/0.9196 | 0.9107/0.9219 |
| Leaves | class:6<br />train:5256<br />valid:2278 | 0.8212/0.8482 | 0.8385/0.8659 |
| Container vehicle | Class:115<br />train:4879<br />valid:2094 | 0.6230/0.9556 | 0.9524/0.9702 |
| Chair | class:5<br />train:169<br />valid:78 | 0.8557/0.9688 | 0.9077/0.9792 |
| Geology | class:4<br />train:671<br />valid:296 | 0.5719/0.8094 | 0.6781/0.8219 |
- The above experiments verified that for fixed parameters, compared with the pretrained model on ImageNet, using the large-scale classification model as a pretrained model can help us improve the model performance on a new dataset in most cases. Parameter search can be further helpful to the model performance.
## Reference
[1] Kornblith, Simon, Jonathon Shlens, and Quoc V. Le. "Do better imagenet models transfer better?." *Proceedings of the IEEE conference on computer vision and pattern recognition*. 2019.
[2] Kolesnikov, Alexander, et al. "Large Scale Learning of General Visual Representations for Transfer." *arXiv preprint arXiv:1912.11370* (2019).
# Vehicle Recognition
This part mainly includes two parts: vehicle fine-grained classification and vehicle Reid.
The goal of fine-grained classification is to recognize images belonging to multiple subordinate categories of a super-category, e.g., different species of animals/plants, different models of cars, different kinds of retail products. Obviously, fine-grained vehicle classification is to classify different sub categories of vehicles.
Vehicle ReID aims to re-target vehicle images across non-overlapping camera views given a query image. It has many practical applications, such as for analyzing and managing the traffic flows in Intelligent Transport System. In this process, how to extract robust features is particularly important.
In this document, the same training scheme is used to try the two application respectively.
## 1 Pipeline
See the pipline of [feature learning](./feature_learning_en.md) for details.
The config file of Vehicle ReID: [ResNet50_ReID.yaml](../../../ppcls/configs/Vehicle/ResNet50_ReID.yaml).
The config file of Vehicle fine-grained classification:[ResNet50.yaml](../../../ppcls/configs/Vehicle/ResNet50.yaml).
The details are as follows.
### 1.1 Data Augmentation
Different from classification, this part mainly uses the following methods:
- `Resize` to 224. Especially for ReID, the vehicle image is already croped using bbox by detector. So if `CenterCrop` is used, more vehicle information will be lost.
- [AugMix](https://arxiv.org/abs/1912.02781v1):Simulation of lighting changes, camera position changes and other real scenes.
- [RandomErasing](https://arxiv.org/pdf/1708.04896v2.pdf):Simulate occlusion.
### 1.2 Backbone
Using `ResNet50` as backbone, and make the following modifications:
- Last stage stride = 1, keep the size of the final output feature map to 14x14. At the cost of increasing a small amount of calculation, the ability of feature expression is greatly improved.
code:[ResNet50_last_stage_stride1](../../../ppcls/arch/backbone/variant_models/resnet_variant.py)
### 1.3 Neck
In order to reduce the complexity of calculating feature distance in inference, an embedding convolution layer is added, and the feature dimension is set to 512.
### 1.4 Metric Learning Losses
In vehicle ReID and vehicle fine-grained classification,[SupConLoss](../../../ppcls/loss/supconloss.py) , [ArcLoss](../../../ppcls/arch/gears/arcmargin.py) are used. The weight ratio of two losses is 1:1.
## 2 Experiment
### 2.1 Vehicle ReID
<img src="../../images/recognition/vehicle/cars.JPG" style="zoom:50%;" />
This method is used in VERI-Wild dataset. This dataset was captured in a large CCTV monitoring system in an unrestricted scenario for a month (30 * 24 hours). The system consists of 174 cameras, which are distributed in large area of more than 200 square kilometers. The original vehicle image set contains 12 million vehicle images. After data cleaning and labeling, 416314 images and 40671 vehicle ids are collected. [See the paper for details]( https://github.com/PKU-IMRE/VERI-Wild).
| **Methods** | **Small** | | |
| :--------------------------: | :-------: | :-------: | :-------: |
| | mAP | Top1 | Top5 |
| Strong baesline(Resnet50)[1] | 76.61 | 90.83 | 97.29 |
| HPGN(Resnet50+PGN)[2] | 80.42 | 91.37 | - |
| GLAMOR(Resnet50+PGN)[3] | 77.15 | 92.13 | 97.43 |
| PVEN(Resnet50)[4] | 79.8 | 94.01 | 98.06 |
| SAVER(VAE+Resnet50)[5] | 80.9 | 93.78 | 97.93 |
| PaddleClas baseline | 80.57 | **93.81** | **98.06** |
### 2.2 Vehicle Fine-grained Classification
In this applications, we use [CompCars](http://mmlab.ie.cuhk.edu.hk/datasets/comp_cars/index.html) as train dataset.
![](../../images/recognition/vehicle/CompCars.png)
The images in the dataset mainly come from the network and monitoring data. The network data includes 163 automobile manufacturers and 1716 automobile models, which includes **136726** full vehicle images and **27618** partial vehicle images. The network car data includes the information of bounding box, perspective and five attributes (maximum speed, displacement, number of doors, number of seats and car type) for vehicles. The monitoring data includes **50000** front view images.
It is worth noting that this dataset needs to generate labels according to its own needs. For example, in this demo, vehicles of the same model produced in different years are regarded as the same category. Therefore, the total number of categories is 431.
| **Methods** | Top1 Acc |
| :-----------------------------: | :--------: |
| ResNet101-swp[6] | 97.6% |
| Fine-Tuning DARTS[7] | 95.9% |
| Resnet50 + COOC[8] | 95.6% |
| A3M[9] | 95.4% |
| PaddleClas baseline (ResNet50) | **97.37**% |
## 3 References
[1] Bag of Tricks and a Strong Baseline for Deep Person Re-Identification.CVPR workshop 2019.
[2] Exploring Spatial Significance via Hybrid Pyramidal Graph Network for Vehicle Re-identification. In arXiv preprint arXiv:2005.14684
[3] GLAMORous: Vehicle Re-Id in Heterogeneous Cameras Networks with Global and Local Attention. In arXiv preprint arXiv:2002.02256
[4] Parsing-based view-aware embedding network for vehicle re-identification. CVPR 2020.
[5] The Devil is in the Details: Self-Supervised Attention for Vehicle Re-Identification. In ECCV 2020.
[6] Deep CNNs With Spatially Weighted Pooling for Fine-Grained Car Recognition. IEEE Transactions on Intelligent Transportation Systems, 2017.
[7] Fine-Tuning DARTS for Image Classification. 2020.
[8] Fine-Grained Vehicle Classification with Unsupervised Parts Co-occurrence Learning. 2018
[9] Attribute-Aware Attention Model for Fine-grained Representation Learning. 2019.
...@@ -4,7 +4,7 @@ ...@@ -4,7 +4,7 @@
[Paddle-Lite](https://github.com/PaddlePaddle/Paddle-Lite) is a set of lightweight inference engine which is fully functional, easy to use and then performs well. Lightweighting is reflected in the use of fewer bits to represent the weight and activation of the neural network, which can greatly reduce the size of the model, solve the problem of limited storage space of the mobile device, and the inference speed is better than other frameworks on the whole. [Paddle-Lite](https://github.com/PaddlePaddle/Paddle-Lite) is a set of lightweight inference engine which is fully functional, easy to use and then performs well. Lightweighting is reflected in the use of fewer bits to represent the weight and activation of the neural network, which can greatly reduce the size of the model, solve the problem of limited storage space of the mobile device, and the inference speed is better than other frameworks on the whole.
In [PaddleClas](https://github.com/PaddlePaddle/PaddleClas), we uses Paddle-Lite to [evaluate the performance on the mobile device](../models/Mobile.md), in this section we uses the `MobileNetV1` model trained on the `ImageNet1k` dataset as an example to introduce how to use `Paddle-Lite` to evaluate the model speed on the mobile terminal (evaluated on SD855) In [PaddleClas](https://github.com/PaddlePaddle/PaddleClas), we uses Paddle-Lite to [evaluate the performance on the mobile device](../models/Mobile_en.md), in this section we uses the `MobileNetV1` model trained on the `ImageNet1k` dataset as an example to introduce how to use `Paddle-Lite` to evaluate the model speed on the mobile terminal (evaluated on SD855)
## Evaluation Steps ## Evaluation Steps
......
...@@ -9,4 +9,4 @@ By using this toolkit, [PaddleClas](https://github.com/PaddlePaddle/PaddleClas) ...@@ -9,4 +9,4 @@ By using this toolkit, [PaddleClas](https://github.com/PaddlePaddle/PaddleClas)
After quantized, the prediction speed is accelerated from 19.308ms to 14.395ms on SD855. After quantized, the prediction speed is accelerated from 19.308ms to 14.395ms on SD855.
The storage size is reduced from 21M to 10M. The storage size is reduced from 21M to 10M.
The top1 recognition accuracy rate is 75.9%. The top1 recognition accuracy rate is 75.9%.
For specific training methods, please refer to [PaddleSlim quant aware](../../../deploy/slim/quant/README_en.md) For specific training methods, please refer to [PaddleSlim quant aware](../../../deploy/slim/README_en.md)
...@@ -33,7 +33,7 @@ Functions of the above modules : ...@@ -33,7 +33,7 @@ Functions of the above modules :
<a name="3"></a> <a name="3"></a>
## 3.General Recognition Models ## 3.General Recognition Models
In PP-Shitu, we have [PP_LCNet_x2_5](../models/PP-LCNet.md) as the backbone network, Linear Layer for Neck, [ArcMargin](../../../ppcls/arch/gears/arcmargin.py) for Head, and CELoss for Loss. See the details in [General Recognition_configuration files](../.././ppcls/configs/GeneralRecognition/). The involved training data covers the following seven public datasets: In PP-Shitu, we have [PP_LCNet_x2_5](../models/PP-LCNet.md) as the backbone network, Linear Layer for Neck, [ArcMargin](../../../ppcls/arch/gears/arcmargin.py) for Head, and CELoss for Loss. See the details in [General Recognition_configuration files](../../../ppcls/configs/GeneralRecognition/). The involved training data covers the following seven public datasets:
| Datasets | Data Size | Class Number | Scenarios | URL | | Datasets | Data Size | Class Number | Scenarios | URL |
| ------------ | --------- | ------------ | ------------------ | ------------------------------------------------------------ | | ------------ | --------- | ------------ | ------------------ | ------------------------------------------------------------ |
......
...@@ -227,7 +227,7 @@ The final directory contains `inference/ppyolov2_r50vd_dcn_365e_coco`, `inferen ...@@ -227,7 +227,7 @@ The final directory contains `inference/ppyolov2_r50vd_dcn_365e_coco`, `inferen
After exporting the model, the path of the detection model can be changed to the inference model path to complete the prediction task. After exporting the model, the path of the detection model can be changed to the inference model path to complete the prediction task.
Take product recognition as an example,you can modify the field `Global.det_inference_model_dir` in its config file [inference_product.yaml](../../../deploy/configs/inference_product.yaml) to the directory of exported inference model, and then finish the detection and recognition of the product with reference to [Quick Start for Image Recognition](./quick_start/quick_start_recognition_en.md). Take product recognition as an example,you can modify the field `Global.det_inference_model_dir` in its config file [inference_product.yaml](../../../deploy/configs/inference_product.yaml) to the directory of exported inference model, and then finish the detection and recognition of the product with reference to [Quick Start for Image Recognition](../quick_start/quick_start_recognition_en.md).
## FAQ ## FAQ
......
...@@ -26,8 +26,6 @@ This tutorial will introduce the detailed steps of deploying the PaddleClas clas ...@@ -26,8 +26,6 @@ This tutorial will introduce the detailed steps of deploying the PaddleClas clas
- Linux, docker is recommended. - Linux, docker is recommended.
- Windows, compilation based on `Visual Studio 2019 Community` is supported. In addition, you can refer to [How to use PaddleDetection to make a complete project](https://zhuanlan.zhihu.com/p/145446681) to compile by generating the `sln solution`. - Windows, compilation based on `Visual Studio 2019 Community` is supported. In addition, you can refer to [How to use PaddleDetection to make a complete project](https://zhuanlan.zhihu.com/p/145446681) to compile by generating the `sln solution`.
- This document mainly introduces the compilation and inference of PaddleClas using C++ in Linux environment. - This document mainly introduces the compilation and inference of PaddleClas using C++ in Linux environment.
- If you need to use the Inference Library in Windows environment, please refer to [The compilation tutorial in Windows](./docs/windows_vs2019_build.md) for detailed information.
<a name="1.1"></a> <a name="1.1"></a>
### 1.1 Compile opencv ### 1.1 Compile opencv
...@@ -254,7 +252,7 @@ After executing the above commands, the dynamic link libraries (`libcls.so` and ...@@ -254,7 +252,7 @@ After executing the above commands, the dynamic link libraries (`libcls.so` and
<a name="3.1"></a> <a name="3.1"></a>
### 3.1 Prepare the inference model ### 3.1 Prepare the inference model
* You can refer to [Model inference](../../tools/export_model.py),export the inference model. After the model is exported, assuming it is placed in the `inference` directory, the directory structure is as follows. * You can refer to [Model inference](../../../tools/export_model.py),export the inference model. After the model is exported, assuming it is placed in the `inference` directory, the directory structure is as follows.
``` ```
inference/ inference/
......
...@@ -95,9 +95,9 @@ The exporting model command will generate the following three files: ...@@ -95,9 +95,9 @@ The exporting model command will generate the following three files:
The inference model exported is used to deployment by using prediction engine. You can refer the following docs according to different deployment modes / platforms The inference model exported is used to deployment by using prediction engine. You can refer the following docs according to different deployment modes / platforms
* [Python inference](./python_deploy.md) * [Python inference](./python_deploy_en.md)
* [C++ inference](./cpp_deploy.md)(Only support classification) * [C++ inference](./cpp_deploy_en.md)(Only support classification)
* [Python Whl inference](./whl_deploy.md)(Only support classification) * [Python Whl inference](./whl_deploy_en.md)(Only support classification)
* [PaddleHub Serving inference](./paddle_hub_serving_deploy.md)(Only support classification) * [PaddleHub Serving inference](./paddle_hub_serving_deploy_en.md)(Only support classification)
* [PaddleServing inference](./paddle_serving_deploy.md) * [PaddleServing inference](./paddle_serving_deploy_en.md)
* [PaddleLite inference](./paddle_lite_deploy.md)(Only support classification) * [PaddleLite inference](./paddle_lite_deploy_en.md)(Only support classification)
...@@ -16,6 +16,9 @@ PaddleClas supports rapid service deployment through Paddlehub. At present, it s ...@@ -16,6 +16,9 @@ PaddleClas supports rapid service deployment through Paddlehub. At present, it s
- [6. Send prediction requests](#6) - [6. Send prediction requests](#6)
- [7. User defined service module modification](#7) - [7. User defined service module modification](#7)
<a name="1"></a>
## 1. Introduction
HubServing service pack contains 3 files, the directory is as follows: HubServing service pack contains 3 files, the directory is as follows:
``` ```
...@@ -45,7 +48,7 @@ Before installing the service module, you need to prepare the inference model an ...@@ -45,7 +48,7 @@ Before installing the service module, you need to prepare the inference model an
**Notice**: **Notice**:
* The model file path can be viewed and modified in `PaddleClas/deploy/hubserving/clas/params.py`. * The model file path can be viewed and modified in `PaddleClas/deploy/hubserving/clas/params.py`.
* It should be noted that the prefix of model structure file and model parameters file must be `inference`. * It should be noted that the prefix of model structure file and model parameters file must be `inference`.
* More models provided by PaddleClas can be obtained from the [model library](../../docs/en/models/models_intro_en.md). You can also use models trained by yourself. * More models provided by PaddleClas can be obtained from the [model library](../models/models_intro_en.md). You can also use models trained by yourself.
<a name="4"></a> <a name="4"></a>
## 4. Install Service Module ## 4. Install Service Module
...@@ -230,4 +233,4 @@ Common parameters can be modified in params.py: ...@@ -230,4 +233,4 @@ Common parameters can be modified in params.py:
'class_id_map_file': 'class_id_map_file':
``` ```
In order to avoid unnecessary delay and be able to predict in batch, the preprocessing (include resize, crop and other) is completed in the client, so modify [test_hubserving.py](../../deploy/hubserving/test_hubserving.py#L35-L52) if necessary. In order to avoid unnecessary delay and be able to predict in batch, the preprocessing (include resize, crop and other) is completed in the client, so modify [test_hubserving.py](../../../deploy/hubserving/test_hubserving.py#L35-L52) if necessary.
...@@ -4,7 +4,7 @@ This tutorial will introduce how to use [Paddle-Lite](https://github.com/PaddleP ...@@ -4,7 +4,7 @@ This tutorial will introduce how to use [Paddle-Lite](https://github.com/PaddleP
Paddle-Lite is a lightweight inference engine for PaddlePaddle. It provides efficient inference capabilities for mobile phones and IoTs, and extensively integrates cross-platform hardware to provide lightweight deployment solutions for mobile-side deployment issues. Paddle-Lite is a lightweight inference engine for PaddlePaddle. It provides efficient inference capabilities for mobile phones and IoTs, and extensively integrates cross-platform hardware to provide lightweight deployment solutions for mobile-side deployment issues.
If you only want to test speed, please refer to [The tutorial of Paddle-Lite mobile-side benchmark test](../../docs/zh_CN/extension/paddle_mobile_inference.md). If you only want to test speed, please refer to [The tutorial of Paddle-Lite mobile-side benchmark test](../extension/paddle_mobile_inference_en.md).
--- ---
...@@ -127,7 +127,7 @@ git checkout develop ...@@ -127,7 +127,7 @@ git checkout develop
After the compilation is complete, the `opt` file is located under `build.opt/lite/api/`. After the compilation is complete, the `opt` file is located under `build.opt/lite/api/`.
`opt` tool is used in the same way as `paddle_lite_opt` , please refer to [4.1](#4.1). `opt` tool is used in the same way as `paddle_lite_opt` , please refer to [2.1.1](#2.1.1).
<a name="2.1.3"></a> <a name="2.1.3"></a>
### 2.1.3 Demo of get the optimized model ### 2.1.3 Demo of get the optimized model
...@@ -202,7 +202,7 @@ cp ../../../cxx/lib/libpaddle_light_api_shared.so ./debug/ ...@@ -202,7 +202,7 @@ cp ../../../cxx/lib/libpaddle_light_api_shared.so ./debug/
The `prepare.sh` take `PaddleClas/deploy/lite/imgs/tabby_cat.jpg` as the test image, and copy it to the `demo/cxx/clas/debug/` directory. The `prepare.sh` take `PaddleClas/deploy/lite/imgs/tabby_cat.jpg` as the test image, and copy it to the `demo/cxx/clas/debug/` directory.
You should put the model that optimized by `paddle_lite_opt` under the `demo/cxx/clas/debug/` directory. In this example, use `MobileNetV3_large_x1_0.nb` model file generated in [2.1.3](#4.3). You should put the model that optimized by `paddle_lite_opt` under the `demo/cxx/clas/debug/` directory. In this example, use `MobileNetV3_large_x1_0.nb` model file generated in [2.1.3](#2.1.3).
The structure of the clas demo is as follows after the above command is completed: The structure of the clas demo is as follows after the above command is completed:
......
...@@ -139,4 +139,4 @@ python python/predict_rec.py -c configs/inference_rec.yaml -o Global.use_gpu=Fa ...@@ -139,4 +139,4 @@ python python/predict_rec.py -c configs/inference_rec.yaml -o Global.use_gpu=Fa
<a name="4"></a> <a name="4"></a>
## 4. Concatenation of mainbody detection, feature extraction and vector search ## 4. Concatenation of mainbody detection, feature extraction and vector search
Please refer to [Quick Start of Recognition](./tutorials/quick_start_recognition_en.md) Please refer to [Quick Start of Recognition](../quick_start/quick_start_recognition_en.md)
...@@ -37,17 +37,17 @@ pip3 install dist/* ...@@ -37,17 +37,17 @@ pip3 install dist/*
<a name="2"></a> <a name="2"></a>
## 2. Quick Start ## 2. Quick Start
* Using the `ResNet50` model provided by PaddleClas, the following image(`'docs/images/whl/demo.jpg'`) as an example. * Using the `ResNet50` model provided by PaddleClas, the following image(`'docs/images/inference_deployment/whl_demo.jpg'`) as an example.
<div align="center"> <div align="center">
<img src="../images/whl/demo.jpg" width = "400" /> <img src="../images/inference_deployment/whl_demo.jpg" width = "400" />
</div> </div>
* Python * Python
```python ```python
from paddleclas import PaddleClas from paddleclas import PaddleClas
clas = PaddleClas(model_name='ResNet50') clas = PaddleClas(model_name='ResNet50')
infer_imgs='docs/images/whl/demo.jpg' infer_imgs='docs/images/inference_deployment/whl_demo.jpg'
result=clas.predict(infer_imgs) result=clas.predict(infer_imgs)
print(next(result)) print(next(result))
``` ```
...@@ -61,12 +61,12 @@ print(next(result)) ...@@ -61,12 +61,12 @@ print(next(result))
* CLI * CLI
```bash ```bash
paddleclas --model_name=ResNet50 --infer_imgs="docs/images/whl/demo.jpg" paddleclas --model_name=ResNet50 --infer_imgs="docs/images/inference_deployment/whl_demo.jpg"
``` ```
``` ```
>>> result >>> result
filename: docs/images/whl/demo.jpg, top-5, class_ids: [8, 7, 136, 80, 84], scores: [0.79368, 0.16329, 0.01853, 0.00959, 0.00239], label_names: ['hen', 'cock', 'European gallinule, Porphyrio porphyrio', 'black grouse', 'peacock'] filename: docs/images/inference_deployment/whl_demo.jpg, top-5, class_ids: [8, 7, 136, 80, 84], scores: [0.79368, 0.16329, 0.01853, 0.00959, 0.00239], label_names: ['hen', 'cock', 'European gallinule, Porphyrio porphyrio', 'black grouse', 'peacock']
Predict complete! Predict complete!
``` ```
...@@ -95,7 +95,7 @@ The following parameters can be specified in Command Line or used as parameters ...@@ -95,7 +95,7 @@ The following parameters can be specified in Command Line or used as parameters
* CLI: * CLI:
```bash ```bash
from paddleclas import PaddleClas, get_default_confg from paddleclas import PaddleClas, get_default_confg
paddleclas --model_name=ViT_base_patch16_384 --infer_imgs='docs/images/whl/demo.jpg' --resize_short=384 --crop_size=384 paddleclas --model_name=ViT_base_patch16_384 --infer_imgs='docs/images/inference_deployment/whl_demo.jpg' --resize_short=384 --crop_size=384
``` ```
* Python: * Python:
...@@ -127,14 +127,14 @@ You can use the inference model provided by PaddleClas to predict, and only need ...@@ -127,14 +127,14 @@ You can use the inference model provided by PaddleClas to predict, and only need
```python ```python
from paddleclas import PaddleClas from paddleclas import PaddleClas
clas = PaddleClas(model_name='ResNet50') clas = PaddleClas(model_name='ResNet50')
infer_imgs = 'docs/images/whl/demo.jpg' infer_imgs = 'docs/images/inference_deployment/whl_demo.jpg'
result=clas.predict(infer_imgs) result=clas.predict(infer_imgs)
print(next(result)) print(next(result))
``` ```
* CLI * CLI
```bash ```bash
paddleclas --model_name='ResNet50' --infer_imgs='docs/images/whl/demo.jpg' paddleclas --model_name='ResNet50' --infer_imgs='docs/images/inference_deployment/whl_demo.jpg'
``` ```
<a name="4.3"></a> <a name="4.3"></a>
...@@ -145,14 +145,14 @@ You can use the local model files trained by yourself to predict, and only need ...@@ -145,14 +145,14 @@ You can use the local model files trained by yourself to predict, and only need
```python ```python
from paddleclas import PaddleClas from paddleclas import PaddleClas
clas = PaddleClas(inference_model_dir='./inference/') clas = PaddleClas(inference_model_dir='./inference/')
infer_imgs = 'docs/images/whl/demo.jpg' infer_imgs = 'docs/images/inference_deployment/whl_demo.jpg'
result=clas.predict(infer_imgs) result=clas.predict(infer_imgs)
print(next(result)) print(next(result))
``` ```
* CLI * CLI
```bash ```bash
paddleclas --inference_model_dir='./inference/' --infer_imgs='docs/images/whl/demo.jpg' paddleclas --inference_model_dir='./inference/' --infer_imgs='docs/images/inference_deployment/whl_demo.jpg'
``` ```
<a name="4.4"></a> <a name="4.4"></a>
...@@ -182,26 +182,26 @@ You can predict the Internet image, only need to specify URL of Internet image b ...@@ -182,26 +182,26 @@ You can predict the Internet image, only need to specify URL of Internet image b
```python ```python
from paddleclas import PaddleClas from paddleclas import PaddleClas
clas = PaddleClas(model_name='ResNet50') clas = PaddleClas(model_name='ResNet50')
infer_imgs = 'https://raw.githubusercontent.com/paddlepaddle/paddleclas/release/2.2/docs/images/whl/demo.jpg' infer_imgs = 'https://raw.githubusercontent.com/paddlepaddle/paddleclas/release/2.2/docs/images/inference_deployment/whl_demo.jpg'
result=clas.predict(infer_imgs) result=clas.predict(infer_imgs)
print(next(result)) print(next(result))
``` ```
* CLI * CLI
```bash ```bash
paddleclas --model_name='ResNet50' --infer_imgs='https://raw.githubusercontent.com/paddlepaddle/paddleclas/release/2.2/docs/images/whl/demo.jpg' paddleclas --model_name='ResNet50' --infer_imgs='https://raw.githubusercontent.com/paddlepaddle/paddleclas/release/2.2/docs/images/inference_deployment/whl_demo.jpg'
``` ```
<a name="4.6"></a> <a name="4.6"></a>
### 4.6 Prediction of NumPy.array format image ### 4.6 Prediction of NumPy.array format image
In Python code, you can predict the NumPy.array format image, only need to use the `infer_imgs` to transfer variable of image data. Note that the image data must be 3 channels. In Python code, you can predict the `NumPy.array` format image, only need to use the `infer_imgs` to transfer variable of image data. Note that the models in PaddleClas only support to predict 3 channels image data, and channels order is `RGB`.
* python * python
```python ```python
import cv2 import cv2
from paddleclas import PaddleClas from paddleclas import PaddleClas
clas = PaddleClas(model_name='ResNet50') clas = PaddleClas(model_name='ResNet50')
infer_imgs = cv2.imread("docs/images/whl/demo.jpg") infer_imgs = cv2.imread("docs/en/inference_deployment/whl_deploy_en.md")[:, :, ::-1]
result=clas.predict(infer_imgs) result=clas.predict(infer_imgs)
print(next(result)) print(next(result))
``` ```
...@@ -214,14 +214,14 @@ You can save the prediction result(s) as pre-label, only need to use `pre_label_ ...@@ -214,14 +214,14 @@ You can save the prediction result(s) as pre-label, only need to use `pre_label_
```python ```python
from paddleclas import PaddleClas from paddleclas import PaddleClas
clas = PaddleClas(model_name='ResNet50', save_dir='./output_pre_label/') clas = PaddleClas(model_name='ResNet50', save_dir='./output_pre_label/')
infer_imgs = 'docs/images/whl/' # it can be infer_imgs folder path which contains all of images you want to predict. infer_imgs = 'docs/images/inference_deployment/whl_' # it can be infer_imgs folder path which contains all of images you want to predict.
result=clas.predict(infer_imgs) result=clas.predict(infer_imgs)
print(next(result)) print(next(result))
``` ```
* CLI * CLI
```bash ```bash
paddleclas --model_name='ResNet50' --infer_imgs='docs/images/whl/' --save_dir='./output_pre_label/' paddleclas --model_name='ResNet50' --infer_imgs='docs/images/inference_deployment/whl_' --save_dir='./output_pre_label/'
``` ```
<a name="4.8"></a> <a name="4.8"></a>
...@@ -247,12 +247,12 @@ For example: ...@@ -247,12 +247,12 @@ For example:
```python ```python
from paddleclas import PaddleClas from paddleclas import PaddleClas
clas = PaddleClas(model_name='ResNet50', class_id_map_file='./ppcls/utils/imagenet1k_label_list.txt') clas = PaddleClas(model_name='ResNet50', class_id_map_file='./ppcls/utils/imagenet1k_label_list.txt')
infer_imgs = 'docs/images/whl/demo.jpg' infer_imgs = 'docs/images/inference_deployment/whl_demo.jpg'
result=clas.predict(infer_imgs) result=clas.predict(infer_imgs)
print(next(result)) print(next(result))
``` ```
* CLI * CLI
```bash ```bash
paddleclas --model_name='ResNet50' --infer_imgs='docs/images/whl/demo.jpg' --class_id_map_file='./ppcls/utils/imagenet1k_label_list.txt' paddleclas --model_name='ResNet50' --infer_imgs='docs/images/inference_deployment/whl_demo.jpg' --class_id_map_file='./ppcls/utils/imagenet1k_label_list.txt'
``` ```
...@@ -11,6 +11,8 @@ ...@@ -11,6 +11,8 @@
At present, **PaddleClas** requires **PaddlePaddle** version **>=2.0**. Docker is recomended to run Paddleclas, for more detailed information about docker and nvidia-docker, you can refer to the [tutorial](https://docs.docker.com/get-started/). If you do not want to use docker, you can skip section [2. (Recommended) Prepare a docker environment](#2), and go into section [3. Install PaddlePaddle using pip](#3). At present, **PaddleClas** requires **PaddlePaddle** version **>=2.0**. Docker is recomended to run Paddleclas, for more detailed information about docker and nvidia-docker, you can refer to the [tutorial](https://docs.docker.com/get-started/). If you do not want to use docker, you can skip section [2. (Recommended) Prepare a docker environment](#2), and go into section [3. Install PaddlePaddle using pip](#3).
<a name="1"></a>
## 1. Environment requirements ## 1. Environment requirements
- python 3.x - python 3.x
......
...@@ -56,7 +56,7 @@ The SE module is a channel attention mechanism proposed by SENet, which can effe ...@@ -56,7 +56,7 @@ The SE module is a channel attention mechanism proposed by SENet, which can effe
The option in the third row of the table was chosen for the location of the SE module in PP-LCNet. The option in the third row of the table was chosen for the location of the SE module in PP-LCNet.
<a name="3.3 "></a> <a name="3.3"></a>
### 3.3 Larger Convolution Kernels ### 3.3 Larger Convolution Kernels
In the paper of MixNet, the author analyzes the effect of convolutional kernel size on model performance and concludes that larger convolutional kernels within a certain range can improve the performance of the model, but beyond this range will be detrimental to the model’s performance. So the author forms MixConv with split-concat paradigm combined, which can improve the performance of the model but is not conducive to inference. We experimentally summarize the role of some larger convolutional kernels at different positions that are similar to those of the SE module, and find that larger convolutional kernels display more prominent roles in the middle and tail of the network. The following table shows the effect of the position of the 5x5 convolutional kernels on the accuracy: In the paper of MixNet, the author analyzes the effect of convolutional kernel size on model performance and concludes that larger convolutional kernels within a certain range can improve the performance of the model, but beyond this range will be detrimental to the model’s performance. So the author forms MixConv with split-concat paradigm combined, which can improve the performance of the model but is not conducive to inference. We experimentally summarize the role of some larger convolutional kernels at different positions that are similar to those of the SE module, and find that larger convolutional kernels display more prominent roles in the middle and tail of the network. The following table shows the effect of the position of the 5x5 convolutional kernels on the accuracy:
......
...@@ -138,6 +138,8 @@ cd ../../ ...@@ -138,6 +138,8 @@ cd ../../
For training and evaluation on a single GPU, the `tools/train.py` and `tools/eval.py` scripts are recommended. For training and evaluation on a single GPU, the `tools/train.py` and `tools/eval.py` scripts are recommended.
<a name="2.2.1"></a>
#### 2.2.1 Model Training #### 2.2.1 Model Training
Once you have prepared the configuration file, you can start training the image retrieval task in the following way. the method used by PaddleClas to train the image retrieval is metric learning, referring to [metric learning](#metric learning) for more explanations. Once you have prepared the configuration file, you can start training the image retrieval task in the following way. the method used by PaddleClas to train the image retrieval is metric learning, referring to [metric learning](#metric learning) for more explanations.
...@@ -188,6 +190,8 @@ Loss: ...@@ -188,6 +190,8 @@ Loss:
The final total Loss is a weighted sum of all Losses, where weight defines the weight of a particular Loss in the final total. If you want to replace other Losses, you can also change the Loss field in the configuration file, for the currently supported Losses please refer to [Loss](../../../ppcls/loss). The final total Loss is a weighted sum of all Losses, where weight defines the weight of a particular Loss in the final total. If you want to replace other Losses, you can also change the Loss field in the configuration file, for the currently supported Losses please refer to [Loss](../../../ppcls/loss).
<a name="2.2.2"></a>
#### 2.2.2 Resume Training #### 2.2.2 Resume Training
If the training task is terminated for some reasons, it can be recovered by loading the checkpoints weights file and continue training: If the training task is terminated for some reasons, it can be recovered by loading the checkpoints weights file and continue training:
...@@ -226,6 +230,8 @@ There is no need to modify the configuration file, just set the `Global.checkpoi ...@@ -226,6 +230,8 @@ There is no need to modify the configuration file, just set the `Global.checkpoi
. .
``` ```
<a name="2.2.3"></a>
#### 2.2.3 Model Evaluation #### 2.2.3 Model Evaluation
Model evaluation can be carried out with the following commands. Model evaluation can be carried out with the following commands.
......
...@@ -16,7 +16,7 @@ ...@@ -16,7 +16,7 @@
[Paddle-Lite](https://github.com/PaddlePaddle/Paddle-Lite) is a set of lightweight inference engine which is fully functional, easy to use and then performs well. Lightweighting is reflected in the use of fewer bits to represent the weight and activation of the neural network, which can greatly reduce the size of the model, solve the problem of limited storage space of the mobile device, and the inference speed is better than other frameworks on the whole. [Paddle-Lite](https://github.com/PaddlePaddle/Paddle-Lite) is a set of lightweight inference engine which is fully functional, easy to use and then performs well. Lightweighting is reflected in the use of fewer bits to represent the weight and activation of the neural network, which can greatly reduce the size of the model, solve the problem of limited storage space of the mobile device, and the inference speed is better than other frameworks on the whole.
In [PaddleClas](https://github.com/PaddlePaddle/PaddleClas), we uses Paddle-Lite to [evaluate the performance on the mobile device](../models/Mobile.md), in this section we uses the `MobileNetV1` model trained on the `ImageNet1k` dataset as an example to introduce how to use `Paddle-Lite` to evaluate the model speed on the mobile terminal (evaluated on SD855) In [PaddleClas](https://github.com/PaddlePaddle/PaddleClas), we uses Paddle-Lite to [evaluate the performance on the mobile device](../models/Mobile_en.md), in this section we uses the `MobileNetV1` model trained on the `ImageNet1k` dataset as an example to introduce how to use `Paddle-Lite` to evaluate the model speed on the mobile terminal (evaluated on SD855)
<a name='2'></a> <a name='2'></a>
## 2. Evaluation Steps ## 2. Evaluation Steps
......
...@@ -2,26 +2,26 @@ ...@@ -2,26 +2,26 @@
This tutorial contains 3 parts: Environment Preparation, Image Recognition Experience, and Unknown Category Image Recognition Experience. This tutorial contains 3 parts: Environment Preparation, Image Recognition Experience, and Unknown Category Image Recognition Experience.
If the image category already exists in the image index database, then you can take a reference to chapter [Image Recognition Experience](#image_recognition_experience),to complete the progress of image recognition;If you wish to recognize unknow category image, which is not included in the index database,you can take a reference to chapter [Unknown Category Image Recognition Experience](#unkonw_category_image_recognition_experience),to complete the process of creating an index to recognize it。 If the image category already exists in the image index database, then you can take a reference to chapter [Image Recognition Experience](#2),to complete the progress of image recognition;If you wish to recognize unknow category image, which is not included in the index database,you can take a reference to chapter [Unknown Category Image Recognition Experience](#3),to complete the process of creating an index to recognize it。
## Catalogue ## Catalogue
* [1. Enviroment Preparation](#enviroment_preperation ) * [1. Enviroment Preparation](#1)
* [2. Image Recognition Experience](#image_recognition_experience) * [2. Image Recognition Experience](#2)
* [2.1 Download and Unzip the Inference Model and Demo Data](#download_and_unzip_the_inference_model_and_demo_data) * [2.1 Download and Unzip the Inference Model and Demo Data](#2.1)
* [2.2 Product Recognition and Retrieval](#Product_recognition_and_retrival) * [2.2 Product Recognition and Retrieval](#2.2)
* [2.2.1 Single Image Recognition](#recognition_of_single_image) * [2.2.1 Single Image Recognition](#2.2.1)
* [2.2.2 Folder-based Batch Recognition](#folder_based_batch_recognition) * [2.2.2 Folder-based Batch Recognition](#2.2.2)
* [3. Unknown Category Image Recognition Experience](#unkonw_category_image_recognition_experience) * [3. Unknown Category Image Recognition Experience](#3)
* [3.1 Prepare for the new images and labels](#3.1) * [3.1 Prepare for the new images and labels](#3.1)
* [3.2 Build a new Index Library](#build_a_new_index_library) * [3.2 Build a new Index Library](#3.2)
* [3.3 Recognize the Unknown Category Images](#Image_differentiation_based_on_the_new_index_library) * [3.3 Recognize the Unknown Category Images](#3.3)
<a name="enviroment_preparation"></a> <a name="1"></a>
## 1. Enviroment Preparation ## 1. Enviroment Preparation
* Installation:Please take a reference to [Quick Installation ](./install_en.md)to configure the PaddleClas environment. * Installation:Please take a reference to [Quick Installation ](../installation/)to configure the PaddleClas environment.
* Using the following command to enter Folder `deploy`. All content and commands in this section need to be run in folder `deploy`. * Using the following command to enter Folder `deploy`. All content and commands in this section need to be run in folder `deploy`.
...@@ -29,7 +29,7 @@ If the image category already exists in the image index database, then you can t ...@@ -29,7 +29,7 @@ If the image category already exists in the image index database, then you can t
cd deploy cd deploy
``` ```
<a name="image_recognition_experience"></a> <a name="2"></a>
## 2. Image Recognition Experience ## 2. Image Recognition Experience
The detection model with the recognition inference model for the 4 directions (Logo, Cartoon Face, Vehicle, Product), the address for downloading the test data and the address of the corresponding configuration file are as follows. The detection model with the recognition inference model for the 4 directions (Logo, Cartoon Face, Vehicle, Product), the address for downloading the test data and the address of the corresponding configuration file are as follows.
...@@ -81,7 +81,7 @@ wget {Data download link} && tar -xf {Name of the tar archive} ...@@ -81,7 +81,7 @@ wget {Data download link} && tar -xf {Name of the tar archive}
``` ```
<a name="download_and_unzip_the_inference_model_and_demo_data"></a> <a name="2.1"></a>
### 2.1 Download and Unzip the Inference Model and Demo Data ### 2.1 Download and Unzip the Inference Model and Demo Data
Take the product recognition as an example, download the detection model, recognition model and product recognition demo data with the following commands. Take the product recognition as an example, download the detection model, recognition model and product recognition demo data with the following commands.
...@@ -136,7 +136,7 @@ If you want to use the lightweight generic recognition model, you need to re-ext ...@@ -136,7 +136,7 @@ If you want to use the lightweight generic recognition model, you need to re-ext
python3.7 python/build_gallery.py -c configs/build_product.yaml -o Global.rec_inference_model_dir=./models/general_PPLCNet_x2_5_lite_v1.0_infer python3.7 python/build_gallery.py -c configs/build_product.yaml -o Global.rec_inference_model_dir=./models/general_PPLCNet_x2_5_lite_v1.0_infer
``` ```
<a name="Product_recognition_and_retrival"></a> <a name="2.2"></a>
### 2.2 Product Recognition and Retrieval ### 2.2 Product Recognition and Retrieval
Take the product recognition demo as an example to show the recognition and retrieval process (if you wish to try other scenarios of recognition and retrieval, replace the corresponding configuration file after downloading and unzipping the corresponding demo data and model to complete the prediction). Take the product recognition demo as an example to show the recognition and retrieval process (if you wish to try other scenarios of recognition and retrieval, replace the corresponding configuration file after downloading and unzipping the corresponding demo data and model to complete the prediction).
...@@ -149,7 +149,7 @@ pip install faiss-cpu==1.7.1post2 ...@@ -149,7 +149,7 @@ pip install faiss-cpu==1.7.1post2
If error happens when using `import faiss`, please uninstall `faiss` and reinstall it, especially on `Windows`. If error happens when using `import faiss`, please uninstall `faiss` and reinstall it, especially on `Windows`.
<a name="recognition_of_single_image"></a> <a name="2.2.1"></a>
#### 2.2.1 Single Image Recognition #### 2.2.1 Single Image Recognition
...@@ -187,7 +187,7 @@ The detection result is also saved in the folder `output`, for this image, the v ...@@ -187,7 +187,7 @@ The detection result is also saved in the folder `output`, for this image, the v
</div> </div>
<a name="folder_based_batch_recognition"></a> <a name="2.2.2"></a>
#### 2.2.2 Folder-based Batch Recognition #### 2.2.2 Folder-based Batch Recognition
If you want to predict the images in the folder, you can directly modify the `Global.infer_imgs` field in the configuration file, or you can also modify the corresponding configuration through the following `-o` parameter. If you want to predict the images in the folder, you can directly modify the `Global.infer_imgs` field in the configuration file, or you can also modify the corresponding configuration through the following `-o` parameter.
...@@ -217,7 +217,7 @@ All the visualization results are also saved in folder `output`. ...@@ -217,7 +217,7 @@ All the visualization results are also saved in folder `output`.
Furthermore, the recognition inference model path can be changed by modifying the `Global.rec_inference_model_dir` field, and the path of the index to the index databass can be changed by modifying the `IndexProcess.index_dir` field. Furthermore, the recognition inference model path can be changed by modifying the `Global.rec_inference_model_dir` field, and the path of the index to the index databass can be changed by modifying the `IndexProcess.index_dir` field.
<a name="unkonw_category_image_recognition_experience"></a> <a name="3"></a>
## 3. Recognize Images of Unknown Category ## 3. Recognize Images of Unknown Category
To recognize the image `./recognition_demo_data_v1.1/test_product/anmuxi.jpg`, run the command as follows: To recognize the image `./recognition_demo_data_v1.1/test_product/anmuxi.jpg`, run the command as follows:
...@@ -268,7 +268,7 @@ gallery/anmuxi/006.jpg Anmuxi Ambrosial Yogurt ...@@ -268,7 +268,7 @@ gallery/anmuxi/006.jpg Anmuxi Ambrosial Yogurt
Each line can be splited into two fields. The first field denotes the relative image path, and the second field denotes its label. The `delimiter` is `tab` here. Each line can be splited into two fields. The first field denotes the relative image path, and the second field denotes its label. The `delimiter` is `tab` here.
<a name="build_a_new_index_library"></a> <a name="3.2"></a>
### 3.2 Build a new Index Base Library ### 3.2 Build a new Index Base Library
Use the following command to build the index to accelerate the retrieval process after recognition. Use the following command to build the index to accelerate the retrieval process after recognition.
...@@ -280,8 +280,8 @@ python3.7 python/build_gallery.py -c configs/build_product.yaml -o IndexProcess. ...@@ -280,8 +280,8 @@ python3.7 python/build_gallery.py -c configs/build_product.yaml -o IndexProcess.
Finally, the new index information is stored in the folder`./recognition_demo_data_v1.1/gallery_product/index_update`. Use the new index database for the above index. Finally, the new index information is stored in the folder`./recognition_demo_data_v1.1/gallery_product/index_update`. Use the new index database for the above index.
<a name="Image_differentiation_based_on_the_new_index_library"></a> <a name="3.3"></a>
### 3.2 Recognize the Unknown Category Images ### 3.3 Recognize the Unknown Category Images
To recognize the image `./recognition_demo_data_v1.1/test_product/anmuxi.jpg`, run the command as follows. To recognize the image `./recognition_demo_data_v1.1/test_product/anmuxi.jpg`, run the command as follows.
......
# Data
---
## Introducation
This document introduces the preparation of ImageNet1k and flowers102
## Dataset
Dataset | train dataset size | valid dataset size | category |
:------:|:---------------:|:---------------------:|:--------:|
[flowers102](https://www.robots.ox.ac.uk/~vgg/data/flowers/102/)|1k | 6k | 102 |
[ImageNet1k](http://www.image-net.org/challenges/LSVRC/2012/)|1.2M| 50k | 1000 |
* Data format
Please follow the steps mentioned below to organize data, include train_list.txt and val_list.txt
```shell
# delimiter: "space"
# the following the content of train_list.txt
train/n01440764/n01440764_10026.JPEG 0
...
# the following the content of val_list.txt
val/ILSVRC2012_val_00000001.JPEG 65
...
```
### ImageNet1k
After downloading data, please organize the data dir as below
```bash
PaddleClas/dataset/ILSVRC2012/
|_ train/
| |_ n01440764
| | |_ n01440764_10026.JPEG
| | |_ ...
| |_ ...
| |
| |_ n15075141
| |_ ...
| |_ n15075141_9993.JPEG
|_ val/
| |_ ILSVRC2012_val_00000001.JPEG
| |_ ...
| |_ ILSVRC2012_val_00050000.JPEG
|_ train_list.txt
|_ val_list.txt
```
### Flowers102 Dataset
Download [Data](https://www.robots.ox.ac.uk/~vgg/data/flowers/102/) then decompress:
```shell
jpg/
setid.mat
imagelabels.mat
```
Please put all the files under ```PaddleClas/dataset/flowers102```
generate generate_flowers102_list.py and train_list.txt和val_list.txt
```bash
python generate_flowers102_list.py jpg train > train_list.txt
python generate_flowers102_list.py jpg valid > val_list.txt
```
Please organize data dir as below
```bash
PaddleClas/dataset/flowers102/
|_ jpg/
| |_ image_03601.jpg
| |_ ...
| |_ image_02355.jpg
|_ train_list.txt
|_ val_list.txt
```
# Getting Started
---
Please refer to [Installation](install_en.md) to setup environment at first, and prepare flower102 dataset by following the instruction mentioned in the [Quick Start](quick_start_en.md).
## 1. Training and Evaluation on CPU or Single GPU
If training and evaluation are performed on CPU or single GPU, it is recommended to use the `tools/train.py` and `tools/eval.py`.
For training and evaluation in multi-GPU environment on Linux, please refer to [2. Training and evaluation on Linux+GPU](#2-training-and-evaluation-on-linuxgpu).
<a name="1.1"></a>
## 1.1 Model training
After preparing the configuration file, The training process can be started in the following way.
```
python tools/train.py \
-c ./ppcls/configs/quick_start/MobileNetV3_large_x1_0.yaml \
-o Arch.pretrained=False \
-o Global.device=gpu
```
Among them, `-c` is used to specify the path of the configuration file, `-o` is used to specify the parameters needed to be modified or added, `-o Arch.pretrained=False` means to not using pre-trained models.
`-o Global.device=gpu` means to use GPU for training. If you want to use the CPU for training, you need to set `Global.device` to `cpu`.
Of course, you can also directly modify the configuration file to update the configuration. For specific configuration parameters, please refer to [Configuration Document](config_description_en.md).
* The output log examples are as follows:
* If mixup or cutmix is used in training, top-1 and top-k (default by 5) will not be printed in the log:
```
...
epoch:0 , train step:20 , loss: 4.53660, lr: 0.003750, batch_cost: 1.23101 s, reader_cost: 0.74311 s, ips: 25.99489 images/sec, eta: 0:12:43
...
END epoch:1 valid top1: 0.01569, top5: 0.06863, loss: 4.61747, batch_cost: 0.26155 s, reader_cost: 0.16952 s, batch_cost_sum: 10.72348 s, ips: 76.46772 images/sec.
...
```
* If mixup or cutmix is not used during training, in addition to the above information, top-1 and top-k (The default is 5) will also be printed in the log:
```
...
epoch:0 , train step:30 , top1: 0.06250, top5: 0.09375, loss: 4.62766, lr: 0.003728, batch_cost: 0.64089 s, reader_cost: 0.18857 s, ips: 49.93080 images/sec, eta: 0:06:18
...
END epoch:0 train top1: 0.01310, top5: 0.04738, loss: 4.65124, batch_cost: 0.64089 s, reader_cost: 0.18857 s, batch_cost_sum: 13.45863 s, ips: 49.93080 images/sec.
...
```
During training, you can view loss changes in real time through `VisualDL`, see [VisualDL](../extension/VisualDL_en.md) for details.
### 1.2 Model finetuning
After configuring the configuration file, you can finetune it by loading the pretrained weights, The command is as shown below.
```
python tools/train.py \
-c ./ppcls/configs/quick_start/MobileNetV3_large_x1_0.yaml \
-o Arch.pretrained=True \
-o Global.device=gpu
```
Among them, `-o Arch.pretrained` is used to set the address to load the pretrained weights. When using it, you need to replace it with your own pretrained weights' path, or you can modify the path directly in the configuration file. You can also set it into `True` to use pretrained weights that trained in ImageNet1k.
We also provide a lot of pre-trained models trained on the ImageNet-1k dataset. For the model list and download address, please refer to the [model library overview](../models/models_intro_en.md).
### 1.3 Resume Training
If the training process is terminated for some reasons, you can also load the checkpoints to continue training.
```
python tools/train.py \
-c ./ppcls/configs/quick_start/MobileNetV3_large_x1_0.yaml \
-o Global.checkpoints="./output/MobileNetV3_large_x1_0/epoch_5" \
-o Global.device=gpu
```
The configuration file does not need to be modified. You only need to add the `Global.checkpoints` parameter during training, which represents the path of the checkpoints. The parameter weights, learning rate, optimizer and other information will be loaded using this parameter.
**Note**:
* The `-o Global.checkpoints` parameter does not need to include the suffix of the checkpoints. The above training command will generate the checkpoints as shown below during the training process. If you want to continue training from the epoch `5`, Just set the `Global.checkpoints` to `../output/MobileNetV3_large_x1_0/epoch_5`, PaddleClas will automatically fill in the `pdopt` and `pdparams` suffixes.
```shell
output
├── MobileNetV3_large_x1_0
│ ├── best_model.pdopt
│ ├── best_model.pdparams
│ ├── best_model.pdstates
│ ├── epoch_1.pdopt
│ ├── epoch_1.pdparams
│ ├── epoch_1.pdstates
.
.
.
```
### 1.4 Model evaluation
The model evaluation process can be started as follows.
```bash
python tools/eval.py \
-c ./ppcls/configs/quick_start/MobileNetV3_large_x1_0.yaml \
-o Global.pretrained_model=./output/MobileNetV3_large_x1_0/best_model
```
The above command will use `./configs/quick_start/MobileNetV3_large_x1_0.yaml` as the configuration file to evaluate the model `./output/MobileNetV3_large_x1_0/best_model`. You can also set the evaluation by changing the parameters in the configuration file, or you can update the configuration with the `-o` parameter, as shown above.
Some of the configurable evaluation parameters are described as follows:
* `Arch.name`: Model name
* `Global.pretrained_model`: The path of the model file to be evaluated
**Note:** If the model is a dygraph type, you only need to specify the prefix of the model file when loading the model, instead of specifying the suffix, such as [1.3 Resume Training](#13-resume-training).
<a name="2"></a>
### 2. Training and evaluation on Linux+GPU
If you want to run PaddleClas on Linux with GPU, it is highly recommended to use `paddle.distributed.launch` to start the model training script(`tools/train.py`) and evaluation script(`tools/eval.py`), which can start on multi-GPU environment more conveniently.
### 2.1 Model training
After preparing the configuration file, The training process can be started in the following way. `paddle.distributed.launch` specifies the GPU running card number by setting `gpus`:
```bash
export CUDA_VISIBLE_DEVICES=0,1,2,3
python3 -m paddle.distributed.launch \
--gpus="0,1,2,3" \
tools/train.py \
-c ./ppcls/configs/quick_start/MobileNetV3_large_x1_0.yaml
```
The format of output log information is the same as above, see [1.1 Model training](#11-model-training) for details.
### 2.2 Model finetuning
After configuring the configuration file, you can finetune it by loading the pretrained weights, The command is as shown below.
```
export CUDA_VISIBLE_DEVICES=0,1,2,3
python3 -m paddle.distributed.launch \
--gpus="0,1,2,3" \
tools/train.py \
-c ./ppcls/configs/quick_start/MobileNetV3_large_x1_0.yaml \
-o Arch.pretrained=True
```
Among them, `Arch.pretrained` is set to `True` or `False`. It also can be used to set the address to load the pretrained weights. When using it, you need to replace it with your own pretrained weights' path, or you can modify the path directly in the configuration file.
There contains a lot of examples of model finetuning in [Quick Start](./quick_start_en.md). You can refer to this tutorial to finetune the model on a specific dataset.
<a name="model_resume"></a>
### 2.3 Resume Training
If the training process is terminated for some reasons, you can also load the checkpoints to continue training.
```
export CUDA_VISIBLE_DEVICES=0,1,2,3
python3 -m paddle.distributed.launch \
--gpus="0,1,2,3" \
tools/train.py \
-c ./ppcls/configs/quick_start/MobileNetV3_large_x1_0.yaml \
-o Global.checkpoints="./output/MobileNetV3_large_x1_0/epoch_5" \
-o Global.device=gpu
```
The configuration file does not need to be modified. You only need to add the `Global.checkpoints` parameter during training, which represents the path of the checkpoints. The parameter weights, learning rate, optimizer and other information will be loaded using this parameter as described in [1.3 Resume training](#13-resume-training).
### 2.4 Model evaluation
The model evaluation process can be started as follows.
```bash
export CUDA_VISIBLE_DEVICES=0,1,2,3
python3 -m paddle.distributed.launch \
tools/eval.py \
-c ./ppcls/configs/quick_start/MobileNetV3_large_x1_0.yaml \
-o Global.pretrained_model=./output/MobileNetV3_large_x1_0/best_model
```
About parameter description, see [1.4 Model evaluation](#14-model-evaluation) for details.
<a name="model_infer"></a>
## 3. Use the pre-trained model to predict
After the training is completed, you can predict by using the pre-trained model obtained by the training, as follows:
```python
python3 tools/infer.py \
-c ./ppcls/configs/quick_start/MobileNetV3_large_x1_0.yaml \
-o Infer.infer_imgs=dataset/flowers102/jpg/image_00001.jpg \
-o Global.pretrained_model=./output/MobileNetV3_large_x1_0/best_model
```
Among them:
+ `Infer.infer_imgs`: The path of the image file or folder to be predicted;
+ `Global.pretrained_model`: Weight file path, such as `./output/MobileNetV3_large_x1_0/best_model`;
## 4. Use the inference model to predict
PaddlePaddle supports inference using prediction engines, which will be introduced next.
Firstly, you should export inference model using `tools/export_model.py`.
```bash
python3 tools/export_model.py \
-c ./ppcls/configs/quick_start/MobileNetV3_large_x1_0.yaml \
-o Global.pretrained_model=output/MobileNetV3_large_x1_0/best_model
```
Among them, `Global.pretrained_model` parameter is used to specify the model file path that does not need to include the file suffix name.
The above command will generate the model structure file (`inference.pdmodel`) and the model weight file (`inference.pdiparams`), and then the inference engine can be used for inference:
Go to the deploy directory:
```
cd deploy
```
Using inference engine to inference. Because the mapping file of ImageNet1k dataset is used by default, we should set `PostProcess.Topk.class_id_map_file` into `None`.
```bash
python3 python/predict_cls.py \
-c configs/inference_cls.yaml \
-o Global.infer_imgs=../dataset/flowers102/jpg/image_00001.jpg \
-o Global.inference_model_dir=../inference/ \
-o PostProcess.Topk.class_id_map_file=None
```
Among them:
+ `Global.infer_imgs`: The path of the image file to be predicted;
+ `Global.inference_model_dir`: Model structure file path, such as `../inference/inference.pdmodel`;
+ `Global.use_tensorrt`: Whether to use the TesorRT, default by `False`;
+ `Global.use_gpu`: Whether to use the GPU, default by `True`
+ `Global.enable_mkldnn`: Wheter to use `MKL-DNN`, default by `False`. It is valid when `Global.use_gpu` is `False`.
+ `Global.use_fp16`: Whether to enable FP16, default by `False`;
**Note**: If you want to use `Transformer series models`, such as `DeiT_***_384`, `ViT_***_384`, etc., please pay attention to the input size of model, and need to set `resize_short=384`, `resize=384`.
If you want to evaluate the speed of the model, it is recommended to enable TensorRT to accelerate for GPU, and MKL-DNN for CPU.
# Quick Start
---
At first,please take a reference to [Installation Guide](./install_en.md) to prepare your environment.
PaddleClas image retrieval supports the following training/evaluation environments:
```shell
└── CPU/Single GPU
   ├── Linux
   └── Windows
```
## Content
* [1. Data Preparation](#Data-Preparation)
* [2. Training and Evaluation on Single GPU](#Training-and-Evaluation-on-Single-GPU)
* [2.1 Model Training](#Model-Training)
* [2.2 Resume Training](#Resume-Training)
* [2.3 Model Evaluation](#Model-Evaluation)
* [3. Export Inference Model](#Export-Inference-Model)
<a name="Data-Preparation"></a>
## 1. Data Preparation
* Go to PaddleClas directory。
```bash
## linux or mac, $path_to_PaddleClas indicates the root directory of PaddleClas, which the user needs to modify according to their real directory
cd $path_to_PaddleClas
```
* Please go to the `dataset` catalog. In order to quickly experiment the image retrieval module of PaddleClas, the dataset we used is [CUB_200_2011](http://vision.ucsd.edu/sites/default/files/WelinderEtal10_CUB-200.pdf), which is a fine grid dataset with 200 different types of birds. Firstly, we need to download the dataset. For download, please refer to [Official Website](http://www.vision.caltech.edu/visipedia/CUB-200-2011.html).
```shell
# linux or mac
cd dataset
# Copy the downloaded data into a directory.
cp {Data storage path}/CUB_200_2011.tgz .
# Unzip
tar -xzvf CUB_200_2011.tgz
#go to `CUB_200_2011`
cd CUB_200_2011
```
When using the dataset for image retrieval, we usually use the first 100 classes as the training set, and the last 100 classes as the testing set, so we need to process those data so as to adapt the model training of image retrival.
```shell
#Create train and test directories
mkdir train && mkdir test
#Divide data into training set with the first 100 classes and testing set with the last 100 classes.
ls images | awk -F "." '{if(int($1)<101)print "mv images/"$0" train/"int($1)}' | sh
ls images | awk -F "." '{if(int($1)>100)print "mv images/"$0" test/"int($1)}' | sh
#Generate train_list.txt test_list.txt
tree -r -i -f train | grep jpg | awk -F "/" '{print $0" "int($2) " "NR}' > train_list.txt
tree -r -i -f test | grep jpg | awk -F "/" '{print $0" "int($2) " "NR}' > test_list.txt
```
So far, we have the training set (in the `train` catalog) and testing set (in the `test` catalog) of `CUB_200_2011`.
After data preparation, the `train` directory of `CUB_200_2011` should be:
```
├── 1
│   ├── Black_Footed_Albatross_0001_796111.jpg
│   ├── Black_Footed_Albatross_0002_55.jpg
...
├── 10
│   ├── Red_Winged_Blackbird_0001_3695.jpg
│   ├── Red_Winged_Blackbird_0005_5636.jpg
...
```
`train_list.txt` Should be:
```
train/99/Ovenbird_0137_92639.jpg 99 1
train/99/Ovenbird_0136_92859.jpg 99 2
train/99/Ovenbird_0135_93168.jpg 99 3
train/99/Ovenbird_0131_92559.jpg 99 4
train/99/Ovenbird_0130_92452.jpg 99 5
...
```
The separators are shown as spaces, and the meaning of those three columns of data are the directory of training set, labels of training set and unique ids of training set.
The format of testing set is the same as the one of training set.
**Note**
* When the gallery dataset and query dataset are the same, in order to remove the first data retrieved (the retrieved images themselves do not need to be evaluated), each data needs to correspond to a unique id for subsequent evaluation of metrics such as mAP, recall@1, etc. Please refer to [Introduction to image retrieval datasets](#Introduction to image retrieval datasets) for the analysis of gallery datasets and query datasets, and [Image retrieval evaluation metrics](#Image retrieval evaluation metrics) for the evaluation of mAP, recall@1, etc.
Back to `PaddleClas` root directory
```shell
# linux or mac
cd ../../
```
<a name="Training-and-Evaluation-on-Single-GPU"></a>
## 2. Single GPU-based Training and Evaluation
For training and evaluation on a single GPU, the `tools/train.py` and `tools/eval.py` scripts are recommended.
<a name="Model-Training"></a>
### 2.1 Model Training
Once you have prepared the configuration file, you can start training the image retrieval task in the following way. the method used by PaddleClas to train the image retrieval is metric learning, refering to [metric learning](#Metric-Learning) for an explanation of metric learning.
```
python3 tools/train.py \
-c ./ppcls/configs/quick_start/MobileNetV1_retrieval.yaml \
-o Arch.Backbone.pretrained=True \
-o Global.device=gpu
```
`-c` is used to specify the path to the configuration file, and `-o` is used to specify the parameters that need to be modified or added, where `-o Arch.Backbone.pretrained=True` indicates that the Backbone part uses the pre-trained model, in addition, `Arch.Backbone.pretrained` can also specify backbone.`pretrained` can also specify the address of a specific model weight file, which needs to be replaced with the path to your own pre-trained model weight file when using it. `-o Global.device=gpu` indicates that the GPU is used for training. If you want to use a CPU for training, you need to set `Global.device` to `cpu`.
For more detailed training configuration, you can also modify the corresponding configuration file of the model directly. Refer to the [configuration document](config_description_en.md) for specific configuration parameters.
Run the above commands to check the output log, an example is as follows:
```
...
[Train][Epoch 1/50][Avg]CELoss: 6.59110, TripletLossV2: 0.54044, loss: 7.13154
...
[Eval][Epoch 1][Avg]recall1: 0.46962, recall5: 0.75608, mAP: 0.21238
...
```
The Backbone here is MobileNetV1, if you want to use other backbone, you can rewrite the parameter `Arch.Backbone.name`, for example by adding `-o Arch.Backbone.name={other Backbone}` to the command. In addition, as the input dimension of the `Neck` section differs between models, replacing a Backbone may require rewriting the input size here in a similar way to replacing the Backbone's name.
In the Training Loss section, [CELoss](../../../ppcls/loss/celoss.py) and [TripletLossV2](../../../ppcls/loss/triplet.py) is used here with the following configuration files.
```
Loss:
Train:
- CELoss:
weight: 1.0
- TripletLossV2:
weight: 1.0
margin: 0.5
```
The final total Loss is a weighted sum of all Losses, where weight defines the weight of a particular Loss in the final total. If you want to replace other Losses, you can also change the Loss field in the configuration file, for the currently supported Losses please refer to [Loss](../../../ppcls/loss).
<a name="Resume-Training"></a>
### 2.2 Resume Training
If the training task is terminated for some reasons, it can be recovered by loading the checkpoints weights file and continue training.
```
python3 tools/train.py \
-c ./ppcls/configs/quick_start/MobileNetV1_retrieval.yaml \
-o Global.checkpoints="./output/RecModel/epoch_5" \
-o Global.device=gpu
```
There is no need to modify the configuration file, just set the `Global.checkpoints` parameter when continuing training, indicating the path to the loaded breakpoint weights file, using this parameter will load both the saved checkpoints weights and information about the learning rate, optimizer, etc.
**Note**
* The `-o Global.checkpoints` parameter need not contain the suffix name of the checkpoint weights file, the above training command will generate the breakpoint weights file as shown below during training, if you want to continue training from breakpoint `5` then the `Global.checkpoints` parameter just needs to be set to `". /output/RecModel/epoch_5"` and PaddleClas will automatically supplement the suffix name.
```shell
output/
└── RecModel
├── best_model.pdopt
├── best_model.pdparams
├── best_model.pdstates
├── epoch_1.pdopt
├── epoch_1.pdparams
├── epoch_1.pdstates
.
.
.
```
<a name="Model-Evaluation"></a>
### 2.3 Model Evaluation
Model evaluation can be carried out with the following commands.
```bash
python3 tools/eval.py \
-c ./ppcls/configs/quick_start/MobileNetV1_retrieval.yaml \
-o Global.pretrained_model=./output/RecModel/best_model
```
The above command will use `. /configs/quick_start/MobileNetV1_retrieval.yaml` as a configuration file to evaluate the model obtained from the above training `. /output/RecModel/best_model` for evaluation. You can also set up the evaluation by changing the parameters in the configuration file, or you can update the configuration with the `-o` parameter, as shown above.
Some of the configurable evaluation parameters are introduced as follows.
* `Arch.name`: the name of the model
* `Global.pretrained_model`: path to the pre-trained model file of the model to be evaluated, unlike `Global.Backbone.pretrained` where the pre-trained model is the weight of the whole model, whereas `Global.Backbone.pretrained` is only the Backbone.`pretrained` is only the weight of the Backbone part. When it is time to do model evaluation, the weights of the whole model need to be loaded.
* `Metric.Eval`: the metric to be evaluated, by default evaluates recall@1, recall@5, mAP. when you are not going to evaluate a metric, you can remove the corresponding trial marker from the configuration file; when you want to add a certain evaluation metric, you can also refer to [Metric](../../../ppcls/metric/metrics.py) section to add the relevant metric to the configuration file `Metric.Eval`.
**Note:**
* When loading the model to be evaluated, the path to the model file needs to be specified, but it is not necessary to include the file suffix, PaddleClas will automatically complete the `.pdparams` suffix, e.g. [2.2 Resume Training](#Resume-Training).
* Metric learning are generally not evaluated for TopkAcc.
<a name="Export-Inference-Model"></a>
## 3. Export Inference Model
By exporting the inference model, PaddlePaddle supports the transformation of the trained model using prediction with inference engine.
```bash
python3 tools/export_model.py \
-c ./ppcls/configs/quick_start/MobileNetV1_retrieval.yaml \
-o Global.pretrained_model=output/RecModel/best_model \
-o Global.save_inference_dir=./inference
```
`Global.pretrained_model` is used to specify the model file path, which still does not need to contain the model file suffix (e.g. [2.2 Model recovery training](#Model recovery training)). When executed, it will generate the `. /inference` directory, which contains the `inference.pdiparams`, `inference.pdiparams.info`, and `inference.pdmodel` files. `Global.save_inference_dir` allows you to specify the path to export the inference model. The inference model saved here is truncated at the embedding feature level, i.e. the final output of the model is n-dimensional embedding features.
The above command will generate the model structure file (`inference.pdmodel`) and the model weights file (`inference.pdiparams`), which can then be used for inference using the inference engine. The process of inference using the inference model can be found in [Predictive inference based on the Python prediction engine](@shengyu).
## Basic knowledge
Image retrieval refers to a query image given a specific instance (e.g. a specific target, scene, item, etc.) that contains the same instance from a database image. Unlike image classification, image retrieval solves an open set problem where the training set may not contain the class of the image being recognised. The overall process of image retrieval is: firstly, the images are represented in a suitable feature vector, secondly, a nearest neighbour search is performed on these image feature vectors using Euclidean or Cosine distances to find similar images in the base, and finally, some post-processing techniques can be used to fine-tune the retrieval results and determine information such as the category of the image being recognised. Therefore, the key to determining the performance of an image retrieval algorithm lies in the goodness of the feature vectors corresponding to the images.
<a name="Metric-Learning"></a>
- Metric Learning
Metric learning studies how to learn a distance function on a particular task so that the distance function can help nearest-neighbour based algorithms (kNN, k-means, etc.) to achieve better performance. Deep Metric Learning is a method of metric learning that aims to learn a mapping from the original features to a low-dimensional dense vector space (embedding space) such that similar objects on the embedding space are closer together using commonly used distance functions (Euclidean distance, cosine distance, etc.) ) on the embedding space, while the distances between objects of different classes are relatively close to each other. Deep metric learning has achieved very successful applications in the field of computer vision, such as face recognition, commodity recognition, image retrieval, pedestrian re-identification, etc.
<a name="Introduction to Image Retrieval Datasets"></a>
- Introduction to image retrieval datasets
- Training Dataset: used to train the model so that it can learn the image features of the collection.
- Gallery Dataset: used to provide the gallery data for the image retrieval task. The gallery dataset can be the same as the training set or the test set, or different.
- Test Set (Query Dataset): used to test the goodness of the model, usually each test image in the test set is extracted with features, and then matched with the features of the underlying data to obtain recognition results, and then the metrics of the whole test set are calculated based on the recognition results.
<a name="Image Retrieval Evaluation Metrics"></a>
- Image Retrieval Evaluation Metrics
<a name="Recall"></a>
- recall:indicates the number of predicted positive cases with positive labels / the number of cases with positive labels
- recall@1:Number of predicted positive cases in top-1 with positive label / Number of cases with positive label
- recall@5:Number of all predicted positive cases in top-5 retrieved with positive label / Number of cases with positive label
<a name="mean Average Precision"></a>
- mean Average Precision(mAP)
- AP: AP refers to the average precision on different recall rates
- mAP: Average of the APs for all images in the test set
# Image Classification
---
Image Classification is a fundamental task that classifies the image by semantic information and assigns it to a specific label. Image Classification is the foundation of Computer Vision tasks, such as object detection, image segmentation, object tracking and behavior analysis. Image Classification has comprehensive applications, including face recognition and smart video analysis in the security and protection field, traffic scenario recognition in the traffic field, image retrieval and electronic photo album classification in the internet industry, and image recognition in the medical industry.
Generally speaking, Image Classification attempts to comprehend an entire image as a whole by feature engineering and assigns labels by a classifier. Hence, how to extract the features of image is the essential part. Before we have deep learning, the most used classification method is the Bag of Words model. However, Image Classification based on deep learning can learn the hierarchical feature description by supervised and unsupervised learning, replacing the manually image feature selection. Recently, Convolution Neural Network in deep learning has an awesome performance in the image field. CNN uses the pixel information as the input to get the all information to the maximum extent. Additionally, since the model uses convolution to extract features, the classification result is the output. Thus, this kind of end-to-end method achieves ideal performance and is applied widely.
Image Classification is a very basic but important field in the subject of computer vision. Its research results have always influenced the development of computer vision and even deep learning. Image classification has many sub-fields, such as multi-label image classification and fine-grained image classification. Here is only a brief description of single-label image classification.
## 1 Dataset Introduction
### 1.1 ImageNet-1k
The ImageNet project is a large-scale visual database for the research of visual object recognition software. More than 14 million images have been annotated manually to point out objects in the picture in this project, and at least more than 1 million images provide borders. ImageNet-1k is a subset of the ImageNet dataset, which contains 1000 categories. The training set contains 1281167 image data, and the validation set contains 50,000 image data. Since 2010, the ImageNet project has held an image classification competition every year, which is the ImageNet Large-scale Visual Recognition Challenge (ILSVRC). The dataset used in the challenge is ImageNet-1k. So far, ImageNet-1k has become one of the most important data sets for the development of computer vision, and it promotes the development of the entire computer vision. The initialization models of many computer vision downstream tasks are based on the weights trained on this dataset.
### 1.2 CIFAR-10/CIFAR-100
The CIFAR-10 data set consists of 60,000 color images in 10 categories, with an image resolution of 32x32, and each category has 6000 images, including 5000 in the training set and 1000 in the validation set. 10 different classes represent airplanes, cars, birds, cats, deer, dogs, frogs, horses, ships and trucks. The CIFAR-100 data set is an extension of CIFAR-10. It consists of 60,000 color images in 100 classes, with an image resolution of 32x32, and each class has 600 images, including 500 in the training set and 100 in the validation set. Researchers can try different algorithms quickly because these two data sets are small in scale. These two data sets are also commonly used data sets for testing the quality of models in the image classification field.
## 2 Image Classification Process
The prepared training data is preprocessed by the corresponding data and then passed through the image classification model. The output of the model and the real label are used in a cross-entropy loss function. This loss function describes the convergence direction of the model. Traverse all the image data input model, do the corresponding gradient descent for the final loss function through some optimizers, return the gradient information to the model, update the weight of the model, and traverse the data repeatedly. Finally, an image classification model can be obtained.
### 2.1 Data and its preprocessing
The quality and quantity of data often determine the performance of a model. In the field of image classification, data includes images and labels. In most cases, labeled data is scarce, so the amount of data is difficult to reach the level of saturation of the model. In order to enable the model to learn more image features, a lot of image transformation or data augmentation is required before the image enters the model, so as to ensure the diversity of input image data. Ultimately ensure that the model has better generalization capabilities. PaddleClas provides standard image transformation for training ImageNet-1k, and also provides 8 data augmentation methods. For related codes, please refer to[data preprocess](../../../ppcls/data/preprocess),The configuration file refer to [Data Augmentation Configuration File](../../../ppcls/configs/ImageNet/DataAugment).
### 2.2 Prepare the model
After the data is determined, the model often determines the upper limit of the final accuracy. In the field of image classification, classic models emerge in an endless stream. PaddleClas provides 35 series and a total of 164 ImageNet pre-trained models. For specific accuracy, speed and other indicators, please refer to[Backbone network introduction](../models).
### 2.3 Train the model
After preparing the data and model, you can start training the model and update the parameters of the model. After many iterations, a trained model can finally be obtained for image classification tasks. The training process of image classification requires a lot of experience and involves the setting of many hyperparameters. PaddleClas provides a series of [training tuning methods](../models/Tricks_en.md), which can quickly help you obtain a high-precision model.
### 2.4 Evaluate the model
After a model is trained, the evaluation results of the model on the validation set can determine the performance of the model. The evaluation index is generally Top1-Acc or Top5-Acc. The higher the index, the better the model performance.
## 3 Main Algorithms Introduction
- LeNet: Yan LeCun et al. first applied convolutional neural networks to image classification tasks in the 1990s, and creatively proposed LeNet, which achieved great success in handwritten digit recognition tasks.
- AlexNet: Alex Krizhevsky et al. proposed AlexNet in 2012 and applied it to ImageNet, and won the 2012 ImageNet classification competition. Since then, deep learning has become popular
- VGG: Simonyan and Zisserman proposed the VGG network structure in 2014. This network structure uses a smaller convolution kernel to stack the entire network, achieving better performance in ImageNet classification, it provides new ideas for the subsequent network structure design.
- GoogLeNet: Christian Szegedy et al. proposed GoogLeNet in 2014. This network uses a multi-branch structure and a global average pooling layer (GAP). While maintaining the accuracy of the model, the amount of model storage and calculation is greatly reduced. The network won the 2014 ImageNet classification competition.
- ResNet: Kaiming He et al. proposed ResNet in 2015, which deepened the depth of the network by introducing a residual module. In the end, the network reduced the recognition error rate of ImageNet classification to 3.6%, which exceeded the recognition accuracy of normal human eyes for the first time.
- DenseNet: Huang Gao et al. proposed DenseNet in 2017. The network designed a denser connected block and achieved higher performance with a smaller amount of parameters.
- EfficientNet: Mingxing Tan et al. proposed EfficientNet in 2019. This network balances the width of the network, the depth of the network, and the resolution of the input image. With the same FLOPS and parameters, the accuracy reaches the state-of-the-art.
For more algorithm introduction, please refer to [Algorithm Introduction](../models).
tutorials
================================
.. toctree::
:maxdepth: 1
install_en.md
quick_start_en.md
data_en.md
getting_started_en.md
config_en.md
# Installation
---
This tutorial introduces how to install PaddleClas and its requirements.
## 1. Install PaddlePaddle
`PaddlePaddle 2.1` or later is required for PaddleClas. You can use the following steps to install PaddlePaddle.
### 1.1 Environment requirements
- python 3.x
- cuda >= 10.1 (necessary if you want to use paddlepaddle-gpu)
- cudnn >= 7.6.4 (necessary if you want to use paddlepaddle-gpu)
- nccl >= 2.1.2 (necessary if you want the use distributed training/eval)
- gcc >= 8.2
Docker is recomended to run Paddleclas, for more detailed information about docker and nvidia-docker, you can refer to the [tutorial](https://www.runoob.com/docker/docker-tutorial.html).
When you use cuda10.1, the driver version needs to be larger or equal than 418.39. When you use cuda10.2, the driver version needs to be larger or equal than 440.33. For more cuda versions and specific driver versions, you can refer to the [link](https://docs.nvidia.com/deploy/cuda-compatibility/index.html).
If you do not want to use docker, you can skip section 1.2 and go into section 1.3 directly.
### 1.2 (Recommended) Prepare a docker environment. The first time you use this docker image, it will be downloaded automatically. Please be patient.
```
# Switch to the working directory
cd /home/Projects
# You need to create a docker container for the first run, and do not need to run the current command when you run it again
# Create a docker container named ppcls and map the current directory to the /paddle directory of the container
# It is recommended to set a shared memory greater than or equal to 8G through the --shm-size parameter
sudo docker run --name ppcls -v $PWD:/paddle --shm-size=8G --network=host -it paddlepaddle/paddle:2.1.0 /bin/bash
# Use the following command to create a container if you want to use GPU in the container
sudo nvidia-docker run --name ppcls -v $PWD:/paddle --shm-size=8G --network=host -it paddlepaddle/paddle:2.1.0-gpu-cuda10.2-cudnn7 /bin/bash
```
You can also visit [DockerHub](https://hub.docker.com/r/paddlepaddle/paddle/tags/) to get more docker images.
```
# use ctrl+P+Q to exit docker, to re-enter docker using the following command:
sudo docker exec -it ppcls /bin/bash
```
### 1.3 Install PaddlePaddle using pip
If you want to use PaddlePaddle on GPU, you can use the following command to install PaddlePaddle.
```bash
pip3 install paddlepaddle-gpu --upgrade -i https://mirror.baidu.com/pypi/simple
```
If you want to use PaddlePaddle on CPU, you can use the following command to install PaddlePaddle.
```bash
pip3 install paddlepaddle --upgrade -i https://mirror.baidu.com/pypi/simple
```
**Note:**
* If you have already installed CPU version of PaddlePaddle and want to use GPU version now, you should uninstall CPU version of PaddlePaddle and then install GPU version to avoid package confusion.
* You can also compile PaddlePaddle from source code, please refer to [PaddlePaddle Installation tutorial](http://www.paddlepaddle.org.cn/install/quick) to more compilation options.
### 1.4 Verify Installation process
```python
import paddle
paddle.utils.run_check()
```
Check PaddlePaddle version:
```bash
python3 -c "import paddle; print(paddle.__version__)"
```
Note:
- Make sure the compiled source code is later than PaddlePaddle2.0.
- Indicate **WITH_DISTRIBUTE=ON** when compiling, Please refer to [Instruction](https://www.paddlepaddle.org.cn/documentation/docs/zh/develop/install/Tables.html#id3) for more details.
- When running in docker, in order to ensure that the container has enough shared memory for dataloader acceleration of Paddle, please set the parameter `--shm_size=8g` at creating a docker container, if conditions permit, you can set it to a larger value.
## 2. Install PaddleClas
### 2.1 Clone PaddleClas source code
```
git clone https://github.com/PaddlePaddle/PaddleClas.git -b develop
```
If it is too slow for you to download from github, you can download PaddleClas from gitee. The command is as follows.
```bash
git clone https://gitee.com/paddlepaddle/PaddleClas.git -b develop
```
### 2.2 Install requirements
PaddleClas dependencies are listed in file `requirements.txt`, you can use the following command to install the dependencies.
```
pip3 install --upgrade -r requirements.txt -i https://mirror.baidu.com/pypi/simple
```
...@@ -307,7 +307,7 @@ sh tools/train.sh ...@@ -307,7 +307,7 @@ sh tools/train.sh
* 在使用数据增强后,模型可能会趋于欠拟合状态,建议可以适当的调小 `l2_decay` 的值来获得更高的验证集准确率。 * 在使用数据增强后,模型可能会趋于欠拟合状态,建议可以适当的调小 `l2_decay` 的值来获得更高的验证集准确率。
* 几乎每一类图像增强均含有超参数,我们只提供了基于 ImageNet-1k 的超参数,其他数据集需要用户自己调试超参数,具体超参数的含义用户可以阅读相关的论文,调试方法也可以参考训练技巧的章节 * 几乎每一类图像增强均含有超参数,我们只提供了基于 ImageNet-1k 的超参数,其他数据集需要用户自己调试超参数,具体超参数的含义用户可以阅读相关的论文,调试方法也可以参考[训练技巧](../models_training/train_strategy.md)
<a name="4"></a> <a name="4"></a>
## 4. 实验结果 ## 4. 实验结果
......
...@@ -85,7 +85,7 @@ ...@@ -85,7 +85,7 @@
- 采用蒸馏方法,对小模型进行模型能力提升,详见[模型蒸馏](../algorithm_introduction/knowledge_distillation.md) - 采用蒸馏方法,对小模型进行模型能力提升,详见[模型蒸馏](../algorithm_introduction/knowledge_distillation.md)
- 增补数据集。针对错误样本,添加badcase数据 - 增补数据集。针对错误样本,添加badcase数据
模型训练完成后,参照[1.2 检索库更新](#1.2 检索库更新)进行检索库更新。同时,对整个pipeline进行测试,如果精度不达预期,则重复此步骤。 模型训练完成后,参照[1.2 检索库更新](#1.2)进行检索库更新。同时,对整个pipeline进行测试,如果精度不达预期,则重复此步骤。
<a name="3"></a> <a name="3"></a>
......
...@@ -15,7 +15,7 @@ ...@@ -15,7 +15,7 @@
- [2.2.4 GridMask](#2.2.4) - [2.2.4 GridMask](#2.2.4)
- [2.3 图像混叠类](#2.3) - [2.3 图像混叠类](#2.3)
- [2.3.1 Mixup](#2.3.1) - [2.3.1 Mixup](#2.3.1)
- [2.3.2 Cutmix](#3.2.2) - [2.3.2 Cutmix](#2.3.2)
<a name="1"></a> <a name="1"></a>
## 1. 数据增强简介 ## 1. 数据增强简介
......
...@@ -51,7 +51,7 @@ CIFAR-10 数据集由 10 个类的 60000 个彩色图像组成,图像分辨率 ...@@ -51,7 +51,7 @@ CIFAR-10 数据集由 10 个类的 60000 个彩色图像组成,图像分辨率
<a name="2.3"></a> <a name="2.3"></a>
### 2.3 模型训练 ### 2.3 模型训练
在准备好数据、模型后,便可以开始迭代模型并更新模型的参数。经过多次迭代最终可以得到训练好的模型来做图像分类任务。图像分类的训练过程需要很多经验,涉及很多超参数的设置,PaddleClas 提供了一些列的[训练调优方法](../models/Tricks.md),可以快速助你获得高精度的模型。 在准备好数据、模型后,便可以开始迭代模型并更新模型的参数。经过多次迭代最终可以得到训练好的模型来做图像分类任务。图像分类的训练过程需要很多经验,涉及很多超参数的设置,PaddleClas 提供了一些列的[训练调优方法](../models_training/train_strategy.md),可以快速助你获得高精度的模型。
<a name="2.4"></a> <a name="2.4"></a>
### 2.4 模型评估 ### 2.4 模型评估
......
...@@ -192,14 +192,14 @@ paddleclas --model_name='ResNet50' --infer_imgs='https://raw.githubusercontent.c ...@@ -192,14 +192,14 @@ paddleclas --model_name='ResNet50' --infer_imgs='https://raw.githubusercontent.c
<a name="4.6"></a> <a name="4.6"></a>
### 4.6 对 `NumPy.ndarray` 格式数据进行预测 ### 4.6 对 `NumPy.ndarray` 格式数据进行预测
在 Python 中,可以对 `Numpy.ndarray` 格式的图像数据进行预测,只需通过参数 `infer_imgs` 指定即可。注意该图像数据必须为三通道图像数据 在 Python 中,可以对 `Numpy.ndarray` 格式的图像数据进行预测,只需通过参数 `infer_imgs` 指定即可。注意,PaddleClas 所提供的模型仅支持 3 通道图像数据,且通道顺序为 `RGB`
* python * python
```python ```python
import cv2 import cv2
from paddleclas import PaddleClas from paddleclas import PaddleClas
clas = PaddleClas(model_name='ResNet50') clas = PaddleClas(model_name='ResNet50')
infer_imgs = cv2.imread("docs/images/inference_deployment/whl_demo.jpg") infer_imgs = cv2.imread("docs/images/inference_deployment/whl_demo.jpg")[:, :, ::-1]
result=clas.predict(infer_imgs) result=clas.predict(infer_imgs)
print(next(result)) print(next(result))
``` ```
......
...@@ -2,8 +2,8 @@ ...@@ -2,8 +2,8 @@
--- ---
## 目录 ## 目录
* [1. 概述](#) * [1. 概述](#1)
* [2. 精度、FLOPs 和参数量](#FLOPs) * [2. 精度、FLOPs 和参数量](#2)
<a name='1'></a> <a name='1'></a>
......
...@@ -99,7 +99,7 @@ DataLoader: ...@@ -99,7 +99,7 @@ DataLoader:
use_shared_memory: True use_shared_memory: True
Infer: Infer:
infer_imgs: docs/images/whl/demo.jpg infer_imgs: docs/images/inference_deployment/whl_demo.jpg
batch_size: 10 batch_size: 10
transforms: transforms:
- DecodeImage: - DecodeImage:
......
...@@ -101,7 +101,7 @@ DataLoader: ...@@ -101,7 +101,7 @@ DataLoader:
use_shared_memory: True use_shared_memory: True
Infer: Infer:
infer_imgs: docs/images/whl/demo.jpg infer_imgs: docs/images/inference_deployment/whl_demo.jpg
batch_size: 10 batch_size: 10
transforms: transforms:
- DecodeImage: - DecodeImage:
......
...@@ -100,7 +100,7 @@ DataLoader: ...@@ -100,7 +100,7 @@ DataLoader:
use_shared_memory: True use_shared_memory: True
Infer: Infer:
infer_imgs: docs/images/whl/demo.jpg infer_imgs: docs/images/inference_deployment/whl_demo.jpg
batch_size: 10 batch_size: 10
transforms: transforms:
- DecodeImage: - DecodeImage:
......
...@@ -100,7 +100,7 @@ DataLoader: ...@@ -100,7 +100,7 @@ DataLoader:
use_shared_memory: True use_shared_memory: True
Infer: Infer:
infer_imgs: docs/images/whl/demo.jpg infer_imgs: docs/images/inference_deployment/whl_demo.jpg
batch_size: 10 batch_size: 10
transforms: transforms:
- DecodeImage: - DecodeImage:
......
...@@ -100,7 +100,7 @@ DataLoader: ...@@ -100,7 +100,7 @@ DataLoader:
use_shared_memory: True use_shared_memory: True
Infer: Infer:
infer_imgs: docs/images/whl/demo.jpg infer_imgs: docs/images/inference_deployment/whl_demo.jpg
batch_size: 10 batch_size: 10
transforms: transforms:
- DecodeImage: - DecodeImage:
......
...@@ -100,7 +100,7 @@ DataLoader: ...@@ -100,7 +100,7 @@ DataLoader:
use_shared_memory: True use_shared_memory: True
Infer: Infer:
infer_imgs: docs/images/whl/demo.jpg infer_imgs: docs/images/inference_deployment/whl_demo.jpg
batch_size: 10 batch_size: 10
transforms: transforms:
- DecodeImage: - DecodeImage:
......
...@@ -100,7 +100,7 @@ DataLoader: ...@@ -100,7 +100,7 @@ DataLoader:
use_shared_memory: True use_shared_memory: True
Infer: Infer:
infer_imgs: docs/images/whl/demo.jpg infer_imgs: docs/images/inference_deployment/whl_demo.jpg
batch_size: 10 batch_size: 10
transforms: transforms:
- DecodeImage: - DecodeImage:
......
...@@ -100,7 +100,7 @@ DataLoader: ...@@ -100,7 +100,7 @@ DataLoader:
use_shared_memory: True use_shared_memory: True
Infer: Infer:
infer_imgs: docs/images/whl/demo.jpg infer_imgs: docs/images/inference_deployment/whl_demo.jpg
batch_size: 10 batch_size: 10
transforms: transforms:
- DecodeImage: - DecodeImage:
......
...@@ -100,7 +100,7 @@ DataLoader: ...@@ -100,7 +100,7 @@ DataLoader:
use_shared_memory: True use_shared_memory: True
Infer: Infer:
infer_imgs: docs/images/whl/demo.jpg infer_imgs: docs/images/inference_deployment/whl_demo.jpg
batch_size: 10 batch_size: 10
transforms: transforms:
- DecodeImage: - DecodeImage:
......
...@@ -100,7 +100,7 @@ DataLoader: ...@@ -100,7 +100,7 @@ DataLoader:
use_shared_memory: True use_shared_memory: True
Infer: Infer:
infer_imgs: docs/images/whl/demo.jpg infer_imgs: docs/images/inference_deployment/whl_demo.jpg
batch_size: 10 batch_size: 10
transforms: transforms:
- DecodeImage: - DecodeImage:
......
...@@ -100,7 +100,7 @@ DataLoader: ...@@ -100,7 +100,7 @@ DataLoader:
use_shared_memory: True use_shared_memory: True
Infer: Infer:
infer_imgs: docs/images/whl/demo.jpg infer_imgs: docs/images/inference_deployment/whl_demo.jpg
batch_size: 10 batch_size: 10
transforms: transforms:
- DecodeImage: - DecodeImage:
......
...@@ -100,7 +100,7 @@ DataLoader: ...@@ -100,7 +100,7 @@ DataLoader:
use_shared_memory: True use_shared_memory: True
Infer: Infer:
infer_imgs: docs/images/whl/demo.jpg infer_imgs: docs/images/inference_deployment/whl_demo.jpg
batch_size: 10 batch_size: 10
transforms: transforms:
- DecodeImage: - DecodeImage:
......
...@@ -102,7 +102,7 @@ DataLoader: ...@@ -102,7 +102,7 @@ DataLoader:
use_shared_memory: True use_shared_memory: True
Infer: Infer:
infer_imgs: docs/images/whl/demo.jpg infer_imgs: docs/images/inference_deployment/whl_demo.jpg
batch_size: 10 batch_size: 10
transforms: transforms:
- DecodeImage: - DecodeImage:
......
...@@ -102,7 +102,7 @@ DataLoader: ...@@ -102,7 +102,7 @@ DataLoader:
use_shared_memory: True use_shared_memory: True
Infer: Infer:
infer_imgs: docs/images/whl/demo.jpg infer_imgs: docs/images/inference_deployment/whl_demo.jpg
batch_size: 10 batch_size: 10
transforms: transforms:
- DecodeImage: - DecodeImage:
......
...@@ -102,7 +102,7 @@ DataLoader: ...@@ -102,7 +102,7 @@ DataLoader:
use_shared_memory: True use_shared_memory: True
Infer: Infer:
infer_imgs: docs/images/whl/demo.jpg infer_imgs: docs/images/inference_deployment/whl_demo.jpg
batch_size: 10 batch_size: 10
transforms: transforms:
- DecodeImage: - DecodeImage:
......
...@@ -102,7 +102,7 @@ DataLoader: ...@@ -102,7 +102,7 @@ DataLoader:
use_shared_memory: True use_shared_memory: True
Infer: Infer:
infer_imgs: docs/images/whl/demo.jpg infer_imgs: docs/images/inference_deployment/whl_demo.jpg
batch_size: 10 batch_size: 10
transforms: transforms:
- DecodeImage: - DecodeImage:
......
...@@ -102,7 +102,7 @@ DataLoader: ...@@ -102,7 +102,7 @@ DataLoader:
use_shared_memory: True use_shared_memory: True
Infer: Infer:
infer_imgs: docs/images/whl/demo.jpg infer_imgs: docs/images/inference_deployment/whl_demo.jpg
batch_size: 10 batch_size: 10
transforms: transforms:
- DecodeImage: - DecodeImage:
......
...@@ -102,7 +102,7 @@ DataLoader: ...@@ -102,7 +102,7 @@ DataLoader:
use_shared_memory: True use_shared_memory: True
Infer: Infer:
infer_imgs: docs/images/whl/demo.jpg infer_imgs: docs/images/inference_deployment/whl_demo.jpg
batch_size: 10 batch_size: 10
transforms: transforms:
- DecodeImage: - DecodeImage:
......
...@@ -99,7 +99,7 @@ DataLoader: ...@@ -99,7 +99,7 @@ DataLoader:
use_shared_memory: True use_shared_memory: True
Infer: Infer:
infer_imgs: docs/images/whl/demo.jpg infer_imgs: docs/images/inference_deployment/whl_demo.jpg
batch_size: 10 batch_size: 10
transforms: transforms:
- DecodeImage: - DecodeImage:
......
...@@ -98,7 +98,7 @@ DataLoader: ...@@ -98,7 +98,7 @@ DataLoader:
use_shared_memory: True use_shared_memory: True
Infer: Infer:
infer_imgs: docs/images/whl/demo.jpg infer_imgs: docs/images/inference_deployment/whl_demo.jpg
batch_size: 10 batch_size: 10
transforms: transforms:
- DecodeImage: - DecodeImage:
......
...@@ -101,7 +101,7 @@ DataLoader: ...@@ -101,7 +101,7 @@ DataLoader:
use_shared_memory: True use_shared_memory: True
Infer: Infer:
infer_imgs: docs/images/whl/demo.jpg infer_imgs: docs/images/inference_deployment/whl_demo.jpg
batch_size: 10 batch_size: 10
transforms: transforms:
- DecodeImage: - DecodeImage:
......
...@@ -101,7 +101,7 @@ DataLoader: ...@@ -101,7 +101,7 @@ DataLoader:
use_shared_memory: True use_shared_memory: True
Infer: Infer:
infer_imgs: docs/images/whl/demo.jpg infer_imgs: docs/images/inference_deployment/whl_demo.jpg
batch_size: 10 batch_size: 10
transforms: transforms:
- DecodeImage: - DecodeImage:
......
...@@ -104,7 +104,7 @@ DataLoader: ...@@ -104,7 +104,7 @@ DataLoader:
use_shared_memory: True use_shared_memory: True
Infer: Infer:
infer_imgs: docs/images/whl/demo.jpg infer_imgs: docs/images/inference_deployment/whl_demo.jpg
batch_size: 10 batch_size: 10
transforms: transforms:
- DecodeImage: - DecodeImage:
......
...@@ -99,7 +99,7 @@ DataLoader: ...@@ -99,7 +99,7 @@ DataLoader:
use_shared_memory: True use_shared_memory: True
Infer: Infer:
infer_imgs: docs/images/whl/demo.jpg infer_imgs: docs/images/inference_deployment/whl_demo.jpg
batch_size: 10 batch_size: 10
transforms: transforms:
- DecodeImage: - DecodeImage:
......
...@@ -101,7 +101,7 @@ DataLoader: ...@@ -101,7 +101,7 @@ DataLoader:
use_shared_memory: True use_shared_memory: True
Infer: Infer:
infer_imgs: docs/images/whl/demo.jpg infer_imgs: docs/images/inference_deployment/whl_demo.jpg
batch_size: 10 batch_size: 10
transforms: transforms:
- DecodeImage: - DecodeImage:
......
...@@ -101,7 +101,7 @@ DataLoader: ...@@ -101,7 +101,7 @@ DataLoader:
use_shared_memory: True use_shared_memory: True
Infer: Infer:
infer_imgs: docs/images/whl/demo.jpg infer_imgs: docs/images/inference_deployment/whl_demo.jpg
batch_size: 10 batch_size: 10
transforms: transforms:
- DecodeImage: - DecodeImage:
......
...@@ -104,7 +104,7 @@ DataLoader: ...@@ -104,7 +104,7 @@ DataLoader:
use_shared_memory: True use_shared_memory: True
Infer: Infer:
infer_imgs: docs/images/whl/demo.jpg infer_imgs: docs/images/inference_deployment/whl_demo.jpg
batch_size: 10 batch_size: 10
transforms: transforms:
- DecodeImage: - DecodeImage:
......
...@@ -127,7 +127,7 @@ DataLoader: ...@@ -127,7 +127,7 @@ DataLoader:
use_shared_memory: True use_shared_memory: True
Infer: Infer:
infer_imgs: docs/images/whl/demo.jpg infer_imgs: docs/images/inference_deployment/whl_demo.jpg
batch_size: 10 batch_size: 10
transforms: transforms:
- DecodeImage: - DecodeImage:
......
...@@ -127,7 +127,7 @@ DataLoader: ...@@ -127,7 +127,7 @@ DataLoader:
use_shared_memory: True use_shared_memory: True
Infer: Infer:
infer_imgs: docs/images/whl/demo.jpg infer_imgs: docs/images/inference_deployment/whl_demo.jpg
batch_size: 10 batch_size: 10
transforms: transforms:
- DecodeImage: - DecodeImage:
......
...@@ -127,7 +127,7 @@ DataLoader: ...@@ -127,7 +127,7 @@ DataLoader:
use_shared_memory: True use_shared_memory: True
Infer: Infer:
infer_imgs: docs/images/whl/demo.jpg infer_imgs: docs/images/inference_deployment/whl_demo.jpg
batch_size: 10 batch_size: 10
transforms: transforms:
- DecodeImage: - DecodeImage:
......
...@@ -127,7 +127,7 @@ DataLoader: ...@@ -127,7 +127,7 @@ DataLoader:
use_shared_memory: True use_shared_memory: True
Infer: Infer:
infer_imgs: docs/images/whl/demo.jpg infer_imgs: docs/images/inference_deployment/whl_demo.jpg
batch_size: 10 batch_size: 10
transforms: transforms:
- DecodeImage: - DecodeImage:
......
...@@ -127,7 +127,7 @@ DataLoader: ...@@ -127,7 +127,7 @@ DataLoader:
use_shared_memory: True use_shared_memory: True
Infer: Infer:
infer_imgs: docs/images/whl/demo.jpg infer_imgs: docs/images/inference_deployment/whl_demo.jpg
batch_size: 10 batch_size: 10
transforms: transforms:
- DecodeImage: - DecodeImage:
......
...@@ -127,7 +127,7 @@ DataLoader: ...@@ -127,7 +127,7 @@ DataLoader:
use_shared_memory: True use_shared_memory: True
Infer: Infer:
infer_imgs: docs/images/whl/demo.jpg infer_imgs: docs/images/inference_deployment/whl_demo.jpg
batch_size: 10 batch_size: 10
transforms: transforms:
- DecodeImage: - DecodeImage:
......
...@@ -127,7 +127,7 @@ DataLoader: ...@@ -127,7 +127,7 @@ DataLoader:
use_shared_memory: True use_shared_memory: True
Infer: Infer:
infer_imgs: docs/images/whl/demo.jpg infer_imgs: docs/images/inference_deployment/whl_demo.jpg
batch_size: 10 batch_size: 10
transforms: transforms:
- DecodeImage: - DecodeImage:
......
...@@ -127,7 +127,7 @@ DataLoader: ...@@ -127,7 +127,7 @@ DataLoader:
use_shared_memory: True use_shared_memory: True
Infer: Infer:
infer_imgs: docs/images/whl/demo.jpg infer_imgs: docs/images/inference_deployment/whl_demo.jpg
batch_size: 10 batch_size: 10
transforms: transforms:
- DecodeImage: - DecodeImage:
......
...@@ -100,7 +100,7 @@ DataLoader: ...@@ -100,7 +100,7 @@ DataLoader:
use_shared_memory: True use_shared_memory: True
Infer: Infer:
infer_imgs: docs/images/whl/demo.jpg infer_imgs: docs/images/inference_deployment/whl_demo.jpg
batch_size: 10 batch_size: 10
transforms: transforms:
- DecodeImage: - DecodeImage:
......
...@@ -100,7 +100,7 @@ DataLoader: ...@@ -100,7 +100,7 @@ DataLoader:
use_shared_memory: True use_shared_memory: True
Infer: Infer:
infer_imgs: docs/images/whl/demo.jpg infer_imgs: docs/images/inference_deployment/whl_demo.jpg
batch_size: 10 batch_size: 10
transforms: transforms:
- DecodeImage: - DecodeImage:
......
...@@ -100,7 +100,7 @@ DataLoader: ...@@ -100,7 +100,7 @@ DataLoader:
use_shared_memory: True use_shared_memory: True
Infer: Infer:
infer_imgs: docs/images/whl/demo.jpg infer_imgs: docs/images/inference_deployment/whl_demo.jpg
batch_size: 10 batch_size: 10
transforms: transforms:
- DecodeImage: - DecodeImage:
......
...@@ -100,7 +100,7 @@ DataLoader: ...@@ -100,7 +100,7 @@ DataLoader:
use_shared_memory: True use_shared_memory: True
Infer: Infer:
infer_imgs: docs/images/whl/demo.jpg infer_imgs: docs/images/inference_deployment/whl_demo.jpg
batch_size: 10 batch_size: 10
transforms: transforms:
- DecodeImage: - DecodeImage:
......
...@@ -100,7 +100,7 @@ DataLoader: ...@@ -100,7 +100,7 @@ DataLoader:
use_shared_memory: True use_shared_memory: True
Infer: Infer:
infer_imgs: docs/images/whl/demo.jpg infer_imgs: docs/images/inference_deployment/whl_demo.jpg
batch_size: 10 batch_size: 10
transforms: transforms:
- DecodeImage: - DecodeImage:
......
...@@ -122,7 +122,7 @@ DataLoader: ...@@ -122,7 +122,7 @@ DataLoader:
use_shared_memory: True use_shared_memory: True
Infer: Infer:
infer_imgs: "docs/images/whl/demo.jpg" infer_imgs: "docs/images/inference_deployment/whl_demo.jpg"
batch_size: 10 batch_size: 10
transforms: transforms:
- DecodeImage: - DecodeImage:
......
...@@ -99,7 +99,7 @@ DataLoader: ...@@ -99,7 +99,7 @@ DataLoader:
use_shared_memory: True use_shared_memory: True
Infer: Infer:
infer_imgs: docs/images/whl/demo.jpg infer_imgs: docs/images/inference_deployment/whl_demo.jpg
batch_size: 10 batch_size: 10
transforms: transforms:
- DecodeImage: - DecodeImage:
......
...@@ -99,7 +99,7 @@ DataLoader: ...@@ -99,7 +99,7 @@ DataLoader:
use_shared_memory: True use_shared_memory: True
Infer: Infer:
infer_imgs: docs/images/whl/demo.jpg infer_imgs: docs/images/inference_deployment/whl_demo.jpg
batch_size: 10 batch_size: 10
transforms: transforms:
- DecodeImage: - DecodeImage:
......
...@@ -99,7 +99,7 @@ DataLoader: ...@@ -99,7 +99,7 @@ DataLoader:
use_shared_memory: True use_shared_memory: True
Infer: Infer:
infer_imgs: docs/images/whl/demo.jpg infer_imgs: docs/images/inference_deployment/whl_demo.jpg
batch_size: 10 batch_size: 10
transforms: transforms:
- DecodeImage: - DecodeImage:
......
...@@ -99,7 +99,7 @@ DataLoader: ...@@ -99,7 +99,7 @@ DataLoader:
use_shared_memory: True use_shared_memory: True
Infer: Infer:
infer_imgs: docs/images/whl/demo.jpg infer_imgs: docs/images/inference_deployment/whl_demo.jpg
batch_size: 10 batch_size: 10
transforms: transforms:
- DecodeImage: - DecodeImage:
......
...@@ -103,7 +103,7 @@ DataLoader: ...@@ -103,7 +103,7 @@ DataLoader:
use_shared_memory: True use_shared_memory: True
Infer: Infer:
infer_imgs: docs/images/whl/demo.jpg infer_imgs: docs/images/inference_deployment/whl_demo.jpg
batch_size: 10 batch_size: 10
transforms: transforms:
- DecodeImage: - DecodeImage:
......
...@@ -103,7 +103,7 @@ DataLoader: ...@@ -103,7 +103,7 @@ DataLoader:
use_shared_memory: True use_shared_memory: True
Infer: Infer:
infer_imgs: docs/images/whl/demo.jpg infer_imgs: docs/images/inference_deployment/whl_demo.jpg
batch_size: 10 batch_size: 10
transforms: transforms:
- DecodeImage: - DecodeImage:
......
...@@ -103,7 +103,7 @@ DataLoader: ...@@ -103,7 +103,7 @@ DataLoader:
use_shared_memory: True use_shared_memory: True
Infer: Infer:
infer_imgs: docs/images/whl/demo.jpg infer_imgs: docs/images/inference_deployment/whl_demo.jpg
batch_size: 10 batch_size: 10
transforms: transforms:
- DecodeImage: - DecodeImage:
......
...@@ -103,7 +103,7 @@ DataLoader: ...@@ -103,7 +103,7 @@ DataLoader:
use_shared_memory: True use_shared_memory: True
Infer: Infer:
infer_imgs: docs/images/whl/demo.jpg infer_imgs: docs/images/inference_deployment/whl_demo.jpg
batch_size: 10 batch_size: 10
transforms: transforms:
- DecodeImage: - DecodeImage:
......
...@@ -103,7 +103,7 @@ DataLoader: ...@@ -103,7 +103,7 @@ DataLoader:
use_shared_memory: True use_shared_memory: True
Infer: Infer:
infer_imgs: docs/images/whl/demo.jpg infer_imgs: docs/images/inference_deployment/whl_demo.jpg
batch_size: 10 batch_size: 10
transforms: transforms:
- DecodeImage: - DecodeImage:
......
...@@ -103,7 +103,7 @@ DataLoader: ...@@ -103,7 +103,7 @@ DataLoader:
use_shared_memory: True use_shared_memory: True
Infer: Infer:
infer_imgs: docs/images/whl/demo.jpg infer_imgs: docs/images/inference_deployment/whl_demo.jpg
batch_size: 10 batch_size: 10
transforms: transforms:
- DecodeImage: - DecodeImage:
......
...@@ -103,7 +103,7 @@ DataLoader: ...@@ -103,7 +103,7 @@ DataLoader:
use_shared_memory: True use_shared_memory: True
Infer: Infer:
infer_imgs: docs/images/whl/demo.jpg infer_imgs: docs/images/inference_deployment/whl_demo.jpg
batch_size: 10 batch_size: 10
transforms: transforms:
- DecodeImage: - DecodeImage:
......
...@@ -103,7 +103,7 @@ DataLoader: ...@@ -103,7 +103,7 @@ DataLoader:
use_shared_memory: True use_shared_memory: True
Infer: Infer:
infer_imgs: docs/images/whl/demo.jpg infer_imgs: docs/images/inference_deployment/whl_demo.jpg
batch_size: 10 batch_size: 10
transforms: transforms:
- DecodeImage: - DecodeImage:
......
...@@ -100,7 +100,7 @@ DataLoader: ...@@ -100,7 +100,7 @@ DataLoader:
use_shared_memory: True use_shared_memory: True
Infer: Infer:
infer_imgs: docs/images/whl/demo.jpg infer_imgs: docs/images/inference_deployment/whl_demo.jpg
batch_size: 10 batch_size: 10
transforms: transforms:
- DecodeImage: - DecodeImage:
......
...@@ -100,7 +100,7 @@ DataLoader: ...@@ -100,7 +100,7 @@ DataLoader:
use_shared_memory: True use_shared_memory: True
Infer: Infer:
infer_imgs: docs/images/whl/demo.jpg infer_imgs: docs/images/inference_deployment/whl_demo.jpg
batch_size: 10 batch_size: 10
transforms: transforms:
- DecodeImage: - DecodeImage:
......
此差异已折叠。
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册