PaddleClas FAQ Summary - 2021 Season 2.md 30.8 KB
Newer Older
G
gaotingquan 已提交
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441
# PaddleClas FAQ Summary - 2021 Season 2

## Before You Read

- We collect some frequently asked questions in issues and user groups since PaddleClas is open-sourced and provide brief answers, aiming to give some reference for the majority to save you from twists and turns.
- There are many talents in the field of image classification, recognition and retrieval with quickly updated models and papers, and the answers here mainly rely on our limited project practice, so it is not possible to cover all facets. We sincerely hope that the man of insight will help to supplement and correct the content, thanks a lot.

## Contents

- [Recent Updates](https://github.com/PaddlePaddle/PaddleClas/blob/release/2.3/docs/zh_CN/faq_series/faq_2021_s2.md#近期更新)(2021.09.08)
- [Selection](https://github.com/PaddlePaddle/PaddleClas/blob/release/2.3/docs/zh_CN/faq_series/faq_2021_s2.md#精选)
- [1. Theory](https://github.com/PaddlePaddle/PaddleClas/blob/release/2.3/docs/zh_CN/faq_series/faq_2021_s2.md#1)
  - [1.1 Basic Knowledge of PaddleClas](https://github.com/PaddlePaddle/PaddleClas/blob/release/2.3/docs/zh_CN/faq_series/faq_2021_s2.md#1.1)
  - [1.2 Backbone Network and Pre-trained Model Library](https://github.com/PaddlePaddle/PaddleClas/blob/release/2.3/docs/zh_CN/faq_series/faq_2021_s2.md#1.2)
  - [1.3 Image Classification](https://github.com/PaddlePaddle/PaddleClas/blob/release/2.3/docs/zh_CN/faq_series/faq_2021_s2.md#1.3)
  - [1.4 General Detection](https://github.com/PaddlePaddle/PaddleClas/blob/release/2.3/docs/zh_CN/faq_series/faq_2021_s2.md#1.4)
  - [1.5 Image Recognition](https://github.com/PaddlePaddle/PaddleClas/blob/release/2.3/docs/zh_CN/faq_series/faq_2021_s2.md#1.5)
  - [1.6 Vector Search](https://github.com/PaddlePaddle/PaddleClas/blob/release/2.3/docs/zh_CN/faq_series/faq_2021_s2.md#1.6)
- [2. Practice](https://github.com/PaddlePaddle/PaddleClas/blob/release/2.3/docs/zh_CN/faq_series/faq_2021_s2.md#2)
  - [2.1 Common Problems in Training and Evaluation](https://github.com/PaddlePaddle/PaddleClas/blob/release/2.3/docs/zh_CN/faq_series/faq_2021_s2.md#2.1)
  - [2.2 Image Classification](https://github.com/PaddlePaddle/PaddleClas/blob/release/2.3/docs/zh_CN/faq_series/faq_2021_s2.md#2.2)
  - [2.3 General Detection](https://github.com/PaddlePaddle/PaddleClas/blob/release/2.3/docs/zh_CN/faq_series/faq_2021_s2.md#2.3)
  - [2.4 Image Recognition](https://github.com/PaddlePaddle/PaddleClas/blob/release/2.3/docs/zh_CN/faq_series/faq_2021_s2.md#2.4)
  - [2.5 Vector Search](https://github.com/PaddlePaddle/PaddleClas/blob/release/2.3/docs/zh_CN/faq_series/faq_2021_s2.md#2.5)
  - [2.6 Model Inference Deployment](https://github.com/PaddlePaddle/PaddleClas/blob/release/2.3/docs/zh_CN/faq_series/faq_2021_s2.md#2.6)



## Recent Updates

#### Q2.1.7: How to tackle the reported error `ERROR: Unexpected segmentation fault encountered in DataLoader workers.` during training?

**A**

Try setting the field `num_workers` in the training configuration file to `0`; try making the field `batch_size` in the file smaller; ensure that the dataset format and the dataset path in the profile are correct.

#### Q2.1.8: How to use `Mixup` and `Cutmix` during training?

**A**

- For `Mixup`, please refer to [Mixup](https://github.com/PaddlePaddle/PaddleClas/blob/release/2.3/ppcls/configs/ImageNet/DataAugment/ResNet50_ Mixup.yaml#L63-L65); and `Cuxmix`, please refer to [Cuxmix](https://github.com/PaddlePaddle/PaddleClas/blob/release/2.3/ppcls/configs/ImageNet/ DataAugment/ResNet50_Cutmix.yaml#L63-L65).
- The training accuracy (Acc) metric cannot be calculated when using `Mixup` or `Cutmix` for training, so you need to remove the field  `Metric.Train.TopkAcc` in the configuration file, please refer to [Metric.Train.TopkAcc](https://github.com/PaddlePaddle/PaddleClas/blob/release/2.3/ppcls/configs/ImageNet/DataAugment/ResNet50_Cutmix.yaml#L125-L128) for more details.

#### Q2.1.9: What are the fields `Global.pretrain_model` and `Global.checkpoints` used for in the training configuration file yaml?

**A**

- When `fine-tune` is required, the path of the file of pre-training model weights can be configured via the field `Global.pretrain_model`, which usually has the suffix `.pdparams`.
- During training, the training program automatically saves the breakpoint information at the end of each epoch, including the optimizer information `.pdopt` and model weights information `.pdparams`. In the event that the training process is unexpectedly interrupted and needs to be resumed, the breakpoint information file saved during training can be configured via the field `Global.checkpoints`, for example by configuring `checkpoints: . /output/ResNet18/epoch_18` to restore the breakpoint information at the end of 18 epoch training. PaddleClas will automatically load `epoch_18.pdopt` and `epoch_18.pdparams` to continue training from 19 epoch.

#### Q2.6.3: How to convert the model to `ONNX` format?

**A**:Paddle supports two ways and relies on the `paddle2onnx` tool, which first requires the installation of `paddle2onnx`.

```
pip install paddle2onnx
```

- From inference model to ONNX format model.

  Take the `combined` format inference model (containing `.pdmodel` and `.pdiparams` files) exported from the dynamic graph as an example, run the following command to convert the model format:

  ```
  paddle2onnx --model_dir ${model_path}  --model_filename  ${model_path}/inference.pdmodel --params_filename ${model_path}/inference.pdiparams --save_file ${save_path}/model.onnx --enable_onnx_checker True
  ```

  In the above commands:

  - `model_dir`: this parameter needs to contain `.pdmodel` and `.pdiparams` files.
  - `model_filename`: this parameter is used to specify the path of the `.pdmodel` file under the parameter `model_dir`.
  - `params_filename`: this parameter is used to specify the path of the `.pdiparams` file under the parameter `model_dir`.
  - `save_file`: this parameter is used to specify the path to the directory where the converted model is saved.

  For the conversion of a non-`combined` format inference model exported from a static diagram (usually containing the file `__model__` and multiple parameter files), and more parameter descriptions, please refer to the official documentation of [paddle2onnx](https://github.com/ PaddlePaddle/Paddle2ONNX/blob/develop/README_zh.md#Parameter options).

- Exporting ONNX format models directly from the model networking code.

  Take the model networking code of dynamic graphs as an example, the model class is a subclass that inherits from `paddle.nn.Layer` and the code is shown below.

  ```
  import paddle
  from paddle.static import InputSpec

  class SimpleNet(paddle.nn.Layer):
      def __init__(self):
          pass
      def forward(self, x):
          pass

  net = SimpleNet()
  x_spec = InputSpec(shape=[None, 3, 224, 224], dtype='float32', name='x')
  paddle.onnx.export(layer=net, path="./SimpleNet", input_spec=[x_spec])
  ```

  Among them:

  - `InputSpec()` function is used to describe the signature information of the model input, including the `shape`, `type` and `name` of the input data (can be omitted).
  - The `paddle.onnx.export()` function needs to specify the model grouping object `net`, the save path of the exported model `save_path`, and the description of the model's input data `input_spec`.

  Note that the `paddlepaddle`  `2.0.0` or above should be adopted.See [paddle.onnx.export](https://www.paddlepaddle.org.cn/documentation/docs/zh/api/paddle/onnx/) for more details on the parameters of the  `paddle.onnx.export()` function.

#### Q2.5.4: How to set the parameter `pq_size` when build searches the base library?

**A**

`pq_size` is a parameter of the PQ search algorithm, which can be simply understood as a "tiered" search algorithm. And `pq_size` is the "capacity" of each tier, so the setting of this parameter will affect the performance. However, in the case that the total data volume of the base library is not too large (less than 10,000), this parameter has little impact on the performance. So for most application scenarios, there is no need to modify this parameter when building the base library. For more details on the PQ search algorithm, see the related [paper](https://lear.inrialpes.fr/pubs/2011/JDS11/jegou_searching_with_quantization.pdf).



## Selection



## 1. Theory



### 1.1 Basic Knowledge of PaddleClas

#### Q1.1.1 Differences between PaddleClas and PaddleDetection

**A**:PaddleClas is an image recognition repo that integrates mainbody detection, image classification, and image retrieval to solve most image recognition problems. It can be easily adopted by users to solve small sample and multi-category issues in the field. PaddleDetection provides the ability of target detection, keypoint detection, multi-target tracking, etc., which is convenient for users to locate the points and regions of interest in images, and is widely used in industrial quality inspection, remote sensing image detection, unmanned inspection and other projects.

#### Q1.1.3: What does the parameter momentum mean in the Momentum optimizer?

**A**:

Momentum optimizer is based on SGD optimizer and introduces the concept of "momentum". In the SGD optimizer, the update of the parameter `w` at the time `t+1` can be expressed as

```
w_t+1 = w_t - lr * grad
```

`lr` is the learning rate and `grad` is the gradient of the parameter `w` at this point. With the introduction of momentum, the update of the parameter `w` can be expressed as

```
v_t+1 = m * v_t + lr * grad
w_t+1 = w_t - v_t+1
```

Here `m` is the `momentum`, which is the weighted value of the cumulative momentum, generally taken as `0.9`. And when the value is less than `1`, the earlier the gradient is, the smaller the impact on the current. For example, when the momentum parameter `m` takes `0.9`, the weighted value of the gradient of `t-5` is `0.9 ^ 5 = 0.59049` at time `t`, while the value at time `t-2` is `0.9 ^ 2 = 0.81`. Therefore, it is intuitive that gradient information that is too "far away" is of little significance for the current reference, while "recent" historical gradient information matters more.

[![img](https://github.com/PaddlePaddle/PaddleClas/raw/release/2.3/docs/images/faq/momentum.jpeg)](https://github.com/PaddlePaddle/PaddleClas/blob/release/2.3/docs/images/faq/momentum.jpeg)

By introducing the concept of momentum, the effect of historical updates is taken into account in parameter updates, thus speeding up the convergence and improving the loss (cost, loss) oscillation caused by the `SGD` optimizer.

#### Q1.1.4: Does PaddleClas have an implementation of the paper `Fixing the train-test resolution discrepancy`?

**A**: Currently, it is not implemented. If needed, you can try to modify the code yourself. In brief, the idea proposed in this paper is to fine-tune the final FC layer of the trained model using a larger resolution as input. Specifically, train the model network on a lower resolution dataset first, then set the parameter `stop_gradient=True ` for the weights of all layers of the network except the final FC layer, and at last fine-tune the network with a larger resolution input.



### 1.2 Backbone Network and Pre-trained Model Library



### 1.3 Image Classification

#### Q1.3.1: Does PaddleClas provide data enhancement for adjusting image brightness, contrast, saturation, hue, etc.?

**A**

PaddleClas provides a variety of data augmentation methods, which can be divided into 3 categories.

1. Image transformation : AutoAugment, RandAugment;
2. Image cropping: CutOut、RandErasing、HideAndSeek、GridMask;
3. Image aliasing :Mixup, Cutmix.

Among them, RandAngment provides a variety of random combinations of data augmentation methods, which can meet the needs of brightness, contrast, saturation, hue and other aspects.



### 1.4 General Detection

#### Q1.4.1 Does the mainbody detection only export one subject detection box at a time?

**A**:The number of outputs for the main body detection is configurable through the configuration file. In the configuration file, Global.threshold controls the threshold value for detection, so boxes smaller than this threshold are discarded; and Global.max_det_results controls the maximum number of results returned. The two together determine the number of output detection boxes.

#### Q1.4.2 How is the data selected for training the mainbody detection model? Will it harm the accuracy to switch to a smaller model?

**A**

The training data is a randomly selected subset of publicly available datasets such as COCO, Object365, RPC, and LogoDet. We are currently introducing an ultra-lightweight mainbody detection model in version 2.3, which can be found in [Mainbody Detection](https://github.com/PaddlePaddle/PaddleClas/blob/release/2.3/docs/zh_CN/image_recognition_ pipeline/mainbody_detection.md#2-Model Selection).

#### Q1.4.3: Is there any false detections in some scenarios with the current mainbody detection model?

**A**:The current mainbody detection model is trained using publicly available datasets such as COCO, Object365, RPC, LogoDet, etc. If the data to be detected is similar to industrial quality inspection and other data with large differences from common categories, it is necessary to fine-tune the training based on the current detection model again.



### 1.5 Image Recognition

#### Q1.5.1 Is `triplet loss` needed for  `circle loss` ?

**A**

`circle loss` is a unified form of sample pair learning and classification learning, and `triplet loss` can be added if it is a classification learning.

#### Q1.5.2 如果不是识别开源的四个方向的图片,该使用哪个识别模型?Which recognition model is better if not to recognize open source images in all four directions?

**A**

The product recognition model is recommended. For one, the range of products is wider and the probability that the recognized image is a product is higher. For two, the training data of the product recognition model uses 50,000 categories of data, which has better generalization ability and more robust features.

#### Q1.5.3 Why is 512-dimensional vector is finally adopted instead of 1024 or others?

**A**

Vectors with small dimensions should be adopted. 128 or even smaller are practically used to speed up the computation. In general, a dimension of 512 is large enough to adequately represent the features.



### 1.6 Vector Search

#### Q1.6.1 Does the Möbius vector search algorithm currently used by PaddleClas support index.add() similar to the one used by faiss? Also, do I have to train every time I build a new graph? Is the train here to speed up the search or to build similar graphs?

**A**:The faiss retrieval module is now supported in the release/2.3 branch and is no longer supported by Möbius, which provides a graph-based algorithm that is similar to the nearest neighbor search and currently supports two types of distance calculation: inner product and L2 distance. However, Möbius does not support the index.add function provided in faiss. So if you need to add content to the search library, you need to rebuild a new index from scratch. The search algorithm internally performs a train-like process each time the index is built, which is different from the train interface provided by faiss. Therefore, if you need the faiss module, you can use the release/2.3 branch, and if you need Möbius, you need to fall back to the release/2.2 branch for now.

#### Q1.6.2: What exactly are the `Query` and `Gallery` configurations used for in the PaddleClas image recognition for Eval configuration file?

**A**:

 Both `Query` and `Gallery` are data set configurations, where `Gallery` is used to configure the base library data and `Query` is used to configure the validation set. When performing Eval, the model is first used to forward compute feature vectors on the `Gallery` base library data, which are used to construct the base library, and then the model forward computes feature vectors on the data in the `Query` validation set, and then computes metrics such as recall rate in the base library.



## 2. Practice



### 2.1 Common Problems in Training and Evaluation

#### Q2.1.1 Where is the `train_log` file in PaddleClas?

**A**`train.log` is stored in the path where the weights are stored.

#### Q2.1.2 Why nan is the output of the model training?

**A**: 1. Make sure the pre-trained model is loaded correctly, the easiest way is to add the parameter `-o Arch.pretrained=True`; 2. When fine-tuning the model, the learning rate should not be too large, e.g. 0.001.

#### Q2.1.3 Is it possible to perform frame-by-frame prediction in a video?

**A**:Yes, but currently PaddleClas does not support video input. You can try to modify the code of PaddleClas or store the video frame by frame before using PaddleClas.

#### Q2.1.4: In data preprocessing, what setting can be adopted without cropping the input data? Or how to set the size of the crop?

**A**: The data preprocessing operators supported by PaddleClas can be viewed in `ppcls/data/preprocess/__init__.py`, and all supported operators can be configured in the configuration file. The name of the operator needs to be the same as the operator class name, and the parameters need to be the same as the constructor parameters of the corresponding operator class. If you do not need to crop the image, you can remove `CropImage` and `RandCropImage` and replace them with `ResizeImage`, and you can set different resize methods with its parameters by using the `size` parameter to directly scale the image to a fixed size, and using the `resize_short` parameter to maintain the image aspect ratio for scaling. To set the crop size, use the `size` parameter of the `CropImage` operator, or the `size` parameter of the `RandCropImage` operator.

#### Q2.1.5: Why do I get a usage error after PaddlePaddle installation and cannot import any modules under paddle (import paddle.xxx)?

**A**:

You can first test if Paddle is installed correctly by using the following code.

```
import paddle
paddle.utils.install_check.run_check()
```

When installed correctly, the following prompts will be displayed.

```
PaddlePaddle is installed successfully! Let's start deep learning with PaddlePaddle now.
```

Otherwise, the relevant question will prompt out. Also, after installing both the CPU and the GPU version of Paddle, you will need to uninstall both versions and reinstall the required version due to conflicts between the two versions.

#### Q2.1.6: How to save the optimal model during training?

**A**:

PaddleClas saves/updates the following three types of models during training.

1. the latest model (`latest.pdopt`, `latest.pdparams`, `latest.pdstates`), which can be used to resume training when it is unexpectedly interrupted.
2. the best model (`best_model.pdopt`, `best_model.pdparams`, `best_model.pdstates`).
3. breakpoints at the end of an epoch during training (`epoch_xxx.pdopt`, `epoch_xxx.pdparams`, `epoch_xxx.pdstates`). The `Global.save_interval` field in the training profile indicates the save interval for this model. If you make it larger than the total number of epochs, intermediate breakpoint models will no longer be saved.

#### Q2.1.7: How to address the `ERROR: Unexpected segmentation fault encountered in DataLoader workers.` during training?

**A**:Try setting the field `num_workers` in the training profile to `0`; try making the field `batch_size` in the training profile smaller; ensure that the dataset format and the dataset path in the profile are correct.

#### Q2.1.8: How to use `Mixup` and `Cutmix` during training?

**A**

- For `Mixup`, please refer to [Mixup](https://github.com/PaddlePaddle/PaddleClas/blob/release/2.3/ppcls/configs/ImageNet/DataAugment/ResNet50_ Mixup.yaml#L63-L65); and`Cuxmix`, please refer to [Cuxmix](https://github.com/PaddlePaddle/PaddleClas/blob/release/2.3/ppcls/configs/ImageNet/ DataAugment/ResNet50_Cutmix.yaml#L63-L65).
- The training accuracy (Acc) metric cannot be calculated when using `Mixup` or `Cutmix` for training, so you need to remove the `Metric.Train.TopkAcc` field in the configuration file, please refer to [Metric.Train.TopkAcc](https://github.com/ PaddlePaddle/PaddleClas/blob/release/2.3/ppcls/configs/ImageNet/DataAugment/ResNet50_Cutmix.yaml#L125-L128).

#### Q2.1.9: What are the fields `Global.pretrain_model` and `Global.checkpoints` used for in the training configuration file yaml?

**A**

- When `fine-tune` is required, the path of the file of pre-training model weights can be configured via the field `Global.pretrain_model`, which usually has the suffix `.pdparams`.
- During training, the training program automatically saves the breakpoint information at the end of each epoch, including the optimizer information `.pdopt` and model weights information `.pdparams`. In the event that the training process is unexpectedly interrupted and needs to be resumed, the breakpoint information file saved during training can be configured via the field `Global.checkpoints`, for example by configuring `checkpoints: . /output/ResNet18/epoch_18` to restore the breakpoint information at the end of 18 epoch training. PaddleClas will automatically load `epoch_18.pdopt` and `epoch_18.pdparams` to continue training from 19 epoch.



### 2.2 Image Classification

#### Q2.2.1 How to distill the small model after pre-training the large model on 500M data and then distill the fintune model on 1M data in SSLD?

**A**:The steps are as follows:

1. Obtain the `ResNet50-vd` model based on the distillation of the open-source `ResNeXt101-32x16d-wsl` model from Facebook.
2. Use this `ResNet50-vd` to distill `MobilNetV3` on a 500W dataset.
3. Considering that the distribution of the 500W dataset is not exactly the same as that of the 100W data, this piece of data is finetuned on the 100W data to slightly improve the accuracy.

#### Q2.2.2 nan appears in loss when training SwinTransformer

**A**:When training SwinTransformer, please use `Paddle` `2.1.1` or above, and load the pre-trained model we provide. Also, the learning rate should be kept at an appropriate level.



### 2.3 General Detection

#### Q2.3.1 Why are there some images that are detected as the original image?

**A**:The mainbody detection model returns the detection frame, but in fact, in order to make the subsequent recognition model more accurate, the original image is also returned along with the detection frame. Subsequently, the original image or the detection frame will be sorted according to its similarity with the images in the library, and the label of the image in the library with the highest similarity will be the label of the recognized image.

#### Q2.3.2:在直播场景中,需要提供一个直播即时识别画面,能够在延迟几秒内找到特征目标物并用框圈起,这个可以实现吗?In a live broadcast scenario, is it possible to provide a live instant recognition screen that can find the target object of the feature and circle it with a delay of a few seconds?

**A**:A real-time detection presents high requirements for the detection speed; PP-YOLO is a lightweight target detection model provided by Paddle team, which strikes a good balance of detection speed and accuracy, you can try PP-YOLO for detection. For the use of PP-YOLO, you can refer to [PaddleDetection](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.1/configs/ppyolo/README_cn.md).

#### Q2.3.3: For unknown labels, adding gallery dataset can be used for subsequent classification recognition (without training), but if the previous detection model cannot locate and detect the unknown labels, is it still necessary to train the previous detection model?

**A**:If the detection model does not perform well on your own dataset, you need to finetune it again on your own detection dataset.



### 2.4 Image Recognition

#### Q2.4.1: Why is `Illegal instruction` reported during the recognition inference?

**A**:If you are using the release/2.2 branch, it is recommended to update it to the release/2.3 branch, where we replaced the Möbius search model with the faiss search module, as described in [Vector Search Tutorial](https://github.com/PaddlePaddle/ PaddleClas/blob/release/2.3/deploy/vector_search/README.md). If you still have problems, you can contact us in the WeChat group or raise an issue on GitHub.

#### Q2.4.2: How can recognition models be fine-tuned to train on the basis of pre-trained models?

**A**:The fine-tuning training of the recognition model is similar to that of the classification model. The recognition model can be loaded with a pre-trained model of the product, and the training process can be found in [recognition model training](https://github.com/PaddlePaddle/PaddleClas/blob/release/2.3/docs/zh_CN/ models_training/recognition.md), and we will continue to refine the documentation.

#### Q2.4.3: Why does it fail to run all mini-batches in each epoch when training metric learning?

**A**:When training metric learning, the Sampler used is DistributedRandomIdentitySampler, which does not sample all the images, resulting in each epoch sampling only part of the data, so it is normal that the mini-batch cannot run through the display. This issue has been optimized in the release/2.3 branch, please update to release/2.3 to use it.

#### Q2.4.4: Why do some images have no recognition results?

**A**:In the configuration file (e.g. inference_product.yaml), `IndexProcess.score_thres` controls the minimum value of cosine similarity of the recognized image to the image in the library. When the cosine similarity is less than this value, the result will not be printed. You can adjust this value according to your actual data.



### 2.5 Vector Search

#### Q2.5.1: Why is the error `assert text_num >= 2` reported after adding an image to the index?

**A**:Make sure that the image path and the image name in data_file.txt is separated by a single table instead of a space.

#### Q2.5.2: Do I need to rebuild the index to add new base data?

**A**:Starting from release/2.3 branch, we have replaced the Möbius search model with the faiss search module, which already supports the addition of base data without building the base library, as described in [Vector Search Tutorial](https://github.com/PaddlePaddle/PaddleClas/blob/ release/2.3/deploy/vector_search/README.md).

#### Q2.5.3: How to deal with the reported error clang: error: unsupported option '-fopenmp' when recompiling index.so in Mac?

**A**

If you are using the release/2.2 branch, it is recommended to update it to the release/2.3 branch, where we replaced the Möbius search model with the faiss search module, as described in [Vector Search Tutorial](https://github.com/PaddlePaddle/ PaddleClas/blob/release/2.3/deploy/vector_search/README.md). If you still have problems, you can contact us in the user WeChat group or raise an issue on GitHub.

#### Q2.5.4: How to set the parameter `pq_size` when build searches the base library?

**A**

`pq_size` is a parameter of the PQ search algorithm, which can be simply understood as a "tiered" search algorithm. And `pq_size` is the "capacity" of each tier, so the setting of this parameter will affect the performance. However, in the case that the total data volume of the base library is not too large (less than 10,000), this parameter has little impact on the performance. So for most application scenarios, there is no need to modify this parameter when building the base library. For more details on the PQ search algorithm, see the related [paper](https://lear.inrialpes.fr/pubs/2011/JDS11/jegou_searching_with_quantization.pdf).

### 2.6 Model Inference Deployment

#### Q2.6.1: How to add the parameter of a module that is enabled by hub serving?

**A**:See [hub serving parameters](https://github.com/PaddlePaddle/PaddleClas/blob/release/2.3/deploy/hubserving/clas/params.py) for more details.

#### Q2.6.2: Why is the result not accurate enough when exporting the inference model for inference deployment?

**A**:

This problem is usually caused by the incorrect loading of the model parameters when exporting. First check the export log for something like the following.

```
UserWarning: Skip loading for ***. *** is not found in the provided dict.
```

If it exists, the model weights were not loaded successfully. Please further check the `Global.pretrained_model` field in the configuration file to see if the path of the model weights file is correctly configured. The suffix of the model weights file is usually `pdparams`, note that the file suffix is not required when configuring this path.

#### Q2.6.3: How to convert the model to `ONNX` format?

**A**

Paddle supports two ways and relies on the `paddle2onnx` tool, which first requires the installation of `paddle2onnx`.

```
pip install paddle2onnx
```

- From inference model to ONNX format model:

  Take the `combined` format inference model (containing `.pdmodel` and `.pdiparams` files) exported from the dynamic graph as an example, run the following command to convert the model format:

  ```
  paddle2onnx --model_dir ${model_path}  --model_filename  ${model_path}/inference.pdmodel --params_filename ${model_path}/inference.pdiparams --save_file ${save_path}/model.onnx --enable_onnx_checker True
  ```

  In the above commands:

  - `model_dir`: this parameter needs to contain `.pdmodel` and `.pdiparams` files.
  - `model_filename`: this parameter is used to specify the path of the `.pdmodel` file under the parameter `model_dir`.
  - `params_filename`: this parameter is used to specify the path of the `.pdiparams` file under the parameter `model_dir`.
  - `save_file`: this parameter is used to specify the path to the directory where the converted model is saved.

  For the conversion of a non-`combined` format inference model exported from a static diagram (usually containing the file `__model__` and multiple parameter files), and more parameter descriptions, please refer to the official documentation of [paddle2onnx](https://github.com/ PaddlePaddle/Paddle2ONNX/blob/develop/README_zh.md#Parameter options).

- Exporting ONNX format models directly from the model networking code.

  Take the model networking code of dynamic graphs as an example, the model class is a subclass that inherits from `paddle.nn.Layer` and the code is shown below:

  ```
  import paddle
  from paddle.static import InputSpec

  class SimpleNet(paddle.nn.Layer):
      def __init__(self):
          pass
      def forward(self, x):
          pass

  net = SimpleNet()
  x_spec = InputSpec(shape=[None, 3, 224, 224], dtype='float32', name='x')
  paddle.onnx.export(layer=net, path="./SimpleNet", input_spec=[x_spec])
  ```

  Among them:

  - `InputSpec()` function is used to describe the signature information of the model input, including the `shape`, `type` and `name` of the input data (can be omitted).
  - The `paddle.onnx.export()` function needs to specify the model grouping object `net`, the save path of the exported model `save_path`, and the description of the model's input data `input_spec`.

  Note that the `paddlepaddle`  `2.0.0` or above should be adopted.See [paddle.onnx.export](https://www.paddlepaddle.org.cn/documentation/docs/zh/api/paddle/onnx/) for more details on the parameters of the  `paddle.onnx.export()` function.