The inference model (the model saved by fluid.io.save_inference_model) is generally a solidified model saved after the model training is completed, and is mostly used to give prediction in deployment.
The inference model (the model saved by `fluid.io.save_inference_model`) is generally a solidified model saved after the model training is completed, and is mostly used to give prediction in deployment.
The model saved during the training process is the checkpoints model, which saves the parameters of the model and is mostly used to resume training.
The model saved during the training process is the checkpoints model, which saves the parameters of the model and is mostly used to resume training.
...
@@ -9,7 +9,31 @@ Compared with the checkpoints model, the inference model will additionally save
...
@@ -9,7 +9,31 @@ Compared with the checkpoints model, the inference model will additionally save
Next, we first introduce how to convert a trained model into an inference model, and then we will introduce text detection, text recognition, and the concatenation of them based on inference model.
Next, we first introduce how to convert a trained model into an inference model, and then we will introduce text detection, text recognition, and the concatenation of them based on inference model.
-[CONVERT TRAINING MODEL TO INFERENCE MODEL](#CONVERT)
-[Convert detection model to inference model](#Convert_detection_model)
-[Convert recognition model to inference model](#Convert_recognition_model)
-[TEXT DETECTION MODEL INFERENCE](#DETECTION_MODEL_INFERENCE)
-[1. LIGHTWEIGHT CHINESE DETECTION MODEL INFERENCE](#LIGHTWEIGHT_DETECTION)
-[2. DB TEXT DETECTION MODEL INFERENCE](#DB_DETECTION)
-[3. EAST TEXT DETECTION MODEL INFERENCE](#EAST_DETECTION)
-[4. SAST TEXT DETECTION MODEL INFERENCE](#SAST_DETECTION)
-[TEXT RECOGNITION MODEL INFERENCE](#RECOGNITION_MODEL_INFERENCE)
-[1. LIGHTWEIGHT CHINESE MODEL](#LIGHTWEIGHT_RECOGNITION)
-[2. CTC-BASED TEXT RECOGNITION MODEL INFERENCE](#CTC-BASED_RECOGNITION)
-[3. ATTENTION-BASED TEXT RECOGNITION MODEL INFERENCE](#ATTENTION-BASED_RECOGNITION)
-[4. TEXT RECOGNITION MODEL INFERENCE USING CUSTOM CHARACTERS DICTIONARY](#USING_CUSTOM_CHARACTERS)
-[TEXT DETECTION AND RECOGNITION INFERENCE CONCATENATION](#CONCATENATION)
-[1. LIGHTWEIGHT CHINESE MODEL](#LIGHTWEIGHT_CHINESE_MODEL)
-[2. OTHER MODELS](#OTHER_MODELS)
<aname="CONVERT"></a>
## CONVERT TRAINING MODEL TO INFERENCE MODEL
## CONVERT TRAINING MODEL TO INFERENCE MODEL
<aname="Convert_detection_model"></a>
### Convert detection model to inference model
### Convert detection model to inference model
Download the lightweight Chinese detection model:
Download the lightweight Chinese detection model:
...
@@ -35,6 +59,7 @@ inference/det_db/
...
@@ -35,6 +59,7 @@ inference/det_db/
└─ params Check the parameter file of the inference model
└─ params Check the parameter file of the inference model
```
```
<aname="Convert_recognition_model"></a>
### Convert recognition model to inference model
### Convert recognition model to inference model
Download the lightweight Chinese recognition model:
Download the lightweight Chinese recognition model:
...
@@ -62,11 +87,13 @@ After the conversion is successful, there are two files in the directory:
...
@@ -62,11 +87,13 @@ After the conversion is successful, there are two files in the directory:
└─ params Identify the parameter files of the inference model
└─ params Identify the parameter files of the inference model
```
```
<aname="DETECTION_MODEL_INFERENCE"></a>
## TEXT DETECTION MODEL INFERENCE
## TEXT DETECTION MODEL INFERENCE
The following will introduce the lightweight Chinese detection model inference, DB text detection model inference and EAST text detection model inference. The default configuration is based on the inference setting of the DB text detection model.
The following will introduce the lightweight Chinese detection model inference, DB text detection model inference and EAST text detection model inference. The default configuration is based on the inference setting of the DB text detection model.
Because EAST and DB algorithms are very different, when inference, it is necessary to **adapt the EAST text detection algorithm by passing in corresponding parameters**.
Because EAST and DB algorithms are very different, when inference, it is necessary to **adapt the EAST text detection algorithm by passing in corresponding parameters**.
<aname="LIGHTWEIGHT_DETECTION"></a>
### 1. LIGHTWEIGHT CHINESE DETECTION MODEL INFERENCE
### 1. LIGHTWEIGHT CHINESE DETECTION MODEL INFERENCE
For lightweight Chinese detection model inference, you can execute the following commands:
For lightweight Chinese detection model inference, you can execute the following commands:
...
@@ -90,6 +117,7 @@ If you want to use the CPU for prediction, execute the command as follows
...
@@ -90,6 +117,7 @@ If you want to use the CPU for prediction, execute the command as follows
First, convert the model saved in the DB text detection training process into an inference model. Taking the model based on the Resnet50_vd backbone network and trained on the ICDAR2015 English dataset as an example ([model download link](https://paddleocr.bj.bcebos.com/det_r50_vd_db.tar)), you can use the following command to convert:
First, convert the model saved in the DB text detection training process into an inference model. Taking the model based on the Resnet50_vd backbone network and trained on the ICDAR2015 English dataset as an example ([model download link](https://paddleocr.bj.bcebos.com/det_r50_vd_db.tar)), you can use the following command to convert:
...
@@ -114,6 +142,7 @@ The visualized text detection results are saved to the `./inference_results` fol
...
@@ -114,6 +142,7 @@ The visualized text detection results are saved to the `./inference_results` fol
**Note**: Since the ICDAR2015 dataset has only 1,000 training images, mainly for English scenes, the above model has very poor detection result on Chinese text images.
**Note**: Since the ICDAR2015 dataset has only 1,000 training images, mainly for English scenes, the above model has very poor detection result on Chinese text images.
<aname="EAST_DETECTION"></a>
### 3. EAST TEXT DETECTION MODEL INFERENCE
### 3. EAST TEXT DETECTION MODEL INFERENCE
First, convert the model saved in the EAST text detection training process into an inference model. Taking the model based on the Resnet50_vd backbone network and trained on the ICDAR2015 English dataset as an example ([model download link](https://paddleocr.bj.bcebos.com/det_r50_vd_east.tar)), you can use the following command to convert:
First, convert the model saved in the EAST text detection training process into an inference model. Taking the model based on the Resnet50_vd backbone network and trained on the ICDAR2015 English dataset as an example ([model download link](https://paddleocr.bj.bcebos.com/det_r50_vd_east.tar)), you can use the following command to convert:
...
@@ -126,23 +155,64 @@ First, convert the model saved in the EAST text detection training process into
...
@@ -126,23 +155,64 @@ First, convert the model saved in the EAST text detection training process into
For EAST text detection model inference, you need to set the parameter det_algorithm, specify the detection algorithm type to EAST, run the following command:
**For EAST text detection model inference, you need to set the parameter ``--det_algorithm="EAST"``**, run the following command:
The visualized text detection results are saved to the `./inference_results` folder by default, and the name of the result file is prefixed with 'det_res'. Examples of results are as follows:
The visualized text detection results are saved to the `./inference_results` folder by default, and the name of the result file is prefixed with 'det_res'. Examples of results are as follows:
![](../imgs_results/det_res_img_10_east.jpg)
![](../imgs_results/det_res_img_10_east.jpg)
**Note**: The Python version of NMS in EAST post-processing used in this codebase so the prediction speed is quite slow. If you use the C++ version, there will be a significant speedup.
**Note**: EAST post-processing locality aware NMS has two versions: Python and C++. The speed of C++ version is obviously faster than that of Python version. Due to the compilation version problem of NMS of C++ version, C++ version NMS will be called only in Python 3.5 environment, and python version NMS will be called in other cases.
<aname="SAST_DETECTION"></a>
### 4. SAST TEXT DETECTION MODEL INFERENCE
#### (1). Quadrangle text detection model (ICDAR2015)
First, convert the model saved in the SAST text detection training process into an inference model. Taking the model based on the Resnet50_vd backbone network and trained on the ICDAR2015 English dataset as an example ([model download link](https://paddleocr.bj.bcebos.com/SAST/sast_r50_vd_icdar2015.tar)), you can use the following command to convert:
The visualized text detection results are saved to the `./inference_results` folder by default, and the name of the result file is prefixed with 'det_res'. Examples of results are as follows:
![](../imgs_results/det_res_img_10_sast.jpg)
#### (2). Curved text detection model (Total-Text)
First, convert the model saved in the SAST text detection training process into an inference model. Taking the model based on the Resnet50_vd backbone network and trained on the Total-Text English dataset as an example ([model download link](https://paddleocr.bj.bcebos.com/SAST/sast_r50_vd_total_text.tar)), you can use the following command to convert:
**For SAST curved text detection model inference, you need to set the parameter `--det_algorithm="SAST"` and `--det_sast_polygon=True`**, run the following command:
The visualized text detection results are saved to the `./inference_results` folder by default, and the name of the result file is prefixed with 'det_res'. Examples of results are as follows:
![](../imgs_results/det_res_img_10_east.jpg)
**Note**: SAST post-processing locality aware NMS has two versions: Python and C++. The speed of C++ version is obviously faster than that of Python version. Due to the compilation version problem of NMS of C++ version, C++ version NMS will be called only in Python 3.5 environment, and python version NMS will be called in other cases.
<aname="RECOGNITION_MODEL_INFERENCE"></a>
## TEXT RECOGNITION MODEL INFERENCE
## TEXT RECOGNITION MODEL INFERENCE
The following will introduce the lightweight Chinese recognition model inference, other CTC-based and Attention-based text recognition models inference. For Chinese text recognition, it is recommended to choose the recognition model based on CTC loss. In practice, it is also found that the result of the model based on Attention loss is not as good as the one based on CTC loss. In addition, if the characters dictionary is modified during training, make sure that you use the same characters set during inferencing. Please check below for details.
The following will introduce the lightweight Chinese recognition model inference, other CTC-based and Attention-based text recognition models inference. For Chinese text recognition, it is recommended to choose the recognition model based on CTC loss. In practice, it is also found that the result of the model based on Attention loss is not as good as the one based on CTC loss. In addition, if the characters dictionary is modified during training, make sure that you use the same characters set during inferencing. Please check below for details.
<aname="LIGHTWEIGHT_RECOGNITION"></a>
### 1. LIGHTWEIGHT CHINESE TEXT RECOGNITION MODEL REFERENCE
### 1. LIGHTWEIGHT CHINESE TEXT RECOGNITION MODEL REFERENCE
For lightweight Chinese recognition model inference, you can execute the following commands:
For lightweight Chinese recognition model inference, you can execute the following commands:
...
@@ -158,6 +228,7 @@ After executing the command, the prediction results (recognized text and score)
...
@@ -158,6 +228,7 @@ After executing the command, the prediction results (recognized text and score)
Predicts of ./doc/imgs_words/ch/word_4.jpg:['实力活力', 0.89552695]
Predicts of ./doc/imgs_words/ch/word_4.jpg:['实力活力', 0.89552695]
<aname="CTC-BASED_RECOGNITION"></a>
### 2. CTC-BASED TEXT RECOGNITION MODEL INFERENCE
### 2. CTC-BASED TEXT RECOGNITION MODEL INFERENCE
Taking STAR-Net as an example, we introduce the recognition model inference based on CTC loss. CRNN and Rosetta are used in a similar way, by setting the recognition algorithm parameter `rec_algorithm`.
Taking STAR-Net as an example, we introduce the recognition model inference based on CTC loss. CRNN and Rosetta are used in a similar way, by setting the recognition algorithm parameter `rec_algorithm`.
...
@@ -178,6 +249,7 @@ For STAR-Net text recognition model inference, execute the following commands:
...
@@ -178,6 +249,7 @@ For STAR-Net text recognition model inference, execute the following commands:
### 4. TEXT RECOGNITION MODEL INFERENCE USING CUSTOM CHARACTERS DICTIONARY
### 4. TEXT RECOGNITION MODEL INFERENCE USING CUSTOM CHARACTERS DICTIONARY
If the chars dictionary is modified during training, you need to specify the new dictionary path by setting the parameter `rec_char_dict_path` when using your inference model to predict.
If the chars dictionary is modified during training, you need to specify the new dictionary path by setting the parameter `rec_char_dict_path` when using your inference model to predict.
...
@@ -203,8 +276,10 @@ If the chars dictionary is modified during training, you need to specify the new
...
@@ -203,8 +276,10 @@ If the chars dictionary is modified during training, you need to specify the new
## TEXT DETECTION AND RECOGNITION INFERENCE CONCATENATION
## TEXT DETECTION AND RECOGNITION INFERENCE CONCATENATION
<aname="LIGHTWEIGHT_CHINESE_MODEL"></a>
### 1. LIGHTWEIGHT CHINESE MODEL
### 1. LIGHTWEIGHT CHINESE MODEL
When performing prediction, you need to specify the path of a single image or a folder of images through the parameter `image_dir`, the parameter `det_model_dir` specifies the path to detect the inference model, and the parameter `rec_model_dir` specifies the path to identify the inference model. The visualized recognition results are saved to the `./inference_results` folder by default.
When performing prediction, you need to specify the path of a single image or a folder of images through the parameter `image_dir`, the parameter `det_model_dir` specifies the path to detect the inference model, and the parameter `rec_model_dir` specifies the path to identify the inference model. The visualized recognition results are saved to the `./inference_results` folder by default.
...
@@ -217,9 +292,14 @@ After executing the command, the recognition result image is as follows:
...
@@ -217,9 +292,14 @@ After executing the command, the recognition result image is as follows:
![](../imgs_results/2.jpg)
![](../imgs_results/2.jpg)
<aname="OTHER_MODELS"></a>
### 2. OTHER MODELS
### 2. OTHER MODELS
If you want to try other detection algorithms or recognition algorithms, please refer to the above text detection model inference and text recognition model inference, update the corresponding configuration and model, the following command uses the combination of the EAST text detection and STAR-Net text recognition:
If you want to try other detection algorithms or recognition algorithms, please refer to the above text detection model inference and text recognition model inference, update the corresponding configuration and model.
**Note: due to the limitation of rotation logic of detected box, SAST curved text detection model (using the parameter `det_sast_polygon=True`) is not supported for model combination yet.**
The following command uses the combination of the EAST text detection and STAR-Net text recognition: