inference.md 13.4 KB
Newer Older
L
LDOUBLEV 已提交
1

K
Khanh Tran 已提交
2
# Inference based on prediction engine
L
LDOUBLEV 已提交
3

K
Khanh Tran 已提交
4 5 6 7
inference model (model saved by fluid.io.save_inference_model)
It is generally the solidified model saved after the model training is completed, which is mostly used to predict deployment.
The model saved during the training process is the checkpoints model, which saves the parameters of the model and is mostly used to resume training.
Compared with the checkpoints model, the inference model will additionally save the structural information of the model. It has superior performance in predicting deployment and accelerating reasoning, is flexible and convenient, and is suitable for integration with actual systems. For more detailed introduction, please refer to the document [Classification prediction framework](https://paddleclas.readthedocs.io/zh_CN/latest/extension/paddle_inference.html).
L
LDOUBLEV 已提交
8

K
Khanh Tran 已提交
9
Next, we first introduce how to convert the trained model into an inference model, and then we will introduce text detection, text recognition, and the connection of the two based on prediction engine inference.
L
LDOUBLEV 已提交
10

K
Khanh Tran 已提交
11 12
## Training model to inference model
### Detection model to inference model
L
LDOUBLEV 已提交
13

K
Khanh Tran 已提交
14
Download the super lightweight Chinese detection model:
L
LDOUBLEV 已提交
15
```
D
dyning 已提交
16
wget -P ./ch_lite/ https://paddleocr.bj.bcebos.com/ch_models/ch_det_mv3_db.tar && tar xf ./ch_lite/ch_det_mv3_db.tar -C ./ch_lite/
L
LDOUBLEV 已提交
17
```
K
Khanh Tran 已提交
18
The above model is a DB algorithm trained with MobileNetV3 as the backbone. To convert the trained model into an inference model, just run the following command:
L
LDOUBLEV 已提交
19
```
D
dyning 已提交
20
python3 tools/export_model.py -c configs/det/det_mv3_db.yml -o Global.checkpoints=./ch_lite/det_mv3_db/best_accuracy Global.save_inference_dir=./inference/det_db/
L
LDOUBLEV 已提交
21
```
K
Khanh Tran 已提交
22 23 24
When transferring an inference model, the configuration file used is the same as the configuration file used during training. In addition, you also need to set the Global.checkpoints and Global.save_inference_dir parameters in the configuration file.
Global.checkpoints points to the model parameter file saved in training, and Global.save_inference_dir is the directory where the generated inference model is to be saved.
After the conversion is successful, there are two files in the `save_inference_dir` directory:
L
LDOUBLEV 已提交
25
```
L
LDOUBLEV 已提交
26
inference/det_db/
K
Khanh Tran 已提交
27 28
  └─  model     Check the program file of inference model
  └─  params    Check the parameter file of the inference model
L
LDOUBLEV 已提交
29 30
```

K
Khanh Tran 已提交
31
### Recognition model to inference model
L
LDOUBLEV 已提交
32

K
Khanh Tran 已提交
33
Download the ultra-lightweight Chinese recognition model:
L
LDOUBLEV 已提交
34
```
D
dyning 已提交
35
wget -P ./ch_lite/ https://paddleocr.bj.bcebos.com/ch_models/ch_rec_mv3_crnn.tar && tar xf ./ch_lite/ch_rec_mv3_crnn.tar -C ./ch_lite/
L
LDOUBLEV 已提交
36 37
```

K
Khanh Tran 已提交
38
The identification model is converted to the inference model in the same way as the detection, as follows:
L
LDOUBLEV 已提交
39
```
L
LDOUBLEV 已提交
40 41
python3 tools/export_model.py -c configs/rec/rec_chinese_lite_train.yml -o Global.checkpoints=./ch_lite/rec_mv3_crnn/best_accuracy \
        Global.save_inference_dir=./inference/rec_crnn/
L
LDOUBLEV 已提交
42
```
L
LDOUBLEV 已提交
43

K
Khanh Tran 已提交
44
If you are a model trained on your own data set and you have adjusted the dictionary file of Chinese characters, please pay attention to whether the character_dict_path in the configuration file is the required dictionary file.
L
LDOUBLEV 已提交
45

K
Khanh Tran 已提交
46
After the conversion is successful, there are two files in the directory:
L
LDOUBLEV 已提交
47
```
L
LDOUBLEV 已提交
48
/inference/rec_crnn/
K
Khanh Tran 已提交
49 50
  └─  model     Identify the program file of the inference model
  └─  params    Identify the parameter file of the inference model
L
LDOUBLEV 已提交
51
```
L
LDOUBLEV 已提交
52

K
Khanh Tran 已提交
53
## Text detection model inference
L
LDOUBLEV 已提交
54

K
Khanh Tran 已提交
55
The following will introduce the ultra-lightweight Chinese detection model reasoning, DB text detection model reasoning and EAST text detection model reasoning. The default configuration is based on the inference setting of the DB text detection model. Because EAST and DB algorithms are very different, when inference, it is necessary to adapt the EAST text detection algorithm by passing in corresponding parameters.
D
dyning 已提交
56

K
Khanh Tran 已提交
57
### 1.Ultra-lightweight Chinese detection model inference
D
dyning 已提交
58

K
Khanh Tran 已提交
59
Super lightweight Chinese detection model inference, you can execute the following commands:
L
LDOUBLEV 已提交
60 61

```
L
LDOUBLEV 已提交
62
python3 tools/infer/predict_det.py --image_dir="./doc/imgs/2.jpg" --det_model_dir="./inference/det_db/"
L
LDOUBLEV 已提交
63 64
```

K
Khanh Tran 已提交
65
The visual text detection results are saved to the ./inference_results folder by default, and the name of the result file is prefixed with'det_res'. Examples of results are as follows:
D
dyning 已提交
66 67

![](imgs_results/det_res_2.jpg)
L
LDOUBLEV 已提交
68

K
Khanh Tran 已提交
69
By setting the size of the parameter det_max_side_len, the maximum value of picture normalization in the detection algorithm is changed. When the length and width of the picture are less than det_max_side_len, the original picture is used for prediction, otherwise the picture is scaled to the maximum value for prediction. This parameter is set to det_max_side_len=960 by default. If the resolution of the input picture is relatively large and you want to use a larger resolution for prediction, you can execute the following command:
L
LDOUBLEV 已提交
70 71

```
L
LDOUBLEV 已提交
72
python3 tools/infer/predict_det.py --image_dir="./doc/imgs/2.jpg" --det_model_dir="./inference/det_db/" --det_max_side_len=1200
D
dyning 已提交
73 74
```

K
Khanh Tran 已提交
75
If you want to use the CPU for prediction, execute the command as follows
D
dyning 已提交
76
```
L
LDOUBLEV 已提交
77
python3 tools/infer/predict_det.py --image_dir="./doc/imgs/2.jpg" --det_model_dir="./inference/det_db/" --use_gpu=False
D
dyning 已提交
78 79
```

K
Khanh Tran 已提交
80
### 2.DB text detection model inference
D
dyning 已提交
81

K
Khanh Tran 已提交
82
First, convert the model saved in the DB text detection training process into an inference model. Taking the model based on the Resnet50_vd backbone network and trained on the ICDAR2015 English dataset as an example ([model download link](https://paddleocr.bj.bcebos.com/det_r50_vd_db.tar)), you can use the following command to convert:
D
dyning 已提交
83

L
LDOUBLEV 已提交
84
```
K
Khanh Tran 已提交
85 86 87
# Set the yml configuration file of the training algorithm after -c
# The Global.checkpoints parameter sets the address of the training model to be converted without adding the file suffix .pdmodel, .pdopt or .pdparams.
# The Global.save_inference_dir parameter sets the address where the converted model will be saved.
D
dyning 已提交
88

D
dyning 已提交
89
python3 tools/export_model.py -c configs/det/det_r50_vd_db.yml -o Global.checkpoints="./models/det_r50_vd_db/best_accuracy" Global.save_inference_dir="./inference/det_db"
D
dyning 已提交
90 91
```

K
Khanh Tran 已提交
92
DB text detection model inference, you can execute the following command:
D
dyning 已提交
93 94 95 96 97

```
python3 tools/infer/predict_det.py --image_dir="./doc/imgs_en/img_10.jpg" --det_model_dir="./inference/det_db/"
```

K
Khanh Tran 已提交
98
The visual text detection results are saved to the ./inference_results folder by default, and the name of the result file is prefixed with'det_res'. Examples of results are as follows:
D
dyning 已提交
99 100 101

![](imgs_results/det_res_img_10_db.jpg)

K
Khanh Tran 已提交
102
**Note**: Since the ICDAR2015 dataset has only 1,000 training images, mainly for English scenes, the above model has very poor detection effect on Chinese text images.
D
dyning 已提交
103

K
Khanh Tran 已提交
104
### 3.EAST text detection model inference
D
dyning 已提交
105

K
Khanh Tran 已提交
106
First, convert the model saved in the EAST text detection training process into an inference model. Taking the model based on the Resnet50_vd backbone network and trained on the ICDAR2015 English data set as an example ([model download link](https://paddleocr.bj.bcebos.com/det_r50_vd_east.tar)), you can use the following command to convert:
D
dyning 已提交
107 108

```
K
Khanh Tran 已提交
109 110 111
# Set the yml configuration file of the training algorithm after -c
# The Global.checkpoints parameter sets the address of the training model to be converted without adding the file suffix .pdmodel, .pdopt or .pdparams.
# The Global.save_inference_dir parameter sets the address where the converted model will be saved.
D
dyning 已提交
112

D
dyning 已提交
113
python3 tools/export_model.py -c configs/det/det_r50_vd_east.yml -o Global.checkpoints="./models/det_r50_vd_east/best_accuracy" Global.save_inference_dir="./inference/det_east"
D
dyning 已提交
114 115
```

K
Khanh Tran 已提交
116
EAST text detection model inference, you need to set the parameter det_algorithm, specify the detection algorithm type as EAST, you can execute the following command:
D
dyning 已提交
117 118 119 120

```
python3 tools/infer/predict_det.py --image_dir="./doc/imgs_en/img_10.jpg" --det_model_dir="./inference/det_east/" --det_algorithm="EAST"
```
K
Khanh Tran 已提交
121
The visual text detection results are saved to the ./inference_results folder by default, and the name of the result file is prefixed with'det_res'. Examples of results are as follows:
D
dyning 已提交
122 123 124

![](imgs_results/det_res_img_10_east.jpg)

K
Khanh Tran 已提交
125
**Note**: The Python version of NMS used in EAST post-processing in this codebase, so the prediction speed is time-consuming. If you use the C++ version, there will be a significant speedup.
L
LDOUBLEV 已提交
126 127


K
Khanh Tran 已提交
128
## Text recognition model inference
L
LDOUBLEV 已提交
129

K
Khanh Tran 已提交
130
The following will introduce the ultra-lightweight Chinese recognition model reasoning and CTC loss-based recognition model reasoning. **The recognition model reasoning based on Attention loss is still being debugged**. For Chinese text recognition, it is recommended to prefer the recognition model based on CTC loss. In practice, it is also found that the effect based on Attention loss is not as good as the recognition model based on CTC loss.
D
dyning 已提交
131 132


K
Khanh Tran 已提交
133
### 1.Ultra-lightweight Chinese recognition model inference
D
dyning 已提交
134

K
Khanh Tran 已提交
135
Super lightweight Chinese recognition model inference, you can execute the following commands:
D
dyning 已提交
136 137

```
L
LDOUBLEV 已提交
138
python3 tools/infer/predict_rec.py --image_dir="./doc/imgs_words/ch/word_4.jpg" --rec_model_dir="./inference/rec_crnn/"
D
dyning 已提交
139 140
```

T
tink2123 已提交
141
![](imgs_words/ch/word_4.jpg)
D
dyning 已提交
142

K
Khanh Tran 已提交
143
After executing the command, the prediction results (recognized text and score) of the above image will be printed on the screen.
D
dyning 已提交
144

T
tink2123 已提交
145
Predicts of ./doc/imgs_words/ch/word_4.jpg:['实力活力', 0.89552695]
D
dyning 已提交
146 147


K
Khanh Tran 已提交
148
### 2.Identification model reasoning based on CTC loss
D
dyning 已提交
149

K
Khanh Tran 已提交
150
Taking STAR-Net as an example, we introduce the identification model reasoning based on CTC loss. CRNN and Rosetta are used in a similar way, without setting the recognition algorithm parameter rec_algorithm.
D
dyning 已提交
151

K
Khanh Tran 已提交
152 153
First, convert the model saved in the STAR-Net text recognition training process into an inference model. Based on Resnet34_vd backbone network, using MJSynth and SynthText two English text recognition synthetic data set training
The example of the model ([model download address](https://paddleocr.bj.bcebos.com/rec_r34_vd_tps_bilstm_ctc.tar))
D
dyning 已提交
154 155

```
K
Khanh Tran 已提交
156 157 158
# Set the yml configuration file of the training algorithm after -c
# The Global.checkpoints parameter sets the address of the training model to be converted without adding the file suffix .pdmodel, .pdopt or .pdparams.
# The Global.save_inference_dir parameter sets the address where the converted model will be saved.
D
dyning 已提交
159 160 161 162

python3 tools/export_model.py -c configs/rec/rec_r34_vd_tps_bilstm_ctc.yml -o Global.checkpoints="./models/rec_r34_vd_tps_bilstm_ctc/best_accuracy" Global.save_inference_dir="./inference/starnet"
```

K
Khanh Tran 已提交
163
STAR-Net text recognition model inference can execute the following commands:
L
LDOUBLEV 已提交
164 165

```
D
dyning 已提交
166
python3 tools/infer/predict_rec.py --image_dir="./doc/imgs_words_en/word_336.png" --rec_model_dir="./inference/starnet/" --rec_image_shape="3, 32, 100" --rec_char_type="en"
L
LDOUBLEV 已提交
167
```
D
dyning 已提交
168
![](imgs_words_en/word_336.png)
D
dyning 已提交
169

K
Khanh Tran 已提交
170
After executing the command, the recognition result of the above image is as follows:
D
dyning 已提交
171

D
dyning 已提交
172
Predicts of ./doc/imgs_words_en/word_336.png:['super', 0.9999555]
D
dyning 已提交
173

K
Khanh Tran 已提交
174
**Note**:Since the above model refers to [DTRB] (https://arxiv.org/abs/1904.01906) text recognition training and evaluation process, it is different from the training of ultra-lightweight Chinese recognition model in two aspects:
L
LDOUBLEV 已提交
175

K
Khanh Tran 已提交
176
- The image resolution used in training is different, and the image resolution used in training the above model is [3,32,100], While the Chinese model training, in order to ensure the recognition effect of long text, the image resolution used in training is [3, 32, 320]. The default shape parameter of the predictive inference program is the image resolution used in training Chinese, that is [3, 32, 320]. Therefore, when reasoning the above English model here, you need to set the shape of the recognition image through the parameter rec_image_shape.
L
LDOUBLEV 已提交
177

K
Khanh Tran 已提交
178
- Character list, the experiment in the DTRB paper is only for 26 lowercase English mothers and 10 numbers, a total of 36 characters. All upper and lower case characters are converted to lower case characters, and characters not in the above list are ignored and considered as spaces. Therefore, no character dictionary is entered here, but a dictionary is generated by the following command. Therefore, the parameter rec_char_type needs to be set during inference, which is specified as "en" in English.
L
LDOUBLEV 已提交
179 180

```
D
dyning 已提交
181 182
self.character_str = "0123456789abcdefghijklmnopqrstuvwxyz"
dict_character = list(self.character_str)
L
LDOUBLEV 已提交
183 184
```

K
Khanh Tran 已提交
185
## Text detection, recognition tandem reasoning
L
LDOUBLEV 已提交
186

K
Khanh Tran 已提交
187
### 1.Ultra-lightweight Chinese OCR model reasoning
D
dyning 已提交
188

K
Khanh Tran 已提交
189
When performing prediction, you need to specify the path of a single image or a collection of images through the parameter image_dir, the parameter det_model_dir specifies the path to detect the inference model, and the parameter rec_model_dir specifies the path to identify the inference model. The visual recognition results are saved to the ./inference_results folder by default.
D
dyning 已提交
190

L
LDOUBLEV 已提交
191
```
L
LDOUBLEV 已提交
192
python3 tools/infer/predict_system.py --image_dir="./doc/imgs/2.jpg" --det_model_dir="./inference/det_db/"  --rec_model_dir="./inference/rec_crnn/"
L
LDOUBLEV 已提交
193 194
```

K
Khanh Tran 已提交
195
After executing the command, the recognition result image is as follows:
D
dyning 已提交
196 197 198

![](imgs_results/2.jpg)

K
Khanh Tran 已提交
199
### 2.Other model reasoning
D
dyning 已提交
200

K
Khanh Tran 已提交
201
If you want to try other detection algorithms or recognition algorithms, please refer to the above text detection model inference and text recognition model inference, update the corresponding configuration and model, the following gives the EAST text detection and STAR-Net text recognition execution commands:
L
LDOUBLEV 已提交
202 203

```
D
dyning 已提交
204
python3 tools/infer/predict_system.py --image_dir="./doc/imgs_en/img_10.jpg" --det_model_dir="./inference/det_east/" --det_algorithm="EAST" --rec_model_dir="./inference/starnet/" --rec_image_shape="3, 32, 100" --rec_char_type="en"
L
LDOUBLEV 已提交
205
```
D
dyning 已提交
206

K
Khanh Tran 已提交
207
After executing the command, the recognition result image is as follows:
D
dyning 已提交
208

D
dyning 已提交
209
![](imgs_results/img_10.jpg)