@@ -100,7 +100,7 @@ Considering that the features of some channels will be suppressed if the convolu
The recognition module of PP-OCRv3 is optimized based on the text recognition algorithm [SVTR](https://arxiv.org/abs/2205.00159). RNN is abandoned in SVTR, and the context information of the text line image is more effectively mined by introducing the Transformers structure, thereby improving the text recognition ability.
-The recognition accuracy of SVTR_inty outperforms PP-OCRv2 recognition model by 5.3%, while the prediction speed nearly 11 times slower. It takes nearly 100ms to predict a text line on CPU. Therefore, as shown in the figure below, PP-OCRv3 adopts the following six optimization strategies to accelerate the recognition model.
+The recognition accuracy of SVTR_tiny outperforms PP-OCRv2 recognition model by 5.3%, while the prediction speed nearly 11 times slower. It takes nearly 100ms to predict a text line on CPU. Therefore, as shown in the figure below, PP-OCRv3 adopts the following six optimization strategies to accelerate the recognition model.

diff --git a/doc/doc_en/algorithm_det_east_en.md b/doc/doc_en/algorithm_det_east_en.md
index 3955809a49a595aa59717bafcfbb23146ae96bd2..07c434a9b162d9d373f5f357522cbd752be1afc1 100644
--- a/doc/doc_en/algorithm_det_east_en.md
+++ b/doc/doc_en/algorithm_det_east_en.md
@@ -40,7 +40,7 @@ Please prepare your environment referring to [prepare the environment](./environ
The above EAST model is trained using the ICDAR2015 text detection public dataset. For the download of the dataset, please refer to [ocr_datasets](./dataset/ocr_datasets_en.md).
-After the data download is complete, please refer to [Text Detection Training Tutorial](./detection.md) for training. PaddleOCR has modularized the code structure, so that you only need to **replace the configuration file** to train different detection models.
+After the data download is complete, please refer to [Text Detection Training Tutorial](./detection_en.md) for training. PaddleOCR has modularized the code structure, so that you only need to **replace the configuration file** to train different detection models.
diff --git a/doc/doc_en/algorithm_det_fcenet_en.md b/doc/doc_en/algorithm_det_fcenet_en.md
index e15fb9a07ede3296d3de83c134457194d4639a1c..f3c51a91a486342b86828167de5c1b386b42cc66 100644
--- a/doc/doc_en/algorithm_det_fcenet_en.md
+++ b/doc/doc_en/algorithm_det_fcenet_en.md
@@ -37,7 +37,7 @@ Please prepare your environment referring to [prepare the environment](./environ
The above FCE model is trained using the CTW1500 text detection public dataset. For the download of the dataset, please refer to [ocr_datasets](./dataset/ocr_datasets_en.md).
-After the data download is complete, please refer to [Text Detection Training Tutorial](./detection.md) for training. PaddleOCR has modularized the code structure, so that you only need to **replace the configuration file** to train different detection models.
+After the data download is complete, please refer to [Text Detection Training Tutorial](./detection_en.md) for training. PaddleOCR has modularized the code structure, so that you only need to **replace the configuration file** to train different detection models.
## 4. Inference and Deployment
diff --git a/doc/doc_en/algorithm_det_psenet_en.md b/doc/doc_en/algorithm_det_psenet_en.md
index d4cb3ea7d1e82a3f9c261c6e44cd6df6b0f6bf1e..3977a156ace3beb899e105bc381e27af6e825d6a 100644
--- a/doc/doc_en/algorithm_det_psenet_en.md
+++ b/doc/doc_en/algorithm_det_psenet_en.md
@@ -39,7 +39,7 @@ Please prepare your environment referring to [prepare the environment](./environ
The above PSE model is trained using the ICDAR2015 text detection public dataset. For the download of the dataset, please refer to [ocr_datasets](./dataset/ocr_datasets_en.md).
-After the data download is complete, please refer to [Text Detection Training Tutorial](./detection.md) for training. PaddleOCR has modularized the code structure, so that you only need to **replace the configuration file** to train different detection models.
+After the data download is complete, please refer to [Text Detection Training Tutorial](./detection_en.md) for training. PaddleOCR has modularized the code structure, so that you only need to **replace the configuration file** to train different detection models.
## 4. Inference and Deployment
diff --git a/doc/doc_en/algorithm_e2e_pgnet_en.md b/doc/doc_en/algorithm_e2e_pgnet_en.md
index c7cb3221ccfd897e2fd9062a828c2fe0ceb42024..ab74c57bc3d4d97852641cd708a2dceea5732ba7 100644
--- a/doc/doc_en/algorithm_e2e_pgnet_en.md
+++ b/doc/doc_en/algorithm_e2e_pgnet_en.md
@@ -36,7 +36,7 @@ The results of detection and recognition are as follows:
## 2. Environment Configuration
-Please refer to [Operation Environment Preparation](./environment_en.md) to configure PaddleOCR operating environment first, refer to [PaddleOCR Overview and Project Clone](./paddleOCR_overview_en.md) to clone the project
+Please refer to [Operation Environment Preparation](./environment_en.md) to configure PaddleOCR operating environment first, refer to [Project Clone](./clone_en.md) to clone the project
## 3. Quick Use
diff --git a/doc/doc_en/algorithm_overview_en.md b/doc/doc_en/algorithm_overview_en.md
index 383cbe39bbd2eb8ca85f497888920ce87cb1837e..bc96cdf2351f10454441e20d319e485019bbec00 100755
--- a/doc/doc_en/algorithm_overview_en.md
+++ b/doc/doc_en/algorithm_overview_en.md
@@ -65,6 +65,8 @@ Supported text recognition algorithms (Click the link to get the tutorial):
- [x] [SAR](./algorithm_rec_sar_en.md)
- [x] [SEED](./algorithm_rec_seed_en.md)
- [x] [SVTR](./algorithm_rec_svtr_en.md)
+- [x] [ViTSTR](./algorithm_rec_vitstr_en.md)
+- [x] [ABINet](./algorithm_rec_abinet_en.md)
Refer to [DTRB](https://arxiv.org/abs/1904.01906), the training and evaluation result of these above text recognition (using MJSynth and SynthText for training, evaluate on IIIT, SVT, IC03, IC13, IC15, SVTP, CUTE) is as follow:
@@ -83,6 +85,8 @@ Refer to [DTRB](https://arxiv.org/abs/1904.01906), the training and evaluation r
|SAR|Resnet31| 87.20% | rec_r31_sar | [trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.1/rec/rec_r31_sar_train.tar) |
|SEED|Aster_Resnet| 85.35% | rec_resnet_stn_bilstm_att | [trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.1/rec/rec_resnet_stn_bilstm_att.tar) |
|SVTR|SVTR-Tiny| 89.25% | rec_svtr_tiny_none_ctc_en | [trained model](https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/rec_svtr_tiny_none_ctc_en_train.tar) |
+|ViTSTR|ViTSTR| 79.82% | rec_vitstr_none_ce | [trained model](https://paddleocr.bj.bcebos.com/rec_vitstr_none_none_train.tar) |
+|ABINet|Resnet45| 90.75% | rec_r45_abinet | [trained model](https://paddleocr.bj.bcebos.com/rec_r45_abinet_train.tar) |
diff --git a/doc/doc_en/algorithm_rec_abinet_en.md b/doc/doc_en/algorithm_rec_abinet_en.md
new file mode 100644
index 0000000000000000000000000000000000000000..767ca65f6411a7bc071ccafacc09d12bc160e6b6
--- /dev/null
+++ b/doc/doc_en/algorithm_rec_abinet_en.md
@@ -0,0 +1,136 @@
+# ABINet
+
+- [1. Introduction](#1)
+- [2. Environment](#2)
+- [3. Model Training / Evaluation / Prediction](#3)
+ - [3.1 Training](#3-1)
+ - [3.2 Evaluation](#3-2)
+ - [3.3 Prediction](#3-3)
+- [4. Inference and Deployment](#4)
+ - [4.1 Python Inference](#4-1)
+ - [4.2 C++ Inference](#4-2)
+ - [4.3 Serving](#4-3)
+ - [4.4 More](#4-4)
+- [5. FAQ](#5)
+
+
+## 1. Introduction
+
+Paper:
+> [ABINet: Read Like Humans: Autonomous, Bidirectional and Iterative Language Modeling for Scene Text Recognition](https://openaccess.thecvf.com/content/CVPR2021/papers/Fang_Read_Like_Humans_Autonomous_Bidirectional_and_Iterative_Language_Modeling_for_CVPR_2021_paper.pdf)
+> Shancheng Fang and Hongtao Xie and Yuxin Wang and Zhendong Mao and Yongdong Zhang
+> CVPR, 2021
+
+Using MJSynth and SynthText two text recognition datasets for training, and evaluating on IIIT, SVT, IC03, IC13, IC15, SVTP, CUTE datasets, the algorithm reproduction effect is as follows:
+
+|Model|Backbone|config|Acc|Download link|
+| --- | --- | --- | --- | --- |
+|ABINet|ResNet45|[rec_r45_abinet.yml](../../configs/rec/rec_r45_abinet.yml)|90.75%|[pretrained & trained model](https://paddleocr.bj.bcebos.com/rec_r45_abinet_train.tar)|
+
+
+## 2. Environment
+Please refer to ["Environment Preparation"](./environment_en.md) to configure the PaddleOCR environment, and refer to ["Project Clone"](./clone_en.md) to clone the project code.
+
+
+
+## 3. Model Training / Evaluation / Prediction
+
+Please refer to [Text Recognition Tutorial](./recognition_en.md). PaddleOCR modularizes the code, and training different recognition models only requires **changing the configuration file**.
+
+Training:
+
+Specifically, after the data preparation is completed, the training can be started. The training command is as follows:
+
+```
+#Single GPU training (long training period, not recommended)
+python3 tools/train.py -c configs/rec/rec_r45_abinet.yml
+
+#Multi GPU training, specify the gpu number through the --gpus parameter
+python3 -m paddle.distributed.launch --gpus '0,1,2,3' tools/train.py -c configs/rec/rec_r45_abinet.yml
+```
+
+Evaluation:
+
+```
+# GPU evaluation
+python3 -m paddle.distributed.launch --gpus '0' tools/eval.py -c configs/rec/rec_r45_abinet.yml -o Global.pretrained_model={path/to/weights}/best_accuracy
+```
+
+Prediction:
+
+```
+# The configuration file used for prediction must match the training
+python3 tools/infer_rec.py -c configs/rec/rec_r45_abinet.yml -o Global.infer_img='./doc/imgs_words_en/word_10.png' Global.pretrained_model=./rec_r45_abinet_train/best_accuracy
+```
+
+
+## 4. Inference and Deployment
+
+
+### 4.1 Python Inference
+First, the model saved during the ABINet text recognition training process is converted into an inference model. ( [Model download link](https://paddleocr.bj.bcebos.com/rec_r45_abinet_train.tar)) ), you can use the following command to convert:
+
+```
+python3 tools/export_model.py -c configs/rec/rec_r45_abinet.yml -o Global.pretrained_model=./rec_r45_abinet_train/best_accuracy Global.save_inference_dir=./inference/rec_r45_abinet
+```
+
+**Note:**
+- If you are training the model on your own dataset and have modified the dictionary file, please pay attention to modify the `character_dict_path` in the configuration file to the modified dictionary file.
+- If you modified the input size during training, please modify the `infer_shape` corresponding to ABINet in the `tools/export_model.py` file.
+
+After the conversion is successful, there are three files in the directory:
+```
+/inference/rec_r45_abinet/
+ ├── inference.pdiparams
+ ├── inference.pdiparams.info
+ └── inference.pdmodel
+```
+
+
+For ABINet text recognition model inference, the following commands can be executed:
+
+```
+python3 tools/infer/predict_rec.py --image_dir='./doc/imgs_words_en/word_10.png' --rec_model_dir='./inference/rec_r45_abinet/' --rec_algorithm='ABINet' --rec_image_shape='3,32,128' --rec_char_dict_path='./ppocr/utils/ic15_dict.txt'
+```
+
+
+
+After executing the command, the prediction result (recognized text and score) of the image above is printed to the screen, an example is as follows:
+The result is as follows:
+```shell
+Predicts of ./doc/imgs_words_en/word_10.png:('pain', 0.9999995231628418)
+```
+
+
+### 4.2 C++ Inference
+
+Not supported
+
+
+### 4.3 Serving
+
+Not supported
+
+
+### 4.4 More
+
+Not supported
+
+
+## 5. FAQ
+
+1. Note that the MJSynth and SynthText datasets come from [ABINet repo](https://github.com/FangShancheng/ABINet).
+2. We use the pre-trained model provided by the ABINet authors for finetune training.
+
+## Citation
+
+```bibtex
+@article{Fang2021ABINet,
+ title = {ABINet: Read Like Humans: Autonomous, Bidirectional and Iterative Language Modeling for Scene Text Recognition},
+ author = {Shancheng Fang and Hongtao Xie and Yuxin Wang and Zhendong Mao and Yongdong Zhang},
+ booktitle = {CVPR},
+ year = {2021},
+ url = {https://arxiv.org/abs/2103.06495},
+ pages = {7098-7107}
+}
+```
diff --git a/doc/doc_en/algorithm_rec_aster_en.md b/doc/doc_en/algorithm_rec_aster_en.md
index 1540681a19f94160e221c37173510395d0fd407f..b949cb5b37c985cc55a2aa01ea5ae4096946bb05 100644
--- a/doc/doc_en/algorithm_rec_aster_en.md
+++ b/doc/doc_en/algorithm_rec_aster_en.md
@@ -33,13 +33,13 @@ Using MJSynth and SynthText two text recognition datasets for training, and eval
## 2. Environment
-Please refer to ["Environment Preparation"](./environment.md) to configure the PaddleOCR environment, and refer to ["Project Clone"](./clone.md) to clone the project code.
+Please refer to ["Environment Preparation"](./environment_en.md) to configure the PaddleOCR environment, and refer to ["Project Clone"](./clone_en.md) to clone the project code.
## 3. Model Training / Evaluation / Prediction
-Please refer to [Text Recognition Tutorial](./recognition.md). PaddleOCR modularizes the code, and training different recognition models only requires **changing the configuration file**.
+Please refer to [Text Recognition Tutorial](./recognition_en.md). PaddleOCR modularizes the code, and training different recognition models only requires **changing the configuration file**.
Training:
diff --git a/doc/doc_en/algorithm_rec_crnn_en.md b/doc/doc_en/algorithm_rec_crnn_en.md
index 571569ee445d756ca7bdfeea6d5f960187a5a666..8548c2fa625b713d7e7e278506ff5c46713303ed 100644
--- a/doc/doc_en/algorithm_rec_crnn_en.md
+++ b/doc/doc_en/algorithm_rec_crnn_en.md
@@ -33,13 +33,13 @@ Using MJSynth and SynthText two text recognition datasets for training, and eval
## 2. Environment
-Please refer to ["Environment Preparation"](./environment.md) to configure the PaddleOCR environment, and refer to ["Project Clone"](./clone.md) to clone the project code.
+Please refer to ["Environment Preparation"](./environment_en.md) to configure the PaddleOCR environment, and refer to ["Project Clone"](./clone_en.md) to clone the project code.
## 3. Model Training / Evaluation / Prediction
-Please refer to [Text Recognition Tutorial](./recognition.md). PaddleOCR modularizes the code, and training different recognition models only requires **changing the configuration file**.
+Please refer to [Text Recognition Tutorial](./recognition_en.md). PaddleOCR modularizes the code, and training different recognition models only requires **changing the configuration file**.
Training:
diff --git a/doc/doc_en/algorithm_rec_nrtr_en.md b/doc/doc_en/algorithm_rec_nrtr_en.md
index 3f8fd0adee900cf889d70e8b78fb1122d54c7d08..309d7ab123065f15a599a1220ab3f39afffb9b60 100644
--- a/doc/doc_en/algorithm_rec_nrtr_en.md
+++ b/doc/doc_en/algorithm_rec_nrtr_en.md
@@ -12,6 +12,7 @@
- [4.3 Serving](#4-3)
- [4.4 More](#4-4)
- [5. FAQ](#5)
+- [6. Release Note](#6)
## 1. Introduction
@@ -25,17 +26,17 @@ Using MJSynth and SynthText two text recognition datasets for training, and eval
|Model|Backbone|config|Acc|Download link|
| --- | --- | --- | --- | --- |
-|NRTR|MTB|[rec_mtb_nrtr.yml](../../configs/rec/rec_mtb_nrtr.yml)|84.21%|[train model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/rec_mtb_nrtr_train.tar)|
+|NRTR|MTB|[rec_mtb_nrtr.yml](../../configs/rec/rec_mtb_nrtr.yml)|84.21%|[trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/rec_mtb_nrtr_train.tar)|
## 2. Environment
-Please refer to ["Environment Preparation"](./environment.md) to configure the PaddleOCR environment, and refer to ["Project Clone"](./clone.md) to clone the project code.
+Please refer to ["Environment Preparation"](./environment_en.md) to configure the PaddleOCR environment, and refer to ["Project Clone"](./clone_en.md) to clone the project code.
## 3. Model Training / Evaluation / Prediction
-Please refer to [Text Recognition Tutorial](./recognition.md). PaddleOCR modularizes the code, and training different recognition models only requires **changing the configuration file**.
+Please refer to [Text Recognition Tutorial](./recognition_en.md). PaddleOCR modularizes the code, and training different recognition models only requires **changing the configuration file**.
Training:
@@ -98,7 +99,7 @@ python3 tools/infer/predict_rec.py --image_dir='./doc/imgs_words_en/word_10.png'
After executing the command, the prediction result (recognized text and score) of the image above is printed to the screen, an example is as follows:
The result is as follows:
```shell
-Predicts of ./doc/imgs_words_en/word_10.png:('pain', 0.9265879392623901)
+Predicts of ./doc/imgs_words_en/word_10.png:('pain', 0.9465042352676392)
```
@@ -121,12 +122,146 @@ Not supported
1. In the `NRTR` paper, Beam search is used to decode characters, but the speed is slow. Beam search is not used by default here, and greedy search is used to decode characters.
+
+## 6. Release Note
+
+1. The release/2.6 version updates the NRTR code structure. The new version of NRTR can load the model parameters of the old version (release/2.5 and before), and you may use the following code to convert the old version model parameters to the new version model parameters:
+
+```python
+
+ params = paddle.load('path/' + '.pdparams') # the old version parameters
+ state_dict = model.state_dict() # the new version model parameters
+ new_state_dict = {}
+
+ for k1, v1 in state_dict.items():
+
+ k = k1
+ if 'encoder' in k and 'self_attn' in k and 'qkv' in k and 'weight' in k:
+
+ k_para = k[:13] + 'layers.' + k[13:]
+ q = params[k_para.replace('qkv', 'conv1')].transpose((1, 0, 2, 3))
+ k = params[k_para.replace('qkv', 'conv2')].transpose((1, 0, 2, 3))
+ v = params[k_para.replace('qkv', 'conv3')].transpose((1, 0, 2, 3))
+
+ new_state_dict[k1] = np.concatenate([q[:, :, 0, 0], k[:, :, 0, 0], v[:, :, 0, 0]], -1)
+
+ elif 'encoder' in k and 'self_attn' in k and 'qkv' in k and 'bias' in k:
+
+ k_para = k[:13] + 'layers.' + k[13:]
+ q = params[k_para.replace('qkv', 'conv1')]
+ k = params[k_para.replace('qkv', 'conv2')]
+ v = params[k_para.replace('qkv', 'conv3')]
+
+ new_state_dict[k1] = np.concatenate([q, k, v], -1)
+
+ elif 'encoder' in k and 'self_attn' in k and 'out_proj' in k:
+
+ k_para = k[:13] + 'layers.' + k[13:]
+ new_state_dict[k1] = params[k_para]
+
+ elif 'encoder' in k and 'norm3' in k:
+ k_para = k[:13] + 'layers.' + k[13:]
+ new_state_dict[k1] = params[k_para.replace('norm3', 'norm2')]
+
+ elif 'encoder' in k and 'norm1' in k:
+ k_para = k[:13] + 'layers.' + k[13:]
+ new_state_dict[k1] = params[k_para]
+
+
+ elif 'decoder' in k and 'self_attn' in k and 'qkv' in k and 'weight' in k:
+ k_para = k[:13] + 'layers.' + k[13:]
+ q = params[k_para.replace('qkv', 'conv1')].transpose((1, 0, 2, 3))
+ k = params[k_para.replace('qkv', 'conv2')].transpose((1, 0, 2, 3))
+ v = params[k_para.replace('qkv', 'conv3')].transpose((1, 0, 2, 3))
+ new_state_dict[k1] = np.concatenate([q[:, :, 0, 0], k[:, :, 0, 0], v[:, :, 0, 0]], -1)
+
+ elif 'decoder' in k and 'self_attn' in k and 'qkv' in k and 'bias' in k:
+ k_para = k[:13] + 'layers.' + k[13:]
+ q = params[k_para.replace('qkv', 'conv1')]
+ k = params[k_para.replace('qkv', 'conv2')]
+ v = params[k_para.replace('qkv', 'conv3')]
+ new_state_dict[k1] = np.concatenate([q, k, v], -1)
+
+ elif 'decoder' in k and 'self_attn' in k and 'out_proj' in k:
+
+ k_para = k[:13] + 'layers.' + k[13:]
+ new_state_dict[k1] = params[k_para]
+
+ elif 'decoder' in k and 'cross_attn' in k and 'q' in k and 'weight' in k:
+ k_para = k[:13] + 'layers.' + k[13:]
+ k_para = k_para.replace('cross_attn', 'multihead_attn')
+ q = params[k_para.replace('q', 'conv1')].transpose((1, 0, 2, 3))
+ new_state_dict[k1] = q[:, :, 0, 0]
+
+ elif 'decoder' in k and 'cross_attn' in k and 'q' in k and 'bias' in k:
+ k_para = k[:13] + 'layers.' + k[13:]
+ k_para = k_para.replace('cross_attn', 'multihead_attn')
+ q = params[k_para.replace('q', 'conv1')]
+ new_state_dict[k1] = q
+
+ elif 'decoder' in k and 'cross_attn' in k and 'kv' in k and 'weight' in k:
+ k_para = k[:13] + 'layers.' + k[13:]
+ k_para = k_para.replace('cross_attn', 'multihead_attn')
+ k = params[k_para.replace('kv', 'conv2')].transpose((1, 0, 2, 3))
+ v = params[k_para.replace('kv', 'conv3')].transpose((1, 0, 2, 3))
+ new_state_dict[k1] = np.concatenate([k[:, :, 0, 0], v[:, :, 0, 0]], -1)
+
+ elif 'decoder' in k and 'cross_attn' in k and 'kv' in k and 'bias' in k:
+ k_para = k[:13] + 'layers.' + k[13:]
+ k_para = k_para.replace('cross_attn', 'multihead_attn')
+ k = params[k_para.replace('kv', 'conv2')]
+ v = params[k_para.replace('kv', 'conv3')]
+ new_state_dict[k1] = np.concatenate([k, v], -1)
+
+ elif 'decoder' in k and 'cross_attn' in k and 'out_proj' in k:
+
+ k_para = k[:13] + 'layers.' + k[13:]
+ k_para = k_para.replace('cross_attn', 'multihead_attn')
+ new_state_dict[k1] = params[k_para]
+ elif 'decoder' in k and 'norm' in k:
+ k_para = k[:13] + 'layers.' + k[13:]
+ new_state_dict[k1] = params[k_para]
+ elif 'mlp' in k and 'weight' in k:
+ k_para = k[:13] + 'layers.' + k[13:]
+ k_para = k_para.replace('fc', 'conv')
+ k_para = k_para.replace('mlp.', '')
+ w = params[k_para].transpose((1, 0, 2, 3))
+ new_state_dict[k1] = w[:, :, 0, 0]
+ elif 'mlp' in k and 'bias' in k:
+ k_para = k[:13] + 'layers.' + k[13:]
+ k_para = k_para.replace('fc', 'conv')
+ k_para = k_para.replace('mlp.', '')
+ w = params[k_para]
+ new_state_dict[k1] = w
+
+ else:
+ new_state_dict[k1] = params[k1]
+
+ if list(new_state_dict[k1].shape) != list(v1.shape):
+ print(k1)
+
+
+ for k, v1 in state_dict.items():
+ if k not in new_state_dict.keys():
+ print(1, k)
+ elif list(new_state_dict[k].shape) != list(v1.shape):
+ print(2, k)
+
+
+
+ model.set_state_dict(new_state_dict)
+ paddle.save(model.state_dict(), 'nrtrnew_from_old_params.pdparams')
+
+```
+
+2. The new version has a clean code structure and improved inference speed compared with the old version.
+
## Citation
```bibtex
@article{Sheng2019NRTR,
title = {NRTR: A No-Recurrence Sequence-to-Sequence Model For Scene Text Recognition},
- author = {Fenfen Sheng and Zhineng Chen andBo Xu},
+ author = {Fenfen Sheng and Zhineng Chen and Bo Xu},
booktitle = {ICDAR},
year = {2019},
url = {http://arxiv.org/abs/1806.00926},
diff --git a/doc/doc_en/algorithm_rec_sar_en.md b/doc/doc_en/algorithm_rec_sar_en.md
index 8c1e6dbbfa5cc05da4d7423a535c6db74cf8f4c3..24b87c10c3b2839909392bf3de0e0c850112fcdc 100644
--- a/doc/doc_en/algorithm_rec_sar_en.md
+++ b/doc/doc_en/algorithm_rec_sar_en.md
@@ -31,13 +31,13 @@ Note:In addition to using the two text recognition datasets MJSynth and SynthTex
## 2. Environment
-Please refer to ["Environment Preparation"](./environment.md) to configure the PaddleOCR environment, and refer to ["Project Clone"](./clone.md) to clone the project code.
+Please refer to ["Environment Preparation"](./environment_en.md) to configure the PaddleOCR environment, and refer to ["Project Clone"](./clone_en.md) to clone the project code.
## 3. Model Training / Evaluation / Prediction
-Please refer to [Text Recognition Tutorial](./recognition.md). PaddleOCR modularizes the code, and training different recognition models only requires **changing the configuration file**.
+Please refer to [Text Recognition Tutorial](./recognition_en.md). PaddleOCR modularizes the code, and training different recognition models only requires **changing the configuration file**.
Training:
diff --git a/doc/doc_en/algorithm_rec_seed_en.md b/doc/doc_en/algorithm_rec_seed_en.md
index 21679f42fd6302228804db49d731f9b69ec692b2..f8d7ae6d3f34ab8a4f510c88002b22dbce7a10e8 100644
--- a/doc/doc_en/algorithm_rec_seed_en.md
+++ b/doc/doc_en/algorithm_rec_seed_en.md
@@ -31,13 +31,13 @@ Using MJSynth and SynthText two text recognition datasets for training, and eval
## 2. Environment
-Please refer to ["Environment Preparation"](./environment.md) to configure the PaddleOCR environment, and refer to ["Project Clone"](./clone.md) to clone the project code.
+Please refer to ["Environment Preparation"](./environment_en.md) to configure the PaddleOCR environment, and refer to ["Project Clone"](./clone_en.md) to clone the project code.
## 3. Model Training / Evaluation / Prediction
-Please refer to [Text Recognition Tutorial](./recognition.md). PaddleOCR modularizes the code, and training different recognition models only requires **changing the configuration file**.
+Please refer to [Text Recognition Tutorial](./recognition_en.md). PaddleOCR modularizes the code, and training different recognition models only requires **changing the configuration file**.
Training:
diff --git a/doc/doc_en/algorithm_rec_srn_en.md b/doc/doc_en/algorithm_rec_srn_en.md
index c022a81f9e5797c531c79de7e793d44d9a22552c..1d7fc07dc29e0de021165fc5656cbca704b45284 100644
--- a/doc/doc_en/algorithm_rec_srn_en.md
+++ b/doc/doc_en/algorithm_rec_srn_en.md
@@ -30,13 +30,13 @@ Using MJSynth and SynthText two text recognition datasets for training, and eval
## 2. Environment
-Please refer to ["Environment Preparation"](./environment.md) to configure the PaddleOCR environment, and refer to ["Project Clone"](./clone.md) to clone the project code.
+Please refer to ["Environment Preparation"](./environment_en.md) to configure the PaddleOCR environment, and refer to ["Project Clone"](./clone_en.md) to clone the project code.
## 3. Model Training / Evaluation / Prediction
-Please refer to [Text Recognition Tutorial](./recognition.md). PaddleOCR modularizes the code, and training different recognition models only requires **changing the configuration file**.
+Please refer to [Text Recognition Tutorial](./recognition_en.md). PaddleOCR modularizes the code, and training different recognition models only requires **changing the configuration file**.
Training:
diff --git a/doc/doc_en/algorithm_rec_starnet.md b/doc/doc_en/algorithm_rec_starnet.md
new file mode 100644
index 0000000000000000000000000000000000000000..dbb53a9c737c16fa249483fa97b0b49cf25b2137
--- /dev/null
+++ b/doc/doc_en/algorithm_rec_starnet.md
@@ -0,0 +1,139 @@
+# STAR-Net
+
+- [1. Introduction](#1)
+- [2. Environment](#2)
+- [3. Model Training / Evaluation / Prediction](#3)
+ - [3.1 Training](#3-1)
+ - [3.2 Evaluation](#3-2)
+ - [3.3 Prediction](#3-3)
+- [4. Inference and Deployment](#4)
+ - [4.1 Python Inference](#4-1)
+ - [4.2 C++ Inference](#4-2)
+ - [4.3 Serving](#4-3)
+ - [4.4 More](#4-4)
+- [5. FAQ](#5)
+
+
+## 1. Introduction
+
+Paper information:
+> [STAR-Net: a spatial attention residue network for scene text recognition.](http://www.bmva.org/bmvc/2016/papers/paper043/paper043.pdf)
+> Wei Liu, Chaofeng Chen, Kwan-Yee K. Wong, Zhizhong Su and Junyu Han.
+> BMVC, pages 43.1-43.13, 2016
+
+Refer to [DTRB](https://arxiv.org/abs/1904.01906) text Recognition Training and Evaluation Process . Using MJSynth and SynthText two text recognition datasets for training, and evaluating on IIIT, SVT, IC03, IC13, IC15, SVTP, CUTE datasets, the algorithm reproduction effect is as follows:
+
+|Models|Backbone Networks|Avg Accuracy|Configuration Files|Download Links|
+| --- | --- | --- | --- | --- |
+|StarNet|Resnet34_vd|84.44%|[configs/rec/rec_r34_vd_tps_bilstm_ctc.yml](../../configs/rec/rec_r34_vd_tps_bilstm_ctc.yml)|[trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/rec_r34_vd_tps_bilstm_ctc_v2.0_train.tar)|
+|StarNet|MobileNetV3|81.42%|[configs/rec/rec_mv3_tps_bilstm_ctc.yml](../../configs/rec/rec_mv3_tps_bilstm_ctc.yml)|[ trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/rec_mv3_tps_bilstm_ctc_v2.0_train.tar)|
+
+
+
+## 2. Environment
+Please refer to [Operating Environment Preparation](./environment_en.md) to configure the PaddleOCR operating environment, and refer to [Project Clone](./clone_en.md) to clone the project code.
+
+
+## 3. Model Training / Evaluation / Prediction
+
+Please refer to [Text Recognition Training Tutorial](./recognition_en.md). PaddleOCR modularizes the code, and training different recognition models only requires **changing the configuration file**. Take the backbone network based on Resnet34_vd as an example:
+
+
+### 3.1 Training
+After the data preparation is complete, the training can be started. The training command is as follows:
+
+````
+#Single card training (long training period, not recommended)
+python3 tools/train.py -c configs/rec/rec_r34_vd_tps_bilstm_ctc.yml #Multi-card training, specify the card number through the --gpus parameter
+python3 -m paddle.distributed.launch --gpus '0,1,2,3' tools/train.py -c rec_r34_vd_tps_bilstm_ctc.yml
+ ````
+
+
+### 3.2 Evaluation
+
+````
+# GPU evaluation, Global.pretrained_model is the model to be evaluated
+python3 -m paddle.distributed.launch --gpus '0' tools/eval.py -c configs/rec/rec_r34_vd_tps_bilstm_ctc.yml -o Global.pretrained_model={path/to/weights}/best_accuracy
+ ````
+
+
+### 3.3 Prediction
+
+````
+# The configuration file used for prediction must match the training
+python3 tools/infer_rec.py -c configs/rec/rec_r34_vd_tps_bilstm_ctc.yml -o Global.pretrained_model={path/to/weights}/best_accuracy Global.infer_img=doc/imgs_words/en/word_1.png
+ ````
+
+
+## 4. Inference
+
+
+### 4.1 Python Inference
+First, convert the model saved during the STAR-Net text recognition training process into an inference model. Take the model trained on the MJSynth and SynthText text recognition datasets based on the Resnet34_vd backbone network as an example [Model download address]( https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/rec_r34_vd_none_bilstm_ctc_v2.0_train.tar) , which can be converted using the following command:
+
+```shell
+python3 tools/export_model.py -c configs/rec/rec_r34_vd_tps_bilstm_ctc.yml -o Global.pretrained_model=./rec_r34_vd_tps_bilstm_ctc_v2.0_train/best_accuracy Global.save_inference_dir=./inference/rec_starnet
+ ````
+
+STAR-Net text recognition model inference, you can execute the following commands:
+
+```shell
+python3 tools/infer/predict_rec.py --image_dir="./doc/imgs_words_en/word_336.png" --rec_model_dir="./inference/rec_starnet/" --rec_image_shape="3, 32, 100" --rec_char_dict_path="./ppocr/utils/ic15_dict.txt"
+ ````
+
+
+
+The inference results are as follows:
+
+
+```bash
+Predicts of ./doc/imgs_words_en/word_336.png:('super', 0.9999073)
+```
+
+**Attention** Since the above model refers to the [DTRB](https://arxiv.org/abs/1904.01906) text recognition training and evaluation process, it is different from the ultra-lightweight Chinese recognition model training in two aspects:
+
+- The image resolutions used during training are different. The image resolutions used for training the above models are [3, 32, 100], while for Chinese model training, in order to ensure the recognition effect of long texts, the image resolutions used during training are [ 3, 32, 320]. The default shape parameter of the predictive inference program is the image resolution used for training Chinese, i.e. [3, 32, 320]. Therefore, when inferring the above English model here, it is necessary to set the shape of the recognized image through the parameter rec_image_shape.
+
+- Character list, the experiment in the DTRB paper is only for 26 lowercase English letters and 10 numbers, a total of 36 characters. All uppercase and lowercase characters are converted to lowercase characters, and characters not listed above are ignored and considered spaces. Therefore, there is no input character dictionary here, but a dictionary is generated by the following command. Therefore, the parameter rec_char_dict_path needs to be set during inference, which is specified as an English dictionary "./ppocr/utils/ic15_dict.txt".
+
+```
+self.character_str = "0123456789abcdefghijklmnopqrstuvwxyz"
+dict_character = list(self.character_str)
+
+
+ ```
+
+
+### 4.2 C++ Inference
+
+After preparing the inference model, refer to the [cpp infer](../../deploy/cpp_infer/) tutorial to operate.
+
+
+### 4.3 Serving
+
+After preparing the inference model, refer to the [pdserving](../../deploy/pdserving/) tutorial for Serving deployment, including two modes: Python Serving and C++ Serving.
+
+
+### 4.4 More
+
+The STAR-Net model also supports the following inference deployment methods:
+
+- Paddle2ONNX Inference: After preparing the inference model, refer to the [paddle2onnx](../../deploy/paddle2onnx/) tutorial.
+
+
+## 5. FAQ
+
+## Quote
+
+```bibtex
+@inproceedings{liu2016star,
+ title={STAR-Net: a spatial attention residue network for scene text recognition.},
+ author={Liu, Wei and Chen, Chaofeng and Wong, Kwan-Yee K and Su, Zhizhong and Han, Junyu},
+ booktitle={BMVC},
+ volume={2},
+ pages={7},
+ year={2016}
+}
+```
+
+
diff --git a/doc/doc_en/algorithm_rec_svtr_en.md b/doc/doc_en/algorithm_rec_svtr_en.md
index 2e7deb4c077ce508773c4789e2e76bdda7dfe8c8..37cd35f35a2025cbb55ff85fe27b50e5d6e556aa 100644
--- a/doc/doc_en/algorithm_rec_svtr_en.md
+++ b/doc/doc_en/algorithm_rec_svtr_en.md
@@ -34,7 +34,7 @@ The accuracy (%) and model files of SVTR on the public dataset of scene text rec
## 2. Environment
-Please refer to ["Environment Preparation"](./environment.md) to configure the PaddleOCR environment, and refer to ["Project Clone"](./clone.md) to clone the project code.
+Please refer to ["Environment Preparation"](./environment_en.md) to configure the PaddleOCR environment, and refer to ["Project Clone"](./clone_en.md) to clone the project code.
#### Dataset Preparation
@@ -44,7 +44,7 @@ Please refer to ["Environment Preparation"](./environment.md) to configure the P
## 3. Model Training / Evaluation / Prediction
-Please refer to [Text Recognition Tutorial](./recognition.md). PaddleOCR modularizes the code, and training different recognition models only requires **changing the configuration file**.
+Please refer to [Text Recognition Tutorial](./recognition_en.md). PaddleOCR modularizes the code, and training different recognition models only requires **changing the configuration file**.
Training:
@@ -88,7 +88,6 @@ python3 tools/export_model.py -c configs/rec/rec_svtrnet.yml -o Global.pretraine
**Note:**
- If you are training the model on your own dataset and have modified the dictionary file, please pay attention to modify the `character_dict_path` in the configuration file to the modified dictionary file.
-- If you modified the input size during training, please modify the `infer_shape` corresponding to SVTR in the `tools/export_model.py` file.
After the conversion is successful, there are three files in the directory:
```
diff --git a/doc/doc_en/algorithm_rec_vitstr_en.md b/doc/doc_en/algorithm_rec_vitstr_en.md
new file mode 100644
index 0000000000000000000000000000000000000000..a6f9e2f15df69e7949a4b9713274d9b83ff98f60
--- /dev/null
+++ b/doc/doc_en/algorithm_rec_vitstr_en.md
@@ -0,0 +1,134 @@
+# ViTSTR
+
+- [1. Introduction](#1)
+- [2. Environment](#2)
+- [3. Model Training / Evaluation / Prediction](#3)
+ - [3.1 Training](#3-1)
+ - [3.2 Evaluation](#3-2)
+ - [3.3 Prediction](#3-3)
+- [4. Inference and Deployment](#4)
+ - [4.1 Python Inference](#4-1)
+ - [4.2 C++ Inference](#4-2)
+ - [4.3 Serving](#4-3)
+ - [4.4 More](#4-4)
+- [5. FAQ](#5)
+
+
+## 1. Introduction
+
+Paper:
+> [Vision Transformer for Fast and Efficient Scene Text Recognition](https://arxiv.org/abs/2105.08582)
+> Rowel Atienza
+> ICDAR, 2021
+
+Using MJSynth and SynthText two text recognition datasets for training, and evaluating on IIIT, SVT, IC03, IC13, IC15, SVTP, CUTE datasets, the algorithm reproduction effect is as follows:
+
+|Model|Backbone|config|Acc|Download link|
+| --- | --- | --- | --- | --- |
+|ViTSTR|ViTSTR|[rec_vitstr_none_ce.yml](../../configs/rec/rec_vitstr_none_ce.yml)|79.82%|[trained model](https://paddleocr.bj.bcebos.com/rec_vitstr_none_none_train.tar)|
+
+
+## 2. Environment
+Please refer to ["Environment Preparation"](./environment_en.md) to configure the PaddleOCR environment, and refer to ["Project Clone"](./clone_en.md) to clone the project code.
+
+
+
+## 3. Model Training / Evaluation / Prediction
+
+Please refer to [Text Recognition Tutorial](./recognition_en.md). PaddleOCR modularizes the code, and training different recognition models only requires **changing the configuration file**.
+
+Training:
+
+Specifically, after the data preparation is completed, the training can be started. The training command is as follows:
+
+```
+#Single GPU training (long training period, not recommended)
+python3 tools/train.py -c configs/rec/rec_vitstr_none_ce.yml
+
+#Multi GPU training, specify the gpu number through the --gpus parameter
+python3 -m paddle.distributed.launch --gpus '0,1,2,3' tools/train.py -c configs/rec/rec_vitstr_none_ce.yml
+```
+
+Evaluation:
+
+```
+# GPU evaluation
+python3 -m paddle.distributed.launch --gpus '0' tools/eval.py -c configs/rec/rec_vitstr_none_ce.yml -o Global.pretrained_model={path/to/weights}/best_accuracy
+```
+
+Prediction:
+
+```
+# The configuration file used for prediction must match the training
+python3 tools/infer_rec.py -c configs/rec/rec_vitstr_none_ce.yml -o Global.infer_img='./doc/imgs_words_en/word_10.png' Global.pretrained_model=./rec_vitstr_none_ce_train/best_accuracy
+```
+
+
+## 4. Inference and Deployment
+
+
+### 4.1 Python Inference
+First, the model saved during the ViTSTR text recognition training process is converted into an inference model. ( [Model download link](https://paddleocr.bj.bcebos.com/rec_vitstr_none_none_train.tar)) ), you can use the following command to convert:
+
+```
+python3 tools/export_model.py -c configs/rec/rec_vitstr_none_ce.yml -o Global.pretrained_model=./rec_vitstr_none_ce_train/best_accuracy Global.save_inference_dir=./inference/rec_vitstr
+```
+
+**Note:**
+- If you are training the model on your own dataset and have modified the dictionary file, please pay attention to modify the `character_dict_path` in the configuration file to the modified dictionary file.
+- If you modified the input size during training, please modify the `infer_shape` corresponding to ViTSTR in the `tools/export_model.py` file.
+
+After the conversion is successful, there are three files in the directory:
+```
+/inference/rec_vitstr/
+ ├── inference.pdiparams
+ ├── inference.pdiparams.info
+ └── inference.pdmodel
+```
+
+
+For ViTSTR text recognition model inference, the following commands can be executed:
+
+```
+python3 tools/infer/predict_rec.py --image_dir='./doc/imgs_words_en/word_10.png' --rec_model_dir='./inference/rec_vitstr/' --rec_algorithm='ViTSTR' --rec_image_shape='1,224,224' --rec_char_dict_path='./ppocr/utils/EN_symbol_dict.txt'
+```
+
+
+
+After executing the command, the prediction result (recognized text and score) of the image above is printed to the screen, an example is as follows:
+The result is as follows:
+```shell
+Predicts of ./doc/imgs_words_en/word_10.png:('pain', 0.9998350143432617)
+```
+
+
+### 4.2 C++ Inference
+
+Not supported
+
+
+### 4.3 Serving
+
+Not supported
+
+
+### 4.4 More
+
+Not supported
+
+
+## 5. FAQ
+
+1. In the `ViTSTR` paper, using pre-trained weights on ImageNet1k for initial training, we did not use pre-trained weights in training, and the final accuracy did not change or even improved.
+
+## Citation
+
+```bibtex
+@article{Atienza2021ViTSTR,
+ title = {Vision Transformer for Fast and Efficient Scene Text Recognition},
+ author = {Rowel Atienza},
+ booktitle = {ICDAR},
+ year = {2021},
+ url = {https://arxiv.org/abs/2105.08582}
+}
+```
diff --git a/doc/doc_en/detection_en.md b/doc/doc_en/detection_en.md
index 76e0f8509b92dfaae62dce7ba2b4b73d39da1600..f85bf585cb66332d90de8d66ed315cb04ece7636 100644
--- a/doc/doc_en/detection_en.md
+++ b/doc/doc_en/detection_en.md
@@ -159,7 +159,7 @@ python3 -m paddle.distributed.launch --ips="xx.xx.xx.xx,xx.xx.xx.xx" --gpus '0,1
-o Global.pretrained_model=./pretrain_models/MobileNetV3_large_x0_5_pretrained
```
-**Note:** When using multi-machine and multi-gpu training, you need to replace the ips value in the above command with the address of your machine, and the machines need to be able to ping each other. In addition, training needs to be launched separately on multiple machines. The command to view the ip address of the machine is `ifconfig`.
+**Note:** (1) When using multi-machine and multi-gpu training, you need to replace the ips value in the above command with the address of your machine, and the machines need to be able to ping each other. (2) Training needs to be launched separately on multiple machines. The command to view the ip address of the machine is `ifconfig`. (3) For more details about the distributed training speedup ratio, please refer to [Distributed Training Tutorial](./distributed_training_en.md).
### 2.6 Training with knowledge distillation
diff --git a/doc/doc_en/distributed_training.md b/doc/doc_en/distributed_training_en.md
similarity index 70%
rename from doc/doc_en/distributed_training.md
rename to doc/doc_en/distributed_training_en.md
index 2822ee5e4ea52720a458e4060d8a09be7b98846b..5a219ed2b494d6239096ff634dfdc702c4be9419 100644
--- a/doc/doc_en/distributed_training.md
+++ b/doc/doc_en/distributed_training_en.md
@@ -40,11 +40,17 @@ python3 -m paddle.distributed.launch \
## Performance comparison
-* Based on 26W public recognition dataset (LSVT, rctw, mtwi), training on single 8-card P40 and dual 8-card P40, the final time consumption is as follows.
+* On two 8-card P40 graphics cards, the final time consumption and speedup ratio for public recognition dataset (LSVT, RCTW, MTWI) containing 260k images are as follows.
-| Model | Config file | Number of machines | Number of GPUs per machine | Training time | Recognition acc | Speedup ratio |
-| :-------: | :------------: | :----------------: | :----------------------------: | :------------------: | :--------------: | :-----------: |
-| CRNN | configs/rec/ch_ppocr_v2.0/rec_chinese_lite_train_v2.0.yml | 1 | 8 | 60h | 66.7% | - |
-| CRNN | configs/rec/ch_ppocr_v2.0/rec_chinese_lite_train_v2.0.yml | 2 | 8 | 40h | 67.0% | 150% |
-It can be seen that the training time is shortened from 60h to 40h, the speedup ratio can reach 150% (60h / 40h), and the efficiency is 75% (60h / (40h * 2)).
+| Model | Config file | Recognition acc | single 8-card training time | two 8-card training time | Speedup ratio |
+|------|-----|--------|--------|--------|-----|
+| CRNN | [rec_chinese_lite_train_v2.0.yml](../../configs/rec/ch_ppocr_v2.0/rec_chinese_lite_train_v2.0.yml) | 67.0% | 2.50d | 1.67d | **1.5** |
+
+
+* On four 8-card V100 graphics cards, the final time consumption and speedup ratio for full data are as follows.
+
+
+| Model | Config file | Recognition acc | single 8-card training time | four 8-card training time | Speedup ratio |
+|------|-----|--------|--------|--------|-----|
+| SVTR | [ch_PP-OCRv3_rec_distillation.yml](../../configs/rec/PP-OCRv3/ch_PP-OCRv3_rec_distillation.yml) | 74.0% | 10d | 2.84d | **3.5** |
diff --git a/doc/doc_en/knowledge_distillation_en.md b/doc/doc_en/knowledge_distillation_en.md
index bd36907c98c6d556fe1dea85712ece0e717fe426..52725e5c0586b7f7b3e8fdc86d0c24ea38030d53 100755
--- a/doc/doc_en/knowledge_distillation_en.md
+++ b/doc/doc_en/knowledge_distillation_en.md
@@ -438,10 +438,10 @@ Architecture:
```
If DML is used, that is, the method of two small models learning from each other, the Teacher network structure in the above configuration file needs to be set to the same configuration as the Student model.
-Refer to the configuration file for details. [ch_PP-OCRv3_det_dml.yml](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.4/configs/det/ch_PP-OCRv3/ch_PP-OCRv3_det_dml.yml)
+Refer to the configuration file for details. [ch_PP-OCRv3_det_dml.yml](../../configs/det/ch_PP-OCRv3/ch_PP-OCRv3_det_dml.yml)
-The following describes the configuration file parameters [ch_PP-OCRv3_det_cml.yml](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.4/configs/det/ch_PP-OCRv3/ch_PP-OCRv3_det_cml.yml):
+The following describes the configuration file parameters [ch_PP-OCRv3_det_cml.yml](../../configs/det/ch_PP-OCRv3/ch_PP-OCRv3_det_cml.yml):
```
Architecture:
diff --git a/doc/doc_en/models_list_en.md b/doc/doc_en/models_list_en.md
index 8e8c1f2fe11bcd0748d556d34fd184fed4b3a86f..c52f71dfe4124302b8cb308980a6228a89589bd6 100644
--- a/doc/doc_en/models_list_en.md
+++ b/doc/doc_en/models_list_en.md
@@ -20,7 +20,7 @@ The downloadable models provided by PaddleOCR include `inference model`, `traine
|model type|model format|description|
|--- | --- | --- |
-|inference model|inference.pdmodel、inference.pdiparams|Used for inference based on Paddle inference engine,[detail](./inference_en.md)|
+|inference model|inference.pdmodel、inference.pdiparams|Used for inference based on Paddle inference engine,[detail](./inference_ppocr_en.md)|
|trained model, pre-trained model|\*.pdparams、\*.pdopt、\*.states |The checkpoints model saved in the training process, which stores the parameters of the model, mostly used for model evaluation and continuous training.|
|nb model|\*.nb| Model optimized by Paddle-Lite, which is suitable for mobile-side deployment scenarios (Paddle-Lite is needed for nb model deployment). |
@@ -37,7 +37,7 @@ Relationship of the above models is as follows.
|model name|description|config|model size|download|
| --- | --- | --- | --- | --- |
-|ch_PP-OCRv3_det_slim| [New] slim quantization with distillation lightweight model, supporting Chinese, English, multilingual text detection |[ch_PP-OCRv3_det_cml.yml](../../configs/det/ch_PP-OCRv3/ch_PP-OCRv3_det_cml.yml)| 1.1M |[inference model](https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_det_slim_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/PP-OCRv3/ch/ch_PP-OCRv3_det_slim_distill_train.tar) / [nb model](https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_det_slim_infer.nb)|
+|ch_PP-OCRv3_det_slim| [New] slim quantization with distillation lightweight model, supporting Chinese, English, multilingual text detection |[ch_PP-OCRv3_det_cml.yml](../../configs/det/ch_PP-OCRv3/ch_PP-OCRv3_det_cml.yml)| 1.1M |[inference model](https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_det_slim_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_det_slim_distill_train.tar) / [nb model](https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_det_slim_infer.nb)|
|ch_PP-OCRv3_det| [New] Original lightweight model, supporting Chinese, English, multilingual text detection |[ch_PP-OCRv3_det_cml.yml](../../configs/det/ch_PP-OCRv3/ch_PP-OCRv3_det_cml.yml)| 3.8M |[inference model](https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_det_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_det_distill_train.tar)|
|ch_PP-OCRv2_det_slim| [New] slim quantization with distillation lightweight model, supporting Chinese, English, multilingual text detection|[ch_PP-OCRv2_det_cml.yml](../../configs/det/ch_PP-OCRv2/ch_PP-OCRv2_det_cml.yml)| 3M |[inference model](https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_det_slim_quant_infer.tar)|
|ch_PP-OCRv2_det| [New] Original lightweight model, supporting Chinese, English, multilingual text detection|[ch_PP-OCRv2_det_cml.yml](../../configs/det/ch_PP-OCRv2/ch_PP-OCRv2_det_cml.yml)|3M|[inference model](https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_det_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_det_distill_train.tar)|
@@ -75,7 +75,7 @@ Relationship of the above models is as follows.
|model name|description|config|model size|download|
| --- | --- | --- | --- | --- |
-|ch_PP-OCRv3_rec_slim | [New] Slim qunatization with distillation lightweight model, supporting Chinese, English text recognition |[ch_PP-OCRv3_rec_distillation.yml](../../configs/rec/PP-OCRv3/ch_PP-OCRv3_rec_distillation.yml)| 4.9M |[inference model](https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_rec_slim_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/PP-OCRv3/ch/ch_PP-OCRv3_rec_slim_train.tar) / [nb model](https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_rec_slim_infer.nb) |
+|ch_PP-OCRv3_rec_slim | [New] Slim qunatization with distillation lightweight model, supporting Chinese, English text recognition |[ch_PP-OCRv3_rec_distillation.yml](../../configs/rec/PP-OCRv3/ch_PP-OCRv3_rec_distillation.yml)| 4.9M |[inference model](https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_rec_slim_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_rec_slim_train.tar) / [nb model](https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_rec_slim_infer.nb) |
|ch_PP-OCRv3_rec| [New] Original lightweight model, supporting Chinese, English, multilingual text recognition |[ch_PP-OCRv3_rec_distillation.yml](../../configs/rec/PP-OCRv3/ch_PP-OCRv3_rec_distillation.yml)| 12.4M |[inference model](https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_rec_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_rec_train.tar) |
|ch_PP-OCRv2_rec_slim| Slim qunatization with distillation lightweight model, supporting Chinese, English text recognition|[ch_PP-OCRv2_rec.yml](../../configs/rec/ch_PP-OCRv2/ch_PP-OCRv2_rec.yml)| 9M |[inference model](https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_rec_slim_quant_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_rec_slim_quant_train.tar) |
|ch_PP-OCRv2_rec| Original lightweight model, supporting Chinese, English, multilingual text recognition |[ch_PP-OCRv2_rec_distillation.yml](../../configs/rec/ch_PP-OCRv2/ch_PP-OCRv2_rec_distillation.yml)|8.5M|[inference model](https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_rec_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_rec_train.tar) |
@@ -91,7 +91,7 @@ Relationship of the above models is as follows.
|model name|description|config|model size|download|
| --- | --- | --- | --- | --- |
-|en_PP-OCRv3_rec_slim | [New] Slim qunatization with distillation lightweight model, supporting english, English text recognition |[en_PP-OCRv3_rec.yml](../../configs/rec/PP-OCRv3/en_PP-OCRv3_rec.yml)| 3.2M |[inference model](https://paddleocr.bj.bcebos.com/PP-OCRv3/english/PP-OCRv3_rec_slim_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/PP-OCRv3/english/en_PP-OCRv3_rec_slim_train.tar) / [nb model](https://paddleocr.bj.bcebos.com/PP-OCRv3/english/en_PP-OCRv3_rec_slim_infer.nb) |
+|en_PP-OCRv3_rec_slim | [New] Slim qunatization with distillation lightweight model, supporting english, English text recognition |[en_PP-OCRv3_rec.yml](../../configs/rec/PP-OCRv3/en_PP-OCRv3_rec.yml)| 3.2M |[inference model](https://paddleocr.bj.bcebos.com/PP-OCRv3/english/en_PP-OCRv3_rec_slim_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/PP-OCRv3/english/en_PP-OCRv3_rec_slim_train.tar) / [nb model](https://paddleocr.bj.bcebos.com/PP-OCRv3/english/en_PP-OCRv3_rec_slim_infer.nb) |
|en_PP-OCRv3_rec| [New] Original lightweight model, supporting english, English, multilingual text recognition |[en_PP-OCRv3_rec.yml](../../configs/rec/PP-OCRv3/en_PP-OCRv3_rec.yml)| 9.6M |[inference model](https://paddleocr.bj.bcebos.com/PP-OCRv3/english/en_PP-OCRv3_rec_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/PP-OCRv3/english/en_PP-OCRv3_rec_train.tar) |
|en_number_mobile_slim_v2.0_rec|Slim pruned and quantized lightweight model, supporting English and number recognition|[rec_en_number_lite_train.yml](../../configs/rec/multi_language/rec_en_number_lite_train.yml)| 2.7M | [inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/en_number_mobile_v2.0_rec_slim_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/en_number_mobile_v2.0_rec_slim_train.tar) |
|en_number_mobile_v2.0_rec|Original lightweight model, supporting English and number recognition|[rec_en_number_lite_train.yml](../../configs/rec/multi_language/rec_en_number_lite_train.yml)|2.6M|[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/en_number_mobile_v2.0_rec_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/en_number_mobile_v2.0_rec_train.tar) |
@@ -108,7 +108,7 @@ Relationship of the above models is as follows.
| ka_PP-OCRv3_rec | ppocr/utils/dict/ka_dict.txt | Lightweight model for Kannada recognition |[ka_PP-OCRv3_rec.yml](../../configs/rec/PP-OCRv3/multi_language/ka_PP-OCRv3_rec.yml)|9.9M|[inference model](https://paddleocr.bj.bcebos.com/PP-OCRv3/multilingual/ka_PP-OCRv3_rec_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/PP-OCRv3/multilingual/ka_PP-OCRv3_rec_train.tar) |
| ta_PP-OCRv3_rec | ppocr/utils/dict/ta_dict.txt |Lightweight model for Tamil recognition|[ta_PP-OCRv3_rec.yml](../../configs/rec/PP-OCRv3/multi_language/ta_PP-OCRv3_rec.yml)|9.6M|[inference model](https://paddleocr.bj.bcebos.com/PP-OCRv3/multilingual/ta_PP-OCRv3_rec_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/PP-OCRv3/multilingual/ta_PP-OCRv3_rec_train.tar) |
| latin_PP-OCRv3_rec | ppocr/utils/dict/latin_dict.txt | Lightweight model for latin recognition | [latin_PP-OCRv3_rec.yml](../../configs/rec/PP-OCRv3/multi_language/latin_PP-OCRv3_rec.yml) |9.7M|[inference model](https://paddleocr.bj.bcebos.com/PP-OCRv3/multilingual/latin_PP-OCRv3_rec_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/PP-OCRv3/multilingual/latin_PP-OCRv3_rec_train.tar) |
-| arabic_PP-OCRv3_rec | ppocr/utils/dict/arabic_dict.txt | Lightweight model for arabic recognition | [arabic_PP-OCRv3_rec.yml](../../configs/rec/PP-OCRv3/multi_language/rec_arabic_lite_train.yml) |9.6M|[inference model](https://paddleocr.bj.bcebos.com/PP-OCRv3/multilingual/arabic_PP-OCRv3_rec_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/PP-OCRv3/multilingual/arabic_PP-OCRv3_rec_train.tar) |
+| arabic_PP-OCRv3_rec | ppocr/utils/dict/arabic_dict.txt | Lightweight model for arabic recognition | [arabic_PP-OCRv3_rec.yml](../../configs/rec/PP-OCRv3/multi_language/arabic_PP-OCRv3_rec.yml) |9.6M|[inference model](https://paddleocr.bj.bcebos.com/PP-OCRv3/multilingual/arabic_PP-OCRv3_rec_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/PP-OCRv3/multilingual/arabic_PP-OCRv3_rec_train.tar) |
| cyrillic_PP-OCRv3_rec | ppocr/utils/dict/cyrillic_dict.txt | Lightweight model for cyrillic recognition | [cyrillic_PP-OCRv3_rec.yml](../../configs/rec/PP-OCRv3/multi_language/cyrillic_PP-OCRv3_rec.yml) |9.6M|[inference model](https://paddleocr.bj.bcebos.com/PP-OCRv3/multilingual/cyrillic_PP-OCRv3_rec_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/PP-OCRv3/multilingual/cyrillic_PP-OCRv3_rec_train.tar) |
| devanagari_PP-OCRv3_rec | ppocr/utils/dict/devanagari_dict.txt | Lightweight model for devanagari recognition | [devanagari_PP-OCRv3_rec.yml](../../configs/rec/PP-OCRv3/multi_language/devanagari_PP-OCRv3_rec.yml) |9.9M|[inference model](https://paddleocr.bj.bcebos.com/PP-OCRv3/multilingual/devanagari_PP-OCRv3_rec_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/PP-OCRv3/multilingual/devanagari_PP-OCRv3_rec_train.tar) |
diff --git a/doc/doc_en/multi_languages_en.md b/doc/doc_en/multi_languages_en.md
index 4696a3e842242517d19bcac7d7bdef3b4c233b12..d9cb180f706eebdba4727f7909499487794545b9 100644
--- a/doc/doc_en/multi_languages_en.md
+++ b/doc/doc_en/multi_languages_en.md
@@ -187,10 +187,10 @@ In addition to installing the whl package for quick forecasting,
PPOCR also provides a variety of forecasting deployment methods.
If necessary, you can read related documents:
-- [Python Inference](./inference_en.md)
-- [C++ Inference](../../deploy/cpp_infer/readme_en.md)
+- [Python Inference](./inference_ppocr_en.md)
+- [C++ Inference](../../deploy/cpp_infer/readme.md)
- [Serving](../../deploy/hubserving/readme_en.md)
-- [Mobile](https://github.com/PaddlePaddle/PaddleOCR/blob/develop/deploy/lite/readme_en.md)
+- [Mobile](../../deploy/lite/readme.md)
- [Benchmark](./benchmark_en.md)
diff --git a/doc/doc_en/ppocr_introduction_en.md b/doc/doc_en/ppocr_introduction_en.md
index 8fe6bc683ac69bdff0e3b4297f2eaa95b934fa17..d28ccb3529a46bdf0d3fd1d1c81f14137d10f2ea 100644
--- a/doc/doc_en/ppocr_introduction_en.md
+++ b/doc/doc_en/ppocr_introduction_en.md
@@ -29,16 +29,16 @@ PP-OCR pipeline is as follows:
PP-OCR system is in continuous optimization. At present, PP-OCR and PP-OCRv2 have been released:
-PP-OCR adopts 19 effective strategies from 8 aspects including backbone network selection and adjustment, prediction head design, data augmentation, learning rate transformation strategy, regularization parameter selection, pre-training model use, and automatic model tailoring and quantization to optimize and slim down the models of each module (as shown in the green box above). The final results are an ultra-lightweight Chinese and English OCR model with an overall size of 3.5M and a 2.8M English digital OCR model. For more details, please refer to the PP-OCR technical article (https://arxiv.org/abs/2009.09941).
+PP-OCR adopts 19 effective strategies from 8 aspects including backbone network selection and adjustment, prediction head design, data augmentation, learning rate transformation strategy, regularization parameter selection, pre-training model use, and automatic model tailoring and quantization to optimize and slim down the models of each module (as shown in the green box above). The final results are an ultra-lightweight Chinese and English OCR model with an overall size of 3.5M and a 2.8M English digital OCR model. For more details, please refer to [PP-OCR technical report](https://arxiv.org/abs/2009.09941).
#### PP-OCRv2
-On the basis of PP-OCR, PP-OCRv2 is further optimized in five aspects. The detection model adopts CML(Collaborative Mutual Learning) knowledge distillation strategy and CopyPaste data expansion strategy. The recognition model adopts LCNet lightweight backbone network, U-DML knowledge distillation strategy and enhanced CTC loss function improvement (as shown in the red box above), which further improves the inference speed and prediction effect. For more details, please refer to the technical report of PP-OCRv2 (https://arxiv.org/abs/2109.03144).
+On the basis of PP-OCR, PP-OCRv2 is further optimized in five aspects. The detection model adopts CML(Collaborative Mutual Learning) knowledge distillation strategy and CopyPaste data expansion strategy. The recognition model adopts LCNet lightweight backbone network, U-DML knowledge distillation strategy and enhanced CTC loss function improvement (as shown in the red box above), which further improves the inference speed and prediction effect. For more details, please refer to [PP-OCRv2 technical report](https://arxiv.org/abs/2109.03144).
#### PP-OCRv3
PP-OCRv3 upgraded the detection model and recognition model in 9 aspects based on PP-OCRv2:
- PP-OCRv3 detector upgrades the CML(Collaborative Mutual Learning) text detection strategy proposed in PP-OCRv2, and further optimizes the effect of teacher model and student model respectively. In the optimization of teacher model, a pan module with large receptive field named LK-PAN is proposed and the DML distillation strategy is adopted; In the optimization of student model, a FPN module with residual attention mechanism named RSE-FPN is proposed.
-- PP-OCRv3 recognizer is optimized based on text recognition algorithm [SVTR](https://arxiv.org/abs/2205.00159). SVTR no longer adopts RNN by introducing transformers structure, which can mine the context information of text line image more effectively, so as to improve the ability of text recognition. PP-OCRv3 adopts lightweight text recognition network SVTR_LCNet, guided training of CTC loss by attention loss, data augmentation strategy TextConAug, better pre-trained model by self-supervised TextRotNet, UDML(Unified Deep Mutual Learning), and UIM (Unlabeled Images Mining) to accelerate the model and improve the effect.
+- PP-OCRv3 recognizer is optimized based on text recognition algorithm [SVTR](https://arxiv.org/abs/2205.00159). SVTR no longer adopts RNN by introducing transformers structure, which can mine the context information of text line image more effectively, so as to improve the ability of text recognition. PP-OCRv3 adopts lightweight text recognition network SVTR_LCNet, guided training of CTC by attention, data augmentation strategy TextConAug, better pre-trained model by self-supervised TextRotNet, UDML(Unified Deep Mutual Learning), and UIM (Unlabeled Images Mining) to accelerate the model and improve the effect.
PP-OCRv3 pipeline is as follows:
@@ -46,7 +46,7 @@ PP-OCRv3 pipeline is as follows:
-For more details, please refer to [PP-OCRv3 technical report](./PP-OCRv3_introduction_en.md).
+For more details, please refer to [PP-OCRv3 technical report](https://arxiv.org/abs/2206.03001v2).
## 2. Features
diff --git a/doc/doc_en/quickstart_en.md b/doc/doc_en/quickstart_en.md
index d7aeb7773021aa6cf8f4d71298588915e5938fab..c678dc47625f4289a93621144bf5577b059d52b3 100644
--- a/doc/doc_en/quickstart_en.md
+++ b/doc/doc_en/quickstart_en.md
@@ -119,7 +119,18 @@ If you do not use the provided test image, you can replace the following `--imag
['PAIN', 0.9934559464454651]
```
-If you need to use the 2.0 model, please specify the parameter `--ocr_version PP-OCR`, paddleocr uses the PP-OCRv3 model by default(`--ocr_version PP-OCRv3`). More whl package usage can be found in [whl package](./whl_en.md)
+**Version**
+paddleocr uses the PP-OCRv3 model by default(`--ocr_version PP-OCRv3`). If you want to use other versions, you can set the parameter `--ocr_version`, the specific version description is as follows:
+| version name | description |
+| --- | --- |
+| PP-OCRv3 | support Chinese and English detection and recognition, direction classifier, support multilingual recognition |
+| PP-OCRv2 | only supports Chinese and English detection and recognition, direction classifier, multilingual model is not updated |
+| PP-OCR | support Chinese and English detection and recognition, direction classifier, support multilingual recognition |
+
+If you want to add your own trained model, you can add model links and keys in [paddleocr](../../paddleocr.py) and recompile.
+
+More whl package usage can be found in [whl package](./whl_en.md)
+
#### 2.1.2 Multi-language Model
diff --git a/doc/doc_en/recognition_en.md b/doc/doc_en/recognition_en.md
index 60b4a1b26b373adc562ab9624e55ffe59a775a35..7d31b0ffe28c59ad3397d06fa178bcf8cbb822e9 100644
--- a/doc/doc_en/recognition_en.md
+++ b/doc/doc_en/recognition_en.md
@@ -306,7 +306,7 @@ python3 -m paddle.distributed.launch --ips="xx.xx.xx.xx,xx.xx.xx.xx" --gpus '0,1
-o Global.pretrained_model=./pretrain_models/rec_mv3_none_bilstm_ctc_v2.0_train
```
-**Note:** When using multi-machine and multi-gpu training, you need to replace the ips value in the above command with the address of your machine, and the machines need to be able to ping each other. In addition, training needs to be launched separately on multiple machines. The command to view the ip address of the machine is `ifconfig`.
+**Note:** (1) When using multi-machine and multi-gpu training, you need to replace the ips value in the above command with the address of your machine, and the machines need to be able to ping each other. (2) Training needs to be launched separately on multiple machines. The command to view the ip address of the machine is `ifconfig`. (3) For more details about the distributed training speedup ratio, please refer to [Distributed Training Tutorial](./distributed_training_en.md).
### 2.6 Training with Knowledge Distillation
diff --git a/doc/doc_en/update_en.md b/doc/doc_en/update_en.md
index a900219b2462524425fc4303ea3bd571efcbab8f..a44dd0d70c611e1b5fb59d1e58b382704d0bbae8 100644
--- a/doc/doc_en/update_en.md
+++ b/doc/doc_en/update_en.md
@@ -1,8 +1,8 @@
# RECENT UPDATES
- 2022.5.9 release PaddleOCR v2.5, including:
- - [PP-OCRv3](./doc/doc_en/ppocr_introduction_en.md#pp-ocrv3): With comparable speed, the effect of Chinese scene is further improved by 5% compared with PP-OCRv2, the effect of English scene is improved by 11%, and the average recognition accuracy of 80 language multilingual models is improved by more than 5%.
- - [PPOCRLabelv2](./PPOCRLabel): Add the annotation function for table recognition task, key information extraction task and irregular text image.
- - Interactive e-book [*"Dive into OCR"*](./doc/doc_en/ocr_book_en.md), covers the cutting-edge theory and code practice of OCR full stack technology.
+ - [PP-OCRv3](./ppocr_introduction_en.md#pp-ocrv3): With comparable speed, the effect of Chinese scene is further improved by 5% compared with PP-OCRv2, the effect of English scene is improved by 11%, and the average recognition accuracy of 80 language multilingual models is improved by more than 5%.
+ - [PPOCRLabelv2](../../PPOCRLabel): Add the annotation function for table recognition task, key information extraction task and irregular text image.
+ - Interactive e-book [*"Dive into OCR"*](./ocr_book_en.md), covers the cutting-edge theory and code practice of OCR full stack technology.
- 2022.5.7 Add support for metric and model logging during training to [Weights & Biases](https://docs.wandb.ai/).
- 2021.12.21 OCR open source online course starts. The lesson starts at 8:30 every night and lasts for ten days. Free registration: https://aistudio.baidu.com/aistudio/course/introduce/25207
- 2021.12.21 release PaddleOCR v2.4, release 1 text detection algorithm (PSENet), 3 text recognition algorithms (NRTR、SEED、SAR), 1 key information extraction algorithm (SDMGR) and 3 DocVQA algorithms (LayoutLM、LayoutLMv2,LayoutXLM).
diff --git a/paddleocr.py b/paddleocr.py
index a1265f79def7018a5586be954127e5b7fdba011e..470dc60da3b15195bcd401aff5e50be5a2cfd13e 100644
--- a/paddleocr.py
+++ b/paddleocr.py
@@ -154,7 +154,13 @@ MODEL_URLS = {
'https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_rec_infer.tar',
'dict_path': './ppocr/utils/ppocr_keys_v1.txt'
}
- }
+ },
+ 'cls': {
+ 'ch': {
+ 'url':
+ 'https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_infer.tar',
+ }
+ },
},
'PP-OCR': {
'det': {
diff --git a/ppocr/data/imaug/__init__.py b/ppocr/data/imaug/__init__.py
index 750a7541578770c15edcadafe38923198bff9fd6..d82176282839bf76d34ed8a60d5e2e13ac6bbce6 100644
--- a/ppocr/data/imaug/__init__.py
+++ b/ppocr/data/imaug/__init__.py
@@ -22,8 +22,11 @@ from .make_shrink_map import MakeShrinkMap
from .random_crop_data import EastRandomCropData, RandomCropImgMask
from .make_pse_gt import MakePseGt
-from .rec_img_aug import RecAug, RecConAug, RecResizeImg, ClsResizeImg, \
- SRNRecResizeImg, NRTRRecResizeImg, SARRecResizeImg, PRENResizeImg
+
+
+from .rec_img_aug import BaseDataAugmentation, RecAug, RecConAug, RecResizeImg, ClsResizeImg, \
+ SRNRecResizeImg, GrayRecResizeImg, SARRecResizeImg, PRENResizeImg, \
+ ABINetRecResizeImg, SVTRRecResizeImg, ABINetRecAug
from .ssl_img_aug import SSLRotateResize
from .randaugment import RandAugment
from .copy_paste import CopyPaste
diff --git a/ppocr/data/imaug/abinet_aug.py b/ppocr/data/imaug/abinet_aug.py
new file mode 100644
index 0000000000000000000000000000000000000000..eefdc75d5a5c0ac3f7136bf22a2adb31129bd313
--- /dev/null
+++ b/ppocr/data/imaug/abinet_aug.py
@@ -0,0 +1,407 @@
+# copyright (c) 2020 PaddlePaddle Authors. All Rights Reserve.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+"""
+This code is refer from:
+https://github.com/FangShancheng/ABINet/blob/main/transforms.py
+"""
+import math
+import numbers
+import random
+
+import cv2
+import numpy as np
+from paddle.vision.transforms import Compose, ColorJitter
+
+
+def sample_asym(magnitude, size=None):
+ return np.random.beta(1, 4, size) * magnitude
+
+
+def sample_sym(magnitude, size=None):
+ return (np.random.beta(4, 4, size=size) - 0.5) * 2 * magnitude
+
+
+def sample_uniform(low, high, size=None):
+ return np.random.uniform(low, high, size=size)
+
+
+def get_interpolation(type='random'):
+ if type == 'random':
+ choice = [
+ cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA
+ ]
+ interpolation = choice[random.randint(0, len(choice) - 1)]
+ elif type == 'nearest':
+ interpolation = cv2.INTER_NEAREST
+ elif type == 'linear':
+ interpolation = cv2.INTER_LINEAR
+ elif type == 'cubic':
+ interpolation = cv2.INTER_CUBIC
+ elif type == 'area':
+ interpolation = cv2.INTER_AREA
+ else:
+ raise TypeError(
+ 'Interpolation types only nearest, linear, cubic, area are supported!'
+ )
+ return interpolation
+
+
+class CVRandomRotation(object):
+ def __init__(self, degrees=15):
+ assert isinstance(degrees,
+ numbers.Number), "degree should be a single number."
+ assert degrees >= 0, "degree must be positive."
+ self.degrees = degrees
+
+ @staticmethod
+ def get_params(degrees):
+ return sample_sym(degrees)
+
+ def __call__(self, img):
+ angle = self.get_params(self.degrees)
+ src_h, src_w = img.shape[:2]
+ M = cv2.getRotationMatrix2D(
+ center=(src_w / 2, src_h / 2), angle=angle, scale=1.0)
+ abs_cos, abs_sin = abs(M[0, 0]), abs(M[0, 1])
+ dst_w = int(src_h * abs_sin + src_w * abs_cos)
+ dst_h = int(src_h * abs_cos + src_w * abs_sin)
+ M[0, 2] += (dst_w - src_w) / 2
+ M[1, 2] += (dst_h - src_h) / 2
+
+ flags = get_interpolation()
+ return cv2.warpAffine(
+ img,
+ M, (dst_w, dst_h),
+ flags=flags,
+ borderMode=cv2.BORDER_REPLICATE)
+
+
+class CVRandomAffine(object):
+ def __init__(self, degrees, translate=None, scale=None, shear=None):
+ assert isinstance(degrees,
+ numbers.Number), "degree should be a single number."
+ assert degrees >= 0, "degree must be positive."
+ self.degrees = degrees
+
+ if translate is not None:
+ assert isinstance(translate, (tuple, list)) and len(translate) == 2, \
+ "translate should be a list or tuple and it must be of length 2."
+ for t in translate:
+ if not (0.0 <= t <= 1.0):
+ raise ValueError(
+ "translation values should be between 0 and 1")
+ self.translate = translate
+
+ if scale is not None:
+ assert isinstance(scale, (tuple, list)) and len(scale) == 2, \
+ "scale should be a list or tuple and it must be of length 2."
+ for s in scale:
+ if s <= 0:
+ raise ValueError("scale values should be positive")
+ self.scale = scale
+
+ if shear is not None:
+ if isinstance(shear, numbers.Number):
+ if shear < 0:
+ raise ValueError(
+ "If shear is a single number, it must be positive.")
+ self.shear = [shear]
+ else:
+ assert isinstance(shear, (tuple, list)) and (len(shear) == 2), \
+ "shear should be a list or tuple and it must be of length 2."
+ self.shear = shear
+ else:
+ self.shear = shear
+
+ def _get_inverse_affine_matrix(self, center, angle, translate, scale,
+ shear):
+ # https://github.com/pytorch/vision/blob/v0.4.0/torchvision/transforms/functional.py#L717
+ from numpy import sin, cos, tan
+
+ if isinstance(shear, numbers.Number):
+ shear = [shear, 0]
+
+ if not isinstance(shear, (tuple, list)) and len(shear) == 2:
+ raise ValueError(
+ "Shear should be a single value or a tuple/list containing " +
+ "two values. Got {}".format(shear))
+
+ rot = math.radians(angle)
+ sx, sy = [math.radians(s) for s in shear]
+
+ cx, cy = center
+ tx, ty = translate
+
+ # RSS without scaling
+ a = cos(rot - sy) / cos(sy)
+ b = -cos(rot - sy) * tan(sx) / cos(sy) - sin(rot)
+ c = sin(rot - sy) / cos(sy)
+ d = -sin(rot - sy) * tan(sx) / cos(sy) + cos(rot)
+
+ # Inverted rotation matrix with scale and shear
+ # det([[a, b], [c, d]]) == 1, since det(rotation) = 1 and det(shear) = 1
+ M = [d, -b, 0, -c, a, 0]
+ M = [x / scale for x in M]
+
+ # Apply inverse of translation and of center translation: RSS^-1 * C^-1 * T^-1
+ M[2] += M[0] * (-cx - tx) + M[1] * (-cy - ty)
+ M[5] += M[3] * (-cx - tx) + M[4] * (-cy - ty)
+
+ # Apply center translation: C * RSS^-1 * C^-1 * T^-1
+ M[2] += cx
+ M[5] += cy
+ return M
+
+ @staticmethod
+ def get_params(degrees, translate, scale_ranges, shears, height):
+ angle = sample_sym(degrees)
+ if translate is not None:
+ max_dx = translate[0] * height
+ max_dy = translate[1] * height
+ translations = (np.round(sample_sym(max_dx)),
+ np.round(sample_sym(max_dy)))
+ else:
+ translations = (0, 0)
+
+ if scale_ranges is not None:
+ scale = sample_uniform(scale_ranges[0], scale_ranges[1])
+ else:
+ scale = 1.0
+
+ if shears is not None:
+ if len(shears) == 1:
+ shear = [sample_sym(shears[0]), 0.]
+ elif len(shears) == 2:
+ shear = [sample_sym(shears[0]), sample_sym(shears[1])]
+ else:
+ shear = 0.0
+
+ return angle, translations, scale, shear
+
+ def __call__(self, img):
+ src_h, src_w = img.shape[:2]
+ angle, translate, scale, shear = self.get_params(
+ self.degrees, self.translate, self.scale, self.shear, src_h)
+
+ M = self._get_inverse_affine_matrix((src_w / 2, src_h / 2), angle,
+ (0, 0), scale, shear)
+ M = np.array(M).reshape(2, 3)
+
+ startpoints = [(0, 0), (src_w - 1, 0), (src_w - 1, src_h - 1),
+ (0, src_h - 1)]
+ project = lambda x, y, a, b, c: int(a * x + b * y + c)
+ endpoints = [(project(x, y, *M[0]), project(x, y, *M[1]))
+ for x, y in startpoints]
+
+ rect = cv2.minAreaRect(np.array(endpoints))
+ bbox = cv2.boxPoints(rect).astype(dtype=np.int)
+ max_x, max_y = bbox[:, 0].max(), bbox[:, 1].max()
+ min_x, min_y = bbox[:, 0].min(), bbox[:, 1].min()
+
+ dst_w = int(max_x - min_x)
+ dst_h = int(max_y - min_y)
+ M[0, 2] += (dst_w - src_w) / 2
+ M[1, 2] += (dst_h - src_h) / 2
+
+ # add translate
+ dst_w += int(abs(translate[0]))
+ dst_h += int(abs(translate[1]))
+ if translate[0] < 0: M[0, 2] += abs(translate[0])
+ if translate[1] < 0: M[1, 2] += abs(translate[1])
+
+ flags = get_interpolation()
+ return cv2.warpAffine(
+ img,
+ M, (dst_w, dst_h),
+ flags=flags,
+ borderMode=cv2.BORDER_REPLICATE)
+
+
+class CVRandomPerspective(object):
+ def __init__(self, distortion=0.5):
+ self.distortion = distortion
+
+ def get_params(self, width, height, distortion):
+ offset_h = sample_asym(
+ distortion * height / 2, size=4).astype(dtype=np.int)
+ offset_w = sample_asym(
+ distortion * width / 2, size=4).astype(dtype=np.int)
+ topleft = (offset_w[0], offset_h[0])
+ topright = (width - 1 - offset_w[1], offset_h[1])
+ botright = (width - 1 - offset_w[2], height - 1 - offset_h[2])
+ botleft = (offset_w[3], height - 1 - offset_h[3])
+
+ startpoints = [(0, 0), (width - 1, 0), (width - 1, height - 1),
+ (0, height - 1)]
+ endpoints = [topleft, topright, botright, botleft]
+ return np.array(
+ startpoints, dtype=np.float32), np.array(
+ endpoints, dtype=np.float32)
+
+ def __call__(self, img):
+ height, width = img.shape[:2]
+ startpoints, endpoints = self.get_params(width, height, self.distortion)
+ M = cv2.getPerspectiveTransform(startpoints, endpoints)
+
+ # TODO: more robust way to crop image
+ rect = cv2.minAreaRect(endpoints)
+ bbox = cv2.boxPoints(rect).astype(dtype=np.int)
+ max_x, max_y = bbox[:, 0].max(), bbox[:, 1].max()
+ min_x, min_y = bbox[:, 0].min(), bbox[:, 1].min()
+ min_x, min_y = max(min_x, 0), max(min_y, 0)
+
+ flags = get_interpolation()
+ img = cv2.warpPerspective(
+ img,
+ M, (max_x, max_y),
+ flags=flags,
+ borderMode=cv2.BORDER_REPLICATE)
+ img = img[min_y:, min_x:]
+ return img
+
+
+class CVRescale(object):
+ def __init__(self, factor=4, base_size=(128, 512)):
+ """ Define image scales using gaussian pyramid and rescale image to target scale.
+
+ Args:
+ factor: the decayed factor from base size, factor=4 keeps target scale by default.
+ base_size: base size the build the bottom layer of pyramid
+ """
+ if isinstance(factor, numbers.Number):
+ self.factor = round(sample_uniform(0, factor))
+ elif isinstance(factor, (tuple, list)) and len(factor) == 2:
+ self.factor = round(sample_uniform(factor[0], factor[1]))
+ else:
+ raise Exception('factor must be number or list with length 2')
+ # assert factor is valid
+ self.base_h, self.base_w = base_size[:2]
+
+ def __call__(self, img):
+ if self.factor == 0: return img
+ src_h, src_w = img.shape[:2]
+ cur_w, cur_h = self.base_w, self.base_h
+ scale_img = cv2.resize(
+ img, (cur_w, cur_h), interpolation=get_interpolation())
+ for _ in range(self.factor):
+ scale_img = cv2.pyrDown(scale_img)
+ scale_img = cv2.resize(
+ scale_img, (src_w, src_h), interpolation=get_interpolation())
+ return scale_img
+
+
+class CVGaussianNoise(object):
+ def __init__(self, mean=0, var=20):
+ self.mean = mean
+ if isinstance(var, numbers.Number):
+ self.var = max(int(sample_asym(var)), 1)
+ elif isinstance(var, (tuple, list)) and len(var) == 2:
+ self.var = int(sample_uniform(var[0], var[1]))
+ else:
+ raise Exception('degree must be number or list with length 2')
+
+ def __call__(self, img):
+ noise = np.random.normal(self.mean, self.var**0.5, img.shape)
+ img = np.clip(img + noise, 0, 255).astype(np.uint8)
+ return img
+
+
+class CVMotionBlur(object):
+ def __init__(self, degrees=12, angle=90):
+ if isinstance(degrees, numbers.Number):
+ self.degree = max(int(sample_asym(degrees)), 1)
+ elif isinstance(degrees, (tuple, list)) and len(degrees) == 2:
+ self.degree = int(sample_uniform(degrees[0], degrees[1]))
+ else:
+ raise Exception('degree must be number or list with length 2')
+ self.angle = sample_uniform(-angle, angle)
+
+ def __call__(self, img):
+ M = cv2.getRotationMatrix2D((self.degree // 2, self.degree // 2),
+ self.angle, 1)
+ motion_blur_kernel = np.zeros((self.degree, self.degree))
+ motion_blur_kernel[self.degree // 2, :] = 1
+ motion_blur_kernel = cv2.warpAffine(motion_blur_kernel, M,
+ (self.degree, self.degree))
+ motion_blur_kernel = motion_blur_kernel / self.degree
+ img = cv2.filter2D(img, -1, motion_blur_kernel)
+ img = np.clip(img, 0, 255).astype(np.uint8)
+ return img
+
+
+class CVGeometry(object):
+ def __init__(self,
+ degrees=15,
+ translate=(0.3, 0.3),
+ scale=(0.5, 2.),
+ shear=(45, 15),
+ distortion=0.5,
+ p=0.5):
+ self.p = p
+ type_p = random.random()
+ if type_p < 0.33:
+ self.transforms = CVRandomRotation(degrees=degrees)
+ elif type_p < 0.66:
+ self.transforms = CVRandomAffine(
+ degrees=degrees, translate=translate, scale=scale, shear=shear)
+ else:
+ self.transforms = CVRandomPerspective(distortion=distortion)
+
+ def __call__(self, img):
+ if random.random() < self.p:
+ return self.transforms(img)
+ else:
+ return img
+
+
+class CVDeterioration(object):
+ def __init__(self, var, degrees, factor, p=0.5):
+ self.p = p
+ transforms = []
+ if var is not None:
+ transforms.append(CVGaussianNoise(var=var))
+ if degrees is not None:
+ transforms.append(CVMotionBlur(degrees=degrees))
+ if factor is not None:
+ transforms.append(CVRescale(factor=factor))
+
+ random.shuffle(transforms)
+ transforms = Compose(transforms)
+ self.transforms = transforms
+
+ def __call__(self, img):
+ if random.random() < self.p:
+
+ return self.transforms(img)
+ else:
+ return img
+
+
+class CVColorJitter(object):
+ def __init__(self,
+ brightness=0.5,
+ contrast=0.5,
+ saturation=0.5,
+ hue=0.1,
+ p=0.5):
+ self.p = p
+ self.transforms = ColorJitter(
+ brightness=brightness,
+ contrast=contrast,
+ saturation=saturation,
+ hue=hue)
+
+ def __call__(self, img):
+ if random.random() < self.p: return self.transforms(img)
+ else: return img
diff --git a/ppocr/data/imaug/fce_targets.py b/ppocr/data/imaug/fce_targets.py
index 4d1903c0a7316989ca24d8c4a934d7b99a2a2ff6..8c64276e26665d2779d35154bf9cd77edddad580 100644
--- a/ppocr/data/imaug/fce_targets.py
+++ b/ppocr/data/imaug/fce_targets.py
@@ -107,17 +107,20 @@ class FCENetTargets:
for i in range(1, n):
current_line_len = i * delta_length
- while current_line_len >= length_cumsum[current_edge_ind + 1]:
+ while current_edge_ind + 1 < len(length_cumsum) and current_line_len >= length_cumsum[current_edge_ind + 1]:
current_edge_ind += 1
+
current_edge_end_shift = current_line_len - length_cumsum[
current_edge_ind]
+
+ if current_edge_ind >= len(length_list):
+ break
end_shift_ratio = current_edge_end_shift / length_list[
current_edge_ind]
current_point = line[current_edge_ind] + (line[current_edge_ind + 1]
- line[current_edge_ind]
) * end_shift_ratio
resampled_line.append(current_point)
-
resampled_line.append(line[-1])
resampled_line = np.array(resampled_line)
@@ -328,6 +331,8 @@ class FCENetTargets:
resampled_top_line, resampled_bot_line = self.resample_sidelines(
top_line, bot_line, self.resample_step)
resampled_bot_line = resampled_bot_line[::-1]
+ if len(resampled_top_line) != len(resampled_bot_line):
+ continue
center_line = (resampled_top_line + resampled_bot_line) / 2
line_head_shrink_len = norm(resampled_top_line[0] -
diff --git a/ppocr/data/imaug/label_ops.py b/ppocr/data/imaug/label_ops.py
index d63901a03df045709632db1d5403de4368422339..007927011a5134129b322940418df8f124746c99 100644
--- a/ppocr/data/imaug/label_ops.py
+++ b/ppocr/data/imaug/label_ops.py
@@ -23,7 +23,6 @@ import string
from shapely.geometry import LineString, Point, Polygon
import json
import copy
-
from ppocr.utils.logging import get_logger
@@ -74,9 +73,10 @@ class DetLabelEncode(object):
s = pts.sum(axis=1)
rect[0] = pts[np.argmin(s)]
rect[2] = pts[np.argmax(s)]
- diff = np.diff(pts, axis=1)
- rect[1] = pts[np.argmin(diff)]
- rect[3] = pts[np.argmax(diff)]
+ tmp = np.delete(pts, (np.argmin(s), np.argmax(s)), axis=0)
+ diff = np.diff(np.array(tmp), axis=1)
+ rect[1] = tmp[np.argmin(diff)]
+ rect[3] = tmp[np.argmax(diff)]
return rect
def expand_points_num(self, boxes):
@@ -157,37 +157,6 @@ class BaseRecLabelEncode(object):
return text_list
-class NRTRLabelEncode(BaseRecLabelEncode):
- """ Convert between text-label and text-index """
-
- def __init__(self,
- max_text_length,
- character_dict_path=None,
- use_space_char=False,
- **kwargs):
-
- super(NRTRLabelEncode, self).__init__(
- max_text_length, character_dict_path, use_space_char)
-
- def __call__(self, data):
- text = data['label']
- text = self.encode(text)
- if text is None:
- return None
- if len(text) >= self.max_text_len - 1:
- return None
- data['length'] = np.array(len(text))
- text.insert(0, 2)
- text.append(3)
- text = text + [0] * (self.max_text_len - len(text))
- data['label'] = np.array(text)
- return data
-
- def add_special_char(self, dict_character):
- dict_character = ['blank', '
', '', ''] + dict_character
- return dict_character
-
-
class CTCLabelEncode(BaseRecLabelEncode):
""" Convert between text-label and text-index """
@@ -290,15 +259,26 @@ class E2ELabelEncodeTrain(object):
class KieLabelEncode(object):
- def __init__(self, character_dict_path, norm=10, directed=False, **kwargs):
+ def __init__(self,
+ character_dict_path,
+ class_path,
+ norm=10,
+ directed=False,
+ **kwargs):
super(KieLabelEncode, self).__init__()
self.dict = dict({'': 0})
+ self.label2classid_map = dict()
with open(character_dict_path, 'r', encoding='utf-8') as fr:
idx = 1
for line in fr:
char = line.strip()
self.dict[char] = idx
idx += 1
+ with open(class_path, "r") as fin:
+ lines = fin.readlines()
+ for idx, line in enumerate(lines):
+ line = line.strip("\n")
+ self.label2classid_map[line] = idx
self.norm = norm
self.directed = directed
@@ -438,10 +418,10 @@ class KieLabelEncode(object):
texts.append(ann['transcription'])
text_ind = [self.dict[c] for c in text if c in self.dict]
text_inds.append(text_ind)
- if 'label' in anno.keys():
- labels.append(ann['label'])
- elif 'key_cls' in anno.keys():
- labels.append(anno['key_cls'])
+ if 'label' in ann.keys():
+ labels.append(self.label2classid_map[ann['label']])
+ elif 'key_cls' in ann.keys():
+ labels.append(ann['key_cls'])
else:
raise ValueError(
"Cannot found 'key_cls' in ann.keys(), please check your training annotation."
@@ -937,15 +917,16 @@ class VQATokenLabelEncode(object):
for info in ocr_info:
if train_re:
# for re
- if len(info["text"]) == 0:
+ if len(info["transcription"]) == 0:
empty_entity.add(info["id"])
continue
id2label[info["id"]] = info["label"]
relations.extend([tuple(sorted(l)) for l in info["linking"]])
# smooth_box
+ info["bbox"] = self.trans_poly_to_bbox(info["points"])
bbox = self._smooth_box(info["bbox"], height, width)
- text = info["text"]
+ text = info["transcription"]
encode_res = self.tokenizer.encode(
text, pad_to_max_seq_len=False, return_attention_mask=True)
@@ -1005,29 +986,29 @@ class VQATokenLabelEncode(object):
data['entity_id_to_index_map'] = entity_id_to_index_map
return data
- def _load_ocr_info(self, data):
- def trans_poly_to_bbox(poly):
- x1 = np.min([p[0] for p in poly])
- x2 = np.max([p[0] for p in poly])
- y1 = np.min([p[1] for p in poly])
- y2 = np.max([p[1] for p in poly])
- return [x1, y1, x2, y2]
+ def trans_poly_to_bbox(self, poly):
+ x1 = np.min([p[0] for p in poly])
+ x2 = np.max([p[0] for p in poly])
+ y1 = np.min([p[1] for p in poly])
+ y2 = np.max([p[1] for p in poly])
+ return [x1, y1, x2, y2]
+ def _load_ocr_info(self, data):
if self.infer_mode:
ocr_result = self.ocr_engine.ocr(data['image'], cls=False)
ocr_info = []
for res in ocr_result:
ocr_info.append({
- "text": res[1][0],
- "bbox": trans_poly_to_bbox(res[0]),
- "poly": res[0],
+ "transcription": res[1][0],
+ "bbox": self.trans_poly_to_bbox(res[0]),
+ "points": res[0],
})
return ocr_info
else:
info = data['label']
# read text info
info_dict = json.loads(info)
- return info_dict["ocr_info"]
+ return info_dict
def _smooth_box(self, bbox, height, width):
bbox[0] = int(bbox[0] * 1000.0 / width)
@@ -1038,7 +1019,7 @@ class VQATokenLabelEncode(object):
def _parse_label(self, label, encode_res):
gt_label = []
- if label.lower() == "other":
+ if label.lower() in ["other", "others", "ignore"]:
gt_label.extend([0] * len(encode_res["input_ids"]))
else:
gt_label.append(self.label2id_map[("b-" + label).upper()])
@@ -1075,3 +1056,99 @@ class MultiLabelEncode(BaseRecLabelEncode):
data_out['label_sar'] = sar['label']
data_out['length'] = ctc['length']
return data_out
+
+
+class NRTRLabelEncode(BaseRecLabelEncode):
+ """ Convert between text-label and text-index """
+
+ def __init__(self,
+ max_text_length,
+ character_dict_path=None,
+ use_space_char=False,
+ **kwargs):
+
+ super(NRTRLabelEncode, self).__init__(
+ max_text_length, character_dict_path, use_space_char)
+
+ def __call__(self, data):
+ text = data['label']
+ text = self.encode(text)
+ if text is None:
+ return None
+ if len(text) >= self.max_text_len - 1:
+ return None
+ data['length'] = np.array(len(text))
+ text.insert(0, 2)
+ text.append(3)
+ text = text + [0] * (self.max_text_len - len(text))
+ data['label'] = np.array(text)
+ return data
+
+ def add_special_char(self, dict_character):
+ dict_character = ['blank', '', '', ''] + dict_character
+ return dict_character
+
+
+class ViTSTRLabelEncode(BaseRecLabelEncode):
+ """ Convert between text-label and text-index """
+
+ def __init__(self,
+ max_text_length,
+ character_dict_path=None,
+ use_space_char=False,
+ ignore_index=0,
+ **kwargs):
+
+ super(ViTSTRLabelEncode, self).__init__(
+ max_text_length, character_dict_path, use_space_char)
+ self.ignore_index = ignore_index
+
+ def __call__(self, data):
+ text = data['label']
+ text = self.encode(text)
+ if text is None:
+ return None
+ if len(text) >= self.max_text_len:
+ return None
+ data['length'] = np.array(len(text))
+ text.insert(0, self.ignore_index)
+ text.append(1)
+ text = text + [self.ignore_index] * (self.max_text_len + 2 - len(text))
+ data['label'] = np.array(text)
+ return data
+
+ def add_special_char(self, dict_character):
+ dict_character = ['', ''] + dict_character
+ return dict_character
+
+
+class ABINetLabelEncode(BaseRecLabelEncode):
+ """ Convert between text-label and text-index """
+
+ def __init__(self,
+ max_text_length,
+ character_dict_path=None,
+ use_space_char=False,
+ ignore_index=100,
+ **kwargs):
+
+ super(ABINetLabelEncode, self).__init__(
+ max_text_length, character_dict_path, use_space_char)
+ self.ignore_index = ignore_index
+
+ def __call__(self, data):
+ text = data['label']
+ text = self.encode(text)
+ if text is None:
+ return None
+ if len(text) >= self.max_text_len:
+ return None
+ data['length'] = np.array(len(text))
+ text.append(0)
+ text = text + [self.ignore_index] * (self.max_text_len + 1 - len(text))
+ data['label'] = np.array(text)
+ return data
+
+ def add_special_char(self, dict_character):
+ dict_character = [''] + dict_character
+ return dict_character
diff --git a/ppocr/data/imaug/operators.py b/ppocr/data/imaug/operators.py
index 09736515e7a388e191a12826e1e9e348e2fcde86..04cc2848fb4d25baaf553c6eda235ddb0e86511f 100644
--- a/ppocr/data/imaug/operators.py
+++ b/ppocr/data/imaug/operators.py
@@ -67,39 +67,6 @@ class DecodeImage(object):
return data
-class NRTRDecodeImage(object):
- """ decode image """
-
- def __init__(self, img_mode='RGB', channel_first=False, **kwargs):
- self.img_mode = img_mode
- self.channel_first = channel_first
-
- def __call__(self, data):
- img = data['image']
- if six.PY2:
- assert type(img) is str and len(
- img) > 0, "invalid input 'img' in DecodeImage"
- else:
- assert type(img) is bytes and len(
- img) > 0, "invalid input 'img' in DecodeImage"
- img = np.frombuffer(img, dtype='uint8')
-
- img = cv2.imdecode(img, 1)
-
- if img is None:
- return None
- if self.img_mode == 'GRAY':
- img = cv2.cvtColor(img, cv2.COLOR_GRAY2BGR)
- elif self.img_mode == 'RGB':
- assert img.shape[2] == 3, 'invalid shape of image[%s]' % (img.shape)
- img = img[:, :, ::-1]
- img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
- if self.channel_first:
- img = img.transpose((2, 0, 1))
- data['image'] = img
- return data
-
-
class NormalizeImage(object):
""" normalize image such as substract mean, divide std
"""
@@ -238,9 +205,12 @@ class DetResizeForTest(object):
def __init__(self, **kwargs):
super(DetResizeForTest, self).__init__()
self.resize_type = 0
+ self.keep_ratio = False
if 'image_shape' in kwargs:
self.image_shape = kwargs['image_shape']
self.resize_type = 1
+ if 'keep_ratio' in kwargs:
+ self.keep_ratio = kwargs['keep_ratio']
elif 'limit_side_len' in kwargs:
self.limit_side_len = kwargs['limit_side_len']
self.limit_type = kwargs.get('limit_type', 'min')
@@ -270,6 +240,10 @@ class DetResizeForTest(object):
def resize_image_type1(self, img):
resize_h, resize_w = self.image_shape
ori_h, ori_w = img.shape[:2] # (h, w, c)
+ if self.keep_ratio is True:
+ resize_w = ori_w * resize_h / ori_h
+ N = math.ceil(resize_w / 32)
+ resize_w = N * 32
ratio_h = float(resize_h) / ori_h
ratio_w = float(resize_w) / ori_w
img = cv2.resize(img, (int(resize_w), int(resize_h)))
diff --git a/ppocr/data/imaug/rec_img_aug.py b/ppocr/data/imaug/rec_img_aug.py
index 7483dffe5b6d9a0a2204702757fcb49762a1cc7a..26773d0a516dfb0877453c7a5c8c8a2b5da92045 100644
--- a/ppocr/data/imaug/rec_img_aug.py
+++ b/ppocr/data/imaug/rec_img_aug.py
@@ -19,16 +19,109 @@ import random
import copy
from PIL import Image
from .text_image_aug import tia_perspective, tia_stretch, tia_distort
+from .abinet_aug import CVGeometry, CVDeterioration, CVColorJitter
+from paddle.vision.transforms import Compose
class RecAug(object):
- def __init__(self, use_tia=True, aug_prob=0.4, **kwargs):
- self.use_tia = use_tia
- self.aug_prob = aug_prob
+ def __init__(self,
+ tia_prob=0.4,
+ crop_prob=0.4,
+ reverse_prob=0.4,
+ noise_prob=0.4,
+ jitter_prob=0.4,
+ blur_prob=0.4,
+ hsv_aug_prob=0.4,
+ **kwargs):
+ self.tia_prob = tia_prob
+ self.bda = BaseDataAugmentation(crop_prob, reverse_prob, noise_prob,
+ jitter_prob, blur_prob, hsv_aug_prob)
def __call__(self, data):
img = data['image']
- img = warp(img, 10, self.use_tia, self.aug_prob)
+ h, w, _ = img.shape
+
+ # tia
+ if random.random() <= self.tia_prob:
+ if h >= 20 and w >= 20:
+ img = tia_distort(img, random.randint(3, 6))
+ img = tia_stretch(img, random.randint(3, 6))
+ img = tia_perspective(img)
+
+ # bda
+ data['image'] = img
+ data = self.bda(data)
+ return data
+
+
+class BaseDataAugmentation(object):
+ def __init__(self,
+ crop_prob=0.4,
+ reverse_prob=0.4,
+ noise_prob=0.4,
+ jitter_prob=0.4,
+ blur_prob=0.4,
+ hsv_aug_prob=0.4,
+ **kwargs):
+ self.crop_prob = crop_prob
+ self.reverse_prob = reverse_prob
+ self.noise_prob = noise_prob
+ self.jitter_prob = jitter_prob
+ self.blur_prob = blur_prob
+ self.hsv_aug_prob = hsv_aug_prob
+
+ def __call__(self, data):
+ img = data['image']
+ h, w, _ = img.shape
+
+ if random.random() <= self.crop_prob and h >= 20 and w >= 20:
+ img = get_crop(img)
+
+ if random.random() <= self.blur_prob:
+ img = blur(img)
+
+ if random.random() <= self.hsv_aug_prob:
+ img = hsv_aug(img)
+
+ if random.random() <= self.jitter_prob:
+ img = jitter(img)
+
+ if random.random() <= self.noise_prob:
+ img = add_gasuss_noise(img)
+
+ if random.random() <= self.reverse_prob:
+ img = 255 - img
+
+ data['image'] = img
+ return data
+
+
+class ABINetRecAug(object):
+ def __init__(self,
+ geometry_p=0.5,
+ deterioration_p=0.25,
+ colorjitter_p=0.25,
+ **kwargs):
+ self.transforms = Compose([
+ CVGeometry(
+ degrees=45,
+ translate=(0.0, 0.0),
+ scale=(0.5, 2.),
+ shear=(45, 15),
+ distortion=0.5,
+ p=geometry_p), CVDeterioration(
+ var=20, degrees=6, factor=4, p=deterioration_p),
+ CVColorJitter(
+ brightness=0.5,
+ contrast=0.5,
+ saturation=0.5,
+ hue=0.1,
+ p=colorjitter_p)
+ ])
+
+ def __call__(self, data):
+ img = data['image']
+ img = self.transforms(img)
data['image'] = img
return data
@@ -87,46 +180,6 @@ class ClsResizeImg(object):
return data
-class NRTRRecResizeImg(object):
- def __init__(self, image_shape, resize_type, padding=False, **kwargs):
- self.image_shape = image_shape
- self.resize_type = resize_type
- self.padding = padding
-
- def __call__(self, data):
- img = data['image']
- img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
- image_shape = self.image_shape
- if self.padding:
- imgC, imgH, imgW = image_shape
- # todo: change to 0 and modified image shape
- h = img.shape[0]
- w = img.shape[1]
- ratio = w / float(h)
- if math.ceil(imgH * ratio) > imgW:
- resized_w = imgW
- else:
- resized_w = int(math.ceil(imgH * ratio))
- resized_image = cv2.resize(img, (resized_w, imgH))
- norm_img = np.expand_dims(resized_image, -1)
- norm_img = norm_img.transpose((2, 0, 1))
- resized_image = norm_img.astype(np.float32) / 128. - 1.
- padding_im = np.zeros((imgC, imgH, imgW), dtype=np.float32)
- padding_im[:, :, 0:resized_w] = resized_image
- data['image'] = padding_im
- return data
- if self.resize_type == 'PIL':
- image_pil = Image.fromarray(np.uint8(img))
- img = image_pil.resize(self.image_shape, Image.ANTIALIAS)
- img = np.array(img)
- if self.resize_type == 'OpenCV':
- img = cv2.resize(img, self.image_shape)
- norm_img = np.expand_dims(img, -1)
- norm_img = norm_img.transpose((2, 0, 1))
- data['image'] = norm_img.astype(np.float32) / 128. - 1.
- return data
-
-
class RecResizeImg(object):
def __init__(self,
image_shape,
@@ -207,6 +260,84 @@ class PRENResizeImg(object):
return data
+class GrayRecResizeImg(object):
+ def __init__(self,
+ image_shape,
+ resize_type,
+ inter_type='Image.ANTIALIAS',
+ scale=True,
+ padding=False,
+ **kwargs):
+ self.image_shape = image_shape
+ self.resize_type = resize_type
+ self.padding = padding
+ self.inter_type = eval(inter_type)
+ self.scale = scale
+
+ def __call__(self, data):
+ img = data['image']
+ img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
+ image_shape = self.image_shape
+ if self.padding:
+ imgC, imgH, imgW = image_shape
+ # todo: change to 0 and modified image shape
+ h = img.shape[0]
+ w = img.shape[1]
+ ratio = w / float(h)
+ if math.ceil(imgH * ratio) > imgW:
+ resized_w = imgW
+ else:
+ resized_w = int(math.ceil(imgH * ratio))
+ resized_image = cv2.resize(img, (resized_w, imgH))
+ norm_img = np.expand_dims(resized_image, -1)
+ norm_img = norm_img.transpose((2, 0, 1))
+ resized_image = norm_img.astype(np.float32) / 128. - 1.
+ padding_im = np.zeros((imgC, imgH, imgW), dtype=np.float32)
+ padding_im[:, :, 0:resized_w] = resized_image
+ data['image'] = padding_im
+ return data
+ if self.resize_type == 'PIL':
+ image_pil = Image.fromarray(np.uint8(img))
+ img = image_pil.resize(self.image_shape, self.inter_type)
+ img = np.array(img)
+ if self.resize_type == 'OpenCV':
+ img = cv2.resize(img, self.image_shape)
+ norm_img = np.expand_dims(img, -1)
+ norm_img = norm_img.transpose((2, 0, 1))
+ if self.scale:
+ data['image'] = norm_img.astype(np.float32) / 128. - 1.
+ else:
+ data['image'] = norm_img.astype(np.float32) / 255.
+ return data
+
+
+class ABINetRecResizeImg(object):
+ def __init__(self, image_shape, **kwargs):
+ self.image_shape = image_shape
+
+ def __call__(self, data):
+ img = data['image']
+ norm_img, valid_ratio = resize_norm_img_abinet(img, self.image_shape)
+ data['image'] = norm_img
+ data['valid_ratio'] = valid_ratio
+ return data
+
+
+class SVTRRecResizeImg(object):
+ def __init__(self, image_shape, padding=True, **kwargs):
+ self.image_shape = image_shape
+ self.padding = padding
+
+ def __call__(self, data):
+ img = data['image']
+
+ norm_img, valid_ratio = resize_norm_img(img, self.image_shape,
+ self.padding)
+ data['image'] = norm_img
+ data['valid_ratio'] = valid_ratio
+ return data
+
+
def resize_norm_img_sar(img, image_shape, width_downsample_ratio=0.25):
imgC, imgH, imgW_min, imgW_max = image_shape
h = img.shape[0]
@@ -325,6 +456,26 @@ def resize_norm_img_srn(img, image_shape):
return np.reshape(img_black, (c, row, col)).astype(np.float32)
+def resize_norm_img_abinet(img, image_shape):
+ imgC, imgH, imgW = image_shape
+
+ resized_image = cv2.resize(
+ img, (imgW, imgH), interpolation=cv2.INTER_LINEAR)
+ resized_w = imgW
+ resized_image = resized_image.astype('float32')
+ resized_image = resized_image / 255.
+
+ mean = np.array([0.485, 0.456, 0.406])
+ std = np.array([0.229, 0.224, 0.225])
+ resized_image = (
+ resized_image - mean[None, None, ...]) / std[None, None, ...]
+ resized_image = resized_image.transpose((2, 0, 1))
+ resized_image = resized_image.astype('float32')
+
+ valid_ratio = min(1.0, float(resized_w / imgW))
+ return resized_image, valid_ratio
+
+
def srn_other_inputs(image_shape, num_heads, max_text_length):
imgC, imgH, imgW = image_shape
@@ -359,7 +510,7 @@ def flag():
return 1 if random.random() > 0.5000001 else -1
-def cvtColor(img):
+def hsv_aug(img):
"""
cvtColor
"""
@@ -427,50 +578,6 @@ def get_crop(image):
return crop_img
-class Config:
- """
- Config
- """
-
- def __init__(self, use_tia):
- self.anglex = random.random() * 30
- self.angley = random.random() * 15
- self.anglez = random.random() * 10
- self.fov = 42
- self.r = 0
- self.shearx = random.random() * 0.3
- self.sheary = random.random() * 0.05
- self.borderMode = cv2.BORDER_REPLICATE
- self.use_tia = use_tia
-
- def make(self, w, h, ang):
- """
- make
- """
- self.anglex = random.random() * 5 * flag()
- self.angley = random.random() * 5 * flag()
- self.anglez = -1 * random.random() * int(ang) * flag()
- self.fov = 42
- self.r = 0
- self.shearx = 0
- self.sheary = 0
- self.borderMode = cv2.BORDER_REPLICATE
- self.w = w
- self.h = h
-
- self.perspective = self.use_tia
- self.stretch = self.use_tia
- self.distort = self.use_tia
-
- self.crop = True
- self.affine = False
- self.reverse = True
- self.noise = True
- self.jitter = True
- self.blur = True
- self.color = True
-
-
def rad(x):
"""
rad
@@ -554,48 +661,3 @@ def get_warpAffine(config):
rz = np.array([[np.cos(rad(anglez)), np.sin(rad(anglez)), 0],
[-np.sin(rad(anglez)), np.cos(rad(anglez)), 0]], np.float32)
return rz
-
-
-def warp(img, ang, use_tia=True, prob=0.4):
- """
- warp
- """
- h, w, _ = img.shape
- config = Config(use_tia=use_tia)
- config.make(w, h, ang)
- new_img = img
-
- if config.distort:
- img_height, img_width = img.shape[0:2]
- if random.random() <= prob and img_height >= 20 and img_width >= 20:
- new_img = tia_distort(new_img, random.randint(3, 6))
-
- if config.stretch:
- img_height, img_width = img.shape[0:2]
- if random.random() <= prob and img_height >= 20 and img_width >= 20:
- new_img = tia_stretch(new_img, random.randint(3, 6))
-
- if config.perspective:
- if random.random() <= prob:
- new_img = tia_perspective(new_img)
-
- if config.crop:
- img_height, img_width = img.shape[0:2]
- if random.random() <= prob and img_height >= 20 and img_width >= 20:
- new_img = get_crop(new_img)
-
- if config.blur:
- if random.random() <= prob:
- new_img = blur(new_img)
- if config.color:
- if random.random() <= prob:
- new_img = cvtColor(new_img)
- if config.jitter:
- new_img = jitter(new_img)
- if config.noise:
- if random.random() <= prob:
- new_img = add_gasuss_noise(new_img)
- if config.reverse:
- if random.random() <= prob:
- new_img = 255 - new_img
- return new_img
diff --git a/ppocr/data/imaug/vqa/__init__.py b/ppocr/data/imaug/vqa/__init__.py
index a5025e7985198e7ee40d6c92d8e1814eb1797032..bde175115536a3f644750260082204fe5f10dc05 100644
--- a/ppocr/data/imaug/vqa/__init__.py
+++ b/ppocr/data/imaug/vqa/__init__.py
@@ -13,7 +13,12 @@
# limitations under the License.
from .token import VQATokenPad, VQASerTokenChunk, VQAReTokenChunk, VQAReTokenRelation
+from .augment import DistortBBox
__all__ = [
- 'VQATokenPad', 'VQASerTokenChunk', 'VQAReTokenChunk', 'VQAReTokenRelation'
+ 'VQATokenPad',
+ 'VQASerTokenChunk',
+ 'VQAReTokenChunk',
+ 'VQAReTokenRelation',
+ 'DistortBBox',
]
diff --git a/ppocr/data/imaug/vqa/augment.py b/ppocr/data/imaug/vqa/augment.py
new file mode 100644
index 0000000000000000000000000000000000000000..fcdc9685e9855c3a2d8e9f6f5add270f95f15a6c
--- /dev/null
+++ b/ppocr/data/imaug/vqa/augment.py
@@ -0,0 +1,37 @@
+# copyright (c) 2022 PaddlePaddle Authors. All Rights Reserve.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import os
+import sys
+import numpy as np
+import random
+
+
+class DistortBBox:
+ def __init__(self, prob=0.5, max_scale=1, **kwargs):
+ """Random distort bbox
+ """
+ self.prob = prob
+ self.max_scale = max_scale
+
+ def __call__(self, data):
+ if random.random() > self.prob:
+ return data
+ bbox = np.array(data['bbox'])
+ rnd_scale = (np.random.rand(*bbox.shape) - 0.5) * 2 * self.max_scale
+ bbox = np.round(bbox + rnd_scale).astype(bbox.dtype)
+ data['bbox'] = np.clip(data['bbox'], 0, 1000)
+ data['bbox'] = bbox.tolist()
+ sys.stdout.flush()
+ return data
diff --git a/ppocr/data/pubtab_dataset.py b/ppocr/data/pubtab_dataset.py
index 105f28db420631e5b6b2e527b5a6536e03d18f7d..642d3eb1961cbf0e829e6fb122f38c6af99df1c5 100644
--- a/ppocr/data/pubtab_dataset.py
+++ b/ppocr/data/pubtab_dataset.py
@@ -120,7 +120,8 @@ class PubTabDataSet(Dataset):
import traceback
err = traceback.format_exc()
self.logger.error(
- "When parsing line {}, error happened with msg: {}".format(err))
+ "When parsing line {}, error happened with msg: {}".format(
+ data_line, err))
outs = None
if outs is None:
rnd_idx = np.random.randint(self.__len__(
diff --git a/ppocr/data/simple_dataset.py b/ppocr/data/simple_dataset.py
index b5da9b8898423facf888839f941dff01caa03643..402f1e38fed9e32722e2dd160f10f779028807a3 100644
--- a/ppocr/data/simple_dataset.py
+++ b/ppocr/data/simple_dataset.py
@@ -33,7 +33,7 @@ class SimpleDataSet(Dataset):
self.delimiter = dataset_config.get('delimiter', '\t')
label_file_list = dataset_config.pop('label_file_list')
data_source_num = len(label_file_list)
- ratio_list = dataset_config.get("ratio_list", [1.0])
+ ratio_list = dataset_config.get("ratio_list", 1.0)
if isinstance(ratio_list, (float, int)):
ratio_list = [float(ratio_list)] * int(data_source_num)
diff --git a/ppocr/losses/__init__.py b/ppocr/losses/__init__.py
index 0f208007b193888b12919547a02c7ea074f01c90..62e0544ea94daaaff7d019e6a48e65a2d508aca0 100755
--- a/ppocr/losses/__init__.py
+++ b/ppocr/losses/__init__.py
@@ -30,7 +30,7 @@ from .det_fce_loss import FCELoss
from .rec_ctc_loss import CTCLoss
from .rec_att_loss import AttentionLoss
from .rec_srn_loss import SRNLoss
-from .rec_nrtr_loss import NRTRLoss
+from .rec_ce_loss import CELoss
from .rec_sar_loss import SARLoss
from .rec_aster_loss import AsterLoss
from .rec_pren_loss import PRENLoss
@@ -60,7 +60,7 @@ def build_loss(config):
support_dict = [
'DBLoss', 'PSELoss', 'EASTLoss', 'SASTLoss', 'FCELoss', 'CTCLoss',
'ClsLoss', 'AttentionLoss', 'SRNLoss', 'PGLoss', 'CombinedLoss',
- 'NRTRLoss', 'TableAttentionLoss', 'SARLoss', 'AsterLoss', 'SDMGRLoss',
+ 'CELoss', 'TableAttentionLoss', 'SARLoss', 'AsterLoss', 'SDMGRLoss',
'VQASerTokenLayoutLMLoss', 'LossFromOutput', 'PRENLoss', 'MultiLoss',
'TableMasterLoss'
]
diff --git a/ppocr/losses/rec_aster_loss.py b/ppocr/losses/rec_aster_loss.py
index fbb99d29a638540b02649a8912051339c08b22dd..52605e46db35339cc22f7f1e6642456bfaf02f11 100644
--- a/ppocr/losses/rec_aster_loss.py
+++ b/ppocr/losses/rec_aster_loss.py
@@ -27,12 +27,12 @@ class CosineEmbeddingLoss(nn.Layer):
self.epsilon = 1e-12
def forward(self, x1, x2, target):
- similarity = paddle.fluid.layers.reduce_sum(
+ similarity = paddle.sum(
x1 * x2, dim=-1) / (paddle.norm(
x1, axis=-1) * paddle.norm(
x2, axis=-1) + self.epsilon)
one_list = paddle.full_like(target, fill_value=1)
- out = paddle.fluid.layers.reduce_mean(
+ out = paddle.mean(
paddle.where(
paddle.equal(target, one_list), 1. - similarity,
paddle.maximum(
diff --git a/ppocr/losses/rec_ce_loss.py b/ppocr/losses/rec_ce_loss.py
new file mode 100644
index 0000000000000000000000000000000000000000..614384de863c15b106aef831f8e938b89dadc246
--- /dev/null
+++ b/ppocr/losses/rec_ce_loss.py
@@ -0,0 +1,66 @@
+import paddle
+from paddle import nn
+import paddle.nn.functional as F
+
+
+class CELoss(nn.Layer):
+ def __init__(self,
+ smoothing=False,
+ with_all=False,
+ ignore_index=-1,
+ **kwargs):
+ super(CELoss, self).__init__()
+ if ignore_index >= 0:
+ self.loss_func = nn.CrossEntropyLoss(
+ reduction='mean', ignore_index=ignore_index)
+ else:
+ self.loss_func = nn.CrossEntropyLoss(reduction='mean')
+ self.smoothing = smoothing
+ self.with_all = with_all
+
+ def forward(self, pred, batch):
+
+ if isinstance(pred, dict): # for ABINet
+ loss = {}
+ loss_sum = []
+ for name, logits in pred.items():
+ if isinstance(logits, list):
+ logit_num = len(logits)
+ all_tgt = paddle.concat([batch[1]] * logit_num, 0)
+ all_logits = paddle.concat(logits, 0)
+ flt_logtis = all_logits.reshape([-1, all_logits.shape[2]])
+ flt_tgt = all_tgt.reshape([-1])
+ else:
+ flt_logtis = logits.reshape([-1, logits.shape[2]])
+ flt_tgt = batch[1].reshape([-1])
+ loss[name + '_loss'] = self.loss_func(flt_logtis, flt_tgt)
+ loss_sum.append(loss[name + '_loss'])
+ loss['loss'] = sum(loss_sum)
+ return loss
+ else:
+ if self.with_all: # for ViTSTR
+ tgt = batch[1]
+ pred = pred.reshape([-1, pred.shape[2]])
+ tgt = tgt.reshape([-1])
+ loss = self.loss_func(pred, tgt)
+ return {'loss': loss}
+ else: # for NRTR
+ max_len = batch[2].max()
+ tgt = batch[1][:, 1:2 + max_len]
+ pred = pred.reshape([-1, pred.shape[2]])
+ tgt = tgt.reshape([-1])
+ if self.smoothing:
+ eps = 0.1
+ n_class = pred.shape[1]
+ one_hot = F.one_hot(tgt, pred.shape[1])
+ one_hot = one_hot * (1 - eps) + (1 - one_hot) * eps / (
+ n_class - 1)
+ log_prb = F.log_softmax(pred, axis=1)
+ non_pad_mask = paddle.not_equal(
+ tgt, paddle.zeros(
+ tgt.shape, dtype=tgt.dtype))
+ loss = -(one_hot * log_prb).sum(axis=1)
+ loss = loss.masked_select(non_pad_mask).mean()
+ else:
+ loss = self.loss_func(pred, tgt)
+ return {'loss': loss}
diff --git a/ppocr/losses/rec_nrtr_loss.py b/ppocr/losses/rec_nrtr_loss.py
deleted file mode 100644
index 200a6d0486dbf6f76dd674eb58f641b31a70f31c..0000000000000000000000000000000000000000
--- a/ppocr/losses/rec_nrtr_loss.py
+++ /dev/null
@@ -1,30 +0,0 @@
-import paddle
-from paddle import nn
-import paddle.nn.functional as F
-
-
-class NRTRLoss(nn.Layer):
- def __init__(self, smoothing=True, **kwargs):
- super(NRTRLoss, self).__init__()
- self.loss_func = nn.CrossEntropyLoss(reduction='mean', ignore_index=0)
- self.smoothing = smoothing
-
- def forward(self, pred, batch):
- pred = pred.reshape([-1, pred.shape[2]])
- max_len = batch[2].max()
- tgt = batch[1][:, 1:2 + max_len]
- tgt = tgt.reshape([-1])
- if self.smoothing:
- eps = 0.1
- n_class = pred.shape[1]
- one_hot = F.one_hot(tgt, pred.shape[1])
- one_hot = one_hot * (1 - eps) + (1 - one_hot) * eps / (n_class - 1)
- log_prb = F.log_softmax(pred, axis=1)
- non_pad_mask = paddle.not_equal(
- tgt, paddle.zeros(
- tgt.shape, dtype=tgt.dtype))
- loss = -(one_hot * log_prb).sum(axis=1)
- loss = loss.masked_select(non_pad_mask).mean()
- else:
- loss = self.loss_func(pred, tgt)
- return {'loss': loss}
diff --git a/ppocr/losses/table_att_loss.py b/ppocr/losses/table_att_loss.py
index 4bdccad3998c00bfc2b0ef12bec2983d2953fdb3..3496c9072553d839017eaa017fe47dfb66fb9d3b 100644
--- a/ppocr/losses/table_att_loss.py
+++ b/ppocr/losses/table_att_loss.py
@@ -19,7 +19,6 @@ from __future__ import print_function
import paddle
from paddle import nn
from paddle.nn import functional as F
-from paddle import fluid
class TableAttentionLoss(nn.Layer):
@@ -42,13 +41,13 @@ class TableAttentionLoss(nn.Layer):
:param bbox:[[x1,y1,x2,y2], [x1,y1,x2,y2],,,]
:return: loss
'''
- ix1 = fluid.layers.elementwise_max(preds[:, 0], bbox[:, 0])
- iy1 = fluid.layers.elementwise_max(preds[:, 1], bbox[:, 1])
- ix2 = fluid.layers.elementwise_min(preds[:, 2], bbox[:, 2])
- iy2 = fluid.layers.elementwise_min(preds[:, 3], bbox[:, 3])
+ ix1 = paddle.maximum(preds[:, 0], bbox[:, 0])
+ iy1 = paddle.maximum(preds[:, 1], bbox[:, 1])
+ ix2 = paddle.minimum(preds[:, 2], bbox[:, 2])
+ iy2 = paddle.minimum(preds[:, 3], bbox[:, 3])
- iw = fluid.layers.clip(ix2 - ix1 + 1e-3, 0., 1e10)
- ih = fluid.layers.clip(iy2 - iy1 + 1e-3, 0., 1e10)
+ iw = paddle.clip(ix2 - ix1 + 1e-3, 0., 1e10)
+ ih = paddle.clip(iy2 - iy1 + 1e-3, 0., 1e10)
# overlap
inters = iw * ih
@@ -62,12 +61,12 @@ class TableAttentionLoss(nn.Layer):
# ious
ious = inters / uni
- ex1 = fluid.layers.elementwise_min(preds[:, 0], bbox[:, 0])
- ey1 = fluid.layers.elementwise_min(preds[:, 1], bbox[:, 1])
- ex2 = fluid.layers.elementwise_max(preds[:, 2], bbox[:, 2])
- ey2 = fluid.layers.elementwise_max(preds[:, 3], bbox[:, 3])
- ew = fluid.layers.clip(ex2 - ex1 + 1e-3, 0., 1e10)
- eh = fluid.layers.clip(ey2 - ey1 + 1e-3, 0., 1e10)
+ ex1 = paddle.minimum(preds[:, 0], bbox[:, 0])
+ ey1 = paddle.minimum(preds[:, 1], bbox[:, 1])
+ ex2 = paddle.maximum(preds[:, 2], bbox[:, 2])
+ ey2 = paddle.maximum(preds[:, 3], bbox[:, 3])
+ ew = paddle.clip(ex2 - ex1 + 1e-3, 0., 1e10)
+ eh = paddle.clip(ey2 - ey1 + 1e-3, 0., 1e10)
# enclose erea
enclose = ew * eh + eps
diff --git a/ppocr/losses/vqa_token_layoutlm_loss.py b/ppocr/losses/vqa_token_layoutlm_loss.py
index 244893d97d0e422c5ca270bdece689e13aba2b07..f9cd4634731a26dd990d6ffac3d8defc8cdf7e97 100755
--- a/ppocr/losses/vqa_token_layoutlm_loss.py
+++ b/ppocr/losses/vqa_token_layoutlm_loss.py
@@ -27,8 +27,8 @@ class VQASerTokenLayoutLMLoss(nn.Layer):
self.ignore_index = self.loss_class.ignore_index
def forward(self, predicts, batch):
- labels = batch[1]
- attention_mask = batch[4]
+ labels = batch[5]
+ attention_mask = batch[2]
if attention_mask is not None:
active_loss = attention_mask.reshape([-1, ]) == 1
active_outputs = predicts.reshape(
diff --git a/ppocr/metrics/table_metric.py b/ppocr/metrics/table_metric.py
index 26f577a03aeba0977eda2866a9046715f03a1f63..fb0075f7cbecad7d58679c5338390e7bf6d99a08 100644
--- a/ppocr/metrics/table_metric.py
+++ b/ppocr/metrics/table_metric.py
@@ -63,7 +63,7 @@ class TableMetric(object):
def __init__(self,
main_indicator='acc',
compute_bbox_metric=False,
- point_num=4,
+ point_num=2,
**kwargs):
"""
diff --git a/ppocr/modeling/architectures/__init__.py b/ppocr/modeling/architectures/__init__.py
index 3f47f64a0f01c2267c7ff4aecb3815915b24dadb..1c955ef3abe9c38e816616cc9b5399c6832aa5f1 100755
--- a/ppocr/modeling/architectures/__init__.py
+++ b/ppocr/modeling/architectures/__init__.py
@@ -40,11 +40,29 @@ def apply_to_static(model, config, logger):
return model
assert "image_shape" in config[
"Global"], "image_shape must be assigned for static training mode..."
- supported_list = ["DB"]
- assert config["Architecture"][
- "algorithm"] in supported_list, f"algorithms that supports static training must in in {supported_list} but got {config['Architecture']['algorithm']}"
+ supported_list = ["DB", "SVTR"]
+ if config["Architecture"]["algorithm"] in ["Distillation"]:
+ algo = list(config["Architecture"]["Models"].values())[0]["algorithm"]
+ else:
+ algo = config["Architecture"]["algorithm"]
+ assert algo in supported_list, f"algorithms that supports static training must in in {supported_list} but got {algo}"
+
+ specs = [
+ InputSpec(
+ [None] + config["Global"]["image_shape"], dtype='float32')
+ ]
+
+ if algo == "SVTR":
+ specs.append([
+ InputSpec(
+ [None, config["Global"]["max_text_length"]],
+ dtype='int64'), InputSpec(
+ [None, config["Global"]["max_text_length"]], dtype='int64'),
+ InputSpec(
+ [None], dtype='int64'), InputSpec(
+ [None], dtype='float64')
+ ])
- specs = [InputSpec([None] + config["Global"]["image_shape"])]
model = to_static(model, input_spec=specs)
logger.info("Successfully to apply @to_static with specs: {}".format(specs))
return model
diff --git a/ppocr/modeling/backbones/__init__.py b/ppocr/modeling/backbones/__init__.py
index 2b5fd9142eaa06947439a9c0b9a64ebf28c420f6..f4094d796b1f14c955e5962936e86bd6b3f5ec78 100755
--- a/ppocr/modeling/backbones/__init__.py
+++ b/ppocr/modeling/backbones/__init__.py
@@ -18,12 +18,13 @@ __all__ = ["build_backbone"]
def build_backbone(config, model_type):
if model_type == "det" or model_type == "table":
from .det_mobilenet_v3 import MobileNetV3
- from .det_resnet_vd import ResNet
+ from .det_resnet import ResNet
+ from .det_resnet_vd import ResNet_vd
from .det_resnet_vd_sast import ResNet_SAST
- from .table_master_resnet import TableResNetExtra
- support_dict = [
- "MobileNetV3", "ResNet", "ResNet_SAST", "TableResNetExtra"
- ]
+ support_dict = ["MobileNetV3", "ResNet", "ResNet_vd", "ResNet_SAST"]
+ if model_type == "table":
+ from .table_master_resnet import TableResNetExtra
+ support_dict.append('TableResNetExtra')
elif model_type == "rec" or model_type == "cls":
from .rec_mobilenet_v3 import MobileNetV3
from .rec_resnet_vd import ResNet
@@ -31,35 +32,37 @@ def build_backbone(config, model_type):
from .rec_mv1_enhance import MobileNetV1Enhance
from .rec_nrtr_mtb import MTB
from .rec_resnet_31 import ResNet31
+ from .rec_resnet_45 import ResNet45
from .rec_resnet_aster import ResNet_ASTER
from .rec_micronet import MicroNet
from .rec_efficientb3_pren import EfficientNetb3_PREN
from .rec_svtrnet import SVTRNet
+ from .rec_vitstr import ViTSTR
support_dict = [
'MobileNetV1Enhance', 'MobileNetV3', 'ResNet', 'ResNetFPN', 'MTB',
- "ResNet31", "ResNet_ASTER", 'MicroNet', 'EfficientNetb3_PREN',
- 'SVTRNet'
+ 'ResNet31', 'ResNet45', 'ResNet_ASTER', 'MicroNet',
+ 'EfficientNetb3_PREN', 'SVTRNet', 'ViTSTR'
]
- elif model_type == "e2e":
+ elif model_type == 'e2e':
from .e2e_resnet_vd_pg import ResNet
support_dict = ['ResNet']
elif model_type == 'kie':
from .kie_unet_sdmgr import Kie_backbone
support_dict = ['Kie_backbone']
- elif model_type == "table":
+ elif model_type == 'table':
from .table_resnet_vd import ResNet
from .table_mobilenet_v3 import MobileNetV3
- support_dict = ["ResNet", "MobileNetV3"]
+ support_dict = ['ResNet', 'MobileNetV3']
elif model_type == 'vqa':
from .vqa_layoutlm import LayoutLMForSer, LayoutLMv2ForSer, LayoutLMv2ForRe, LayoutXLMForSer, LayoutXLMForRe
support_dict = [
- "LayoutLMForSer", "LayoutLMv2ForSer", 'LayoutLMv2ForRe',
- "LayoutXLMForSer", 'LayoutXLMForRe'
+ 'LayoutLMForSer', 'LayoutLMv2ForSer', 'LayoutLMv2ForRe',
+ 'LayoutXLMForSer', 'LayoutXLMForRe'
]
else:
raise NotImplementedError
- module_name = config.pop("name")
+ module_name = config.pop('name')
assert module_name in support_dict, Exception(
"when model typs is {}, backbone only support {}".format(model_type,
support_dict))
diff --git a/ppocr/modeling/backbones/det_resnet.py b/ppocr/modeling/backbones/det_resnet.py
new file mode 100644
index 0000000000000000000000000000000000000000..87eef11cf0e33c24c0f539c8074b21f589345282
--- /dev/null
+++ b/ppocr/modeling/backbones/det_resnet.py
@@ -0,0 +1,236 @@
+# copyright (c) 2022 PaddlePaddle Authors. All Rights Reserve.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+
+import numpy as np
+import paddle
+from paddle import ParamAttr
+import paddle.nn as nn
+import paddle.nn.functional as F
+from paddle.nn import Conv2D, BatchNorm, Linear, Dropout
+from paddle.nn import AdaptiveAvgPool2D, MaxPool2D, AvgPool2D
+from paddle.nn.initializer import Uniform
+
+import math
+
+from paddle.vision.ops import DeformConv2D
+from paddle.regularizer import L2Decay
+from paddle.nn.initializer import Normal, Constant, XavierUniform
+from .det_resnet_vd import DeformableConvV2, ConvBNLayer
+
+
+class BottleneckBlock(nn.Layer):
+ def __init__(self,
+ num_channels,
+ num_filters,
+ stride,
+ shortcut=True,
+ is_dcn=False):
+ super(BottleneckBlock, self).__init__()
+
+ self.conv0 = ConvBNLayer(
+ in_channels=num_channels,
+ out_channels=num_filters,
+ kernel_size=1,
+ act="relu", )
+ self.conv1 = ConvBNLayer(
+ in_channels=num_filters,
+ out_channels=num_filters,
+ kernel_size=3,
+ stride=stride,
+ act="relu",
+ is_dcn=is_dcn,
+ dcn_groups=1, )
+ self.conv2 = ConvBNLayer(
+ in_channels=num_filters,
+ out_channels=num_filters * 4,
+ kernel_size=1,
+ act=None, )
+
+ if not shortcut:
+ self.short = ConvBNLayer(
+ in_channels=num_channels,
+ out_channels=num_filters * 4,
+ kernel_size=1,
+ stride=stride, )
+
+ self.shortcut = shortcut
+
+ self._num_channels_out = num_filters * 4
+
+ def forward(self, inputs):
+ y = self.conv0(inputs)
+ conv1 = self.conv1(y)
+ conv2 = self.conv2(conv1)
+
+ if self.shortcut:
+ short = inputs
+ else:
+ short = self.short(inputs)
+
+ y = paddle.add(x=short, y=conv2)
+ y = F.relu(y)
+ return y
+
+
+class BasicBlock(nn.Layer):
+ def __init__(self,
+ num_channels,
+ num_filters,
+ stride,
+ shortcut=True,
+ name=None):
+ super(BasicBlock, self).__init__()
+ self.stride = stride
+ self.conv0 = ConvBNLayer(
+ in_channels=num_channels,
+ out_channels=num_filters,
+ kernel_size=3,
+ stride=stride,
+ act="relu")
+ self.conv1 = ConvBNLayer(
+ in_channels=num_filters,
+ out_channels=num_filters,
+ kernel_size=3,
+ act=None)
+
+ if not shortcut:
+ self.short = ConvBNLayer(
+ in_channels=num_channels,
+ out_channels=num_filters,
+ kernel_size=1,
+ stride=stride)
+
+ self.shortcut = shortcut
+
+ def forward(self, inputs):
+ y = self.conv0(inputs)
+ conv1 = self.conv1(y)
+
+ if self.shortcut:
+ short = inputs
+ else:
+ short = self.short(inputs)
+ y = paddle.add(x=short, y=conv1)
+ y = F.relu(y)
+ return y
+
+
+class ResNet(nn.Layer):
+ def __init__(self,
+ in_channels=3,
+ layers=50,
+ out_indices=None,
+ dcn_stage=None):
+ super(ResNet, self).__init__()
+
+ self.layers = layers
+ self.input_image_channel = in_channels
+
+ supported_layers = [18, 34, 50, 101, 152]
+ assert layers in supported_layers, \
+ "supported layers are {} but input layer is {}".format(
+ supported_layers, layers)
+
+ if layers == 18:
+ depth = [2, 2, 2, 2]
+ elif layers == 34 or layers == 50:
+ depth = [3, 4, 6, 3]
+ elif layers == 101:
+ depth = [3, 4, 23, 3]
+ elif layers == 152:
+ depth = [3, 8, 36, 3]
+ num_channels = [64, 256, 512,
+ 1024] if layers >= 50 else [64, 64, 128, 256]
+ num_filters = [64, 128, 256, 512]
+
+ self.dcn_stage = dcn_stage if dcn_stage is not None else [
+ False, False, False, False
+ ]
+ self.out_indices = out_indices if out_indices is not None else [
+ 0, 1, 2, 3
+ ]
+
+ self.conv = ConvBNLayer(
+ in_channels=self.input_image_channel,
+ out_channels=64,
+ kernel_size=7,
+ stride=2,
+ act="relu", )
+ self.pool2d_max = MaxPool2D(
+ kernel_size=3,
+ stride=2,
+ padding=1, )
+
+ self.stages = []
+ self.out_channels = []
+ if layers >= 50:
+ for block in range(len(depth)):
+ shortcut = False
+ block_list = []
+ is_dcn = self.dcn_stage[block]
+ for i in range(depth[block]):
+ if layers in [101, 152] and block == 2:
+ if i == 0:
+ conv_name = "res" + str(block + 2) + "a"
+ else:
+ conv_name = "res" + str(block + 2) + "b" + str(i)
+ else:
+ conv_name = "res" + str(block + 2) + chr(97 + i)
+ bottleneck_block = self.add_sublayer(
+ conv_name,
+ BottleneckBlock(
+ num_channels=num_channels[block]
+ if i == 0 else num_filters[block] * 4,
+ num_filters=num_filters[block],
+ stride=2 if i == 0 and block != 0 else 1,
+ shortcut=shortcut,
+ is_dcn=is_dcn))
+ block_list.append(bottleneck_block)
+ shortcut = True
+ if block in self.out_indices:
+ self.out_channels.append(num_filters[block] * 4)
+ self.stages.append(nn.Sequential(*block_list))
+ else:
+ for block in range(len(depth)):
+ shortcut = False
+ block_list = []
+ for i in range(depth[block]):
+ conv_name = "res" + str(block + 2) + chr(97 + i)
+ basic_block = self.add_sublayer(
+ conv_name,
+ BasicBlock(
+ num_channels=num_channels[block]
+ if i == 0 else num_filters[block],
+ num_filters=num_filters[block],
+ stride=2 if i == 0 and block != 0 else 1,
+ shortcut=shortcut))
+ block_list.append(basic_block)
+ shortcut = True
+ if block in self.out_indices:
+ self.out_channels.append(num_filters[block])
+ self.stages.append(nn.Sequential(*block_list))
+
+ def forward(self, inputs):
+ y = self.conv(inputs)
+ y = self.pool2d_max(y)
+ out = []
+ for i, block in enumerate(self.stages):
+ y = block(y)
+ if i in self.out_indices:
+ out.append(y)
+ return out
diff --git a/ppocr/modeling/backbones/det_resnet_vd.py b/ppocr/modeling/backbones/det_resnet_vd.py
index 8c955a4af377374f21e7c09f0d10952f2fe1ceed..a421da0ab440e9b87c1c7efc7d2448f8f76ad205 100644
--- a/ppocr/modeling/backbones/det_resnet_vd.py
+++ b/ppocr/modeling/backbones/det_resnet_vd.py
@@ -25,7 +25,7 @@ from paddle.vision.ops import DeformConv2D
from paddle.regularizer import L2Decay
from paddle.nn.initializer import Normal, Constant, XavierUniform
-__all__ = ["ResNet"]
+__all__ = ["ResNet_vd", "ConvBNLayer", "DeformableConvV2"]
class DeformableConvV2(nn.Layer):
@@ -104,6 +104,7 @@ class ConvBNLayer(nn.Layer):
kernel_size,
stride=1,
groups=1,
+ dcn_groups=1,
is_vd_mode=False,
act=None,
is_dcn=False):
@@ -128,7 +129,7 @@ class ConvBNLayer(nn.Layer):
kernel_size=kernel_size,
stride=stride,
padding=(kernel_size - 1) // 2,
- groups=2, #groups,
+ groups=dcn_groups, #groups,
bias_attr=False)
self._batch_norm = nn.BatchNorm(out_channels, act=act)
@@ -162,7 +163,8 @@ class BottleneckBlock(nn.Layer):
kernel_size=3,
stride=stride,
act='relu',
- is_dcn=is_dcn)
+ is_dcn=is_dcn,
+ dcn_groups=2)
self.conv2 = ConvBNLayer(
in_channels=out_channels,
out_channels=out_channels * 4,
@@ -238,14 +240,14 @@ class BasicBlock(nn.Layer):
return y
-class ResNet(nn.Layer):
+class ResNet_vd(nn.Layer):
def __init__(self,
in_channels=3,
layers=50,
dcn_stage=None,
out_indices=None,
**kwargs):
- super(ResNet, self).__init__()
+ super(ResNet_vd, self).__init__()
self.layers = layers
supported_layers = [18, 34, 50, 101, 152, 200]
@@ -321,7 +323,6 @@ class ResNet(nn.Layer):
for block in range(len(depth)):
block_list = []
shortcut = False
- # is_dcn = self.dcn_stage[block]
for i in range(depth[block]):
basic_block = self.add_sublayer(
'bb_%d_%d' % (block, i),
diff --git a/ppocr/modeling/backbones/kie_unet_sdmgr.py b/ppocr/modeling/backbones/kie_unet_sdmgr.py
index 545e4e7511e58c3d8220e9ec0be35474deba8806..4b1bd8030060b26acb9e60bd671a5b23d936347b 100644
--- a/ppocr/modeling/backbones/kie_unet_sdmgr.py
+++ b/ppocr/modeling/backbones/kie_unet_sdmgr.py
@@ -175,12 +175,7 @@ class Kie_backbone(nn.Layer):
img, relations, texts, gt_bboxes, tag, img_size)
x = self.img_feat(img)
boxes, rois_num = self.bbox2roi(gt_bboxes)
- feats = paddle.fluid.layers.roi_align(
- x,
- boxes,
- spatial_scale=1.0,
- pooled_height=7,
- pooled_width=7,
- rois_num=rois_num)
+ feats = paddle.vision.ops.roi_align(
+ x, boxes, spatial_scale=1.0, output_size=7, boxes_num=rois_num)
feats = self.maxpool(feats).squeeze(-1).squeeze(-1)
return [relations, texts, feats]
diff --git a/ppocr/modeling/backbones/rec_resnet_45.py b/ppocr/modeling/backbones/rec_resnet_45.py
new file mode 100644
index 0000000000000000000000000000000000000000..9093d0bc99b78806d36662dec36b6cfbdd4ae493
--- /dev/null
+++ b/ppocr/modeling/backbones/rec_resnet_45.py
@@ -0,0 +1,147 @@
+# copyright (c) 2021 PaddlePaddle Authors. All Rights Reserve.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+"""
+This code is refer from:
+https://github.com/FangShancheng/ABINet/tree/main/modules
+"""
+
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+
+import paddle
+from paddle import ParamAttr
+from paddle.nn.initializer import KaimingNormal
+import paddle.nn as nn
+import paddle.nn.functional as F
+import numpy as np
+import math
+
+__all__ = ["ResNet45"]
+
+
+def conv1x1(in_planes, out_planes, stride=1):
+ return nn.Conv2D(
+ in_planes,
+ out_planes,
+ kernel_size=1,
+ stride=1,
+ weight_attr=ParamAttr(initializer=KaimingNormal()),
+ bias_attr=False)
+
+
+def conv3x3(in_channel, out_channel, stride=1):
+ return nn.Conv2D(
+ in_channel,
+ out_channel,
+ kernel_size=3,
+ stride=stride,
+ padding=1,
+ weight_attr=ParamAttr(initializer=KaimingNormal()),
+ bias_attr=False)
+
+
+class BasicBlock(nn.Layer):
+ expansion = 1
+
+ def __init__(self, in_channels, channels, stride=1, downsample=None):
+ super().__init__()
+ self.conv1 = conv1x1(in_channels, channels)
+ self.bn1 = nn.BatchNorm2D(channels)
+ self.relu = nn.ReLU()
+ self.conv2 = conv3x3(channels, channels, stride)
+ self.bn2 = nn.BatchNorm2D(channels)
+ self.downsample = downsample
+ self.stride = stride
+
+ def forward(self, x):
+ residual = x
+
+ out = self.conv1(x)
+ out = self.bn1(out)
+ out = self.relu(out)
+
+ out = self.conv2(out)
+ out = self.bn2(out)
+
+ if self.downsample is not None:
+ residual = self.downsample(x)
+ out += residual
+ out = self.relu(out)
+
+ return out
+
+
+class ResNet45(nn.Layer):
+ def __init__(self, block=BasicBlock, layers=[3, 4, 6, 6, 3], in_channels=3):
+ self.inplanes = 32
+ super(ResNet45, self).__init__()
+ self.conv1 = nn.Conv2D(
+ 3,
+ 32,
+ kernel_size=3,
+ stride=1,
+ padding=1,
+ weight_attr=ParamAttr(initializer=KaimingNormal()),
+ bias_attr=False)
+ self.bn1 = nn.BatchNorm2D(32)
+ self.relu = nn.ReLU()
+
+ self.layer1 = self._make_layer(block, 32, layers[0], stride=2)
+ self.layer2 = self._make_layer(block, 64, layers[1], stride=1)
+ self.layer3 = self._make_layer(block, 128, layers[2], stride=2)
+ self.layer4 = self._make_layer(block, 256, layers[3], stride=1)
+ self.layer5 = self._make_layer(block, 512, layers[4], stride=1)
+ self.out_channels = 512
+
+ # for m in self.modules():
+ # if isinstance(m, nn.Conv2D):
+ # n = m._kernel_size[0] * m._kernel_size[1] * m._out_channels
+ # m.weight.data.normal_(0, math.sqrt(2. / n))
+
+ def _make_layer(self, block, planes, blocks, stride=1):
+ downsample = None
+ if stride != 1 or self.inplanes != planes * block.expansion:
+ # downsample = True
+ downsample = nn.Sequential(
+ nn.Conv2D(
+ self.inplanes,
+ planes * block.expansion,
+ kernel_size=1,
+ stride=stride,
+ weight_attr=ParamAttr(initializer=KaimingNormal()),
+ bias_attr=False),
+ nn.BatchNorm2D(planes * block.expansion), )
+
+ layers = []
+ layers.append(block(self.inplanes, planes, stride, downsample))
+ self.inplanes = planes * block.expansion
+ for i in range(1, blocks):
+ layers.append(block(self.inplanes, planes))
+
+ return nn.Sequential(*layers)
+
+ def forward(self, x):
+
+ x = self.conv1(x)
+ x = self.bn1(x)
+ x = self.relu(x)
+ # print(x)
+ x = self.layer1(x)
+ x = self.layer2(x)
+ x = self.layer3(x)
+ # print(x)
+ x = self.layer4(x)
+ x = self.layer5(x)
+ return x
diff --git a/ppocr/modeling/backbones/rec_resnet_fpn.py b/ppocr/modeling/backbones/rec_resnet_fpn.py
index a7e876a2bd52a0ea70479c2009a291e4e2f8ce1f..79efd6e41e231ecad99aa4d01a8226a8550bd1ef 100644
--- a/ppocr/modeling/backbones/rec_resnet_fpn.py
+++ b/ppocr/modeling/backbones/rec_resnet_fpn.py
@@ -18,7 +18,6 @@ from __future__ import print_function
from paddle import nn, ParamAttr
from paddle.nn import functional as F
-import paddle.fluid as fluid
import paddle
import numpy as np
diff --git a/ppocr/modeling/backbones/rec_svtrnet.py b/ppocr/modeling/backbones/rec_svtrnet.py
index c57bf46345d6e08f23b9258358f77f2285366314..c2c07f4476929d49237c8e9a10713f881f5f556b 100644
--- a/ppocr/modeling/backbones/rec_svtrnet.py
+++ b/ppocr/modeling/backbones/rec_svtrnet.py
@@ -147,7 +147,7 @@ class Attention(nn.Layer):
dim,
num_heads=8,
mixer='Global',
- HW=[8, 25],
+ HW=None,
local_k=[7, 11],
qkv_bias=False,
qk_scale=None,
@@ -210,7 +210,7 @@ class Block(nn.Layer):
num_heads,
mixer='Global',
local_mixer=[7, 11],
- HW=[8, 25],
+ HW=None,
mlp_ratio=4.,
qkv_bias=False,
qk_scale=None,
@@ -274,7 +274,9 @@ class PatchEmbed(nn.Layer):
img_size=[32, 100],
in_channels=3,
embed_dim=768,
- sub_num=2):
+ sub_num=2,
+ patch_size=[4, 4],
+ mode='pope'):
super().__init__()
num_patches = (img_size[1] // (2 ** sub_num)) * \
(img_size[0] // (2 ** sub_num))
@@ -282,50 +284,56 @@ class PatchEmbed(nn.Layer):
self.num_patches = num_patches
self.embed_dim = embed_dim
self.norm = None
- if sub_num == 2:
- self.proj = nn.Sequential(
- ConvBNLayer(
- in_channels=in_channels,
- out_channels=embed_dim // 2,
- kernel_size=3,
- stride=2,
- padding=1,
- act=nn.GELU,
- bias_attr=None),
- ConvBNLayer(
- in_channels=embed_dim // 2,
- out_channels=embed_dim,
- kernel_size=3,
- stride=2,
- padding=1,
- act=nn.GELU,
- bias_attr=None))
- if sub_num == 3:
- self.proj = nn.Sequential(
- ConvBNLayer(
- in_channels=in_channels,
- out_channels=embed_dim // 4,
- kernel_size=3,
- stride=2,
- padding=1,
- act=nn.GELU,
- bias_attr=None),
- ConvBNLayer(
- in_channels=embed_dim // 4,
- out_channels=embed_dim // 2,
- kernel_size=3,
- stride=2,
- padding=1,
- act=nn.GELU,
- bias_attr=None),
- ConvBNLayer(
- in_channels=embed_dim // 2,
- out_channels=embed_dim,
- kernel_size=3,
- stride=2,
- padding=1,
- act=nn.GELU,
- bias_attr=None))
+ if mode == 'pope':
+ if sub_num == 2:
+ self.proj = nn.Sequential(
+ ConvBNLayer(
+ in_channels=in_channels,
+ out_channels=embed_dim // 2,
+ kernel_size=3,
+ stride=2,
+ padding=1,
+ act=nn.GELU,
+ bias_attr=None),
+ ConvBNLayer(
+ in_channels=embed_dim // 2,
+ out_channels=embed_dim,
+ kernel_size=3,
+ stride=2,
+ padding=1,
+ act=nn.GELU,
+ bias_attr=None))
+ if sub_num == 3:
+ self.proj = nn.Sequential(
+ ConvBNLayer(
+ in_channels=in_channels,
+ out_channels=embed_dim // 4,
+ kernel_size=3,
+ stride=2,
+ padding=1,
+ act=nn.GELU,
+ bias_attr=None),
+ ConvBNLayer(
+ in_channels=embed_dim // 4,
+ out_channels=embed_dim // 2,
+ kernel_size=3,
+ stride=2,
+ padding=1,
+ act=nn.GELU,
+ bias_attr=None),
+ ConvBNLayer(
+ in_channels=embed_dim // 2,
+ out_channels=embed_dim,
+ kernel_size=3,
+ stride=2,
+ padding=1,
+ act=nn.GELU,
+ bias_attr=None))
+ elif mode == 'linear':
+ self.proj = nn.Conv2D(
+ 1, embed_dim, kernel_size=patch_size, stride=patch_size)
+ self.num_patches = img_size[0] // patch_size[0] * img_size[
+ 1] // patch_size[1]
def forward(self, x):
B, C, H, W = x.shape
diff --git a/ppocr/modeling/backbones/rec_vitstr.py b/ppocr/modeling/backbones/rec_vitstr.py
new file mode 100644
index 0000000000000000000000000000000000000000..d5d7d5148a1120e6f97a321b4135c6780c0c5db2
--- /dev/null
+++ b/ppocr/modeling/backbones/rec_vitstr.py
@@ -0,0 +1,120 @@
+# copyright (c) 2021 PaddlePaddle Authors. All Rights Reserve.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+"""
+This code is refer from:
+https://github.com/roatienza/deep-text-recognition-benchmark/blob/master/modules/vitstr.py
+"""
+
+import numpy as np
+import paddle
+import paddle.nn as nn
+from ppocr.modeling.backbones.rec_svtrnet import Block, PatchEmbed, zeros_, trunc_normal_, ones_
+
+scale_dim_heads = {'tiny': [192, 3], 'small': [384, 6], 'base': [768, 12]}
+
+
+class ViTSTR(nn.Layer):
+ def __init__(self,
+ img_size=[224, 224],
+ in_channels=1,
+ scale='tiny',
+ seqlen=27,
+ patch_size=[16, 16],
+ embed_dim=None,
+ depth=12,
+ num_heads=None,
+ mlp_ratio=4,
+ qkv_bias=True,
+ qk_scale=None,
+ drop_path_rate=0.,
+ drop_rate=0.,
+ attn_drop_rate=0.,
+ norm_layer='nn.LayerNorm',
+ act_layer='nn.GELU',
+ epsilon=1e-6,
+ out_channels=None,
+ **kwargs):
+ super().__init__()
+ self.seqlen = seqlen
+ embed_dim = embed_dim if embed_dim is not None else scale_dim_heads[
+ scale][0]
+ num_heads = num_heads if num_heads is not None else scale_dim_heads[
+ scale][1]
+ out_channels = out_channels if out_channels is not None else embed_dim
+ self.patch_embed = PatchEmbed(
+ img_size=img_size,
+ in_channels=in_channels,
+ embed_dim=embed_dim,
+ patch_size=patch_size,
+ mode='linear')
+ num_patches = self.patch_embed.num_patches
+
+ self.pos_embed = self.create_parameter(
+ shape=[1, num_patches + 1, embed_dim], default_initializer=zeros_)
+ self.add_parameter("pos_embed", self.pos_embed)
+ self.cls_token = self.create_parameter(
+ shape=[1, 1, embed_dim], default_initializer=zeros_)
+ self.add_parameter("cls_token", self.cls_token)
+
+ self.pos_drop = nn.Dropout(p=drop_rate)
+
+ dpr = np.linspace(0, drop_path_rate, depth)
+ self.blocks = nn.LayerList([
+ Block(
+ dim=embed_dim,
+ num_heads=num_heads,
+ mlp_ratio=mlp_ratio,
+ qkv_bias=qkv_bias,
+ qk_scale=qk_scale,
+ drop=drop_rate,
+ attn_drop=attn_drop_rate,
+ drop_path=dpr[i],
+ norm_layer=norm_layer,
+ act_layer=eval(act_layer),
+ epsilon=epsilon,
+ prenorm=False) for i in range(depth)
+ ])
+ self.norm = eval(norm_layer)(embed_dim, epsilon=epsilon)
+
+ self.out_channels = out_channels
+
+ trunc_normal_(self.pos_embed)
+ trunc_normal_(self.cls_token)
+ self.apply(self._init_weights)
+
+ def _init_weights(self, m):
+ if isinstance(m, nn.Linear):
+ trunc_normal_(m.weight)
+ if isinstance(m, nn.Linear) and m.bias is not None:
+ zeros_(m.bias)
+ elif isinstance(m, nn.LayerNorm):
+ zeros_(m.bias)
+ ones_(m.weight)
+
+ def forward_features(self, x):
+ B = x.shape[0]
+ x = self.patch_embed(x)
+ cls_tokens = paddle.tile(self.cls_token, repeat_times=[B, 1, 1])
+ x = paddle.concat((cls_tokens, x), axis=1)
+ x = x + self.pos_embed
+ x = self.pos_drop(x)
+ for blk in self.blocks:
+ x = blk(x)
+ x = self.norm(x)
+ return x
+
+ def forward(self, x):
+ x = self.forward_features(x)
+ x = x[:, :self.seqlen]
+ return x.transpose([0, 2, 1]).unsqueeze(2)
diff --git a/ppocr/modeling/backbones/vqa_layoutlm.py b/ppocr/modeling/backbones/vqa_layoutlm.py
index ede5b7a35af65fac351277cefccd89b251f5cdb7..2fd1b1b2a78a98dba1930378f4a06783aadd8834 100644
--- a/ppocr/modeling/backbones/vqa_layoutlm.py
+++ b/ppocr/modeling/backbones/vqa_layoutlm.py
@@ -74,9 +74,9 @@ class LayoutLMForSer(NLPBaseModel):
def forward(self, x):
x = self.model(
input_ids=x[0],
- bbox=x[2],
- attention_mask=x[4],
- token_type_ids=x[5],
+ bbox=x[1],
+ attention_mask=x[2],
+ token_type_ids=x[3],
position_ids=None,
output_hidden_states=False)
return x
@@ -96,13 +96,15 @@ class LayoutLMv2ForSer(NLPBaseModel):
def forward(self, x):
x = self.model(
input_ids=x[0],
- bbox=x[2],
- image=x[3],
- attention_mask=x[4],
- token_type_ids=x[5],
+ bbox=x[1],
+ attention_mask=x[2],
+ token_type_ids=x[3],
+ image=x[4],
position_ids=None,
head_mask=None,
labels=None)
+ if not self.training:
+ return x
return x[0]
@@ -120,13 +122,15 @@ class LayoutXLMForSer(NLPBaseModel):
def forward(self, x):
x = self.model(
input_ids=x[0],
- bbox=x[2],
- image=x[3],
- attention_mask=x[4],
- token_type_ids=x[5],
+ bbox=x[1],
+ attention_mask=x[2],
+ token_type_ids=x[3],
+ image=x[4],
position_ids=None,
head_mask=None,
labels=None)
+ if not self.training:
+ return x
return x[0]
@@ -140,12 +144,12 @@ class LayoutLMv2ForRe(NLPBaseModel):
x = self.model(
input_ids=x[0],
bbox=x[1],
- labels=None,
- image=x[2],
- attention_mask=x[3],
- token_type_ids=x[4],
+ attention_mask=x[2],
+ token_type_ids=x[3],
+ image=x[4],
position_ids=None,
head_mask=None,
+ labels=None,
entities=x[5],
relations=x[6])
return x
@@ -161,12 +165,12 @@ class LayoutXLMForRe(NLPBaseModel):
x = self.model(
input_ids=x[0],
bbox=x[1],
- labels=None,
- image=x[2],
- attention_mask=x[3],
- token_type_ids=x[4],
+ attention_mask=x[2],
+ token_type_ids=x[3],
+ image=x[4],
position_ids=None,
head_mask=None,
+ labels=None,
entities=x[5],
relations=x[6])
return x
diff --git a/ppocr/modeling/heads/__init__.py b/ppocr/modeling/heads/__init__.py
index da09e25c0e14ffaeb240e394ed9bb0c137afa5fd..fcd146efbc378faeebd42534a994836789974c32 100755
--- a/ppocr/modeling/heads/__init__.py
+++ b/ppocr/modeling/heads/__init__.py
@@ -33,6 +33,7 @@ def build_head(config):
from .rec_aster_head import AsterHead
from .rec_pren_head import PRENHead
from .rec_multi_head import MultiHead
+ from .rec_abinet_head import ABINetHead
# cls head
from .cls_head import ClsHead
@@ -47,7 +48,7 @@ def build_head(config):
'DBHead', 'PSEHead', 'FCEHead', 'EASTHead', 'SASTHead', 'CTCHead',
'ClsHead', 'AttentionHead', 'SRNHead', 'PGHead', 'Transformer',
'TableAttentionHead', 'SARHead', 'AsterHead', 'SDMGRHead', 'PRENHead',
- 'MultiHead', 'TableMasterHead'
+ 'MultiHead', 'ABINetHead', 'TableMasterHead'
]
#table head
diff --git a/ppocr/modeling/heads/multiheadAttention.py b/ppocr/modeling/heads/multiheadAttention.py
deleted file mode 100755
index 900865ba1a8d80a108b3247ce1aff91c242860f2..0000000000000000000000000000000000000000
--- a/ppocr/modeling/heads/multiheadAttention.py
+++ /dev/null
@@ -1,163 +0,0 @@
-# copyright (c) 2021 PaddlePaddle Authors. All Rights Reserve.
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-# http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-
-import paddle
-from paddle import nn
-import paddle.nn.functional as F
-from paddle.nn import Linear
-from paddle.nn.initializer import XavierUniform as xavier_uniform_
-from paddle.nn.initializer import Constant as constant_
-from paddle.nn.initializer import XavierNormal as xavier_normal_
-
-zeros_ = constant_(value=0.)
-ones_ = constant_(value=1.)
-
-
-class MultiheadAttention(nn.Layer):
- """Allows the model to jointly attend to information
- from different representation subspaces.
- See reference: Attention Is All You Need
-
- .. math::
- \text{MultiHead}(Q, K, V) = \text{Concat}(head_1,\dots,head_h)W^O
- \text{where} head_i = \text{Attention}(QW_i^Q, KW_i^K, VW_i^V)
-
- Args:
- embed_dim: total dimension of the model
- num_heads: parallel attention layers, or heads
-
- """
-
- def __init__(self,
- embed_dim,
- num_heads,
- dropout=0.,
- bias=True,
- add_bias_kv=False,
- add_zero_attn=False):
- super(MultiheadAttention, self).__init__()
- self.embed_dim = embed_dim
- self.num_heads = num_heads
- self.dropout = dropout
- self.head_dim = embed_dim // num_heads
- assert self.head_dim * num_heads == self.embed_dim, "embed_dim must be divisible by num_heads"
- self.scaling = self.head_dim**-0.5
- self.out_proj = Linear(embed_dim, embed_dim, bias_attr=bias)
- self._reset_parameters()
- self.conv1 = paddle.nn.Conv2D(
- in_channels=embed_dim, out_channels=embed_dim, kernel_size=(1, 1))
- self.conv2 = paddle.nn.Conv2D(
- in_channels=embed_dim, out_channels=embed_dim, kernel_size=(1, 1))
- self.conv3 = paddle.nn.Conv2D(
- in_channels=embed_dim, out_channels=embed_dim, kernel_size=(1, 1))
-
- def _reset_parameters(self):
- xavier_uniform_(self.out_proj.weight)
-
- def forward(self,
- query,
- key,
- value,
- key_padding_mask=None,
- incremental_state=None,
- attn_mask=None):
- """
- Inputs of forward function
- query: [target length, batch size, embed dim]
- key: [sequence length, batch size, embed dim]
- value: [sequence length, batch size, embed dim]
- key_padding_mask: if True, mask padding based on batch size
- incremental_state: if provided, previous time steps are cashed
- need_weights: output attn_output_weights
- static_kv: key and value are static
-
- Outputs of forward function
- attn_output: [target length, batch size, embed dim]
- attn_output_weights: [batch size, target length, sequence length]
- """
- q_shape = paddle.shape(query)
- src_shape = paddle.shape(key)
- q = self._in_proj_q(query)
- k = self._in_proj_k(key)
- v = self._in_proj_v(value)
- q *= self.scaling
- q = paddle.transpose(
- paddle.reshape(
- q, [q_shape[0], q_shape[1], self.num_heads, self.head_dim]),
- [1, 2, 0, 3])
- k = paddle.transpose(
- paddle.reshape(
- k, [src_shape[0], q_shape[1], self.num_heads, self.head_dim]),
- [1, 2, 0, 3])
- v = paddle.transpose(
- paddle.reshape(
- v, [src_shape[0], q_shape[1], self.num_heads, self.head_dim]),
- [1, 2, 0, 3])
- if key_padding_mask is not None:
- assert key_padding_mask.shape[0] == q_shape[1]
- assert key_padding_mask.shape[1] == src_shape[0]
- attn_output_weights = paddle.matmul(q,
- paddle.transpose(k, [0, 1, 3, 2]))
- if attn_mask is not None:
- attn_mask = paddle.unsqueeze(paddle.unsqueeze(attn_mask, 0), 0)
- attn_output_weights += attn_mask
- if key_padding_mask is not None:
- attn_output_weights = paddle.reshape(
- attn_output_weights,
- [q_shape[1], self.num_heads, q_shape[0], src_shape[0]])
- key = paddle.unsqueeze(paddle.unsqueeze(key_padding_mask, 1), 2)
- key = paddle.cast(key, 'float32')
- y = paddle.full(
- shape=paddle.shape(key), dtype='float32', fill_value='-inf')
- y = paddle.where(key == 0., key, y)
- attn_output_weights += y
- attn_output_weights = F.softmax(
- attn_output_weights.astype('float32'),
- axis=-1,
- dtype=paddle.float32 if attn_output_weights.dtype == paddle.float16
- else attn_output_weights.dtype)
- attn_output_weights = F.dropout(
- attn_output_weights, p=self.dropout, training=self.training)
-
- attn_output = paddle.matmul(attn_output_weights, v)
- attn_output = paddle.reshape(
- paddle.transpose(attn_output, [2, 0, 1, 3]),
- [q_shape[0], q_shape[1], self.embed_dim])
- attn_output = self.out_proj(attn_output)
-
- return attn_output
-
- def _in_proj_q(self, query):
- query = paddle.transpose(query, [1, 2, 0])
- query = paddle.unsqueeze(query, axis=2)
- res = self.conv1(query)
- res = paddle.squeeze(res, axis=2)
- res = paddle.transpose(res, [2, 0, 1])
- return res
-
- def _in_proj_k(self, key):
- key = paddle.transpose(key, [1, 2, 0])
- key = paddle.unsqueeze(key, axis=2)
- res = self.conv2(key)
- res = paddle.squeeze(res, axis=2)
- res = paddle.transpose(res, [2, 0, 1])
- return res
-
- def _in_proj_v(self, value):
- value = paddle.transpose(value, [1, 2, 0]) #(1, 2, 0)
- value = paddle.unsqueeze(value, axis=2)
- res = self.conv3(value)
- res = paddle.squeeze(res, axis=2)
- res = paddle.transpose(res, [2, 0, 1])
- return res
diff --git a/ppocr/modeling/heads/rec_abinet_head.py b/ppocr/modeling/heads/rec_abinet_head.py
new file mode 100644
index 0000000000000000000000000000000000000000..a0f60f1be1727e85380eedb7d311ce9445f88b8e
--- /dev/null
+++ b/ppocr/modeling/heads/rec_abinet_head.py
@@ -0,0 +1,296 @@
+# copyright (c) 2021 PaddlePaddle Authors. All Rights Reserve.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+"""
+This code is refer from:
+https://github.com/FangShancheng/ABINet/tree/main/modules
+"""
+
+import math
+import paddle
+from paddle import nn
+import paddle.nn.functional as F
+from paddle.nn import LayerList
+from ppocr.modeling.heads.rec_nrtr_head import TransformerBlock, PositionalEncoding
+
+
+class BCNLanguage(nn.Layer):
+ def __init__(self,
+ d_model=512,
+ nhead=8,
+ num_layers=4,
+ dim_feedforward=2048,
+ dropout=0.,
+ max_length=25,
+ detach=True,
+ num_classes=37):
+ super().__init__()
+
+ self.d_model = d_model
+ self.detach = detach
+ self.max_length = max_length + 1 # additional stop token
+ self.proj = nn.Linear(num_classes, d_model, bias_attr=False)
+ self.token_encoder = PositionalEncoding(
+ dropout=0.1, dim=d_model, max_len=self.max_length)
+ self.pos_encoder = PositionalEncoding(
+ dropout=0, dim=d_model, max_len=self.max_length)
+
+ self.decoder = nn.LayerList([
+ TransformerBlock(
+ d_model=d_model,
+ nhead=nhead,
+ dim_feedforward=dim_feedforward,
+ attention_dropout_rate=dropout,
+ residual_dropout_rate=dropout,
+ with_self_attn=False,
+ with_cross_attn=True) for i in range(num_layers)
+ ])
+
+ self.cls = nn.Linear(d_model, num_classes)
+
+ def forward(self, tokens, lengths):
+ """
+ Args:
+ tokens: (B, N, C) where N is length, B is batch size and C is classes number
+ lengths: (B,)
+ """
+ if self.detach: tokens = tokens.detach()
+ embed = self.proj(tokens) # (B, N, C)
+ embed = self.token_encoder(embed) # (B, N, C)
+ padding_mask = _get_mask(lengths, self.max_length)
+ zeros = paddle.zeros_like(embed) # (B, N, C)
+ qeury = self.pos_encoder(zeros)
+ for decoder_layer in self.decoder:
+ qeury = decoder_layer(qeury, embed, cross_mask=padding_mask)
+ output = qeury # (B, N, C)
+
+ logits = self.cls(output) # (B, N, C)
+
+ return output, logits
+
+
+def encoder_layer(in_c, out_c, k=3, s=2, p=1):
+ return nn.Sequential(
+ nn.Conv2D(in_c, out_c, k, s, p), nn.BatchNorm2D(out_c), nn.ReLU())
+
+
+def decoder_layer(in_c,
+ out_c,
+ k=3,
+ s=1,
+ p=1,
+ mode='nearest',
+ scale_factor=None,
+ size=None):
+ align_corners = False if mode == 'nearest' else True
+ return nn.Sequential(
+ nn.Upsample(
+ size=size,
+ scale_factor=scale_factor,
+ mode=mode,
+ align_corners=align_corners),
+ nn.Conv2D(in_c, out_c, k, s, p),
+ nn.BatchNorm2D(out_c),
+ nn.ReLU())
+
+
+class PositionAttention(nn.Layer):
+ def __init__(self,
+ max_length,
+ in_channels=512,
+ num_channels=64,
+ h=8,
+ w=32,
+ mode='nearest',
+ **kwargs):
+ super().__init__()
+ self.max_length = max_length
+ self.k_encoder = nn.Sequential(
+ encoder_layer(
+ in_channels, num_channels, s=(1, 2)),
+ encoder_layer(
+ num_channels, num_channels, s=(2, 2)),
+ encoder_layer(
+ num_channels, num_channels, s=(2, 2)),
+ encoder_layer(
+ num_channels, num_channels, s=(2, 2)))
+ self.k_decoder = nn.Sequential(
+ decoder_layer(
+ num_channels, num_channels, scale_factor=2, mode=mode),
+ decoder_layer(
+ num_channels, num_channels, scale_factor=2, mode=mode),
+ decoder_layer(
+ num_channels, num_channels, scale_factor=2, mode=mode),
+ decoder_layer(
+ num_channels, in_channels, size=(h, w), mode=mode))
+
+ self.pos_encoder = PositionalEncoding(
+ dropout=0, dim=in_channels, max_len=max_length)
+ self.project = nn.Linear(in_channels, in_channels)
+
+ def forward(self, x):
+ B, C, H, W = x.shape
+ k, v = x, x
+
+ # calculate key vector
+ features = []
+ for i in range(0, len(self.k_encoder)):
+ k = self.k_encoder[i](k)
+ features.append(k)
+ for i in range(0, len(self.k_decoder) - 1):
+ k = self.k_decoder[i](k)
+ # print(k.shape, features[len(self.k_decoder) - 2 - i].shape)
+ k = k + features[len(self.k_decoder) - 2 - i]
+ k = self.k_decoder[-1](k)
+
+ # calculate query vector
+ # TODO q=f(q,k)
+ zeros = paddle.zeros(
+ (B, self.max_length, C), dtype=x.dtype) # (T, N, C)
+ q = self.pos_encoder(zeros) # (B, N, C)
+ q = self.project(q) # (B, N, C)
+
+ # calculate attention
+ attn_scores = q @k.flatten(2) # (B, N, (H*W))
+ attn_scores = attn_scores / (C**0.5)
+ attn_scores = F.softmax(attn_scores, axis=-1)
+
+ v = v.flatten(2).transpose([0, 2, 1]) # (B, (H*W), C)
+ attn_vecs = attn_scores @v # (B, N, C)
+
+ return attn_vecs, attn_scores.reshape([0, self.max_length, H, W])
+
+
+class ABINetHead(nn.Layer):
+ def __init__(self,
+ in_channels,
+ out_channels,
+ d_model=512,
+ nhead=8,
+ num_layers=3,
+ dim_feedforward=2048,
+ dropout=0.1,
+ max_length=25,
+ use_lang=False,
+ iter_size=1):
+ super().__init__()
+ self.max_length = max_length + 1
+ self.pos_encoder = PositionalEncoding(
+ dropout=0.1, dim=d_model, max_len=8 * 32)
+ self.encoder = nn.LayerList([
+ TransformerBlock(
+ d_model=d_model,
+ nhead=nhead,
+ dim_feedforward=dim_feedforward,
+ attention_dropout_rate=dropout,
+ residual_dropout_rate=dropout,
+ with_self_attn=True,
+ with_cross_attn=False) for i in range(num_layers)
+ ])
+ self.decoder = PositionAttention(
+ max_length=max_length + 1, # additional stop token
+ mode='nearest', )
+ self.out_channels = out_channels
+ self.cls = nn.Linear(d_model, self.out_channels)
+ self.use_lang = use_lang
+ if use_lang:
+ self.iter_size = iter_size
+ self.language = BCNLanguage(
+ d_model=d_model,
+ nhead=nhead,
+ num_layers=4,
+ dim_feedforward=dim_feedforward,
+ dropout=dropout,
+ max_length=max_length,
+ num_classes=self.out_channels)
+ # alignment
+ self.w_att_align = nn.Linear(2 * d_model, d_model)
+ self.cls_align = nn.Linear(d_model, self.out_channels)
+
+ def forward(self, x, targets=None):
+ x = x.transpose([0, 2, 3, 1])
+ _, H, W, C = x.shape
+ feature = x.flatten(1, 2)
+ feature = self.pos_encoder(feature)
+ for encoder_layer in self.encoder:
+ feature = encoder_layer(feature)
+ feature = feature.reshape([0, H, W, C]).transpose([0, 3, 1, 2])
+ v_feature, attn_scores = self.decoder(
+ feature) # (B, N, C), (B, C, H, W)
+ vis_logits = self.cls(v_feature) # (B, N, C)
+ logits = vis_logits
+ vis_lengths = _get_length(vis_logits)
+ if self.use_lang:
+ align_logits = vis_logits
+ align_lengths = vis_lengths
+ all_l_res, all_a_res = [], []
+ for i in range(self.iter_size):
+ tokens = F.softmax(align_logits, axis=-1)
+ lengths = align_lengths
+ lengths = paddle.clip(
+ lengths, 2, self.max_length) # TODO:move to langauge model
+ l_feature, l_logits = self.language(tokens, lengths)
+
+ # alignment
+ all_l_res.append(l_logits)
+ fuse = paddle.concat((l_feature, v_feature), -1)
+ f_att = F.sigmoid(self.w_att_align(fuse))
+ output = f_att * v_feature + (1 - f_att) * l_feature
+ align_logits = self.cls_align(output) # (B, N, C)
+
+ align_lengths = _get_length(align_logits)
+ all_a_res.append(align_logits)
+ if self.training:
+ return {
+ 'align': all_a_res,
+ 'lang': all_l_res,
+ 'vision': vis_logits
+ }
+ else:
+ logits = align_logits
+ if self.training:
+ return logits
+ else:
+ return F.softmax(logits, -1)
+
+
+def _get_length(logit):
+ """ Greed decoder to obtain length from logit"""
+ out = (logit.argmax(-1) == 0)
+ abn = out.any(-1)
+ out_int = out.cast('int32')
+ out = (out_int.cumsum(-1) == 1) & out
+ out = out.cast('int32')
+ out = out.argmax(-1)
+ out = out + 1
+ out = paddle.where(abn, out, paddle.to_tensor(logit.shape[1]))
+ return out
+
+
+def _get_mask(length, max_length):
+ """Generate a square mask for the sequence. The masked positions are filled with float('-inf').
+ Unmasked positions are filled with float(0.0).
+ """
+ length = length.unsqueeze(-1)
+ B = paddle.shape(length)[0]
+ grid = paddle.arange(0, max_length).unsqueeze(0).tile([B, 1])
+ zero_mask = paddle.zeros([B, max_length], dtype='float32')
+ inf_mask = paddle.full([B, max_length], '-inf', dtype='float32')
+ diag_mask = paddle.diag(
+ paddle.full(
+ [max_length], '-inf', dtype=paddle.float32),
+ offset=0,
+ name=None)
+ mask = paddle.where(grid >= length, inf_mask, zero_mask)
+ mask = mask.unsqueeze(1) + diag_mask
+ return mask.unsqueeze(1)
diff --git a/ppocr/modeling/heads/rec_nrtr_head.py b/ppocr/modeling/heads/rec_nrtr_head.py
index 38ba0c917840ea7d1e2a3c2bf0da32c2c35f2b40..bf9ef56145e6edfb15bd30235b4a62588396ba96 100644
--- a/ppocr/modeling/heads/rec_nrtr_head.py
+++ b/ppocr/modeling/heads/rec_nrtr_head.py
@@ -14,20 +14,15 @@
import math
import paddle
-import copy
from paddle import nn
import paddle.nn.functional as F
from paddle.nn import LayerList
-from paddle.nn.initializer import XavierNormal as xavier_uniform_
-from paddle.nn import Dropout, Linear, LayerNorm, Conv2D
+# from paddle.nn.initializer import XavierNormal as xavier_uniform_
+from paddle.nn import Dropout, Linear, LayerNorm
import numpy as np
-from ppocr.modeling.heads.multiheadAttention import MultiheadAttention
-from paddle.nn.initializer import Constant as constant_
+from ppocr.modeling.backbones.rec_svtrnet import Mlp, zeros_, ones_
from paddle.nn.initializer import XavierNormal as xavier_normal_
-zeros_ = constant_(value=0.)
-ones_ = constant_(value=1.)
-
class Transformer(nn.Layer):
"""A transformer model. User is able to modify the attributes as needed. The architechture
@@ -45,7 +40,6 @@ class Transformer(nn.Layer):
dropout: the dropout value (default=0.1).
custom_encoder: custom encoder (default=None).
custom_decoder: custom decoder (default=None).
-
"""
def __init__(self,
@@ -54,45 +48,49 @@ class Transformer(nn.Layer):
num_encoder_layers=6,
beam_size=0,
num_decoder_layers=6,
+ max_len=25,
dim_feedforward=1024,
attention_dropout_rate=0.0,
residual_dropout_rate=0.1,
- custom_encoder=None,
- custom_decoder=None,
in_channels=0,
out_channels=0,
scale_embedding=True):
super(Transformer, self).__init__()
self.out_channels = out_channels + 1
+ self.max_len = max_len
self.embedding = Embeddings(
d_model=d_model,
vocab=self.out_channels,
padding_idx=0,
scale_embedding=scale_embedding)
self.positional_encoding = PositionalEncoding(
- dropout=residual_dropout_rate,
- dim=d_model, )
- if custom_encoder is not None:
- self.encoder = custom_encoder
- else:
- if num_encoder_layers > 0:
- encoder_layer = TransformerEncoderLayer(
- d_model, nhead, dim_feedforward, attention_dropout_rate,
- residual_dropout_rate)
- self.encoder = TransformerEncoder(encoder_layer,
- num_encoder_layers)
- else:
- self.encoder = None
-
- if custom_decoder is not None:
- self.decoder = custom_decoder
+ dropout=residual_dropout_rate, dim=d_model)
+
+ if num_encoder_layers > 0:
+ self.encoder = nn.LayerList([
+ TransformerBlock(
+ d_model,
+ nhead,
+ dim_feedforward,
+ attention_dropout_rate,
+ residual_dropout_rate,
+ with_self_attn=True,
+ with_cross_attn=False) for i in range(num_encoder_layers)
+ ])
else:
- decoder_layer = TransformerDecoderLayer(
- d_model, nhead, dim_feedforward, attention_dropout_rate,
- residual_dropout_rate)
- self.decoder = TransformerDecoder(decoder_layer, num_decoder_layers)
+ self.encoder = None
+
+ self.decoder = nn.LayerList([
+ TransformerBlock(
+ d_model,
+ nhead,
+ dim_feedforward,
+ attention_dropout_rate,
+ residual_dropout_rate,
+ with_self_attn=True,
+ with_cross_attn=True) for i in range(num_decoder_layers)
+ ])
- self._reset_parameters()
self.beam_size = beam_size
self.d_model = d_model
self.nhead = nhead
@@ -105,7 +103,7 @@ class Transformer(nn.Layer):
def _init_weights(self, m):
- if isinstance(m, nn.Conv2D):
+ if isinstance(m, nn.Linear):
xavier_normal_(m.weight)
if m.bias is not None:
zeros_(m.bias)
@@ -113,24 +111,20 @@ class Transformer(nn.Layer):
def forward_train(self, src, tgt):
tgt = tgt[:, :-1]
- tgt_key_padding_mask = self.generate_padding_mask(tgt)
- tgt = self.embedding(tgt).transpose([1, 0, 2])
+ tgt = self.embedding(tgt)
tgt = self.positional_encoding(tgt)
- tgt_mask = self.generate_square_subsequent_mask(tgt.shape[0])
+ tgt_mask = self.generate_square_subsequent_mask(tgt.shape[1])
if self.encoder is not None:
- src = self.positional_encoding(src.transpose([1, 0, 2]))
- memory = self.encoder(src)
+ src = self.positional_encoding(src)
+ for encoder_layer in self.encoder:
+ src = encoder_layer(src)
+ memory = src # B N C
else:
- memory = src.squeeze(2).transpose([2, 0, 1])
- output = self.decoder(
- tgt,
- memory,
- tgt_mask=tgt_mask,
- memory_mask=None,
- tgt_key_padding_mask=tgt_key_padding_mask,
- memory_key_padding_mask=None)
- output = output.transpose([1, 0, 2])
+ memory = src # B N C
+ for decoder_layer in self.decoder:
+ tgt = decoder_layer(tgt, memory, self_mask=tgt_mask)
+ output = tgt
logit = self.tgt_word_prj(output)
return logit
@@ -140,8 +134,8 @@ class Transformer(nn.Layer):
src: the sequence to the encoder (required).
tgt: the sequence to the decoder (required).
Shape:
- - src: :math:`(S, N, E)`.
- - tgt: :math:`(T, N, E)`.
+ - src: :math:`(B, sN, C)`.
+ - tgt: :math:`(B, tN, C)`.
Examples:
>>> output = transformer_model(src, tgt)
"""
@@ -157,36 +151,35 @@ class Transformer(nn.Layer):
return self.forward_test(src)
def forward_test(self, src):
+
bs = paddle.shape(src)[0]
if self.encoder is not None:
- src = self.positional_encoding(paddle.transpose(src, [1, 0, 2]))
- memory = self.encoder(src)
+ src = self.positional_encoding(src)
+ for encoder_layer in self.encoder:
+ src = encoder_layer(src)
+ memory = src # B N C
else:
- memory = paddle.transpose(paddle.squeeze(src, 2), [2, 0, 1])
+ memory = src
dec_seq = paddle.full((bs, 1), 2, dtype=paddle.int64)
dec_prob = paddle.full((bs, 1), 1., dtype=paddle.float32)
- for len_dec_seq in range(1, 25):
- dec_seq_embed = paddle.transpose(self.embedding(dec_seq), [1, 0, 2])
+ for len_dec_seq in range(1, self.max_len):
+ dec_seq_embed = self.embedding(dec_seq)
dec_seq_embed = self.positional_encoding(dec_seq_embed)
tgt_mask = self.generate_square_subsequent_mask(
- paddle.shape(dec_seq_embed)[0])
- output = self.decoder(
- dec_seq_embed,
- memory,
- tgt_mask=tgt_mask,
- memory_mask=None,
- tgt_key_padding_mask=None,
- memory_key_padding_mask=None)
- dec_output = paddle.transpose(output, [1, 0, 2])
+ paddle.shape(dec_seq_embed)[1])
+ tgt = dec_seq_embed
+ for decoder_layer in self.decoder:
+ tgt = decoder_layer(tgt, memory, self_mask=tgt_mask)
+ dec_output = tgt
dec_output = dec_output[:, -1, :]
- word_prob = F.softmax(self.tgt_word_prj(dec_output), axis=1)
- preds_idx = paddle.argmax(word_prob, axis=1)
+ word_prob = F.softmax(self.tgt_word_prj(dec_output), axis=-1)
+ preds_idx = paddle.argmax(word_prob, axis=-1)
if paddle.equal_all(
preds_idx,
paddle.full(
paddle.shape(preds_idx), 3, dtype='int64')):
break
- preds_prob = paddle.max(word_prob, axis=1)
+ preds_prob = paddle.max(word_prob, axis=-1)
dec_seq = paddle.concat(
[dec_seq, paddle.reshape(preds_idx, [-1, 1])], axis=1)
dec_prob = paddle.concat(
@@ -194,10 +187,10 @@ class Transformer(nn.Layer):
return [dec_seq, dec_prob]
def forward_beam(self, images):
- ''' Translation work in one batch '''
+ """ Translation work in one batch """
def get_inst_idx_to_tensor_position_map(inst_idx_list):
- ''' Indicate the position of an instance in a tensor. '''
+ """ Indicate the position of an instance in a tensor. """
return {
inst_idx: tensor_position
for tensor_position, inst_idx in enumerate(inst_idx_list)
@@ -205,7 +198,7 @@ class Transformer(nn.Layer):
def collect_active_part(beamed_tensor, curr_active_inst_idx,
n_prev_active_inst, n_bm):
- ''' Collect tensor parts associated to active instances. '''
+ """ Collect tensor parts associated to active instances. """
beamed_tensor_shape = paddle.shape(beamed_tensor)
n_curr_active_inst = len(curr_active_inst_idx)
@@ -237,9 +230,8 @@ class Transformer(nn.Layer):
return active_src_enc, active_inst_idx_to_position_map
def beam_decode_step(inst_dec_beams, len_dec_seq, enc_output,
- inst_idx_to_position_map, n_bm,
- memory_key_padding_mask):
- ''' Decode and update beam status, and then return active beam idx '''
+ inst_idx_to_position_map, n_bm):
+ """ Decode and update beam status, and then return active beam idx """
def prepare_beam_dec_seq(inst_dec_beams, len_dec_seq):
dec_partial_seq = [
@@ -249,19 +241,15 @@ class Transformer(nn.Layer):
dec_partial_seq = dec_partial_seq.reshape([-1, len_dec_seq])
return dec_partial_seq
- def predict_word(dec_seq, enc_output, n_active_inst, n_bm,
- memory_key_padding_mask):
- dec_seq = paddle.transpose(self.embedding(dec_seq), [1, 0, 2])
+ def predict_word(dec_seq, enc_output, n_active_inst, n_bm):
+ dec_seq = self.embedding(dec_seq)
dec_seq = self.positional_encoding(dec_seq)
tgt_mask = self.generate_square_subsequent_mask(
- paddle.shape(dec_seq)[0])
- dec_output = self.decoder(
- dec_seq,
- enc_output,
- tgt_mask=tgt_mask,
- tgt_key_padding_mask=None,
- memory_key_padding_mask=memory_key_padding_mask, )
- dec_output = paddle.transpose(dec_output, [1, 0, 2])
+ paddle.shape(dec_seq)[1])
+ tgt = dec_seq
+ for decoder_layer in self.decoder:
+ tgt = decoder_layer(tgt, enc_output, self_mask=tgt_mask)
+ dec_output = tgt
dec_output = dec_output[:,
-1, :] # Pick the last step: (bh * bm) * d_h
word_prob = F.softmax(self.tgt_word_prj(dec_output), axis=1)
@@ -281,8 +269,7 @@ class Transformer(nn.Layer):
n_active_inst = len(inst_idx_to_position_map)
dec_seq = prepare_beam_dec_seq(inst_dec_beams, len_dec_seq)
- word_prob = predict_word(dec_seq, enc_output, n_active_inst, n_bm,
- None)
+ word_prob = predict_word(dec_seq, enc_output, n_active_inst, n_bm)
# Update the beam with predicted word prob information and collect incomplete instances
active_inst_idx_list = collect_active_inst_idx_list(
inst_dec_beams, word_prob, inst_idx_to_position_map)
@@ -303,10 +290,10 @@ class Transformer(nn.Layer):
with paddle.no_grad():
#-- Encode
if self.encoder is not None:
- src = self.positional_encoding(images.transpose([1, 0, 2]))
+ src = self.positional_encoding(images)
src_enc = self.encoder(src)
else:
- src_enc = images.squeeze(2).transpose([0, 2, 1])
+ src_enc = images
n_bm = self.beam_size
src_shape = paddle.shape(src_enc)
@@ -317,11 +304,11 @@ class Transformer(nn.Layer):
inst_idx_to_position_map = get_inst_idx_to_tensor_position_map(
active_inst_idx_list)
# Decode
- for len_dec_seq in range(1, 25):
+ for len_dec_seq in range(1, self.max_len):
src_enc_copy = src_enc.clone()
active_inst_idx_list = beam_decode_step(
inst_dec_beams, len_dec_seq, src_enc_copy,
- inst_idx_to_position_map, n_bm, None)
+ inst_idx_to_position_map, n_bm)
if not active_inst_idx_list:
break # all instances have finished their path to
src_enc, inst_idx_to_position_map = collate_active_info(
@@ -354,261 +341,124 @@ class Transformer(nn.Layer):
shape=[sz, sz], dtype='float32', fill_value='-inf'),
diagonal=1)
mask = mask + mask_inf
- return mask
-
- def generate_padding_mask(self, x):
- padding_mask = paddle.equal(x, paddle.to_tensor(0, dtype=x.dtype))
- return padding_mask
+ return mask.unsqueeze([0, 1])
- def _reset_parameters(self):
- """Initiate parameters in the transformer model."""
- for p in self.parameters():
- if p.dim() > 1:
- xavier_uniform_(p)
+class MultiheadAttention(nn.Layer):
+ """Allows the model to jointly attend to information
+ from different representation subspaces.
+ See reference: Attention Is All You Need
+ .. math::
+ \text{MultiHead}(Q, K, V) = \text{Concat}(head_1,\dots,head_h)W^O
+ \text{where} head_i = \text{Attention}(QW_i^Q, KW_i^K, VW_i^V)
-class TransformerEncoder(nn.Layer):
- """TransformerEncoder is a stack of N encoder layers
Args:
- encoder_layer: an instance of the TransformerEncoderLayer() class (required).
- num_layers: the number of sub-encoder-layers in the encoder (required).
- norm: the layer normalization component (optional).
- """
+ embed_dim: total dimension of the model
+ num_heads: parallel attention layers, or heads
- def __init__(self, encoder_layer, num_layers):
- super(TransformerEncoder, self).__init__()
- self.layers = _get_clones(encoder_layer, num_layers)
- self.num_layers = num_layers
+ """
- def forward(self, src):
- """Pass the input through the endocder layers in turn.
- Args:
- src: the sequnce to the encoder (required).
- mask: the mask for the src sequence (optional).
- src_key_padding_mask: the mask for the src keys per batch (optional).
- """
- output = src
+ def __init__(self, embed_dim, num_heads, dropout=0., self_attn=False):
+ super(MultiheadAttention, self).__init__()
+ self.embed_dim = embed_dim
+ self.num_heads = num_heads
+ # self.dropout = dropout
+ self.head_dim = embed_dim // num_heads
+ assert self.head_dim * num_heads == self.embed_dim, "embed_dim must be divisible by num_heads"
+ self.scale = self.head_dim**-0.5
+ self.self_attn = self_attn
+ if self_attn:
+ self.qkv = nn.Linear(embed_dim, embed_dim * 3)
+ else:
+ self.q = nn.Linear(embed_dim, embed_dim)
+ self.kv = nn.Linear(embed_dim, embed_dim * 2)
+ self.attn_drop = nn.Dropout(dropout)
+ self.out_proj = nn.Linear(embed_dim, embed_dim)
- for i in range(self.num_layers):
- output = self.layers[i](output,
- src_mask=None,
- src_key_padding_mask=None)
+ def forward(self, query, key=None, attn_mask=None):
- return output
+ qN = query.shape[1]
+ if self.self_attn:
+ qkv = self.qkv(query).reshape(
+ (0, qN, 3, self.num_heads, self.head_dim)).transpose(
+ (2, 0, 3, 1, 4))
+ q, k, v = qkv[0], qkv[1], qkv[2]
+ else:
+ kN = key.shape[1]
+ q = self.q(query).reshape(
+ [0, qN, self.num_heads, self.head_dim]).transpose([0, 2, 1, 3])
+ kv = self.kv(key).reshape(
+ (0, kN, 2, self.num_heads, self.head_dim)).transpose(
+ (2, 0, 3, 1, 4))
+ k, v = kv[0], kv[1]
-class TransformerDecoder(nn.Layer):
- """TransformerDecoder is a stack of N decoder layers
+ attn = (q.matmul(k.transpose((0, 1, 3, 2)))) * self.scale
- Args:
- decoder_layer: an instance of the TransformerDecoderLayer() class (required).
- num_layers: the number of sub-decoder-layers in the decoder (required).
- norm: the layer normalization component (optional).
+ if attn_mask is not None:
+ attn += attn_mask
- """
+ attn = F.softmax(attn, axis=-1)
+ attn = self.attn_drop(attn)
- def __init__(self, decoder_layer, num_layers):
- super(TransformerDecoder, self).__init__()
- self.layers = _get_clones(decoder_layer, num_layers)
- self.num_layers = num_layers
+ x = (attn.matmul(v)).transpose((0, 2, 1, 3)).reshape(
+ (0, qN, self.embed_dim))
+ x = self.out_proj(x)
- def forward(self,
- tgt,
- memory,
- tgt_mask=None,
- memory_mask=None,
- tgt_key_padding_mask=None,
- memory_key_padding_mask=None):
- """Pass the inputs (and mask) through the decoder layer in turn.
+ return x
- Args:
- tgt: the sequence to the decoder (required).
- memory: the sequnce from the last layer of the encoder (required).
- tgt_mask: the mask for the tgt sequence (optional).
- memory_mask: the mask for the memory sequence (optional).
- tgt_key_padding_mask: the mask for the tgt keys per batch (optional).
- memory_key_padding_mask: the mask for the memory keys per batch (optional).
- """
- output = tgt
- for i in range(self.num_layers):
- output = self.layers[i](
- output,
- memory,
- tgt_mask=tgt_mask,
- memory_mask=memory_mask,
- tgt_key_padding_mask=tgt_key_padding_mask,
- memory_key_padding_mask=memory_key_padding_mask)
-
- return output
-
-
-class TransformerEncoderLayer(nn.Layer):
- """TransformerEncoderLayer is made up of self-attn and feedforward network.
- This standard encoder layer is based on the paper "Attention Is All You Need".
- Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez,
- Lukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Advances in
- Neural Information Processing Systems, pages 6000-6010. Users may modify or implement
- in a different way during application.
-
- Args:
- d_model: the number of expected features in the input (required).
- nhead: the number of heads in the multiheadattention models (required).
- dim_feedforward: the dimension of the feedforward network model (default=2048).
- dropout: the dropout value (default=0.1).
-
- """
+class TransformerBlock(nn.Layer):
def __init__(self,
d_model,
nhead,
dim_feedforward=2048,
attention_dropout_rate=0.0,
- residual_dropout_rate=0.1):
- super(TransformerEncoderLayer, self).__init__()
- self.self_attn = MultiheadAttention(
- d_model, nhead, dropout=attention_dropout_rate)
-
- self.conv1 = Conv2D(
- in_channels=d_model,
- out_channels=dim_feedforward,
- kernel_size=(1, 1))
- self.conv2 = Conv2D(
- in_channels=dim_feedforward,
- out_channels=d_model,
- kernel_size=(1, 1))
-
- self.norm1 = LayerNorm(d_model)
- self.norm2 = LayerNorm(d_model)
- self.dropout1 = Dropout(residual_dropout_rate)
- self.dropout2 = Dropout(residual_dropout_rate)
-
- def forward(self, src, src_mask=None, src_key_padding_mask=None):
- """Pass the input through the endocder layer.
- Args:
- src: the sequnce to the encoder layer (required).
- src_mask: the mask for the src sequence (optional).
- src_key_padding_mask: the mask for the src keys per batch (optional).
- """
- src2 = self.self_attn(
- src,
- src,
- src,
- attn_mask=src_mask,
- key_padding_mask=src_key_padding_mask)
- src = src + self.dropout1(src2)
- src = self.norm1(src)
-
- src = paddle.transpose(src, [1, 2, 0])
- src = paddle.unsqueeze(src, 2)
- src2 = self.conv2(F.relu(self.conv1(src)))
- src2 = paddle.squeeze(src2, 2)
- src2 = paddle.transpose(src2, [2, 0, 1])
- src = paddle.squeeze(src, 2)
- src = paddle.transpose(src, [2, 0, 1])
-
- src = src + self.dropout2(src2)
- src = self.norm2(src)
- return src
-
-
-class TransformerDecoderLayer(nn.Layer):
- """TransformerDecoderLayer is made up of self-attn, multi-head-attn and feedforward network.
- This standard decoder layer is based on the paper "Attention Is All You Need".
- Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez,
- Lukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Advances in
- Neural Information Processing Systems, pages 6000-6010. Users may modify or implement
- in a different way during application.
-
- Args:
- d_model: the number of expected features in the input (required).
- nhead: the number of heads in the multiheadattention models (required).
- dim_feedforward: the dimension of the feedforward network model (default=2048).
- dropout: the dropout value (default=0.1).
-
- """
+ residual_dropout_rate=0.1,
+ with_self_attn=True,
+ with_cross_attn=False,
+ epsilon=1e-5):
+ super(TransformerBlock, self).__init__()
+ self.with_self_attn = with_self_attn
+ if with_self_attn:
+ self.self_attn = MultiheadAttention(
+ d_model,
+ nhead,
+ dropout=attention_dropout_rate,
+ self_attn=with_self_attn)
+ self.norm1 = LayerNorm(d_model, epsilon=epsilon)
+ self.dropout1 = Dropout(residual_dropout_rate)
+ self.with_cross_attn = with_cross_attn
+ if with_cross_attn:
+ self.cross_attn = MultiheadAttention( #for self_attn of encoder or cross_attn of decoder
+ d_model,
+ nhead,
+ dropout=attention_dropout_rate)
+ self.norm2 = LayerNorm(d_model, epsilon=epsilon)
+ self.dropout2 = Dropout(residual_dropout_rate)
+
+ self.mlp = Mlp(in_features=d_model,
+ hidden_features=dim_feedforward,
+ act_layer=nn.ReLU,
+ drop=residual_dropout_rate)
+
+ self.norm3 = LayerNorm(d_model, epsilon=epsilon)
- def __init__(self,
- d_model,
- nhead,
- dim_feedforward=2048,
- attention_dropout_rate=0.0,
- residual_dropout_rate=0.1):
- super(TransformerDecoderLayer, self).__init__()
- self.self_attn = MultiheadAttention(
- d_model, nhead, dropout=attention_dropout_rate)
- self.multihead_attn = MultiheadAttention(
- d_model, nhead, dropout=attention_dropout_rate)
-
- self.conv1 = Conv2D(
- in_channels=d_model,
- out_channels=dim_feedforward,
- kernel_size=(1, 1))
- self.conv2 = Conv2D(
- in_channels=dim_feedforward,
- out_channels=d_model,
- kernel_size=(1, 1))
-
- self.norm1 = LayerNorm(d_model)
- self.norm2 = LayerNorm(d_model)
- self.norm3 = LayerNorm(d_model)
- self.dropout1 = Dropout(residual_dropout_rate)
- self.dropout2 = Dropout(residual_dropout_rate)
self.dropout3 = Dropout(residual_dropout_rate)
- def forward(self,
- tgt,
- memory,
- tgt_mask=None,
- memory_mask=None,
- tgt_key_padding_mask=None,
- memory_key_padding_mask=None):
- """Pass the inputs (and mask) through the decoder layer.
+ def forward(self, tgt, memory=None, self_mask=None, cross_mask=None):
+ if self.with_self_attn:
+ tgt1 = self.self_attn(tgt, attn_mask=self_mask)
+ tgt = self.norm1(tgt + self.dropout1(tgt1))
- Args:
- tgt: the sequence to the decoder layer (required).
- memory: the sequnce from the last layer of the encoder (required).
- tgt_mask: the mask for the tgt sequence (optional).
- memory_mask: the mask for the memory sequence (optional).
- tgt_key_padding_mask: the mask for the tgt keys per batch (optional).
- memory_key_padding_mask: the mask for the memory keys per batch (optional).
-
- """
- tgt2 = self.self_attn(
- tgt,
- tgt,
- tgt,
- attn_mask=tgt_mask,
- key_padding_mask=tgt_key_padding_mask)
- tgt = tgt + self.dropout1(tgt2)
- tgt = self.norm1(tgt)
- tgt2 = self.multihead_attn(
- tgt,
- memory,
- memory,
- attn_mask=memory_mask,
- key_padding_mask=memory_key_padding_mask)
- tgt = tgt + self.dropout2(tgt2)
- tgt = self.norm2(tgt)
-
- # default
- tgt = paddle.transpose(tgt, [1, 2, 0])
- tgt = paddle.unsqueeze(tgt, 2)
- tgt2 = self.conv2(F.relu(self.conv1(tgt)))
- tgt2 = paddle.squeeze(tgt2, 2)
- tgt2 = paddle.transpose(tgt2, [2, 0, 1])
- tgt = paddle.squeeze(tgt, 2)
- tgt = paddle.transpose(tgt, [2, 0, 1])
-
- tgt = tgt + self.dropout3(tgt2)
- tgt = self.norm3(tgt)
+ if self.with_cross_attn:
+ tgt2 = self.cross_attn(tgt, key=memory, attn_mask=cross_mask)
+ tgt = self.norm2(tgt + self.dropout2(tgt2))
+ tgt = self.norm3(tgt + self.dropout3(self.mlp(tgt)))
return tgt
-def _get_clones(module, N):
- return LayerList([copy.deepcopy(module) for i in range(N)])
-
-
class PositionalEncoding(nn.Layer):
"""Inject some information about the relative or absolute position of the tokens
in the sequence. The positional encodings have the same dimension as
@@ -651,8 +501,9 @@ class PositionalEncoding(nn.Layer):
Examples:
>>> output = pos_encoder(x)
"""
+ x = x.transpose([1, 0, 2])
x = x + self.pe[:paddle.shape(x)[0], :]
- return self.dropout(x)
+ return self.dropout(x).transpose([1, 0, 2])
class PositionalEncoding_2d(nn.Layer):
@@ -725,7 +576,7 @@ class PositionalEncoding_2d(nn.Layer):
class Embeddings(nn.Layer):
- def __init__(self, d_model, vocab, padding_idx, scale_embedding):
+ def __init__(self, d_model, vocab, padding_idx=None, scale_embedding=True):
super(Embeddings, self).__init__()
self.embedding = nn.Embedding(vocab, d_model, padding_idx=padding_idx)
w0 = np.random.normal(0.0, d_model**-0.5,
@@ -742,7 +593,7 @@ class Embeddings(nn.Layer):
class Beam():
- ''' Beam search '''
+ """ Beam search """
def __init__(self, size, device=False):
diff --git a/ppocr/modeling/heads/rec_sar_head.py b/ppocr/modeling/heads/rec_sar_head.py
index 0e6b34404b61b44bebcbc7d67ddfd0a95382c39b..5e64cae85afafc555f2519ed6dd3f05eafff7ea2 100644
--- a/ppocr/modeling/heads/rec_sar_head.py
+++ b/ppocr/modeling/heads/rec_sar_head.py
@@ -83,7 +83,7 @@ class SAREncoder(nn.Layer):
def forward(self, feat, img_metas=None):
if img_metas is not None:
- assert len(img_metas[0]) == feat.shape[0]
+ assert len(img_metas[0]) == paddle.shape(feat)[0]
valid_ratios = None
if img_metas is not None and self.mask:
@@ -98,9 +98,10 @@ class SAREncoder(nn.Layer):
if valid_ratios is not None:
valid_hf = []
- T = holistic_feat.shape[1]
- for i in range(len(valid_ratios)):
- valid_step = min(T, math.ceil(T * valid_ratios[i])) - 1
+ T = paddle.shape(holistic_feat)[1]
+ for i in range(paddle.shape(valid_ratios)[0]):
+ valid_step = paddle.minimum(
+ T, paddle.ceil(valid_ratios[i] * T).astype('int32')) - 1
valid_hf.append(holistic_feat[i, valid_step, :])
valid_hf = paddle.stack(valid_hf, axis=0)
else:
@@ -247,13 +248,14 @@ class ParallelSARDecoder(BaseDecoder):
# bsz * (seq_len + 1) * h * w * attn_size
attn_weight = self.conv1x1_2(attn_weight)
# bsz * (seq_len + 1) * h * w * 1
- bsz, T, h, w, c = attn_weight.shape
+ bsz, T, h, w, c = paddle.shape(attn_weight)
assert c == 1
if valid_ratios is not None:
# cal mask of attention weight
- for i in range(len(valid_ratios)):
- valid_width = min(w, math.ceil(w * valid_ratios[i]))
+ for i in range(paddle.shape(valid_ratios)[0]):
+ valid_width = paddle.minimum(
+ w, paddle.ceil(valid_ratios[i] * w).astype("int32"))
if valid_width < w:
attn_weight[i, :, :, valid_width:, :] = float('-inf')
@@ -288,7 +290,7 @@ class ParallelSARDecoder(BaseDecoder):
img_metas: [label, valid_ratio]
'''
if img_metas is not None:
- assert len(img_metas[0]) == feat.shape[0]
+ assert paddle.shape(img_metas[0])[0] == paddle.shape(feat)[0]
valid_ratios = None
if img_metas is not None and self.mask:
@@ -302,7 +304,6 @@ class ParallelSARDecoder(BaseDecoder):
# bsz * (seq_len + 1) * C
out_dec = self._2d_attention(
in_dec, feat, out_enc, valid_ratios=valid_ratios)
- # bsz * (seq_len + 1) * num_classes
return out_dec[:, 1:, :] # bsz * seq_len * num_classes
@@ -395,7 +396,6 @@ class SARHead(nn.Layer):
if self.training:
label = targets[0] # label
- label = paddle.to_tensor(label, dtype='int64')
final_out = self.decoder(
feat, holistic_feat, label, img_metas=targets)
else:
diff --git a/ppocr/modeling/heads/rec_srn_head.py b/ppocr/modeling/heads/rec_srn_head.py
index 8d59e4711a043afd9234f430a62c9876c0a8f6f4..1070d8cd648eb686c0a2e66df092b7dc6de29c42 100644
--- a/ppocr/modeling/heads/rec_srn_head.py
+++ b/ppocr/modeling/heads/rec_srn_head.py
@@ -20,13 +20,11 @@ import math
import paddle
from paddle import nn, ParamAttr
from paddle.nn import functional as F
-import paddle.fluid as fluid
import numpy as np
from .self_attention import WrapEncoderForFeature
from .self_attention import WrapEncoder
from paddle.static import Program
from ppocr.modeling.backbones.rec_resnet_fpn import ResNetFPN
-import paddle.fluid.framework as framework
from collections import OrderedDict
gradient_clip = 10
diff --git a/ppocr/modeling/heads/self_attention.py b/ppocr/modeling/heads/self_attention.py
index 6c27fdbe434166e9277cc8d695bce2743cbd8ec6..6e4c65e3931ae74a0fde2a16694a69fdfa69b5ed 100644
--- a/ppocr/modeling/heads/self_attention.py
+++ b/ppocr/modeling/heads/self_attention.py
@@ -22,7 +22,6 @@ import paddle
from paddle import ParamAttr, nn
from paddle import nn, ParamAttr
from paddle.nn import functional as F
-import paddle.fluid as fluid
import numpy as np
gradient_clip = 10
@@ -288,10 +287,10 @@ class PrePostProcessLayer(nn.Layer):
"layer_norm_%d" % len(self.sublayers()),
paddle.nn.LayerNorm(
normalized_shape=d_model,
- weight_attr=fluid.ParamAttr(
- initializer=fluid.initializer.Constant(1.)),
- bias_attr=fluid.ParamAttr(
- initializer=fluid.initializer.Constant(0.)))))
+ weight_attr=paddle.ParamAttr(
+ initializer=paddle.nn.initializer.Constant(1.)),
+ bias_attr=paddle.ParamAttr(
+ initializer=paddle.nn.initializer.Constant(0.)))))
elif cmd == "d": # add dropout
self.functors.append(lambda x: F.dropout(
x, p=dropout_rate, mode="downscale_in_infer")
@@ -324,7 +323,7 @@ class PrepareEncoder(nn.Layer):
def forward(self, src_word, src_pos):
src_word_emb = src_word
- src_word_emb = fluid.layers.cast(src_word_emb, 'float32')
+ src_word_emb = paddle.cast(src_word_emb, 'float32')
src_word_emb = paddle.scale(x=src_word_emb, scale=self.src_emb_dim**0.5)
src_pos = paddle.squeeze(src_pos, axis=-1)
src_pos_enc = self.emb(src_pos)
@@ -367,7 +366,7 @@ class PrepareDecoder(nn.Layer):
self.dropout_rate = dropout_rate
def forward(self, src_word, src_pos):
- src_word = fluid.layers.cast(src_word, 'int64')
+ src_word = paddle.cast(src_word, 'int64')
src_word = paddle.squeeze(src_word, axis=-1)
src_word_emb = self.emb0(src_word)
src_word_emb = paddle.scale(x=src_word_emb, scale=self.src_emb_dim**0.5)
diff --git a/ppocr/modeling/heads/table_master_head.py b/ppocr/modeling/heads/table_master_head.py
index acd1a9145fc0aeb6d374a8555cd09347624f0172..4da6e9b59f78db5bfe1557317e11204f97544aa6 100644
--- a/ppocr/modeling/heads/table_master_head.py
+++ b/ppocr/modeling/heads/table_master_head.py
@@ -32,7 +32,7 @@ class TableMasterHead(nn.Layer):
d_ff=2048,
dropout=0,
max_text_length=500,
- point_num=4,
+ point_num=2,
**kwargs):
super(TableMasterHead, self).__init__()
hidden_size = in_channels[-1]
@@ -45,7 +45,7 @@ class TableMasterHead(nn.Layer):
self.cls_fc = nn.Linear(hidden_size, out_channels)
self.bbox_fc = nn.Sequential(
# nn.Linear(hidden_size, hidden_size),
- nn.Linear(hidden_size, point_num),
+ nn.Linear(hidden_size, point_num * 2),
nn.Sigmoid())
self.norm = nn.LayerNorm(hidden_size)
self.embedding = Embeddings(d_model=hidden_size, vocab=out_channels)
@@ -100,7 +100,7 @@ class TableMasterHead(nn.Layer):
output = paddle.zeros(
[input.shape[0], self.max_text_length + 1, self.out_channels])
bbox_output = paddle.zeros(
- [input.shape[0], self.max_text_length + 1, self.point_num])
+ [input.shape[0], self.max_text_length + 1, self.point_num * 2])
max_text_length = paddle.to_tensor(self.max_text_length)
for i in range(max_text_length + 1):
target_mask = self.make_mask(input)
diff --git a/ppocr/modeling/necks/db_fpn.py b/ppocr/modeling/necks/db_fpn.py
index 93ed2dbfd1fac9bf2d163c54d23a20e16b537981..8c3f52a331db5daafab2a38c0a441edd44eb141d 100644
--- a/ppocr/modeling/necks/db_fpn.py
+++ b/ppocr/modeling/necks/db_fpn.py
@@ -105,9 +105,10 @@ class DSConv(nn.Layer):
class DBFPN(nn.Layer):
- def __init__(self, in_channels, out_channels, **kwargs):
+ def __init__(self, in_channels, out_channels, use_asf=False, **kwargs):
super(DBFPN, self).__init__()
self.out_channels = out_channels
+ self.use_asf = use_asf
weight_attr = paddle.nn.initializer.KaimingUniform()
self.in2_conv = nn.Conv2D(
@@ -163,6 +164,9 @@ class DBFPN(nn.Layer):
weight_attr=ParamAttr(initializer=weight_attr),
bias_attr=False)
+ if self.use_asf is True:
+ self.asf = ASFBlock(self.out_channels, self.out_channels // 4)
+
def forward(self, x):
c2, c3, c4, c5 = x
@@ -187,6 +191,10 @@ class DBFPN(nn.Layer):
p3 = F.upsample(p3, scale_factor=2, mode="nearest", align_mode=1)
fuse = paddle.concat([p5, p4, p3, p2], axis=1)
+
+ if self.use_asf is True:
+ fuse = self.asf(fuse, [p5, p4, p3, p2])
+
return fuse
@@ -356,3 +364,64 @@ class LKPAN(nn.Layer):
fuse = paddle.concat([p5, p4, p3, p2], axis=1)
return fuse
+
+
+class ASFBlock(nn.Layer):
+ """
+ This code is refered from:
+ https://github.com/MhLiao/DB/blob/master/decoders/feature_attention.py
+ """
+
+ def __init__(self, in_channels, inter_channels, out_features_num=4):
+ """
+ Adaptive Scale Fusion (ASF) block of DBNet++
+ Args:
+ in_channels: the number of channels in the input data
+ inter_channels: the number of middle channels
+ out_features_num: the number of fused stages
+ """
+ super(ASFBlock, self).__init__()
+ weight_attr = paddle.nn.initializer.KaimingUniform()
+ self.in_channels = in_channels
+ self.inter_channels = inter_channels
+ self.out_features_num = out_features_num
+ self.conv = nn.Conv2D(in_channels, inter_channels, 3, padding=1)
+
+ self.spatial_scale = nn.Sequential(
+ #Nx1xHxW
+ nn.Conv2D(
+ in_channels=1,
+ out_channels=1,
+ kernel_size=3,
+ bias_attr=False,
+ padding=1,
+ weight_attr=ParamAttr(initializer=weight_attr)),
+ nn.ReLU(),
+ nn.Conv2D(
+ in_channels=1,
+ out_channels=1,
+ kernel_size=1,
+ bias_attr=False,
+ weight_attr=ParamAttr(initializer=weight_attr)),
+ nn.Sigmoid())
+
+ self.channel_scale = nn.Sequential(
+ nn.Conv2D(
+ in_channels=inter_channels,
+ out_channels=out_features_num,
+ kernel_size=1,
+ bias_attr=False,
+ weight_attr=ParamAttr(initializer=weight_attr)),
+ nn.Sigmoid())
+
+ def forward(self, fuse_features, features_list):
+ fuse_features = self.conv(fuse_features)
+ spatial_x = paddle.mean(fuse_features, axis=1, keepdim=True)
+ attention_scores = self.spatial_scale(spatial_x) + fuse_features
+ attention_scores = self.channel_scale(attention_scores)
+ assert len(features_list) == self.out_features_num
+
+ out_list = []
+ for i in range(self.out_features_num):
+ out_list.append(attention_scores[:, i:i + 1] * features_list[i])
+ return paddle.concat(out_list, axis=1)
diff --git a/ppocr/optimizer/learning_rate.py b/ppocr/optimizer/learning_rate.py
index d96ab51896884428d88a70c5a6a1e4ab59252c55..7d45109b4857871f52764c64d6d32e5322fc7c57 100644
--- a/ppocr/optimizer/learning_rate.py
+++ b/ppocr/optimizer/learning_rate.py
@@ -310,6 +310,41 @@ class Const(object):
return learning_rate
+class DecayLearningRate(object):
+ """
+ DecayLearningRate learning rate decay
+ new_lr = (lr - end_lr) * (1 - epoch/decay_steps)**power + end_lr
+ Args:
+ learning_rate(float): initial learning rate
+ step_each_epoch(int): steps each epoch
+ epochs(int): total training epochs
+ factor(float): Power of polynomial, should greater than 0.0 to get learning rate decay. Default: 0.9
+ end_lr(float): The minimum final learning rate. Default: 0.0.
+ """
+
+ def __init__(self,
+ learning_rate,
+ step_each_epoch,
+ epochs,
+ factor=0.9,
+ end_lr=0,
+ **kwargs):
+ super(DecayLearningRate, self).__init__()
+ self.learning_rate = learning_rate
+ self.epochs = epochs + 1
+ self.factor = factor
+ self.end_lr = 0
+ self.decay_steps = step_each_epoch * epochs
+
+ def __call__(self):
+ learning_rate = lr.PolynomialDecay(
+ learning_rate=self.learning_rate,
+ decay_steps=self.decay_steps,
+ power=self.factor,
+ end_lr=self.end_lr)
+ return learning_rate
+
+
class MultiStepDecay(object):
"""
Piecewise learning rate decay
@@ -350,4 +385,4 @@ class MultiStepDecay(object):
start_lr=0.0,
end_lr=self.learning_rate,
last_epoch=self.last_epoch)
- return learning_rate
\ No newline at end of file
+ return learning_rate
diff --git a/ppocr/postprocess/__init__.py b/ppocr/postprocess/__init__.py
index 4a08f1531f7aa4f521c360e59d74ab62ff2911ba..26a23f1ea476c81a092fcbdd11ff79e4e38ec2e8 100644
--- a/ppocr/postprocess/__init__.py
+++ b/ppocr/postprocess/__init__.py
@@ -26,13 +26,13 @@ from .east_postprocess import EASTPostProcess
from .sast_postprocess import SASTPostProcess
from .fce_postprocess import FCEPostProcess
from .rec_postprocess import CTCLabelDecode, AttnLabelDecode, SRNLabelDecode, \
- DistillationCTCLabelDecode, NRTRLabelDecode, SARLabelDecode, \
- SEEDLabelDecode, PRENLabelDecode
-from .table_postprocess import TableMasterLabelDecode, TableLabelDecode
+ DistillationCTCLabelDecode, TableLabelDecode, NRTRLabelDecode, SARLabelDecode, \
+ SEEDLabelDecode, PRENLabelDecode, ViTSTRLabelDecode, ABINetLabelDecode
from .cls_postprocess import ClsPostProcess
from .pg_postprocess import PGPostProcess
from .vqa_token_ser_layoutlm_postprocess import VQASerTokenLayoutLMPostProcess
from .vqa_token_re_layoutlm_postprocess import VQAReTokenLayoutLMPostProcess
+from .table_postprocess import TableMasterLabelDecode, TableLabelDecode
def build_post_process(config, global_config=None):
@@ -43,7 +43,8 @@ def build_post_process(config, global_config=None):
'DistillationDBPostProcess', 'NRTRLabelDecode', 'SARLabelDecode',
'SEEDLabelDecode', 'VQASerTokenLayoutLMPostProcess',
'VQAReTokenLayoutLMPostProcess', 'PRENLabelDecode',
- 'DistillationSARLabelDecode', 'TableMasterLabelDecode'
+ 'DistillationSARLabelDecode', 'ViTSTRLabelDecode', 'ABINetLabelDecode',
+ 'TableMasterLabelDecode'
]
if config['name'] == 'PSEPostProcess':
diff --git a/ppocr/postprocess/rec_postprocess.py b/ppocr/postprocess/rec_postprocess.py
index 0d01b342106dc04fa44bc8f9fb74f56b1b67ff8a..cc7c2cb379cc476943152507569f0b0066189c46 100644
--- a/ppocr/postprocess/rec_postprocess.py
+++ b/ppocr/postprocess/rec_postprocess.py
@@ -140,70 +140,6 @@ class DistillationCTCLabelDecode(CTCLabelDecode):
return output
-class NRTRLabelDecode(BaseRecLabelDecode):
- """ Convert between text-label and text-index """
-
- def __init__(self, character_dict_path=None, use_space_char=True, **kwargs):
- super(NRTRLabelDecode, self).__init__(character_dict_path,
- use_space_char)
-
- def __call__(self, preds, label=None, *args, **kwargs):
-
- if len(preds) == 2:
- preds_id = preds[0]
- preds_prob = preds[1]
- if isinstance(preds_id, paddle.Tensor):
- preds_id = preds_id.numpy()
- if isinstance(preds_prob, paddle.Tensor):
- preds_prob = preds_prob.numpy()
- if preds_id[0][0] == 2:
- preds_idx = preds_id[:, 1:]
- preds_prob = preds_prob[:, 1:]
- else:
- preds_idx = preds_id
- text = self.decode(preds_idx, preds_prob, is_remove_duplicate=False)
- if label is None:
- return text
- label = self.decode(label[:, 1:])
- else:
- if isinstance(preds, paddle.Tensor):
- preds = preds.numpy()
- preds_idx = preds.argmax(axis=2)
- preds_prob = preds.max(axis=2)
- text = self.decode(preds_idx, preds_prob, is_remove_duplicate=False)
- if label is None:
- return text
- label = self.decode(label[:, 1:])
- return text, label
-
- def add_special_char(self, dict_character):
- dict_character = ['blank', '', '', ''] + dict_character
- return dict_character
-
- def decode(self, text_index, text_prob=None, is_remove_duplicate=False):
- """ convert text-index into text-label. """
- result_list = []
- batch_size = len(text_index)
- for batch_idx in range(batch_size):
- char_list = []
- conf_list = []
- for idx in range(len(text_index[batch_idx])):
- if text_index[batch_idx][idx] == 3: # end
- break
- try:
- char_list.append(self.character[int(text_index[batch_idx][
- idx])])
- except:
- continue
- if text_prob is not None:
- conf_list.append(text_prob[batch_idx][idx])
- else:
- conf_list.append(1)
- text = ''.join(char_list)
- result_list.append((text.lower(), np.mean(conf_list).tolist()))
- return result_list
-
-
class AttnLabelDecode(BaseRecLabelDecode):
""" Convert between text-label and text-index """
@@ -612,3 +548,122 @@ class PRENLabelDecode(BaseRecLabelDecode):
return text
label = self.decode(label)
return text, label
+
+
+class NRTRLabelDecode(BaseRecLabelDecode):
+ """ Convert between text-label and text-index """
+
+ def __init__(self, character_dict_path=None, use_space_char=True, **kwargs):
+ super(NRTRLabelDecode, self).__init__(character_dict_path,
+ use_space_char)
+
+ def __call__(self, preds, label=None, *args, **kwargs):
+
+ if len(preds) == 2:
+ preds_id = preds[0]
+ preds_prob = preds[1]
+ if isinstance(preds_id, paddle.Tensor):
+ preds_id = preds_id.numpy()
+ if isinstance(preds_prob, paddle.Tensor):
+ preds_prob = preds_prob.numpy()
+ if preds_id[0][0] == 2:
+ preds_idx = preds_id[:, 1:]
+ preds_prob = preds_prob[:, 1:]
+ else:
+ preds_idx = preds_id
+ text = self.decode(preds_idx, preds_prob, is_remove_duplicate=False)
+ if label is None:
+ return text
+ label = self.decode(label[:, 1:])
+ else:
+ if isinstance(preds, paddle.Tensor):
+ preds = preds.numpy()
+ preds_idx = preds.argmax(axis=2)
+ preds_prob = preds.max(axis=2)
+ text = self.decode(preds_idx, preds_prob, is_remove_duplicate=False)
+ if label is None:
+ return text
+ label = self.decode(label[:, 1:])
+ return text, label
+
+ def add_special_char(self, dict_character):
+ dict_character = ['blank', '', '', ''] + dict_character
+ return dict_character
+
+ def decode(self, text_index, text_prob=None, is_remove_duplicate=False):
+ """ convert text-index into text-label. """
+ result_list = []
+ batch_size = len(text_index)
+ for batch_idx in range(batch_size):
+ char_list = []
+ conf_list = []
+ for idx in range(len(text_index[batch_idx])):
+ try:
+ char_idx = self.character[int(text_index[batch_idx][idx])]
+ except:
+ continue
+ if char_idx == '': # end
+ break
+ char_list.append(char_idx)
+ if text_prob is not None:
+ conf_list.append(text_prob[batch_idx][idx])
+ else:
+ conf_list.append(1)
+ text = ''.join(char_list)
+ result_list.append((text.lower(), np.mean(conf_list).tolist()))
+ return result_list
+
+
+class ViTSTRLabelDecode(NRTRLabelDecode):
+ """ Convert between text-label and text-index """
+
+ def __init__(self, character_dict_path=None, use_space_char=False,
+ **kwargs):
+ super(ViTSTRLabelDecode, self).__init__(character_dict_path,
+ use_space_char)
+
+ def __call__(self, preds, label=None, *args, **kwargs):
+ if isinstance(preds, paddle.Tensor):
+ preds = preds[:, 1:].numpy()
+ else:
+ preds = preds[:, 1:]
+ preds_idx = preds.argmax(axis=2)
+ preds_prob = preds.max(axis=2)
+ text = self.decode(preds_idx, preds_prob, is_remove_duplicate=False)
+ if label is None:
+ return text
+ label = self.decode(label[:, 1:])
+ return text, label
+
+ def add_special_char(self, dict_character):
+ dict_character = ['', ''] + dict_character
+ return dict_character
+
+
+class ABINetLabelDecode(NRTRLabelDecode):
+ """ Convert between text-label and text-index """
+
+ def __init__(self, character_dict_path=None, use_space_char=False,
+ **kwargs):
+ super(ABINetLabelDecode, self).__init__(character_dict_path,
+ use_space_char)
+
+ def __call__(self, preds, label=None, *args, **kwargs):
+ if isinstance(preds, dict):
+ preds = preds['align'][-1].numpy()
+ elif isinstance(preds, paddle.Tensor):
+ preds = preds.numpy()
+ else:
+ preds = preds
+
+ preds_idx = preds.argmax(axis=2)
+ preds_prob = preds.max(axis=2)
+ text = self.decode(preds_idx, preds_prob, is_remove_duplicate=False)
+ if label is None:
+ return text
+ label = self.decode(label)
+ return text, label
+
+ def add_special_char(self, dict_character):
+ dict_character = [''] + dict_character
+ return dict_character
diff --git a/ppocr/postprocess/vqa_token_ser_layoutlm_postprocess.py b/ppocr/postprocess/vqa_token_ser_layoutlm_postprocess.py
index 782cdea6c58c69e0d728787e0e21e200c9e13790..8a6669f71f5ae6a7a16931e565b43355de5928d9 100644
--- a/ppocr/postprocess/vqa_token_ser_layoutlm_postprocess.py
+++ b/ppocr/postprocess/vqa_token_ser_layoutlm_postprocess.py
@@ -41,11 +41,13 @@ class VQASerTokenLayoutLMPostProcess(object):
self.id2label_map_for_show[val] = key
def __call__(self, preds, batch=None, *args, **kwargs):
+ if isinstance(preds, tuple):
+ preds = preds[0]
if isinstance(preds, paddle.Tensor):
preds = preds.numpy()
if batch is not None:
- return self._metric(preds, batch[1])
+ return self._metric(preds, batch[5])
else:
return self._infer(preds, **kwargs)
@@ -63,11 +65,11 @@ class VQASerTokenLayoutLMPostProcess(object):
j]])
return decode_out_list, label_decode_out_list
- def _infer(self, preds, attention_masks, segment_offset_ids, ocr_infos):
+ def _infer(self, preds, segment_offset_ids, ocr_infos):
results = []
- for pred, attention_mask, segment_offset_id, ocr_info in zip(
- preds, attention_masks, segment_offset_ids, ocr_infos):
+ for pred, segment_offset_id, ocr_info in zip(preds, segment_offset_ids,
+ ocr_infos):
pred = np.argmax(pred, axis=1)
pred = [self.id2label_map[idx] for idx in pred]
diff --git a/ppocr/utils/save_load.py b/ppocr/utils/save_load.py
index b09f1db6e938e8eb99148d69efce016f1cbe8628..3647111fddaa848a75873ab689559c63dd6d4814 100644
--- a/ppocr/utils/save_load.py
+++ b/ppocr/utils/save_load.py
@@ -177,9 +177,9 @@ def save_model(model,
model.backbone.model.save_pretrained(model_prefix)
metric_prefix = os.path.join(model_prefix, 'metric')
# save metric and config
+ with open(metric_prefix + '.states', 'wb') as f:
+ pickle.dump(kwargs, f, protocol=2)
if is_best:
- with open(metric_prefix + '.states', 'wb') as f:
- pickle.dump(kwargs, f, protocol=2)
logger.info('save best model is to {}'.format(model_prefix))
else:
logger.info("save model in {}".format(model_prefix))
diff --git a/ppocr/utils/utility.py b/ppocr/utils/utility.py
index 4a25ff8b2fa182faaf4f4ce8909c9ec2e9b55ccc..b881fcab20bc5ca076a0002bd72349768c7d881a 100755
--- a/ppocr/utils/utility.py
+++ b/ppocr/utils/utility.py
@@ -91,18 +91,19 @@ def check_and_read_gif(img_path):
def load_vqa_bio_label_maps(label_map_path):
with open(label_map_path, "r", encoding='utf-8') as fin:
lines = fin.readlines()
- lines = [line.strip() for line in lines]
- if "O" not in lines:
- lines.insert(0, "O")
- labels = []
- for line in lines:
- if line == "O":
- labels.append("O")
- else:
- labels.append("B-" + line)
- labels.append("I-" + line)
- label2id_map = {label: idx for idx, label in enumerate(labels)}
- id2label_map = {idx: label for idx, label in enumerate(labels)}
+ old_lines = [line.strip() for line in lines]
+ lines = ["O"]
+ for line in old_lines:
+ # "O" has already been in lines
+ if line.upper() in ["OTHER", "OTHERS", "IGNORE"]:
+ continue
+ lines.append(line)
+ labels = ["O"]
+ for line in lines[1:]:
+ labels.append("B-" + line)
+ labels.append("I-" + line)
+ label2id_map = {label.upper(): idx for idx, label in enumerate(labels)}
+ id2label_map = {idx: label.upper() for idx, label in enumerate(labels)}
return label2id_map, id2label_map
diff --git a/ppocr/utils/visual.py b/ppocr/utils/visual.py
index 7a8c1674a74f89299de59f7cd120b4577a7499d8..235eb572a3975b4446ae2f2c9ad9c8558d5c5ad8 100644
--- a/ppocr/utils/visual.py
+++ b/ppocr/utils/visual.py
@@ -19,7 +19,7 @@ from PIL import Image, ImageDraw, ImageFont
def draw_ser_results(image,
ocr_results,
font_path="doc/fonts/simfang.ttf",
- font_size=18):
+ font_size=14):
np.random.seed(2021)
color = (np.random.permutation(range(255)),
np.random.permutation(range(255)),
@@ -40,9 +40,15 @@ def draw_ser_results(image,
if ocr_info["pred_id"] not in color_map:
continue
color = color_map[ocr_info["pred_id"]]
- text = "{}: {}".format(ocr_info["pred"], ocr_info["text"])
+ text = "{}: {}".format(ocr_info["pred"], ocr_info["transcription"])
- draw_box_txt(ocr_info["bbox"], text, draw, font, font_size, color)
+ if "bbox" in ocr_info:
+ # draw with ocr engine
+ bbox = ocr_info["bbox"]
+ else:
+ # draw with ocr groundtruth
+ bbox = trans_poly_to_bbox(ocr_info["points"])
+ draw_box_txt(bbox, text, draw, font, font_size, color)
img_new = Image.blend(image, img_new, 0.5)
return np.array(img_new)
@@ -62,6 +68,14 @@ def draw_box_txt(bbox, text, draw, font, font_size, color):
draw.text((bbox[0][0] + 1, start_y), text, fill=(255, 255, 255), font=font)
+def trans_poly_to_bbox(poly):
+ x1 = np.min([p[0] for p in poly])
+ x2 = np.max([p[0] for p in poly])
+ y1 = np.min([p[1] for p in poly])
+ y2 = np.max([p[1] for p in poly])
+ return [x1, y1, x2, y2]
+
+
def draw_re_results(image,
result,
font_path="doc/fonts/simfang.ttf",
@@ -80,10 +94,10 @@ def draw_re_results(image,
color_line = (0, 255, 0)
for ocr_info_head, ocr_info_tail in result:
- draw_box_txt(ocr_info_head["bbox"], ocr_info_head["text"], draw, font,
- font_size, color_head)
- draw_box_txt(ocr_info_tail["bbox"], ocr_info_tail["text"], draw, font,
- font_size, color_tail)
+ draw_box_txt(ocr_info_head["bbox"], ocr_info_head["transcription"],
+ draw, font, font_size, color_head)
+ draw_box_txt(ocr_info_tail["bbox"], ocr_info_tail["transcription"],
+ draw, font, font_size, color_tail)
center_head = (
(ocr_info_head['bbox'][0] + ocr_info_head['bbox'][2]) // 2,
diff --git a/ppstructure/docs/kie.md b/ppstructure/docs/kie.md
index 35498b33478d1010fd2548dfcb8586b4710723a1..315dd9f7bafa6b6160489eab330e8d278b2d119d 100644
--- a/ppstructure/docs/kie.md
+++ b/ppstructure/docs/kie.md
@@ -16,7 +16,7 @@ SDMGR是一个关键信息提取算法,将每个检测到的文本区域分类
训练和测试的数据采用wildreceipt数据集,通过如下指令下载数据集:
```
-wget https://paddleocr.bj.bcebos.com/dygraph_v2.1/kie/wildreceipt.tar && tar xf wildreceipt.tar
+wget https://paddleocr.bj.bcebos.com/ppstructure/dataset/wildreceipt.tar && tar xf wildreceipt.tar
```
执行预测:
diff --git a/ppstructure/docs/kie_en.md b/ppstructure/docs/kie_en.md
index 1fe38b0b399e9290526dafa5409673dc87026db7..7b3752223dd765e780d56d146c90bd0f892aac7b 100644
--- a/ppstructure/docs/kie_en.md
+++ b/ppstructure/docs/kie_en.md
@@ -15,7 +15,7 @@ This section provides a tutorial example on how to quickly use, train, and evalu
[Wildreceipt dataset](https://paperswithcode.com/dataset/wildreceipt) is used for this tutorial. It contains 1765 photos, with 25 classes, and 50000 text boxes, which can be downloaded by wget:
```shell
-wget https://paddleocr.bj.bcebos.com/dygraph_v2.1/kie/wildreceipt.tar && tar xf wildreceipt.tar
+wget https://paddleocr.bj.bcebos.com/ppstructure/dataset/wildreceipt.tar && tar xf wildreceipt.tar
```
Download the pretrained model and predict the result:
diff --git a/ppstructure/docs/models_list.md b/ppstructure/docs/models_list.md
index c7dab999ff6e370c56c5495e22e91f117b3d1275..dabce3a5149a88833d38a4395e31ac1f82306c4f 100644
--- a/ppstructure/docs/models_list.md
+++ b/ppstructure/docs/models_list.md
@@ -1,11 +1,11 @@
# PP-Structure 系列模型列表
-- [1. 版面分析模型](#1)
-- [2. OCR和表格识别模型](#2)
- - [2.1 OCR](#21)
- - [2.2 表格识别模型](#22)
-- [3. VQA模型](#3)
-- [4. KIE模型](#4)
+- [1. 版面分析模型](#1-版面分析模型)
+- [2. OCR和表格识别模型](#2-ocr和表格识别模型)
+ - [2.1 OCR](#21-ocr)
+ - [2.2 表格识别模型](#22-表格识别模型)
+- [3. VQA模型](#3-vqa模型)
+- [4. KIE模型](#4-kie模型)
@@ -42,11 +42,11 @@
|模型名称|模型简介|推理模型大小|下载地址|
| --- | --- | --- | --- |
-|ser_LayoutXLM_xfun_zh|基于LayoutXLM在xfun中文数据集上训练的SER模型|1.4G|[推理模型 coming soon]() / [训练模型](https://paddleocr.bj.bcebos.com/pplayout/re_LayoutXLM_xfun_zh.tar) |
-|re_LayoutXLM_xfun_zh|基于LayoutXLM在xfun中文数据集上训练的RE模型|1.4G|[推理模型 coming soon]() / [训练模型](https://paddleocr.bj.bcebos.com/pplayout/ser_LayoutXLM_xfun_zh.tar) |
-|ser_LayoutLMv2_xfun_zh|基于LayoutLMv2在xfun中文数据集上训练的SER模型|778M|[推理模型 coming soon]() / [训练模型](https://paddleocr.bj.bcebos.com/pplayout/ser_LayoutLMv2_xfun_zh.tar) |
+|ser_LayoutXLM_xfun_zh|基于LayoutXLM在xfun中文数据集上训练的SER模型|1.4G|[推理模型](https://paddleocr.bj.bcebos.com/pplayout/ser_LayoutXLM_xfun_zh_infer.tar) / [训练模型](https://paddleocr.bj.bcebos.com/pplayout/ser_LayoutXLM_xfun_zh.tar) |
+|re_LayoutXLM_xfun_zh|基于LayoutXLM在xfun中文数据集上训练的RE模型|1.4G|[推理模型 coming soon]() / [训练模型](https://paddleocr.bj.bcebos.com/pplayout/re_LayoutXLM_xfun_zh.tar) |
+|ser_LayoutLMv2_xfun_zh|基于LayoutLMv2在xfun中文数据集上训练的SER模型|778M|[推理模型](https://paddleocr.bj.bcebos.com/pplayout/ser_LayoutLMv2_xfun_zh_infer.tar) / [训练模型](https://paddleocr.bj.bcebos.com/pplayout/ser_LayoutLMv2_xfun_zh.tar) |
|re_LayoutLMv2_xfun_zh|基于LayoutLMv2在xfun中文数据集上训练的RE模型|765M|[推理模型 coming soon]() / [训练模型](https://paddleocr.bj.bcebos.com/pplayout/re_LayoutLMv2_xfun_zh.tar) |
-|ser_LayoutLM_xfun_zh|基于LayoutLM在xfun中文数据集上训练的SER模型|430M|[推理模型 coming soon]() / [训练模型](https://paddleocr.bj.bcebos.com/pplayout/ser_LayoutLM_xfun_zh.tar) |
+|ser_LayoutLM_xfun_zh|基于LayoutLM在xfun中文数据集上训练的SER模型|430M|[推理模型](https://paddleocr.bj.bcebos.com/pplayout/ser_LayoutLM_xfun_zh_infer.tar) / [训练模型](https://paddleocr.bj.bcebos.com/pplayout/ser_LayoutLM_xfun_zh.tar) |
## 4. KIE模型
diff --git a/ppstructure/docs/models_list_en.md b/ppstructure/docs/models_list_en.md
index b92c10c241df72c85649b64f915b4266cd3fe410..e133a0bb2a9b017207b5e92ea444aba4633a7457 100644
--- a/ppstructure/docs/models_list_en.md
+++ b/ppstructure/docs/models_list_en.md
@@ -1,11 +1,11 @@
# PP-Structure Model list
-- [1. Layout Analysis](#1)
-- [2. OCR and Table Recognition](#2)
- - [2.1 OCR](#21)
- - [2.2 Table Recognition](#22)
-- [3. VQA](#3)
-- [4. KIE](#4)
+- [1. Layout Analysis](#1-layout-analysis)
+- [2. OCR and Table Recognition](#2-ocr-and-table-recognition)
+ - [2.1 OCR](#21-ocr)
+ - [2.2 Table Recognition](#22-table-recognition)
+- [3. VQA](#3-vqa)
+- [4. KIE](#4-kie)
@@ -42,11 +42,11 @@ If you need to use other OCR models, you can download the model in [PP-OCR model
|model| description |inference model size|download|
| --- |----------------------------------------------------------------| --- | --- |
-|ser_LayoutXLM_xfun_zh| SER model trained on xfun Chinese dataset based on LayoutXLM |1.4G|[inference model coming soon]() / [trained model](https://paddleocr.bj.bcebos.com/pplayout/re_LayoutXLM_xfun_zh.tar) |
-|re_LayoutXLM_xfun_zh| Re model trained on xfun Chinese dataset based on LayoutXLM |1.4G|[inference model coming soon]() / [trained model](https://paddleocr.bj.bcebos.com/pplayout/ser_LayoutXLM_xfun_zh.tar) |
-|ser_LayoutLMv2_xfun_zh| SER model trained on xfun Chinese dataset based on LayoutXLMv2 |778M|[inference model coming soon]() / [trained model](https://paddleocr.bj.bcebos.com/pplayout/ser_LayoutLMv2_xfun_zh.tar) |
+|ser_LayoutXLM_xfun_zh| SER model trained on xfun Chinese dataset based on LayoutXLM |1.4G|[inference model](https://paddleocr.bj.bcebos.com/pplayout/ser_LayoutXLM_xfun_zh_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/pplayout/ser_LayoutXLM_xfun_zh.tar) |
+|re_LayoutXLM_xfun_zh| Re model trained on xfun Chinese dataset based on LayoutXLM |1.4G|[inference model coming soon]() / [trained model](https://paddleocr.bj.bcebos.com/pplayout/re_LayoutXLM_xfun_zh.tar) |
+|ser_LayoutLMv2_xfun_zh| SER model trained on xfun Chinese dataset based on LayoutXLMv2 |778M|[inference model](https://paddleocr.bj.bcebos.com/pplayout/ser_LayoutLMv2_xfun_zh_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/pplayout/ser_LayoutLMv2_xfun_zh.tar) |
|re_LayoutLMv2_xfun_zh| Re model trained on xfun Chinese dataset based on LayoutXLMv2 |765M|[inference model coming soon]() / [trained model](https://paddleocr.bj.bcebos.com/pplayout/re_LayoutLMv2_xfun_zh.tar) |
-|ser_LayoutLM_xfun_zh| SER model trained on xfun Chinese dataset based on LayoutLM |430M|[inference model coming soon]() / [trained model](https://paddleocr.bj.bcebos.com/pplayout/ser_LayoutLM_xfun_zh.tar) |
+|ser_LayoutLM_xfun_zh| SER model trained on xfun Chinese dataset based on LayoutLM |430M|[inference model](https://paddleocr.bj.bcebos.com/pplayout/ser_LayoutLM_xfun_zh_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/pplayout/ser_LayoutLM_xfun_zh.tar) |
## 4. KIE
diff --git a/ppstructure/table/README.md b/ppstructure/table/README.md
index d21ef4aa3813b4ff49dc0580be35c5e2e0483c8f..b6804c6f09b4ee3d17cd2b81e6cc6642c1c1be9a 100644
--- a/ppstructure/table/README.md
+++ b/ppstructure/table/README.md
@@ -18,7 +18,7 @@ The table recognition mainly contains three models
The table recognition flow chart is as follows
-
+
1. The coordinates of single-line text is detected by DB model, and then sends it to the recognition model to get the recognition result.
2. The table structure and cell coordinates is predicted by RARE model.
diff --git a/ppstructure/table/predict_table.py b/ppstructure/table/predict_table.py
index 402d6c24189d044e2ee6d359edef8624d4aae145..aa05459589208dde66a6710322593d091af41325 100644
--- a/ppstructure/table/predict_table.py
+++ b/ppstructure/table/predict_table.py
@@ -28,6 +28,7 @@ import numpy as np
import time
import tools.infer.predict_rec as predict_rec
import tools.infer.predict_det as predict_det
+import tools.infer.utility as utility
from ppocr.utils.utility import get_image_file_list, check_and_read_gif
from ppocr.utils.logging import get_logger
from ppstructure.table.matcher import distance, compute_iou
@@ -59,11 +60,37 @@ class TableSystem(object):
self.text_recognizer = predict_rec.TextRecognizer(
args) if text_recognizer is None else text_recognizer
self.table_structurer = predict_strture.TableStructurer(args)
+ self.benchmark = args.benchmark
+ self.predictor, self.input_tensor, self.output_tensors, self.config = utility.create_predictor(
+ args, 'table', logger)
+ if args.benchmark:
+ import auto_log
+ pid = os.getpid()
+ gpu_id = utility.get_infer_gpuid()
+ self.autolog = auto_log.AutoLogger(
+ model_name="table",
+ model_precision=args.precision,
+ batch_size=1,
+ data_shape="dynamic",
+ save_path=None, #args.save_log_path,
+ inference_config=self.config,
+ pids=pid,
+ process_name=None,
+ gpu_ids=gpu_id if args.use_gpu else None,
+ time_keys=[
+ 'preprocess_time', 'inference_time', 'postprocess_time'
+ ],
+ warmup=0,
+ logger=logger)
def __call__(self, img, return_ocr_result_in_table=False):
result = dict()
ori_im = img.copy()
+ if self.benchmark:
+ self.autolog.times.start()
structure_res, elapse = self.table_structurer(copy.deepcopy(img))
+ if self.benchmark:
+ self.autolog.times.stamp()
dt_boxes, elapse = self.text_detector(copy.deepcopy(img))
dt_boxes = sorted_boxes(dt_boxes)
if return_ocr_result_in_table:
@@ -77,13 +104,11 @@ class TableSystem(object):
box = [x_min, y_min, x_max, y_max]
r_boxes.append(box)
dt_boxes = np.array(r_boxes)
-
logger.debug("dt_boxes num : {}, elapse : {}".format(
len(dt_boxes), elapse))
if dt_boxes is None:
return None, None
img_crop_list = []
-
for i in range(len(dt_boxes)):
det_box = dt_boxes[i]
x0, y0, x1, y1 = expand(2, det_box, ori_im.shape)
@@ -92,10 +117,14 @@ class TableSystem(object):
rec_res, elapse = self.text_recognizer(img_crop_list)
logger.debug("rec_res num : {}, elapse : {}".format(
len(rec_res), elapse))
+ if self.benchmark:
+ self.autolog.times.stamp()
if return_ocr_result_in_table:
result['rec_res'] = rec_res
pred_html, pred = self.rebuild_table(structure_res, dt_boxes, rec_res)
result['html'] = pred_html
+ if self.benchmark:
+ self.autolog.times.end(stamp=True)
return result
def rebuild_table(self, structure_res, dt_boxes, rec_res):
@@ -213,6 +242,8 @@ def main(args):
logger.info('excel saved to {}'.format(excel_path))
elapse = time.time() - starttime
logger.info("Predict time : {:.3f}s".format(elapse))
+ if args.benchmark:
+ text_sys.autolog.report()
if __name__ == "__main__":
diff --git a/ppstructure/utility.py b/ppstructure/utility.py
index 05452c23b53356991d1684f0ed0f63649447e915..af0616239b167ff9ca5f6e1222015d51338d6bab 100644
--- a/ppstructure/utility.py
+++ b/ppstructure/utility.py
@@ -41,6 +41,13 @@ def init_args():
type=ast.literal_eval,
default=None,
help='label map according to ppstructure/layout/README_ch.md')
+ # params for vqa
+ parser.add_argument("--vqa_algorithm", type=str, default='LayoutXLM')
+ parser.add_argument("--ser_model_dir", type=str)
+ parser.add_argument(
+ "--ser_dict_path",
+ type=str,
+ default="../train_data/XFUND/class_list_xfun.txt")
# params for inference
parser.add_argument(
"--mode",
diff --git a/ppstructure/vqa/README.md b/ppstructure/vqa/README.md
index e3a10671ddb6494eb15073e7ac007aa1e8e6a32a..05635265b5e5eff18429e2d595fc4195381299f5 100644
--- a/ppstructure/vqa/README.md
+++ b/ppstructure/vqa/README.md
@@ -1,19 +1,15 @@
English | [简体中文](README_ch.md)
-- [Document Visual Question Answering (Doc-VQA)](#Document-Visual-Question-Answering)
- - [1. Introduction](#1-Introduction)
- - [2. Performance](#2-performance)
- - [3. Effect demo](#3-Effect-demo)
- - [3.1 SER](#31-ser)
- - [3.2 RE](#32-re)
- - [4. Install](#4-Install)
- - [4.1 Installation dependencies](#41-Install-dependencies)
- - [4.2 Install PaddleOCR](#42-Install-PaddleOCR)
- - [5. Usage](#5-Usage)
- - [5.1 Data and Model Preparation](#51-Data-and-Model-Preparation)
- - [5.2 SER](#52-ser)
- - [5.3 RE](#53-re)
- - [6. Reference](#6-Reference-Links)
+- [1 Introduction](#1-introduction)
+- [2. Performance](#2-performance)
+- [3. Effect demo](#3-effect-demo)
+ - [3.1 SER](#31-ser)
+ - [3.2 RE](#32-re)
+- [4. Install](#4-install)
+ - [4.1 Install dependencies](#41-install-dependencies)
+ - [5.3 RE](#53-re)
+- [6. Reference Links](#6-reference-links)
+- [License](#license)
# Document Visual Question Answering
@@ -125,13 +121,13 @@ If you want to experience the prediction process directly, you can download the
* Download the processed dataset
-The download address of the processed XFUND Chinese dataset: [https://paddleocr.bj.bcebos.com/dataset/XFUND.tar](https://paddleocr.bj.bcebos.com/dataset/XFUND.tar).
+The download address of the processed XFUND Chinese dataset: [link](https://paddleocr.bj.bcebos.com/ppstructure/dataset/XFUND.tar).
Download and unzip the dataset, and place the dataset in the current directory after unzipping.
```shell
-wget https://paddleocr.bj.bcebos.com/dataset/XFUND.tar
+wget https://paddleocr.bj.bcebos.com/ppstructure/dataset/XFUND.tar
````
* Convert the dataset
@@ -187,17 +183,17 @@ CUDA_VISIBLE_DEVICES=0 python3 tools/eval.py -c configs/vqa/ser/layoutxlm.yml -o
````
Finally, `precision`, `recall`, `hmean` and other indicators will be printed
-* Use `OCR engine + SER` tandem prediction
+* `OCR + SER` tandem prediction based on training engine
-Use the following command to complete the series prediction of `OCR engine + SER`, taking the pretrained SER model as an example:
+Use the following command to complete the series prediction of `OCR engine + SER`, taking the SER model based on LayoutXLM as an example::
```shell
-CUDA_VISIBLE_DEVICES=0 python3 tools/infer_vqa_token_ser.py -c configs/vqa/ser/layoutxlm.yml -o Architecture.Backbone.checkpoints=pretrain/ser_LayoutXLM_xfun_zh/Global.infer_img=doc/vqa/input/zh_val_42.jpg
+python3.7 tools/export_model.py -c configs/vqa/ser/layoutxlm.yml -o Architecture.Backbone.checkpoints=pretrain/ser_LayoutXLM_xfun_zh/ Global.save_inference_dir=output/ser/infer
````
Finally, the prediction result visualization image and the prediction result text file will be saved in the directory configured by the `config.Global.save_res_path` field. The prediction result text file is named `infer_results.txt`.
-* End-to-end evaluation of `OCR engine + SER` prediction system
+* End-to-end evaluation of `OCR + SER` prediction system
First use the `tools/infer_vqa_token_ser.py` script to complete the prediction of the dataset, then use the following command to evaluate.
@@ -205,6 +201,24 @@ First use the `tools/infer_vqa_token_ser.py` script to complete the prediction o
export CUDA_VISIBLE_DEVICES=0
python3 tools/eval_with_label_end2end.py --gt_json_path XFUND/zh_val/xfun_normalize_val.json --pred_json_path output_res/infer_results.txt
````
+* export model
+
+Use the following command to complete the model export of the SER model, taking the SER model based on LayoutXLM as an example:
+
+```shell
+python3.7 tools/export_model.py -c configs/vqa/ser/layoutxlm.yml -o Architecture.Backbone.checkpoints=pretrain/ser_LayoutXLM_xfun_zh/ Global.save_inference_dir=output/ser/infer
+```
+The converted model will be stored in the directory specified by the `Global.save_inference_dir` field.
+
+* `OCR + SER` tandem prediction based on prediction engine
+
+Use the following command to complete the tandem prediction of `OCR + SER` based on the prediction engine, taking the SER model based on LayoutXLM as an example:
+
+```shell
+cd ppstructure
+CUDA_VISIBLE_DEVICES=0 python3.7 vqa/predict_vqa_token_ser.py --vqa_algorithm=LayoutXLM --ser_model_dir=../output/ser/infer --ser_dict_path=../train_data/XFUND/class_list_xfun.txt --image_dir=docs/vqa/input/zh_val_42.jpg --output=output
+```
+After the prediction is successful, the visualization images and results will be saved in the directory specified by the `output` field
### 5.3 RE
@@ -247,11 +261,19 @@ Finally, `precision`, `recall`, `hmean` and other indicators will be printed
Use the following command to complete the series prediction of `OCR engine + SER + RE`, taking the pretrained SER and RE models as an example:
```shell
export CUDA_VISIBLE_DEVICES=0
-python3 tools/infer_vqa_token_ser_re.py -c configs/vqa/re/layoutxlm.yml -o Architecture.Backbone.checkpoints=pretrain/re_LayoutXLM_xfun_zh/Global.infer_img=doc/vqa/input/zh_val_21.jpg -c_ser configs/vqa/ser/layoutxlm. yml -o_ser Architecture.Backbone.checkpoints=pretrain/ser_LayoutXLM_xfun_zh/
+python3 tools/infer_vqa_token_ser_re.py -c configs/vqa/re/layoutxlm.yml -o Architecture.Backbone.checkpoints=pretrain/re_LayoutXLM_xfun_zh/Global.infer_img=ppstructure/docs/vqa/input/zh_val_21.jpg -c_ser configs/vqa/ser/layoutxlm. yml -o_ser Architecture.Backbone.checkpoints=pretrain/ser_LayoutXLM_xfun_zh/
````
Finally, the prediction result visualization image and the prediction result text file will be saved in the directory configured by the `config.Global.save_res_path` field. The prediction result text file is named `infer_results.txt`.
+* export model
+
+cooming soon
+
+* `OCR + SER + RE` tandem prediction based on prediction engine
+
+cooming soon
+
## 6. Reference Links
- LayoutXLM: Multimodal Pre-training for Multilingual Visually-rich Document Understanding, https://arxiv.org/pdf/2104.08836.pdf
diff --git a/ppstructure/vqa/README_ch.md b/ppstructure/vqa/README_ch.md
index ff513f8f7d603d66a372ce383883f3bcf97a7880..b421a82d3a1cbe39f5c740bea486ec26593ab20f 100644
--- a/ppstructure/vqa/README_ch.md
+++ b/ppstructure/vqa/README_ch.md
@@ -1,19 +1,19 @@
[English](README.md) | 简体中文
-- [文档视觉问答(DOC-VQA)](#文档视觉问答doc-vqa)
- - [1. 简介](#1-简介)
- - [2. 性能](#2-性能)
- - [3. 效果演示](#3-效果演示)
- - [3.1 SER](#31-ser)
- - [3.2 RE](#32-re)
- - [4. 安装](#4-安装)
- - [4.1 安装依赖](#41-安装依赖)
- - [4.2 安装PaddleOCR(包含 PP-OCR 和 VQA)](#42-安装paddleocr包含-pp-ocr-和-vqa)
- - [5. 使用](#5-使用)
- - [5.1 数据和预训练模型准备](#51-数据和预训练模型准备)
- - [5.2 SER](#52-ser)
- - [5.3 RE](#53-re)
- - [6. 参考链接](#6-参考链接)
+- [1. 简介](#1-简介)
+- [2. 性能](#2-性能)
+- [3. 效果演示](#3-效果演示)
+ - [3.1 SER](#31-ser)
+ - [3.2 RE](#32-re)
+- [4. 安装](#4-安装)
+ - [4.1 安装依赖](#41-安装依赖)
+ - [4.2 安装PaddleOCR(包含 PP-OCR 和 VQA)](#42-安装paddleocr包含-pp-ocr-和-vqa)
+- [5. 使用](#5-使用)
+ - [5.1 数据和预训练模型准备](#51-数据和预训练模型准备)
+ - [5.2 SER](#52-ser)
+ - [5.3 RE](#53-re)
+- [6. 参考链接](#6-参考链接)
+- [License](#license)
# 文档视觉问答(DOC-VQA)
@@ -52,7 +52,7 @@ PP-Structure 里的 DOC-VQA算法基于PaddleNLP自然语言处理算法库进
### 3.1 SER
- | 
+ | 
---|---
图中不同颜色的框表示不同的类别,对于XFUND数据集,有`QUESTION`, `ANSWER`, `HEADER` 3种类别
@@ -65,7 +65,7 @@ PP-Structure 里的 DOC-VQA算法基于PaddleNLP自然语言处理算法库进
### 3.2 RE
- | 
+ | 
---|---
@@ -122,13 +122,13 @@ python3 -m pip install -r ppstructure/vqa/requirements.txt
* 下载处理好的数据集
-处理好的XFUND中文数据集下载地址:[https://paddleocr.bj.bcebos.com/dataset/XFUND.tar](https://paddleocr.bj.bcebos.com/dataset/XFUND.tar)。
+处理好的XFUND中文数据集下载地址:[链接](https://paddleocr.bj.bcebos.com/ppstructure/dataset/XFUND.tar)。
下载并解压该数据集,解压后将数据集放置在当前目录下。
```shell
-wget https://paddleocr.bj.bcebos.com/dataset/XFUND.tar
+wget https://paddleocr.bj.bcebos.com/ppstructure/dataset/XFUND.tar
```
* 转换数据集
@@ -183,16 +183,16 @@ CUDA_VISIBLE_DEVICES=0 python3 tools/eval.py -c configs/vqa/ser/layoutxlm.yml -o
```
最终会打印出`precision`, `recall`, `hmean`等指标
-* 使用`OCR引擎 + SER`串联预测
+* 基于训练引擎的`OCR + SER`串联预测
-使用如下命令即可完成`OCR引擎 + SER`的串联预测, 以SER预训练模型为例:
+使用如下命令即可完成基于训练引擎的`OCR + SER`的串联预测, 以基于LayoutXLM的SER模型为例:
```shell
CUDA_VISIBLE_DEVICES=0 python3 tools/infer_vqa_token_ser.py -c configs/vqa/ser/layoutxlm.yml -o Architecture.Backbone.checkpoints=pretrain/ser_LayoutXLM_xfun_zh/ Global.infer_img=doc/vqa/input/zh_val_42.jpg
```
最终会在`config.Global.save_res_path`字段所配置的目录下保存预测结果可视化图像以及预测结果文本文件,预测结果文本文件名为`infer_results.txt`。
-* 对`OCR引擎 + SER`预测系统进行端到端评估
+* 对`OCR + SER`预测系统进行端到端评估
首先使用 `tools/infer_vqa_token_ser.py` 脚本完成数据集的预测,然后使用下面的命令进行评估。
@@ -200,6 +200,24 @@ CUDA_VISIBLE_DEVICES=0 python3 tools/infer_vqa_token_ser.py -c configs/vqa/ser/l
export CUDA_VISIBLE_DEVICES=0
python3 tools/eval_with_label_end2end.py --gt_json_path XFUND/zh_val/xfun_normalize_val.json --pred_json_path output_res/infer_results.txt
```
+* 模型导出
+
+使用如下命令即可完成SER模型的模型导出, 以基于LayoutXLM的SER模型为例:
+
+```shell
+python3.7 tools/export_model.py -c configs/vqa/ser/layoutxlm.yml -o Architecture.Backbone.checkpoints=pretrain/ser_LayoutXLM_xfun_zh/ Global.save_inference_dir=output/ser/infer
+```
+转换后的模型会存放在`Global.save_inference_dir`字段指定的目录下。
+
+* 基于预测引擎的`OCR + SER`串联预测
+
+使用如下命令即可完成基于预测引擎的`OCR + SER`的串联预测, 以基于LayoutXLM的SER模型为例:
+
+```shell
+cd ppstructure
+CUDA_VISIBLE_DEVICES=0 python3.7 vqa/predict_vqa_token_ser.py --vqa_algorithm=LayoutXLM --ser_model_dir=../output/ser/infer --ser_dict_path=../train_data/XFUND/class_list_xfun.txt --image_dir=docs/vqa/input/zh_val_42.jpg --output=output
+```
+预测成功后,可视化图片和结果会保存在`output`字段指定的目录下
### 5.3 RE
@@ -236,16 +254,24 @@ CUDA_VISIBLE_DEVICES=0 python3 tools/eval.py -c configs/vqa/re/layoutxlm.yml -o
```
最终会打印出`precision`, `recall`, `hmean`等指标
-* 使用`OCR引擎 + SER + RE`串联预测
+* 基于训练引擎的`OCR + SER + RE`串联预测
-使用如下命令即可完成`OCR引擎 + SER + RE`的串联预测, 以预训练SER和RE模型为例:
+使用如下命令即可完成基于训练引擎的`OCR + SER + RE`串联预测, 以基于LayoutXLMSER和RE模型为例:
```shell
export CUDA_VISIBLE_DEVICES=0
-python3 tools/infer_vqa_token_ser_re.py -c configs/vqa/re/layoutxlm.yml -o Architecture.Backbone.checkpoints=pretrain/re_LayoutXLM_xfun_zh/ Global.infer_img=doc/vqa/input/zh_val_21.jpg -c_ser configs/vqa/ser/layoutxlm.yml -o_ser Architecture.Backbone.checkpoints=pretrain/ser_LayoutXLM_xfun_zh/
+python3 tools/infer_vqa_token_ser_re.py -c configs/vqa/re/layoutxlm.yml -o Architecture.Backbone.checkpoints=pretrain/re_LayoutXLM_xfun_zh/ Global.infer_img=ppstructure/docs/vqa/input/zh_val_21.jpg -c_ser configs/vqa/ser/layoutxlm.yml -o_ser Architecture.Backbone.checkpoints=pretrain/ser_LayoutXLM_xfun_zh/
```
最终会在`config.Global.save_res_path`字段所配置的目录下保存预测结果可视化图像以及预测结果文本文件,预测结果文本文件名为`infer_results.txt`。
+* 模型导出
+
+cooming soon
+
+* 基于预测引擎的`OCR + SER + RE`串联预测
+
+cooming soon
+
## 6. 参考链接
- LayoutXLM: Multimodal Pre-training for Multilingual Visually-rich Document Understanding, https://arxiv.org/pdf/2104.08836.pdf
diff --git a/ppstructure/vqa/labels/labels_ser.txt b/ppstructure/vqa/labels/labels_ser.txt
deleted file mode 100644
index 508e48112412f62538baf0c78bcf99ec8945196e..0000000000000000000000000000000000000000
--- a/ppstructure/vqa/labels/labels_ser.txt
+++ /dev/null
@@ -1,3 +0,0 @@
-QUESTION
-ANSWER
-HEADER
diff --git a/ppstructure/vqa/predict_vqa_token_ser.py b/ppstructure/vqa/predict_vqa_token_ser.py
new file mode 100644
index 0000000000000000000000000000000000000000..de0bbfe72d80d9a16de8b09657a98dc5285bb348
--- /dev/null
+++ b/ppstructure/vqa/predict_vqa_token_ser.py
@@ -0,0 +1,169 @@
+# Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+import os
+import sys
+
+__dir__ = os.path.dirname(os.path.abspath(__file__))
+sys.path.append(__dir__)
+sys.path.insert(0, os.path.abspath(os.path.join(__dir__, '../..')))
+
+os.environ["FLAGS_allocator_strategy"] = 'auto_growth'
+
+import cv2
+import json
+import numpy as np
+import time
+
+import tools.infer.utility as utility
+from ppocr.data import create_operators, transform
+from ppocr.postprocess import build_post_process
+from ppocr.utils.logging import get_logger
+from ppocr.utils.visual import draw_ser_results
+from ppocr.utils.utility import get_image_file_list, check_and_read_gif
+from ppstructure.utility import parse_args
+
+from paddleocr import PaddleOCR
+
+logger = get_logger()
+
+
+class SerPredictor(object):
+ def __init__(self, args):
+ self.ocr_engine = PaddleOCR(use_angle_cls=False, show_log=False)
+
+ pre_process_list = [{
+ 'VQATokenLabelEncode': {
+ 'algorithm': args.vqa_algorithm,
+ 'class_path': args.ser_dict_path,
+ 'contains_re': False,
+ 'ocr_engine': self.ocr_engine
+ }
+ }, {
+ 'VQATokenPad': {
+ 'max_seq_len': 512,
+ 'return_attention_mask': True
+ }
+ }, {
+ 'VQASerTokenChunk': {
+ 'max_seq_len': 512,
+ 'return_attention_mask': True
+ }
+ }, {
+ 'Resize': {
+ 'size': [224, 224]
+ }
+ }, {
+ 'NormalizeImage': {
+ 'std': [58.395, 57.12, 57.375],
+ 'mean': [123.675, 116.28, 103.53],
+ 'scale': '1',
+ 'order': 'hwc'
+ }
+ }, {
+ 'ToCHWImage': None
+ }, {
+ 'KeepKeys': {
+ 'keep_keys': [
+ 'input_ids', 'bbox', 'attention_mask', 'token_type_ids',
+ 'image', 'labels', 'segment_offset_id', 'ocr_info',
+ 'entities'
+ ]
+ }
+ }]
+ postprocess_params = {
+ 'name': 'VQASerTokenLayoutLMPostProcess',
+ "class_path": args.ser_dict_path,
+ }
+
+ self.preprocess_op = create_operators(pre_process_list,
+ {'infer_mode': True})
+ self.postprocess_op = build_post_process(postprocess_params)
+ self.predictor, self.input_tensor, self.output_tensors, self.config = \
+ utility.create_predictor(args, 'ser', logger)
+
+ def __call__(self, img):
+ ori_im = img.copy()
+ data = {'image': img}
+ data = transform(data, self.preprocess_op)
+ img = data[0]
+ if img is None:
+ return None, 0
+ img = np.expand_dims(img, axis=0)
+ img = img.copy()
+ starttime = time.time()
+
+ for idx in range(len(self.input_tensor)):
+ expand_input = np.expand_dims(data[idx], axis=0)
+ self.input_tensor[idx].copy_from_cpu(expand_input)
+
+ self.predictor.run()
+
+ outputs = []
+ for output_tensor in self.output_tensors:
+ output = output_tensor.copy_to_cpu()
+ outputs.append(output)
+ preds = outputs[0]
+
+ post_result = self.postprocess_op(
+ preds, segment_offset_ids=[data[6]], ocr_infos=[data[7]])
+ elapse = time.time() - starttime
+ return post_result, elapse
+
+
+def main(args):
+ image_file_list = get_image_file_list(args.image_dir)
+ ser_predictor = SerPredictor(args)
+ count = 0
+ total_time = 0
+
+ os.makedirs(args.output, exist_ok=True)
+ with open(
+ os.path.join(args.output, 'infer.txt'), mode='w',
+ encoding='utf-8') as f_w:
+ for image_file in image_file_list:
+ img, flag = check_and_read_gif(image_file)
+ if not flag:
+ img = cv2.imread(image_file)
+ img = img[:, :, ::-1]
+ if img is None:
+ logger.info("error in loading image:{}".format(image_file))
+ continue
+ ser_res, elapse = ser_predictor(img)
+ ser_res = ser_res[0]
+
+ res_str = '{}\t{}\n'.format(
+ image_file,
+ json.dumps(
+ {
+ "ocr_info": ser_res,
+ }, ensure_ascii=False))
+ f_w.write(res_str)
+
+ img_res = draw_ser_results(
+ image_file,
+ ser_res,
+ font_path="../doc/fonts/simfang.ttf", )
+
+ img_save_path = os.path.join(args.output,
+ os.path.basename(image_file))
+ cv2.imwrite(img_save_path, img_res)
+ logger.info("save vis result to {}".format(img_save_path))
+ if count > 0:
+ total_time += elapse
+ count += 1
+ logger.info("Predict time of {}: {}".format(image_file, elapse))
+
+
+if __name__ == "__main__":
+ main(parse_args())
diff --git a/ppstructure/vqa/requirements.txt b/ppstructure/vqa/requirements.txt
index 0042ec0baedcc3e7bbecb922d10b93c95219219d..fcd882274c4402ba2a1d34f20ee6e2befa157121 100644
--- a/ppstructure/vqa/requirements.txt
+++ b/ppstructure/vqa/requirements.txt
@@ -1,4 +1,7 @@
sentencepiece
yacs
seqeval
-paddlenlp>=2.2.1
\ No newline at end of file
+paddlenlp>=2.2.1
+pypandoc
+attrdict
+python_docx
\ No newline at end of file
diff --git a/ppstructure/vqa/tools/trans_xfun_data.py b/ppstructure/vqa/tools/trans_xfun_data.py
index 93ec98163c6cec96ec93399c1d41524200ddc499..11d221bea40367f091b3e09dde42e87f2217a617 100644
--- a/ppstructure/vqa/tools/trans_xfun_data.py
+++ b/ppstructure/vqa/tools/trans_xfun_data.py
@@ -21,26 +21,22 @@ def transfer_xfun_data(json_path=None, output_file=None):
json_info = json.loads(lines[0])
documents = json_info["documents"]
- label_info = {}
with open(output_file, "w", encoding='utf-8') as fout:
for idx, document in enumerate(documents):
+ label_info = []
img_info = document["img"]
document = document["document"]
image_path = img_info["fname"]
- label_info["height"] = img_info["height"]
- label_info["width"] = img_info["width"]
-
- label_info["ocr_info"] = []
-
for doc in document:
- label_info["ocr_info"].append({
- "text": doc["text"],
+ x1, y1, x2, y2 = doc["box"]
+ points = [[x1, y1], [x2, y1], [x2, y2], [x1, y2]]
+ label_info.append({
+ "transcription": doc["text"],
"label": doc["label"],
- "bbox": doc["box"],
+ "points": points,
"id": doc["id"],
- "linking": doc["linking"],
- "words": doc["words"]
+ "linking": doc["linking"]
})
fout.write(image_path + "\t" + json.dumps(
diff --git a/test_tipc/build_server.sh b/test_tipc/build_server.sh
new file mode 100644
index 0000000000000000000000000000000000000000..3173359785290ffa5c6f865efe96705e2b09fae1
--- /dev/null
+++ b/test_tipc/build_server.sh
@@ -0,0 +1,69 @@
+#使用镜像:
+#registry.baidubce.com/paddlepaddle/paddle:latest-dev-cuda10.1-cudnn7-gcc82
+
+#编译Serving Server:
+
+#client和app可以直接使用release版本
+
+#server因为加入了自定义OP,需要重新编译
+
+apt-get update
+apt install -y libcurl4-openssl-dev libbz2-dev
+wget https://paddle-serving.bj.bcebos.com/others/centos_ssl.tar && tar xf centos_ssl.tar && rm -rf centos_ssl.tar && mv libcrypto.so.1.0.2k /usr/lib/libcrypto.so.1.0.2k && mv libssl.so.1.0.2k /usr/lib/libssl.so.1.0.2k && ln -sf /usr/lib/libcrypto.so.1.0.2k /usr/lib/libcrypto.so.10 && ln -sf /usr/lib/libssl.so.1.0.2k /usr/lib/libssl.so.10 && ln -sf /usr/lib/libcrypto.so.10 /usr/lib/libcrypto.so && ln -sf /usr/lib/libssl.so.10 /usr/lib/libssl.so
+
+# 安装go依赖
+rm -rf /usr/local/go
+wget -qO- https://paddle-ci.cdn.bcebos.com/go1.17.2.linux-amd64.tar.gz | tar -xz -C /usr/local
+export GOROOT=/usr/local/go
+export GOPATH=/root/gopath
+export PATH=$PATH:$GOPATH/bin:$GOROOT/bin
+go env -w GO111MODULE=on
+go env -w GOPROXY=https://goproxy.cn,direct
+go install github.com/grpc-ecosystem/grpc-gateway/protoc-gen-grpc-gateway@v1.15.2
+go install github.com/grpc-ecosystem/grpc-gateway/protoc-gen-swagger@v1.15.2
+go install github.com/golang/protobuf/protoc-gen-go@v1.4.3
+go install google.golang.org/grpc@v1.33.0
+go env -w GO111MODULE=auto
+
+# 下载opencv库
+wget https://paddle-qa.bj.bcebos.com/PaddleServing/opencv3.tar.gz && tar -xvf opencv3.tar.gz && rm -rf opencv3.tar.gz
+export OPENCV_DIR=$PWD/opencv3
+
+# clone Serving
+git clone https://github.com/PaddlePaddle/Serving.git -b develop --depth=1
+cd Serving
+export Serving_repo_path=$PWD
+git submodule update --init --recursive
+python -m pip install -r python/requirements.txt
+
+
+export PYTHON_INCLUDE_DIR=$(python -c "from distutils.sysconfig import get_python_inc; print(get_python_inc())")
+export PYTHON_LIBRARIES=$(python -c "import distutils.sysconfig as sysconfig; print(sysconfig.get_config_var('LIBDIR'))")
+export PYTHON_EXECUTABLE=`which python`
+
+export CUDA_PATH='/usr/local/cuda'
+export CUDNN_LIBRARY='/usr/local/cuda/lib64/'
+export CUDA_CUDART_LIBRARY='/usr/local/cuda/lib64/'
+export TENSORRT_LIBRARY_PATH='/usr/local/TensorRT6-cuda10.1-cudnn7/targets/x86_64-linux-gnu/'
+
+# cp 自定义OP代码
+cp -rf ../deploy/pdserving/general_detection_op.cpp ${Serving_repo_path}/core/general-server/op
+
+# 编译Server, export SERVING_BIN
+mkdir server-build-gpu-opencv && cd server-build-gpu-opencv
+cmake -DPYTHON_INCLUDE_DIR=$PYTHON_INCLUDE_DIR \
+ -DPYTHON_LIBRARIES=$PYTHON_LIBRARIES \
+ -DPYTHON_EXECUTABLE=$PYTHON_EXECUTABLE \
+ -DCUDA_TOOLKIT_ROOT_DIR=${CUDA_PATH} \
+ -DCUDNN_LIBRARY=${CUDNN_LIBRARY} \
+ -DCUDA_CUDART_LIBRARY=${CUDA_CUDART_LIBRARY} \
+ -DTENSORRT_ROOT=${TENSORRT_LIBRARY_PATH} \
+ -DOPENCV_DIR=${OPENCV_DIR} \
+ -DWITH_OPENCV=ON \
+ -DSERVER=ON \
+ -DWITH_GPU=ON ..
+make -j32
+
+python -m pip install python/dist/paddle*
+export SERVING_BIN=$PWD/core/general-server/serving
+cd ../../
diff --git a/test_tipc/common_func.sh b/test_tipc/common_func.sh
index 85dfe217253bb1d4c8b92f17d26f138121a2a198..f7d8a1e04adee9d32332eda8cb5913bbaf168481 100644
--- a/test_tipc/common_func.sh
+++ b/test_tipc/common_func.sh
@@ -57,10 +57,11 @@ function status_check(){
last_status=$1 # the exit code
run_command=$2
run_log=$3
+ model_name=$4
if [ $last_status -eq 0 ]; then
- echo -e "\033[33m Run successfully with command - ${run_command}! \033[0m" | tee -a ${run_log}
+ echo -e "\033[33m Run successfully with command - ${model_name} - ${run_command}! \033[0m" | tee -a ${run_log}
else
- echo -e "\033[33m Run failed with command - ${run_command}! \033[0m" | tee -a ${run_log}
+ echo -e "\033[33m Run failed with command - ${model_name} - ${run_command}! \033[0m" | tee -a ${run_log}
fi
}
diff --git a/test_tipc/configs/ch_PP-OCRv2/model_linux_gpu_normal_normal_infer_cpp_linux_gpu_cpu.txt b/test_tipc/configs/ch_PP-OCRv2/model_linux_gpu_normal_normal_infer_cpp_linux_gpu_cpu.txt
new file mode 100644
index 0000000000000000000000000000000000000000..a0c49a0812c5cb47c848d4b6e68e0b10f835c760
--- /dev/null
+++ b/test_tipc/configs/ch_PP-OCRv2/model_linux_gpu_normal_normal_infer_cpp_linux_gpu_cpu.txt
@@ -0,0 +1,20 @@
+===========================cpp_infer_params===========================
+model_name:ch_PP-OCRv2
+use_opencv:True
+infer_model:./inference/ch_PP-OCRv2_det_infer/
+infer_quant:False
+inference:./deploy/cpp_infer/build/ppocr --rec_char_dict_path=./ppocr/utils/ppocr_keys_v1.txt --rec_img_h=32
+--use_gpu:True|False
+--enable_mkldnn:False
+--cpu_threads:6
+--rec_batch_num:1
+--use_tensorrt:False
+--precision:fp32
+--det_model_dir:
+--image_dir:./inference/ch_det_data_50/all-sum-510/
+--rec_model_dir:./inference/ch_PP-OCRv2_rec_infer/
+--benchmark:True
+--det:True
+--rec:True
+--cls:False
+--use_angle_cls:False
\ No newline at end of file
diff --git a/test_tipc/configs/ch_PP-OCRv2/model_linux_gpu_normal_normal_infer_python_linux_gpu_cpu.txt b/test_tipc/configs/ch_PP-OCRv2/model_linux_gpu_normal_normal_infer_python_linux_gpu_cpu.txt
index fcac6e3984cf3fd45fec9f7b736f794289278b25..32b290a9ed0ca04032e7854d9739e1612a6a095f 100644
--- a/test_tipc/configs/ch_PP-OCRv2/model_linux_gpu_normal_normal_infer_python_linux_gpu_cpu.txt
+++ b/test_tipc/configs/ch_PP-OCRv2/model_linux_gpu_normal_normal_infer_python_linux_gpu_cpu.txt
@@ -6,10 +6,10 @@ infer_export:null
infer_quant:False
inference:tools/infer/predict_system.py
--use_gpu:False|True
---enable_mkldnn:False|True
---cpu_threads:1|6
+--enable_mkldnn:False
+--cpu_threads:6
--rec_batch_num:1
---use_tensorrt:False|True
+--use_tensorrt:False
--precision:fp32
--det_model_dir:
--image_dir:./inference/ch_det_data_50/all-sum-510/
diff --git a/test_tipc/configs/ch_PP-OCRv2/model_linux_gpu_normal_normal_paddle2onnx_python_linux_cpu.txt b/test_tipc/configs/ch_PP-OCRv2/model_linux_gpu_normal_normal_paddle2onnx_python_linux_cpu.txt
new file mode 100644
index 0000000000000000000000000000000000000000..24eb620eeb6fc0cf880bb88560805491d6e8e2d5
--- /dev/null
+++ b/test_tipc/configs/ch_PP-OCRv2/model_linux_gpu_normal_normal_paddle2onnx_python_linux_cpu.txt
@@ -0,0 +1,17 @@
+===========================paddle2onnx_params===========================
+model_name:ch_PP-OCRv2
+python:python3.7
+2onnx: paddle2onnx
+--det_model_dir:./inference/ch_PP-OCRv2_det_infer/
+--model_filename:inference.pdmodel
+--params_filename:inference.pdiparams
+--det_save_file:./inference/det_v2_onnx/model.onnx
+--rec_model_dir:./inference/ch_PP-OCRv2_rec_infer/
+--rec_save_file:./inference/rec_v2_onnx/model.onnx
+--opset_version:10
+--enable_onnx_checker:True
+inference:tools/infer/predict_system.py --rec_image_shape="3,32,320"
+--use_gpu:True|False
+--det_model_dir:
+--rec_model_dir:
+--image_dir:./inference/ch_det_data_50/all-sum-510/00008790.jpg
\ No newline at end of file
diff --git a/test_tipc/configs/ch_PP-OCRv2/model_linux_gpu_normal_normal_serving_cpp_linux_gpu_cpu.txt b/test_tipc/configs/ch_PP-OCRv2/model_linux_gpu_normal_normal_serving_cpp_linux_gpu_cpu.txt
new file mode 100644
index 0000000000000000000000000000000000000000..f0456b5c351d20222e331df6a5019a51b79b6d28
--- /dev/null
+++ b/test_tipc/configs/ch_PP-OCRv2/model_linux_gpu_normal_normal_serving_cpp_linux_gpu_cpu.txt
@@ -0,0 +1,19 @@
+===========================serving_params===========================
+model_name:ch_PP-OCRv2
+python:python3.7
+trans_model:-m paddle_serving_client.convert
+--det_dirname:./inference/ch_PP-OCRv2_det_infer/
+--model_filename:inference.pdmodel
+--params_filename:inference.pdiparams
+--det_serving_server:./deploy/pdserving/ppocr_det_v2_serving/
+--det_serving_client:./deploy/pdserving/ppocr_det_v2_client/
+--rec_dirname:./inference/ch_PP-OCRv2_rec_infer/
+--rec_serving_server:./deploy/pdserving/ppocr_rec_v2_serving/
+--rec_serving_client:./deploy/pdserving/ppocr_rec_v2_client/
+serving_dir:./deploy/pdserving
+web_service:-m paddle_serving_server.serve
+--op:GeneralDetectionOp GeneralInferOp
+--port:8181
+--gpu_id:"0"|null
+cpp_client:ocr_cpp_client.py
+--image_dir:../../doc/imgs/1.jpg
diff --git a/test_tipc/configs/ch_PP-OCRv2/model_linux_gpu_normal_normal_serving_python_linux_gpu_cpu.txt b/test_tipc/configs/ch_PP-OCRv2/model_linux_gpu_normal_normal_serving_python_linux_gpu_cpu.txt
new file mode 100644
index 0000000000000000000000000000000000000000..4ad64db03c6bbd017644a947ccdc168fd721ad9f
--- /dev/null
+++ b/test_tipc/configs/ch_PP-OCRv2/model_linux_gpu_normal_normal_serving_python_linux_gpu_cpu.txt
@@ -0,0 +1,23 @@
+===========================serving_params===========================
+model_name:ch_PP-OCRv2
+python:python3.7
+trans_model:-m paddle_serving_client.convert
+--det_dirname:./inference/ch_PP-OCRv2_det_infer/
+--model_filename:inference.pdmodel
+--params_filename:inference.pdiparams
+--det_serving_server:./deploy/pdserving/ppocr_det_v2_serving/
+--det_serving_client:./deploy/pdserving/ppocr_det_v2_client/
+--rec_dirname:./inference/ch_PP-OCRv2_rec_infer/
+--rec_serving_server:./deploy/pdserving/ppocr_rec_v2_serving/
+--rec_serving_client:./deploy/pdserving/ppocr_rec_v2_client/
+serving_dir:./deploy/pdserving
+web_service:web_service.py --config=config.yml --opt op.det.concurrency="1" op.rec.concurrency="1"
+op.det.local_service_conf.devices:gpu|null
+op.det.local_service_conf.use_mkldnn:False
+op.det.local_service_conf.thread_num:6
+op.det.local_service_conf.use_trt:False
+op.det.local_service_conf.precision:fp32
+op.det.local_service_conf.model_config:
+op.rec.local_service_conf.model_config:
+pipline:pipeline_http_client.py
+--image_dir:../../doc/imgs/1.jpg
diff --git a/test_tipc/configs/ch_PP-OCRv2_det/model_linux_gpu_normal_normal_infer_cpp_linux_gpu_cpu.txt b/test_tipc/configs/ch_PP-OCRv2_det/model_linux_gpu_normal_normal_infer_cpp_linux_gpu_cpu.txt
new file mode 100644
index 0000000000000000000000000000000000000000..7eccbd725c25938aeabb32499e67536f45eed184
--- /dev/null
+++ b/test_tipc/configs/ch_PP-OCRv2_det/model_linux_gpu_normal_normal_infer_cpp_linux_gpu_cpu.txt
@@ -0,0 +1,20 @@
+===========================cpp_infer_params===========================
+model_name:ch_PP-OCRv2_det
+use_opencv:True
+infer_model:./inference/ch_PP-OCRv2_det_infer/
+infer_quant:False
+inference:./deploy/cpp_infer/build/ppocr
+--use_gpu:True|False
+--enable_mkldnn:False
+--cpu_threads:6
+--rec_batch_num:1
+--use_tensorrt:False
+--precision:fp32
+--det_model_dir:
+--image_dir:./inference/ch_det_data_50/all-sum-510/
+null:null
+--benchmark:True
+--det:True
+--rec:False
+--cls:False
+--use_angle_cls:False
\ No newline at end of file
diff --git a/test_tipc/configs/ch_PP-OCRv2_det/model_linux_gpu_normal_normal_paddle2onnx_python_linux_cpu.txt b/test_tipc/configs/ch_PP-OCRv2_det/model_linux_gpu_normal_normal_paddle2onnx_python_linux_cpu.txt
new file mode 100644
index 0000000000000000000000000000000000000000..2e7906076faf4d8002f7746afd6f0cbb4adb8253
--- /dev/null
+++ b/test_tipc/configs/ch_PP-OCRv2_det/model_linux_gpu_normal_normal_paddle2onnx_python_linux_cpu.txt
@@ -0,0 +1,17 @@
+===========================paddle2onnx_params===========================
+model_name:ch_PP-OCRv2_det
+python:python3.7
+2onnx: paddle2onnx
+--det_model_dir:./inference/ch_PP-OCRv2_det_infer/
+--model_filename:inference.pdmodel
+--params_filename:inference.pdiparams
+--det_save_file:./inference/det_v2_onnx/model.onnx
+--rec_model_dir:
+--rec_save_file:
+--opset_version:10
+--enable_onnx_checker:True
+inference:tools/infer/predict_det.py
+--use_gpu:True|False
+--det_model_dir:
+--rec_model_dir:
+--image_dir:./inference/ch_det_data_50/all-sum-510/
\ No newline at end of file
diff --git a/test_tipc/configs/ch_PP-OCRv2_det/model_linux_gpu_normal_normal_serving_python_linux_gpu_cpu.txt b/test_tipc/configs/ch_PP-OCRv2_det/model_linux_gpu_normal_normal_serving_python_linux_gpu_cpu.txt
new file mode 100644
index 0000000000000000000000000000000000000000..587a7d7ea6644f4eda200e7ed8f095d9428a1310
--- /dev/null
+++ b/test_tipc/configs/ch_PP-OCRv2_det/model_linux_gpu_normal_normal_serving_python_linux_gpu_cpu.txt
@@ -0,0 +1,23 @@
+===========================serving_params===========================
+model_name:ch_PP-OCRv2_det
+python:python3.7
+trans_model:-m paddle_serving_client.convert
+--det_dirname:./inference/ch_PP-OCRv2_det_infer/
+--model_filename:inference.pdmodel
+--params_filename:inference.pdiparams
+--det_serving_server:./deploy/pdserving/ppocr_det_v2_serving/
+--det_serving_client:./deploy/pdserving/ppocr_det_v2_client/
+--rec_dirname:null
+--rec_serving_server:null
+--rec_serving_client:null
+serving_dir:./deploy/pdserving
+web_service:web_service_det.py --config=config.yml --opt op.det.concurrency="1"
+op.det.local_service_conf.devices:gpu|null
+op.det.local_service_conf.use_mkldnn:False
+op.det.local_service_conf.thread_num:6
+op.det.local_service_conf.use_trt:False
+op.det.local_service_conf.precision:fp32
+op.det.local_service_conf.model_config:
+op.rec.local_service_conf.model_config:
+pipline:pipeline_http_client.py
+--image_dir:../../doc/imgs/1.jpg
diff --git a/test_tipc/configs/ch_PP-OCRv2_det/train_infer_python.txt b/test_tipc/configs/ch_PP-OCRv2_det/train_infer_python.txt
index 797cf53a1ded756670709dd1a30c3ef25a9c0906..cab0cb0aa390c7cb1efa6e5d3bc636e9c974acba 100644
--- a/test_tipc/configs/ch_PP-OCRv2_det/train_infer_python.txt
+++ b/test_tipc/configs/ch_PP-OCRv2_det/train_infer_python.txt
@@ -1,10 +1,10 @@
===========================train_params===========================
-model_name:ch_PPOCRv2_det
+model_name:ch_PP-OCRv2_det
python:python3.7
gpu_list:0|0,1
Global.use_gpu:True|True
Global.auto_cast:fp32
-Global.epoch_num:lite_train_lite_infer=1|whole_train_whole_infer=500
+Global.epoch_num:lite_train_lite_infer=1|whole_train_whole_infer=50
Global.save_model_dir:./output/
Train.loader.batch_size_per_card:lite_train_lite_infer=2|whole_train_whole_infer=4
Global.pretrained_model:null
@@ -39,11 +39,11 @@ infer_export:null
infer_quant:False
inference:tools/infer/predict_det.py
--use_gpu:True|False
---enable_mkldnn:True|False
---cpu_threads:1|6
+--enable_mkldnn:False
+--cpu_threads:6
--rec_batch_num:1
---use_tensorrt:False|True
---precision:fp32|fp16|int8
+--use_tensorrt:False
+--precision:fp32
--det_model_dir:
--image_dir:./inference/ch_det_data_50/all-sum-510/
null:null
diff --git a/test_tipc/configs/ch_PP-OCRv2_det/train_linux_gpu_fleet_normal_infer_python_linux_gpu_cpu.txt b/test_tipc/configs/ch_PP-OCRv2_det/train_linux_gpu_fleet_normal_infer_python_linux_gpu_cpu.txt
new file mode 100644
index 0000000000000000000000000000000000000000..91a6288eb0d4f3d2a8c968a65916295d25024c32
--- /dev/null
+++ b/test_tipc/configs/ch_PP-OCRv2_det/train_linux_gpu_fleet_normal_infer_python_linux_gpu_cpu.txt
@@ -0,0 +1,53 @@
+===========================train_params===========================
+model_name:ch_PP-OCRv2_det
+python:python3.7
+gpu_list:192.168.0.1,192.168.0.2;0,1
+Global.use_gpu:True
+Global.auto_cast:fp32
+Global.epoch_num:lite_train_lite_infer=1|whole_train_whole_infer=50
+Global.save_model_dir:./output/
+Train.loader.batch_size_per_card:lite_train_lite_infer=2|whole_train_whole_infer=4
+Global.pretrained_model:null
+train_model_name:latest
+train_infer_img_dir:./train_data/icdar2015/text_localization/ch4_test_images/
+null:null
+##
+trainer:norm_train
+norm_train:tools/train.py -c configs/det/ch_PP-OCRv2/ch_PP-OCRv2_det_cml.yml -o
+pact_train:null
+fpgm_train:null
+distill_train:null
+null:null
+null:null
+##
+===========================eval_params===========================
+eval:null
+null:null
+##
+===========================infer_params===========================
+Global.save_inference_dir:./output/
+Global.checkpoints:
+norm_export:tools/export_model.py -c configs/det/ch_PP-OCRv2/ch_PP-OCRv2_det_cml.yml -o
+quant_export:null
+fpgm_export:
+distill_export:null
+export1:null
+export2:null
+inference_dir:Student
+infer_model:./inference/ch_PP-OCRv2_det_infer/
+infer_export:null
+infer_quant:False
+inference:tools/infer/predict_det.py
+--use_gpu:False
+--enable_mkldnn:False
+--cpu_threads:6
+--rec_batch_num:1
+--use_tensorrt:False
+--precision:fp32
+--det_model_dir:
+--image_dir:./inference/ch_det_data_50/all-sum-510/
+null:null
+--benchmark:True
+null:null
+===========================infer_benchmark_params==========================
+random_infer_input:[{float32,[3,640,640]}];[{float32,[3,960,960]}]
diff --git a/test_tipc/configs/ch_PP-OCRv2_det/train_linux_gpu_normal_amp_infer_python_linux_gpu_cpu.txt b/test_tipc/configs/ch_PP-OCRv2_det/train_linux_gpu_normal_amp_infer_python_linux_gpu_cpu.txt
index 033d40a80a3569f8bfd408cdb6df37e7ba5ecd0c..85b0ebcb9d747acdf520d4516722526d791b1151 100644
--- a/test_tipc/configs/ch_PP-OCRv2_det/train_linux_gpu_normal_amp_infer_python_linux_gpu_cpu.txt
+++ b/test_tipc/configs/ch_PP-OCRv2_det/train_linux_gpu_normal_amp_infer_python_linux_gpu_cpu.txt
@@ -1,10 +1,10 @@
===========================train_params===========================
-model_name:ch_PPOCRv2_det
+model_name:ch_PP-OCRv2_det
python:python3.7
gpu_list:0|0,1
Global.use_gpu:True|True
Global.auto_cast:amp
-Global.epoch_num:lite_train_lite_infer=1|whole_train_whole_infer=500
+Global.epoch_num:lite_train_lite_infer=1|whole_train_whole_infer=50
Global.save_model_dir:./output/
Train.loader.batch_size_per_card:lite_train_lite_infer=2|whole_train_whole_infer=4
Global.pretrained_model:null
@@ -39,11 +39,11 @@ infer_export:null
infer_quant:False
inference:tools/infer/predict_det.py
--use_gpu:True|False
---enable_mkldnn:True|False
---cpu_threads:1|6
+--enable_mkldnn:False
+--cpu_threads:6
--rec_batch_num:1
---use_tensorrt:False|True
---precision:fp32|fp16|int8
+--use_tensorrt:False
+--precision:fp32
--det_model_dir:
--image_dir:./inference/ch_det_data_50/all-sum-510/
null:null
diff --git a/test_tipc/configs/ch_PP-OCRv2_det_PACT/train_infer_python.txt b/test_tipc/configs/ch_PP-OCRv2_det/train_pact_infer_python.txt
similarity index 91%
rename from test_tipc/configs/ch_PP-OCRv2_det_PACT/train_infer_python.txt
rename to test_tipc/configs/ch_PP-OCRv2_det/train_pact_infer_python.txt
index 038fa850614d45dbefe076b866571cead57b8450..1a20f97fdbcd10881de6e94d0724740ba44edc5c 100644
--- a/test_tipc/configs/ch_PP-OCRv2_det_PACT/train_infer_python.txt
+++ b/test_tipc/configs/ch_PP-OCRv2_det/train_pact_infer_python.txt
@@ -1,10 +1,10 @@
===========================train_params===========================
-model_name:ch_PPOCRv2_det_PACT
+model_name:ch_PP-OCRv2_det_PACT
python:python3.7
gpu_list:0|0,1
Global.use_gpu:True|True
Global.auto_cast:fp32
-Global.epoch_num:lite_train_lite_infer=1|whole_train_whole_infer=500
+Global.epoch_num:lite_train_lite_infer=1|whole_train_whole_infer=50
Global.save_model_dir:./output/
Train.loader.batch_size_per_card:lite_train_lite_infer=1|whole_train_whole_infer=4
Global.pretrained_model:null
@@ -39,11 +39,11 @@ infer_export:null
infer_quant:False
inference:tools/infer/predict_det.py
--use_gpu:True|False
---enable_mkldnn:True|False
---cpu_threads:1|6
+--enable_mkldnn:False
+--cpu_threads:6
--rec_batch_num:1
---use_tensorrt:False|True
---precision:fp32|fp16|int8
+--use_tensorrt:False
+--precision:fp32
--det_model_dir:
--image_dir:./inference/ch_det_data_50/all-sum-510/
null:null
diff --git a/test_tipc/configs/ch_PP-OCRv2_det_KL/model_linux_gpu_normal_normal_infer_python_linux_gpu_cpu.txt b/test_tipc/configs/ch_PP-OCRv2_det/train_ptq_infer_python.txt
similarity index 84%
rename from test_tipc/configs/ch_PP-OCRv2_det_KL/model_linux_gpu_normal_normal_infer_python_linux_gpu_cpu.txt
rename to test_tipc/configs/ch_PP-OCRv2_det/train_ptq_infer_python.txt
index 1aad65b687992155133ed11533a14f642510361d..ccc9e5ced086c2c617359bafdc8772fe92eab8fa 100644
--- a/test_tipc/configs/ch_PP-OCRv2_det_KL/model_linux_gpu_normal_normal_infer_python_linux_gpu_cpu.txt
+++ b/test_tipc/configs/ch_PP-OCRv2_det/train_ptq_infer_python.txt
@@ -1,5 +1,5 @@
===========================kl_quant_params===========================
-model_name:PPOCRv2_ocr_det_kl
+model_name:ch_PP-OCRv2_det_KL
python:python3.7
Global.pretrained_model:null
Global.save_inference_dir:null
@@ -8,10 +8,10 @@ infer_export:deploy/slim/quantization/quant_kl.py -c configs/det/ch_PP-OCRv2/ch_
infer_quant:True
inference:tools/infer/predict_det.py
--use_gpu:False|True
---enable_mkldnn:True
---cpu_threads:1|6
+--enable_mkldnn:False
+--cpu_threads:6
--rec_batch_num:1
---use_tensorrt:False|True
+--use_tensorrt:False
--precision:int8
--det_model_dir:
--image_dir:./inference/ch_det_data_50/all-sum-510/
diff --git a/test_tipc/configs/ch_PP-OCRv2_det_KL/model_linux_gpu_normal_normal_infer_cpp_linux_gpu_cpu.txt b/test_tipc/configs/ch_PP-OCRv2_det_KL/model_linux_gpu_normal_normal_infer_cpp_linux_gpu_cpu.txt
new file mode 100644
index 0000000000000000000000000000000000000000..1975e099d7f8328f521d976d76b1070b365164fe
--- /dev/null
+++ b/test_tipc/configs/ch_PP-OCRv2_det_KL/model_linux_gpu_normal_normal_infer_cpp_linux_gpu_cpu.txt
@@ -0,0 +1,20 @@
+===========================cpp_infer_params===========================
+model_name:ch_PP-OCRv2_det_KL
+use_opencv:True
+infer_model:./inference/ch_PP-OCRv2_det_klquant_infer
+infer_quant:False
+inference:./deploy/cpp_infer/build/ppocr
+--use_gpu:True|False
+--enable_mkldnn:False
+--cpu_threads:6
+--rec_batch_num:1
+--use_tensorrt:False
+--precision:fp32
+--det_model_dir:
+--image_dir:./inference/ch_det_data_50/all-sum-510/
+null:null
+--benchmark:True
+--det:True
+--rec:False
+--cls:False
+--use_angle_cls:False
\ No newline at end of file
diff --git a/test_tipc/configs/ch_PP-OCRv2_det_KL/model_linux_gpu_normal_normal_serving_cpp_linux_gpu_cpu.txt b/test_tipc/configs/ch_PP-OCRv2_det_KL/model_linux_gpu_normal_normal_serving_cpp_linux_gpu_cpu.txt
new file mode 100644
index 0000000000000000000000000000000000000000..e306b0a92afdc720227704c2e526e7c3cfe98ae6
--- /dev/null
+++ b/test_tipc/configs/ch_PP-OCRv2_det_KL/model_linux_gpu_normal_normal_serving_cpp_linux_gpu_cpu.txt
@@ -0,0 +1,19 @@
+===========================serving_params===========================
+model_name:ch_PP-OCRv2_det_KL
+python:python3.7
+trans_model:-m paddle_serving_client.convert
+--det_dirname:./inference/ch_PP-OCRv2_det_klquant_infer/
+--model_filename:inference.pdmodel
+--params_filename:inference.pdiparams
+--det_serving_server:./deploy/pdserving/ppocr_det_v2_kl_serving/
+--det_serving_client:./deploy/pdserving/ppocr_det_v2_kl_client/
+--rec_dirname:./inference/ch_PP-OCRv2_rec_klquant_infer/
+--rec_serving_server:./deploy/pdserving/ppocr_rec_v2_kl_serving/
+--rec_serving_client:./deploy/pdserving/ppocr_rec_v2_kl_client/
+serving_dir:./deploy/pdserving
+web_service:-m paddle_serving_server.serve
+--op:GeneralDetectionOp GeneralInferOp
+--port:8181
+--gpu_id:"0"|null
+cpp_client:ocr_cpp_client.py
+--image_dir:../../doc/imgs/1.jpg
diff --git a/test_tipc/configs/ch_PP-OCRv2_det_KL/model_linux_gpu_normal_normal_serving_python_linux_gpu_cpu.txt b/test_tipc/configs/ch_PP-OCRv2_det_KL/model_linux_gpu_normal_normal_serving_python_linux_gpu_cpu.txt
new file mode 100644
index 0000000000000000000000000000000000000000..2c96d2bfd88ce43f7701ee92a2c2bb8c909feed8
--- /dev/null
+++ b/test_tipc/configs/ch_PP-OCRv2_det_KL/model_linux_gpu_normal_normal_serving_python_linux_gpu_cpu.txt
@@ -0,0 +1,23 @@
+===========================serving_params===========================
+model_name:ch_PP-OCRv2_det_KL
+python:python3.7
+trans_model:-m paddle_serving_client.convert
+--det_dirname:./inference/ch_PP-OCRv2_det_klquant_infer/
+--model_filename:inference.pdmodel
+--params_filename:inference.pdiparams
+--det_serving_server:./deploy/pdserving/ppocr_det_v2_kl_serving/
+--det_serving_client:./deploy/pdserving/ppocr_det_v2_kl_client/
+--rec_dirname:null
+--rec_serving_server:null
+--rec_serving_client:null
+serving_dir:./deploy/pdserving
+web_service:web_service_det.py --config=config.yml --opt op.det.concurrency="1"
+op.det.local_service_conf.devices:gpu|null
+op.det.local_service_conf.use_mkldnn:False
+op.det.local_service_conf.thread_num:6
+op.det.local_service_conf.use_trt:False
+op.det.local_service_conf.precision:fp32
+op.det.local_service_conf.model_config:
+op.rec.local_service_conf.model_config:
+pipline:pipeline_http_client.py
+--image_dir:../../doc/imgs/1.jpg
diff --git a/test_tipc/configs/ch_PP-OCRv2_det_PACT/model_linux_gpu_normal_normal_infer_cpp_linux_gpu_cpu.txt b/test_tipc/configs/ch_PP-OCRv2_det_PACT/model_linux_gpu_normal_normal_infer_cpp_linux_gpu_cpu.txt
new file mode 100644
index 0000000000000000000000000000000000000000..43ef97d5063d347071501ab4d8874ee3a42e50d7
--- /dev/null
+++ b/test_tipc/configs/ch_PP-OCRv2_det_PACT/model_linux_gpu_normal_normal_infer_cpp_linux_gpu_cpu.txt
@@ -0,0 +1,20 @@
+===========================cpp_infer_params===========================
+model_name:ch_PP-OCRv2_det_PACT
+use_opencv:True
+infer_model:./inference/ch_PP-OCRv2_det_pact_infer
+infer_quant:False
+inference:./deploy/cpp_infer/build/ppocr
+--use_gpu:True|False
+--enable_mkldnn:False
+--cpu_threads:6
+--rec_batch_num:1
+--use_tensorrt:False
+--precision:fp32
+--det_model_dir:
+--image_dir:./inference/ch_det_data_50/all-sum-510/
+null:null
+--benchmark:True
+--det:True
+--rec:False
+--cls:False
+--use_angle_cls:False
\ No newline at end of file
diff --git a/test_tipc/configs/ch_PP-OCRv2_det_PACT/model_linux_gpu_normal_normal_serving_cpp_linux_gpu_cpu.txt b/test_tipc/configs/ch_PP-OCRv2_det_PACT/model_linux_gpu_normal_normal_serving_cpp_linux_gpu_cpu.txt
new file mode 100644
index 0000000000000000000000000000000000000000..b2d929b99c80d02c48795dab820da1836f7a2ebe
--- /dev/null
+++ b/test_tipc/configs/ch_PP-OCRv2_det_PACT/model_linux_gpu_normal_normal_serving_cpp_linux_gpu_cpu.txt
@@ -0,0 +1,19 @@
+===========================serving_params===========================
+model_name:ch_PP-OCRv2_det_PACT
+python:python3.7
+trans_model:-m paddle_serving_client.convert
+--det_dirname:./inference/ch_PP-OCRv2_det_pact_infer/
+--model_filename:inference.pdmodel
+--params_filename:inference.pdiparams
+--det_serving_server:./deploy/pdserving/ppocr_det_v2_pact_serving/
+--det_serving_client:./deploy/pdserving/ppocr_det_v2_pact_client/
+--rec_dirname:./inference/ch_PP-OCRv2_rec_pact_infer/
+--rec_serving_server:./deploy/pdserving/ppocr_rec_v2_pact_serving/
+--rec_serving_client:./deploy/pdserving/ppocr_rec_v2_pact_client/
+serving_dir:./deploy/pdserving
+web_service:-m paddle_serving_server.serve
+--op:GeneralDetectionOp GeneralInferOp
+--port:8181
+--gpu_id:"0"|null
+cpp_client:ocr_cpp_client.py
+--image_dir:../../doc/imgs/1.jpg
diff --git a/test_tipc/configs/ch_PP-OCRv2_det_PACT/model_linux_gpu_normal_normal_serving_python_linux_gpu_cpu.txt b/test_tipc/configs/ch_PP-OCRv2_det_PACT/model_linux_gpu_normal_normal_serving_python_linux_gpu_cpu.txt
new file mode 100644
index 0000000000000000000000000000000000000000..d5d99ab56d0e17789d1b6b0c593ed42e5ffad320
--- /dev/null
+++ b/test_tipc/configs/ch_PP-OCRv2_det_PACT/model_linux_gpu_normal_normal_serving_python_linux_gpu_cpu.txt
@@ -0,0 +1,23 @@
+===========================serving_params===========================
+model_name:ch_PP-OCRv2_det_PACT
+python:python3.7
+trans_model:-m paddle_serving_client.convert
+--det_dirname:./inference/ch_PP-OCRv2_det_pact_infer/
+--model_filename:inference.pdmodel
+--params_filename:inference.pdiparams
+--det_serving_server:./deploy/pdserving/ppocr_det_v2_pact_serving/
+--det_serving_client:./deploy/pdserving/ppocr_det_v2_pact_client/
+--rec_dirname:null
+--rec_serving_server:null
+--rec_serving_client:null
+serving_dir:./deploy/pdserving
+web_service:web_service_det.py --config=config.yml --opt op.det.concurrency="1"
+op.det.local_service_conf.devices:gpu|null
+op.det.local_service_conf.use_mkldnn:False
+op.det.local_service_conf.thread_num:6
+op.det.local_service_conf.use_trt:False
+op.det.local_service_conf.precision:fp32
+op.det.local_service_conf.model_config:
+op.rec.local_service_conf.model_config:
+pipline:pipeline_http_client.py
+--image_dir:../../doc/imgs/1.jpg
diff --git a/test_tipc/configs/ch_PP-OCRv2_rec/model_linux_gpu_normal_normal_infer_cpp_linux_gpu_cpu.txt b/test_tipc/configs/ch_PP-OCRv2_rec/model_linux_gpu_normal_normal_infer_cpp_linux_gpu_cpu.txt
new file mode 100644
index 0000000000000000000000000000000000000000..b1bff00b09cedbe5abac50a471091fb2daedf8f3
--- /dev/null
+++ b/test_tipc/configs/ch_PP-OCRv2_rec/model_linux_gpu_normal_normal_infer_cpp_linux_gpu_cpu.txt
@@ -0,0 +1,20 @@
+===========================cpp_infer_params===========================
+model_name:ch_PP-OCRv2_rec
+use_opencv:True
+infer_model:./inference/ch_PP-OCRv2_rec_infer/
+infer_quant:False
+inference:./deploy/cpp_infer/build/ppocr --rec_char_dict_path=./ppocr/utils/ppocr_keys_v1.txt --rec_img_h=32
+--use_gpu:True|False
+--enable_mkldnn:False
+--cpu_threads:6
+--rec_batch_num:6
+--use_tensorrt:False
+--precision:fp32
+--rec_model_dir:
+--image_dir:./inference/rec_inference/
+null:null
+--benchmark:True
+--det:False
+--rec:True
+--cls:False
+--use_angle_cls:False
\ No newline at end of file
diff --git a/test_tipc/configs/ch_PP-OCRv2_rec/model_linux_gpu_normal_normal_paddle2onnx_python_linux_cpu.txt b/test_tipc/configs/ch_PP-OCRv2_rec/model_linux_gpu_normal_normal_paddle2onnx_python_linux_cpu.txt
new file mode 100644
index 0000000000000000000000000000000000000000..e374a5d8216f54eef52e0765be30a1dcc391dc7b
--- /dev/null
+++ b/test_tipc/configs/ch_PP-OCRv2_rec/model_linux_gpu_normal_normal_paddle2onnx_python_linux_cpu.txt
@@ -0,0 +1,17 @@
+===========================paddle2onnx_params===========================
+model_name:ch_PP-OCRv2_rec
+python:python3.7
+2onnx: paddle2onnx
+--det_model_dir:
+--model_filename:inference.pdmodel
+--params_filename:inference.pdiparams
+--det_save_file:
+--rec_model_dir:./inference/ch_PP-OCRv2_rec_infer/
+--rec_save_file:./inference/rec_v2_onnx/model.onnx
+--opset_version:10
+--enable_onnx_checker:True
+inference:tools/infer/predict_rec.py --rec_image_shape="3,32,320"
+--use_gpu:True|False
+--det_model_dir:
+--rec_model_dir:
+--image_dir:./inference/rec_inference/
\ No newline at end of file
diff --git a/test_tipc/configs/ch_PP-OCRv2_rec/model_linux_gpu_normal_normal_serving_python_linux_gpu_cpu.txt b/test_tipc/configs/ch_PP-OCRv2_rec/model_linux_gpu_normal_normal_serving_python_linux_gpu_cpu.txt
new file mode 100644
index 0000000000000000000000000000000000000000..e9e90d372e4a2fed645962ce6ad3b8112588f24a
--- /dev/null
+++ b/test_tipc/configs/ch_PP-OCRv2_rec/model_linux_gpu_normal_normal_serving_python_linux_gpu_cpu.txt
@@ -0,0 +1,23 @@
+===========================serving_params===========================
+model_name:ch_PP-OCRv2_rec
+python:python3.7
+trans_model:-m paddle_serving_client.convert
+--det_dirname:null
+--model_filename:inference.pdmodel
+--params_filename:inference.pdiparams
+--det_serving_server:null
+--det_serving_client:null
+--rec_dirname:./inference/ch_PP-OCRv2_rec_infer/
+--rec_serving_server:./deploy/pdserving/ppocr_rec_v2_serving/
+--rec_serving_client:./deploy/pdserving/ppocr_rec_v2_client/
+serving_dir:./deploy/pdserving
+web_service:web_service_rec.py --config=config.yml --opt op.rec.concurrency="1"
+op.det.local_service_conf.devices:gpu|null
+op.det.local_service_conf.use_mkldnn:False
+op.det.local_service_conf.thread_num:6
+op.det.local_service_conf.use_trt:False
+op.det.local_service_conf.precision:fp32
+op.det.local_service_conf.model_config:
+op.rec.local_service_conf.model_config:
+pipline:pipeline_http_client.py --det=False
+--image_dir:../../inference/rec_inference
diff --git a/test_tipc/configs/ch_PP-OCRv2_rec/train_infer_python.txt b/test_tipc/configs/ch_PP-OCRv2_rec/train_infer_python.txt
index 188eb3ccc5f7aa2b3724dc1fb7132af090c22ffa..df42b342ba5fa3947a69c2bde5548975ca92d857 100644
--- a/test_tipc/configs/ch_PP-OCRv2_rec/train_infer_python.txt
+++ b/test_tipc/configs/ch_PP-OCRv2_rec/train_infer_python.txt
@@ -1,10 +1,10 @@
===========================train_params===========================
-model_name:PPOCRv2_ocr_rec
+model_name:ch_PP-OCRv2_rec
python:python3.7
gpu_list:0|0,1
Global.use_gpu:True|True
Global.auto_cast:fp32
-Global.epoch_num:lite_train_lite_infer=3|whole_train_whole_infer=300
+Global.epoch_num:lite_train_lite_infer=3|whole_train_whole_infer=50
Global.save_model_dir:./output/
Train.loader.batch_size_per_card:lite_train_lite_infer=16|whole_train_whole_infer=128
Global.pretrained_model:null
@@ -39,11 +39,11 @@ infer_export:null
infer_quant:False
inference:tools/infer/predict_rec.py
--use_gpu:True|False
---enable_mkldnn:True|False
---cpu_threads:1|6
+--enable_mkldnn:False
+--cpu_threads:6
--rec_batch_num:1|6
---use_tensorrt:False|True
---precision:fp32|int8
+--use_tensorrt:False
+--precision:fp32
--rec_model_dir:
--image_dir:./inference/rec_inference
null:null
diff --git a/test_tipc/configs/ch_PP-OCRv2_rec/train_linux_gpu_fleet_normal_infer_python_linux_gpu_cpu.txt b/test_tipc/configs/ch_PP-OCRv2_rec/train_linux_gpu_fleet_normal_infer_python_linux_gpu_cpu.txt
new file mode 100644
index 0000000000000000000000000000000000000000..5795bc27e686164578fc246e1fa467efdc52f71f
--- /dev/null
+++ b/test_tipc/configs/ch_PP-OCRv2_rec/train_linux_gpu_fleet_normal_infer_python_linux_gpu_cpu.txt
@@ -0,0 +1,53 @@
+===========================train_params===========================
+model_name:ch_PP-OCRv2_rec
+python:python3.7
+gpu_list:192.168.0.1,192.168.0.2;0,1
+Global.use_gpu:True
+Global.auto_cast:fp32
+Global.epoch_num:lite_train_lite_infer=3|whole_train_whole_infer=50
+Global.save_model_dir:./output/
+Train.loader.batch_size_per_card:lite_train_lite_infer=16|whole_train_whole_infer=128
+Global.pretrained_model:null
+train_model_name:latest
+train_infer_img_dir:./inference/rec_inference
+null:null
+##
+trainer:norm_train
+norm_train:tools/train.py -c test_tipc/configs/ch_PP-OCRv2_rec/ch_PP-OCRv2_rec_distillation.yml -o
+pact_train:null
+fpgm_train:null
+distill_train:null
+null:null
+null:null
+##
+===========================eval_params===========================
+eval:null
+null:null
+##
+===========================infer_params===========================
+Global.save_inference_dir:./output/
+Global.checkpoints:
+norm_export:tools/export_model.py -c test_tipc/configs/ch_PP-OCRv2_rec/ch_PP-OCRv2_rec_distillation.yml -o
+quant_export:
+fpgm_export:
+distill_export:null
+export1:null
+export2:null
+inference_dir:Student
+infer_model:./inference/ch_PP-OCRv2_rec_infer
+infer_export:null
+infer_quant:False
+inference:tools/infer/predict_rec.py
+--use_gpu:False
+--enable_mkldnn:False
+--cpu_threads:6
+--rec_batch_num:1|6
+--use_tensorrt:False
+--precision:fp32
+--rec_model_dir:
+--image_dir:./inference/rec_inference
+null:null
+--benchmark:True
+null:null
+===========================infer_benchmark_params==========================
+random_infer_input:[{float32,[3,32,320]}]
diff --git a/test_tipc/configs/ch_PP-OCRv2_rec/train_linux_gpu_normal_amp_infer_python_linux_gpu_cpu.txt b/test_tipc/configs/ch_PP-OCRv2_rec/train_linux_gpu_normal_amp_infer_python_linux_gpu_cpu.txt
index 7c438cb8a3b6907c9ca352e90605d8b4f6fb17fd..1b8800f5cf8f1f864a03dd6a0408aafdd2cec3af 100644
--- a/test_tipc/configs/ch_PP-OCRv2_rec/train_linux_gpu_normal_amp_infer_python_linux_gpu_cpu.txt
+++ b/test_tipc/configs/ch_PP-OCRv2_rec/train_linux_gpu_normal_amp_infer_python_linux_gpu_cpu.txt
@@ -1,10 +1,10 @@
===========================train_params===========================
-model_name:PPOCRv2_ocr_rec
+model_name:ch_PP-OCRv2_rec
python:python3.7
gpu_list:0|0,1
Global.use_gpu:True|True
Global.auto_cast:amp
-Global.epoch_num:lite_train_lite_infer=3|whole_train_whole_infer=300
+Global.epoch_num:lite_train_lite_infer=3|whole_train_whole_infer=50
Global.save_model_dir:./output/
Train.loader.batch_size_per_card:lite_train_lite_infer=16|whole_train_whole_infer=128
Global.pretrained_model:null
@@ -39,11 +39,11 @@ infer_export:null
infer_quant:False
inference:tools/infer/predict_rec.py
--use_gpu:True|False
---enable_mkldnn:True|False
---cpu_threads:1|6
+--enable_mkldnn:False
+--cpu_threads:6
--rec_batch_num:1|6
---use_tensorrt:False|True
---precision:fp32|int8
+--use_tensorrt:False
+--precision:fp32
--rec_model_dir:
--image_dir:./inference/rec_inference
null:null
diff --git a/test_tipc/configs/ch_PP-OCRv2_rec_PACT/train_infer_python.txt b/test_tipc/configs/ch_PP-OCRv2_rec/train_pact_infer_python.txt
similarity index 88%
rename from test_tipc/configs/ch_PP-OCRv2_rec_PACT/train_infer_python.txt
rename to test_tipc/configs/ch_PP-OCRv2_rec/train_pact_infer_python.txt
index 98c125229d7f968cd3f650c3885ba4edb0de754c..0ac75eff07a5ed4c17d7fdbe554fd4b5c0f11aed 100644
--- a/test_tipc/configs/ch_PP-OCRv2_rec_PACT/train_infer_python.txt
+++ b/test_tipc/configs/ch_PP-OCRv2_rec/train_pact_infer_python.txt
@@ -1,10 +1,10 @@
===========================train_params===========================
-model_name:ch_PPOCRv2_rec_PACT
+model_name:ch_PP-OCRv2_rec_PACT
python:python3.7
gpu_list:0|0,1
Global.use_gpu:True|True
Global.auto_cast:fp32
-Global.epoch_num:lite_train_lite_infer=6|whole_train_whole_infer=300
+Global.epoch_num:lite_train_lite_infer=1|whole_train_whole_infer=50
Global.save_model_dir:./output/
Train.loader.batch_size_per_card:lite_train_lite_infer=16|whole_train_whole_infer=128
Global.pretrained_model:pretrain_models/ch_PP-OCRv2_rec_train/best_accuracy
@@ -39,11 +39,11 @@ infer_export:null
infer_quant:True
inference:tools/infer/predict_rec.py
--use_gpu:True|False
---enable_mkldnn:True|False
---cpu_threads:1|6
+--enable_mkldnn:False
+--cpu_threads:6
--rec_batch_num:1|6
---use_tensorrt:False|True
---precision:fp32|int8
+--use_tensorrt:False
+--precision:fp32
--rec_model_dir:
--image_dir:./inference/rec_inference
null:null
diff --git a/test_tipc/configs/ch_PP-OCRv2_rec_KL/model_linux_gpu_normal_normal_infer_python_linux_gpu_cpu.txt b/test_tipc/configs/ch_PP-OCRv2_rec/train_ptq_infer_python.txt
similarity index 76%
rename from test_tipc/configs/ch_PP-OCRv2_rec_KL/model_linux_gpu_normal_normal_infer_python_linux_gpu_cpu.txt
rename to test_tipc/configs/ch_PP-OCRv2_rec/train_ptq_infer_python.txt
index 083a3ae26e726e290ffde4095821cbf3c40f7178..c30e0858efda4df8f9912183c5c31b56413a5252 100644
--- a/test_tipc/configs/ch_PP-OCRv2_rec_KL/model_linux_gpu_normal_normal_infer_python_linux_gpu_cpu.txt
+++ b/test_tipc/configs/ch_PP-OCRv2_rec/train_ptq_infer_python.txt
@@ -1,17 +1,17 @@
===========================kl_quant_params===========================
-model_name:PPOCRv2_ocr_rec_kl
+model_name:ch_PP-OCRv2_rec_KL
python:python3.7
Global.pretrained_model:null
Global.save_inference_dir:null
infer_model:./inference/ch_PP-OCRv2_rec_infer/
infer_export:deploy/slim/quantization/quant_kl.py -c test_tipc/configs/ch_PP-OCRv2_rec/ch_PP-OCRv2_rec_distillation.yml -o
infer_quant:True
-inference:tools/infer/predict_rec.py
+inference:tools/infer/predict_rec.py --rec_image_shape="3,32,320"
--use_gpu:False|True
---enable_mkldnn:False|True
---cpu_threads:1|6
+--enable_mkldnn:False
+--cpu_threads:6
--rec_batch_num:1|6
---use_tensorrt:True
+--use_tensorrt:False
--precision:int8
--rec_model_dir:
--image_dir:./inference/rec_inference
diff --git a/test_tipc/configs/ch_PP-OCRv2_rec_KL/model_linux_gpu_normal_normal_infer_cpp_linux_gpu_cpu.txt b/test_tipc/configs/ch_PP-OCRv2_rec_KL/model_linux_gpu_normal_normal_infer_cpp_linux_gpu_cpu.txt
new file mode 100644
index 0000000000000000000000000000000000000000..95e4062d145b5c0fc390049dee322b0e85ecee98
--- /dev/null
+++ b/test_tipc/configs/ch_PP-OCRv2_rec_KL/model_linux_gpu_normal_normal_infer_cpp_linux_gpu_cpu.txt
@@ -0,0 +1,20 @@
+===========================cpp_infer_params===========================
+model_name:ch_PP-OCRv2_rec_KL
+use_opencv:True
+infer_model:./inference/ch_PP-OCRv2_rec_klquant_infer
+infer_quant:False
+inference:./deploy/cpp_infer/build/ppocr --rec_char_dict_path=./ppocr/utils/ppocr_keys_v1.txt --rec_img_h=32
+--use_gpu:True|False
+--enable_mkldnn:False
+--cpu_threads:6
+--rec_batch_num:6
+--use_tensorrt:False
+--precision:fp32
+--rec_model_dir:
+--image_dir:./inference/rec_inference/
+null:null
+--benchmark:True
+--det:False
+--rec:True
+--cls:False
+--use_angle_cls:False
\ No newline at end of file
diff --git a/test_tipc/configs/ch_PP-OCRv2_rec_KL/model_linux_gpu_normal_normal_serving_cpp_linux_gpu_cpu.txt b/test_tipc/configs/ch_PP-OCRv2_rec_KL/model_linux_gpu_normal_normal_serving_cpp_linux_gpu_cpu.txt
new file mode 100644
index 0000000000000000000000000000000000000000..34d4007b150385657a69ae93ef4e04bdf1ce149d
--- /dev/null
+++ b/test_tipc/configs/ch_PP-OCRv2_rec_KL/model_linux_gpu_normal_normal_serving_cpp_linux_gpu_cpu.txt
@@ -0,0 +1,19 @@
+===========================serving_params===========================
+model_name:ch_PP-OCRv2_rec_KL
+python:python3.7
+trans_model:-m paddle_serving_client.convert
+--det_dirname:./inference/ch_PP-OCRv2_det_klquant_infer/
+--model_filename:inference.pdmodel
+--params_filename:inference.pdiparams
+--det_serving_server:./deploy/pdserving/ppocr_det_v2_kl_serving/
+--det_serving_client:./deploy/pdserving/ppocr_det_v2_kl_client/
+--rec_dirname:./inference/ch_PP-OCRv2_rec_klquant_infer/
+--rec_serving_server:./deploy/pdserving/ppocr_rec_v2_kl_serving/
+--rec_serving_client:./deploy/pdserving/ppocr_rec_v2_kl_client/
+serving_dir:./deploy/pdserving
+web_service:-m paddle_serving_server.serve
+--op:GeneralDetectionOp GeneralInferOp
+--port:8181
+--gpu_id:"0"|null
+cpp_client:ocr_cpp_client.py
+--image_dir:../../doc/imgs/1.jpg
diff --git a/test_tipc/configs/ch_PP-OCRv2_rec_KL/model_linux_gpu_normal_normal_serving_python_linux_gpu_cpu.txt b/test_tipc/configs/ch_PP-OCRv2_rec_KL/model_linux_gpu_normal_normal_serving_python_linux_gpu_cpu.txt
new file mode 100644
index 0000000000000000000000000000000000000000..3405f2b5876609bf6d97cf3d379613bf38c16cda
--- /dev/null
+++ b/test_tipc/configs/ch_PP-OCRv2_rec_KL/model_linux_gpu_normal_normal_serving_python_linux_gpu_cpu.txt
@@ -0,0 +1,23 @@
+===========================serving_params===========================
+model_name:ch_PP-OCRv2_rec_KL
+python:python3.7
+trans_model:-m paddle_serving_client.convert
+--det_dirname:null
+--model_filename:inference.pdmodel
+--params_filename:inference.pdiparams
+--det_serving_server:null
+--det_serving_client:null
+--rec_dirname:./inference/ch_PP-OCRv2_rec_klquant_infer/
+--rec_serving_server:./deploy/pdserving/ppocr_rec_v2_kl_serving/
+--rec_serving_client:./deploy/pdserving/ppocr_rec_v2_kl_client/
+serving_dir:./deploy/pdserving
+web_service:web_service_rec.py --config=config.yml --opt op.rec.concurrency="1"
+op.det.local_service_conf.devices:gpu|null
+op.det.local_service_conf.use_mkldnn:False
+op.det.local_service_conf.thread_num:6
+op.det.local_service_conf.use_trt:False
+op.det.local_service_conf.precision:fp32
+op.det.local_service_conf.model_config:
+op.rec.local_service_conf.model_config:
+pipline:pipeline_http_client.py --det=False
+--image_dir:../../inference/rec_inference
diff --git a/test_tipc/configs/ch_PP-OCRv2_rec_PACT/model_linux_gpu_normal_normal_infer_cpp_linux_gpu_cpu.txt b/test_tipc/configs/ch_PP-OCRv2_rec_PACT/model_linux_gpu_normal_normal_infer_cpp_linux_gpu_cpu.txt
new file mode 100644
index 0000000000000000000000000000000000000000..b807eadd3ff7f38ce670f0db9d28aaacd62e207a
--- /dev/null
+++ b/test_tipc/configs/ch_PP-OCRv2_rec_PACT/model_linux_gpu_normal_normal_infer_cpp_linux_gpu_cpu.txt
@@ -0,0 +1,20 @@
+===========================cpp_infer_params===========================
+model_name:ch_PP-OCRv2_rec_PACT
+use_opencv:True
+infer_model:./inference/ch_PP-OCRv2_rec_pact_infer
+infer_quant:False
+inference:./deploy/cpp_infer/build/ppocr --rec_char_dict_path=./ppocr/utils/ppocr_keys_v1.txt --rec_img_h=32
+--use_gpu:True|False
+--enable_mkldnn:False
+--cpu_threads:6
+--rec_batch_num:6
+--use_tensorrt:False
+--precision:fp32
+--rec_model_dir:
+--image_dir:./inference/rec_inference/
+null:null
+--benchmark:True
+--det:False
+--rec:True
+--cls:False
+--use_angle_cls:False
\ No newline at end of file
diff --git a/test_tipc/configs/ch_PP-OCRv2_rec_PACT/model_linux_gpu_normal_normal_serving_cpp_linux_gpu_cpu.txt b/test_tipc/configs/ch_PP-OCRv2_rec_PACT/model_linux_gpu_normal_normal_serving_cpp_linux_gpu_cpu.txt
new file mode 100644
index 0000000000000000000000000000000000000000..2a174b9e31b24ca002a639b1e8675cc6d5592bcc
--- /dev/null
+++ b/test_tipc/configs/ch_PP-OCRv2_rec_PACT/model_linux_gpu_normal_normal_serving_cpp_linux_gpu_cpu.txt
@@ -0,0 +1,19 @@
+===========================serving_params===========================
+model_name:ch_PP-OCRv2_rec_PACT
+python:python3.7
+trans_model:-m paddle_serving_client.convert
+--det_dirname:./inference/ch_PP-OCRv2_det_pact_infer/
+--model_filename:inference.pdmodel
+--params_filename:inference.pdiparams
+--det_serving_server:./deploy/pdserving/ppocr_det_v2_pact_serving/
+--det_serving_client:./deploy/pdserving/ppocr_det_v2_pact_client/
+--rec_dirname:./inference/ch_PP-OCRv2_rec_pact_infer/
+--rec_serving_server:./deploy/pdserving/ppocr_rec_v2_pact_serving/
+--rec_serving_client:./deploy/pdserving/ppocr_rec_v2_pact_client/
+serving_dir:./deploy/pdserving
+web_service:-m paddle_serving_server.serve
+--op:GeneralDetectionOp GeneralInferOp
+--port:8181
+--gpu_id:"0"|null
+cpp_client:ocr_cpp_client.py
+--image_dir:../../doc/imgs/1.jpg
diff --git a/test_tipc/configs/ch_PP-OCRv2_rec_PACT/model_linux_gpu_normal_normal_serving_python_linux_gpu_cpu.txt b/test_tipc/configs/ch_PP-OCRv2_rec_PACT/model_linux_gpu_normal_normal_serving_python_linux_gpu_cpu.txt
new file mode 100644
index 0000000000000000000000000000000000000000..2b7ed81172473a7924d1af48937048bbee9d1953
--- /dev/null
+++ b/test_tipc/configs/ch_PP-OCRv2_rec_PACT/model_linux_gpu_normal_normal_serving_python_linux_gpu_cpu.txt
@@ -0,0 +1,23 @@
+===========================serving_params===========================
+model_name:ch_PP-OCRv2_rec_PACT
+python:python3.7
+trans_model:-m paddle_serving_client.convert
+--det_dirname:null
+--model_filename:inference.pdmodel
+--params_filename:inference.pdiparams
+--det_serving_server:null
+--det_serving_client:null
+--rec_dirname:./inference/ch_PP-OCRv2_rec_pact_infer/
+--rec_serving_server:./deploy/pdserving/ppocr_rec_v2_pact_serving/
+--rec_serving_client:./deploy/pdserving/ppocr_rec_v2_pact_client/
+serving_dir:./deploy/pdserving
+web_service:web_service_rec.py --config=config.yml --opt op.rec.concurrency="1"
+op.det.local_service_conf.devices:gpu|null
+op.det.local_service_conf.use_mkldnn:False
+op.det.local_service_conf.thread_num:6
+op.det.local_service_conf.use_trt:False
+op.det.local_service_conf.precision:fp32
+op.det.local_service_conf.model_config:
+op.rec.local_service_conf.model_config:
+pipline:pipeline_http_client.py --det=False
+--image_dir:../../inference/rec_inference
diff --git a/test_tipc/configs/ch_PP-OCRv3/model_linux_gpu_normal_normal_infer_cpp_linux_gpu_cpu.txt b/test_tipc/configs/ch_PP-OCRv3/model_linux_gpu_normal_normal_infer_cpp_linux_gpu_cpu.txt
new file mode 100644
index 0000000000000000000000000000000000000000..794af27d90edd54ff888ab167698d57a92eab7b6
--- /dev/null
+++ b/test_tipc/configs/ch_PP-OCRv3/model_linux_gpu_normal_normal_infer_cpp_linux_gpu_cpu.txt
@@ -0,0 +1,20 @@
+===========================cpp_infer_params===========================
+model_name:ch_PP-OCRv3
+use_opencv:True
+infer_model:./inference/ch_PP-OCRv3_det_infer/
+infer_quant:False
+inference:./deploy/cpp_infer/build/ppocr --rec_img_h=48 --rec_char_dict_path=./ppocr/utils/ppocr_keys_v1.txt
+--use_gpu:True|False
+--enable_mkldnn:False
+--cpu_threads:6
+--rec_batch_num:1
+--use_tensorrt:False
+--precision:fp32
+--det_model_dir:
+--image_dir:./inference/ch_det_data_50/all-sum-510/
+--rec_model_dir:./inference/ch_PP-OCRv3_rec_infer/
+--benchmark:True
+--det:True
+--rec:True
+--cls:False
+--use_angle_cls:False
\ No newline at end of file
diff --git a/test_tipc/configs/ch_PP-OCRv3/model_linux_gpu_normal_normal_infer_python_linux_gpu_cpu.txt b/test_tipc/configs/ch_PP-OCRv3/model_linux_gpu_normal_normal_infer_python_linux_gpu_cpu.txt
new file mode 100644
index 0000000000000000000000000000000000000000..afacdc140515297c75f8674b245b34fce4c9fad9
--- /dev/null
+++ b/test_tipc/configs/ch_PP-OCRv3/model_linux_gpu_normal_normal_infer_python_linux_gpu_cpu.txt
@@ -0,0 +1,19 @@
+===========================ch_PP-OCRv2===========================
+model_name:ch_PP-OCRv3
+python:python3.7
+infer_model:./inference/ch_PP-OCRv3_det_infer/
+infer_export:null
+infer_quant:False
+inference:tools/infer/predict_system.py --rec_image_shape="3,48,320"
+--use_gpu:False|True
+--enable_mkldnn:False
+--cpu_threads:6
+--rec_batch_num:1
+--use_tensorrt:False
+--precision:fp32
+--det_model_dir:
+--image_dir:./inference/ch_det_data_50/all-sum-510/
+--rec_model_dir:./inference/ch_PP-OCRv3_rec_infer/
+--benchmark:True
+null:null
+null:null
diff --git a/test_tipc/configs/ch_PP-OCRv3/model_linux_gpu_normal_normal_lite_cpp_arm_cpu.txt b/test_tipc/configs/ch_PP-OCRv3/model_linux_gpu_normal_normal_lite_cpp_arm_cpu.txt
new file mode 100644
index 0000000000000000000000000000000000000000..133a78c84ff1f3d11765b52b45489f5ba43d63f2
--- /dev/null
+++ b/test_tipc/configs/ch_PP-OCRv3/model_linux_gpu_normal_normal_lite_cpp_arm_cpu.txt
@@ -0,0 +1,13 @@
+===========================lite_params===========================
+inference:./ocr_db_crnn system
+runtime_device:ARM_CPU
+det_infer_model:ch_PP-OCRv3_det_infer|ch_PP-OCRv3_det_slim_quant_infer
+rec_infer_model:ch_PP-OCRv3_rec_infer|ch_PP-OCRv3_rec_slim_quant_infer
+cls_infer_model:ch_ppocr_mobile_v2.0_cls_infer|ch_ppocr_mobile_v2.0_cls_slim_infer
+--cpu_threads:1|4
+--det_batch_size:1
+--rec_batch_size:1
+--image_dir:./test_data/icdar2015_lite/text_localization/ch4_test_images/
+--config_dir:./config.txt
+--rec_dict_dir:./ppocr_keys_v1.txt
+--benchmark:True
diff --git a/test_tipc/configs/ch_PP-OCRv3/model_linux_gpu_normal_normal_lite_cpp_arm_gpu_opencl.txt b/test_tipc/configs/ch_PP-OCRv3/model_linux_gpu_normal_normal_lite_cpp_arm_gpu_opencl.txt
new file mode 100644
index 0000000000000000000000000000000000000000..86e49fc100522ec78d38f41ab121d936526b9b4f
--- /dev/null
+++ b/test_tipc/configs/ch_PP-OCRv3/model_linux_gpu_normal_normal_lite_cpp_arm_gpu_opencl.txt
@@ -0,0 +1,13 @@
+===========================lite_params===========================
+inference:./ocr_db_crnn system
+runtime_device:ARM_GPU_OPENCL
+det_infer_model:ch_PP-OCRv3_det_infer|ch_PP-OCRv3_det_slim_quant_infer
+rec_infer_model:ch_PP-OCRv3_rec_infer|ch_PP-OCRv3_rec_slim_quant_infer
+cls_infer_model:ch_ppocr_mobile_v2.0_cls_infer|ch_ppocr_mobile_v2.0_cls_slim_infer
+--cpu_threads:1|4
+--det_batch_size:1
+--rec_batch_size:1
+--image_dir:./test_data/icdar2015_lite/text_localization/ch4_test_images/
+--config_dir:./config.txt
+--rec_dict_dir:./ppocr_keys_v1.txt
+--benchmark:True
diff --git a/test_tipc/configs/ch_PP-OCRv3/model_linux_gpu_normal_normal_paddle2onnx_python_linux_cpu.txt b/test_tipc/configs/ch_PP-OCRv3/model_linux_gpu_normal_normal_paddle2onnx_python_linux_cpu.txt
new file mode 100644
index 0000000000000000000000000000000000000000..bf2556ef17dcdbd66ff81abe6dbb10639511cde4
--- /dev/null
+++ b/test_tipc/configs/ch_PP-OCRv3/model_linux_gpu_normal_normal_paddle2onnx_python_linux_cpu.txt
@@ -0,0 +1,17 @@
+===========================paddle2onnx_params===========================
+model_name:ch_PP-OCRv3
+python:python3.7
+2onnx: paddle2onnx
+--det_model_dir:./inference/ch_PP-OCRv3_det_infer/
+--model_filename:inference.pdmodel
+--params_filename:inference.pdiparams
+--det_save_file:./inference/det_v3_onnx/model.onnx
+--rec_model_dir:./inference/ch_PP-OCRv3_rec_infer/
+--rec_save_file:./inference/rec_v3_onnx/model.onnx
+--opset_version:10
+--enable_onnx_checker:True
+inference:tools/infer/predict_system.py --rec_image_shape="3,48,320"
+--use_gpu:True|False
+--det_model_dir:
+--rec_model_dir:
+--image_dir:./inference/ch_det_data_50/all-sum-510/00008790.jpg
\ No newline at end of file
diff --git a/test_tipc/configs/ch_PP-OCRv3/model_linux_gpu_normal_normal_serving_cpp_linux_gpu_cpu.txt b/test_tipc/configs/ch_PP-OCRv3/model_linux_gpu_normal_normal_serving_cpp_linux_gpu_cpu.txt
new file mode 100644
index 0000000000000000000000000000000000000000..91c57bed1b9e9bbafc6438766b81781433a06aa2
--- /dev/null
+++ b/test_tipc/configs/ch_PP-OCRv3/model_linux_gpu_normal_normal_serving_cpp_linux_gpu_cpu.txt
@@ -0,0 +1,19 @@
+===========================serving_params===========================
+model_name:ch_PP-OCRv3
+python:python3.7
+trans_model:-m paddle_serving_client.convert
+--det_dirname:./inference/ch_PP-OCRv3_det_infer/
+--model_filename:inference.pdmodel
+--params_filename:inference.pdiparams
+--det_serving_server:./deploy/pdserving/ppocr_det_v3_serving/
+--det_serving_client:./deploy/pdserving/ppocr_det_v3_client/
+--rec_dirname:./inference/ch_PP-OCRv3_rec_infer/
+--rec_serving_server:./deploy/pdserving/ppocr_rec_v3_serving/
+--rec_serving_client:./deploy/pdserving/ppocr_rec_v3_client/
+serving_dir:./deploy/pdserving
+web_service:-m paddle_serving_server.serve
+--op:GeneralDetectionOp GeneralInferOp
+--port:8181
+--gpu_id:"0"|null
+cpp_client:ocr_cpp_client.py
+--image_dir:../../doc/imgs/1.jpg
diff --git a/test_tipc/configs/ch_PP-OCRv3/model_linux_gpu_normal_normal_serving_python_linux_gpu_cpu.txt b/test_tipc/configs/ch_PP-OCRv3/model_linux_gpu_normal_normal_serving_python_linux_gpu_cpu.txt
new file mode 100644
index 0000000000000000000000000000000000000000..6f699ef5c02015d214ef08fb7047a6e9be84e24d
--- /dev/null
+++ b/test_tipc/configs/ch_PP-OCRv3/model_linux_gpu_normal_normal_serving_python_linux_gpu_cpu.txt
@@ -0,0 +1,23 @@
+===========================serving_params===========================
+model_name:ch_PP-OCRv3
+python:python3.7
+trans_model:-m paddle_serving_client.convert
+--det_dirname:./inference/ch_PP-OCRv3_det_infer/
+--model_filename:inference.pdmodel
+--params_filename:inference.pdiparams
+--det_serving_server:./deploy/pdserving/ppocr_det_v3_serving/
+--det_serving_client:./deploy/pdserving/ppocr_det_v3_client/
+--rec_dirname:./inference/ch_PP-OCRv3_rec_infer/
+--rec_serving_server:./deploy/pdserving/ppocr_rec_v3_serving/
+--rec_serving_client:./deploy/pdserving/ppocr_rec_v3_client/
+serving_dir:./deploy/pdserving
+web_service:web_service.py --config=config.yml --opt op.det.concurrency="1" op.rec.concurrency="1"
+op.det.local_service_conf.devices:gpu|null
+op.det.local_service_conf.use_mkldnn:False
+op.det.local_service_conf.thread_num:6
+op.det.local_service_conf.use_trt:False
+op.det.local_service_conf.precision:fp32
+op.det.local_service_conf.model_config:
+op.rec.local_service_conf.model_config:
+pipline:pipeline_http_client.py
+--image_dir:../../doc/imgs/1.jpg
diff --git a/test_tipc/configs/ch_PP-OCRv3_det/model_linux_gpu_normal_normal_infer_cpp_linux_gpu_cpu.txt b/test_tipc/configs/ch_PP-OCRv3_det/model_linux_gpu_normal_normal_infer_cpp_linux_gpu_cpu.txt
new file mode 100644
index 0000000000000000000000000000000000000000..aecd0dd437ea028c555ffedf4227b55843dd7f2e
--- /dev/null
+++ b/test_tipc/configs/ch_PP-OCRv3_det/model_linux_gpu_normal_normal_infer_cpp_linux_gpu_cpu.txt
@@ -0,0 +1,20 @@
+===========================cpp_infer_params===========================
+model_name:ch_PP-OCRv3_det
+use_opencv:True
+infer_model:./inference/ch_PP-OCRv3_det_infer/
+infer_quant:False
+inference:./deploy/cpp_infer/build/ppocr
+--use_gpu:True|False
+--enable_mkldnn:False
+--cpu_threads:6
+--rec_batch_num:1
+--use_tensorrt:False
+--precision:fp32
+--det_model_dir:
+--image_dir:./inference/ch_det_data_50/all-sum-510/
+null:null
+--benchmark:True
+--det:True
+--rec:False
+--cls:False
+--use_angle_cls:False
\ No newline at end of file
diff --git a/test_tipc/configs/ch_PP-OCRv3_det/model_linux_gpu_normal_normal_lite_cpp_arm_cpu.txt b/test_tipc/configs/ch_PP-OCRv3_det/model_linux_gpu_normal_normal_lite_cpp_arm_cpu.txt
new file mode 100644
index 0000000000000000000000000000000000000000..cbc101f93b5ba3822caf10ed6ecd55d32f827fce
--- /dev/null
+++ b/test_tipc/configs/ch_PP-OCRv3_det/model_linux_gpu_normal_normal_lite_cpp_arm_cpu.txt
@@ -0,0 +1,13 @@
+===========================lite_params===========================
+inference:./ocr_db_crnn det
+runtime_device:ARM_CPU
+det_infer_model:ch_PP-OCRv3_det_infer|ch_PP-OCRv3_det_slim_quant_infer
+null:null
+null:null
+--cpu_threads:1|4
+--det_batch_size:1
+null:null
+--image_dir:./test_data/icdar2015_lite/text_localization/ch4_test_images/
+--config_dir:./config.txt
+null:null
+--benchmark:True
\ No newline at end of file
diff --git a/test_tipc/configs/ch_PP-OCRv3_det/model_linux_gpu_normal_normal_lite_cpp_arm_gpu_opencl.txt b/test_tipc/configs/ch_PP-OCRv3_det/model_linux_gpu_normal_normal_lite_cpp_arm_gpu_opencl.txt
new file mode 100644
index 0000000000000000000000000000000000000000..ba3f5b71e50c7a52ccd61450cec5e5b28935b98b
--- /dev/null
+++ b/test_tipc/configs/ch_PP-OCRv3_det/model_linux_gpu_normal_normal_lite_cpp_arm_gpu_opencl.txt
@@ -0,0 +1,13 @@
+===========================lite_params===========================
+inference:./ocr_db_crnn det
+runtime_device:ARM_GPU_OPENCL
+det_infer_model:ch_PP-OCRv3_det_infer|ch_PP-OCRv3_det_slim_quant_infer
+null:null
+null:null
+--cpu_threads:1|4
+--det_batch_size:1
+null:null
+--image_dir:./test_data/icdar2015_lite/text_localization/ch4_test_images/
+--config_dir:./config.txt
+null:null
+--benchmark:True
diff --git a/test_tipc/configs/ch_PP-OCRv3_det/model_linux_gpu_normal_normal_paddle2onnx_python_linux_cpu.txt b/test_tipc/configs/ch_PP-OCRv3_det/model_linux_gpu_normal_normal_paddle2onnx_python_linux_cpu.txt
new file mode 100644
index 0000000000000000000000000000000000000000..a448713b1c91dbd190770056b2ee403f2ac56cc6
--- /dev/null
+++ b/test_tipc/configs/ch_PP-OCRv3_det/model_linux_gpu_normal_normal_paddle2onnx_python_linux_cpu.txt
@@ -0,0 +1,17 @@
+===========================paddle2onnx_params===========================
+model_name:ch_PP-OCRv3_det
+python:python3.7
+2onnx: paddle2onnx
+--det_model_dir:./inference/ch_PP-OCRv3_det_infer/
+--model_filename:inference.pdmodel
+--params_filename:inference.pdiparams
+--det_save_file:./inference/det_v3_onnx/model.onnx
+--rec_model_dir:
+--rec_save_file:
+--opset_version:10
+--enable_onnx_checker:True
+inference:tools/infer/predict_det.py
+--use_gpu:True|False
+--det_model_dir:
+--rec_model_dir:
+--image_dir:./inference/ch_det_data_50/all-sum-510/
\ No newline at end of file
diff --git a/test_tipc/configs/ch_PP-OCRv3_det/model_linux_gpu_normal_normal_serving_python_linux_gpu_cpu.txt b/test_tipc/configs/ch_PP-OCRv3_det/model_linux_gpu_normal_normal_serving_python_linux_gpu_cpu.txt
new file mode 100644
index 0000000000000000000000000000000000000000..6e2ec22cbd51e686ab64ad832559de1e2442fc98
--- /dev/null
+++ b/test_tipc/configs/ch_PP-OCRv3_det/model_linux_gpu_normal_normal_serving_python_linux_gpu_cpu.txt
@@ -0,0 +1,23 @@
+===========================serving_params===========================
+model_name:ch_PP-OCRv3_det
+python:python3.7
+trans_model:-m paddle_serving_client.convert
+--det_dirname:./inference/ch_PP-OCRv3_det_infer/
+--model_filename:inference.pdmodel
+--params_filename:inference.pdiparams
+--det_serving_server:./deploy/pdserving/ppocr_det_v3_serving/
+--det_serving_client:./deploy/pdserving/ppocr_det_v3_client/
+--rec_dirname:null
+--rec_serving_server:null
+--rec_serving_client:null
+serving_dir:./deploy/pdserving
+web_service:web_service_det.py --config=config.yml --opt op.det.concurrency="1"
+op.det.local_service_conf.devices:gpu|null
+op.det.local_service_conf.use_mkldnn:False
+op.det.local_service_conf.thread_num:6
+op.det.local_service_conf.use_trt:False
+op.det.local_service_conf.precision:fp32
+op.det.local_service_conf.model_config:
+op.rec.local_service_conf.model_config:
+pipline:pipeline_http_client.py
+--image_dir:../../doc/imgs/1.jpg
diff --git a/test_tipc/configs/ch_PP-OCRv3_det/train_infer_python.txt b/test_tipc/configs/ch_PP-OCRv3_det/train_infer_python.txt
new file mode 100644
index 0000000000000000000000000000000000000000..a69e0ab81ec6963228e9ab2e39c5bb1d730b6323
--- /dev/null
+++ b/test_tipc/configs/ch_PP-OCRv3_det/train_infer_python.txt
@@ -0,0 +1,53 @@
+===========================train_params===========================
+model_name:ch_PP-OCRv3_det
+python:python3.7
+gpu_list:0|0,1
+Global.use_gpu:True|True
+Global.auto_cast:fp32
+Global.epoch_num:lite_train_lite_infer=1|whole_train_whole_infer=50
+Global.save_model_dir:./output/
+Train.loader.batch_size_per_card:lite_train_lite_infer=2|whole_train_whole_infer=4
+Global.pretrained_model:null
+train_model_name:latest
+train_infer_img_dir:./train_data/icdar2015/text_localization/ch4_test_images/
+null:null
+##
+trainer:norm_train
+norm_train:tools/train.py -c configs/det/ch_PP-OCRv3/ch_PP-OCRv3_det_cml.yml -o
+pact_train:null
+fpgm_train:null
+distill_train:null
+null:null
+null:null
+##
+===========================eval_params===========================
+eval:null
+null:null
+##
+===========================infer_params===========================
+Global.save_inference_dir:./output/
+Global.checkpoints:
+norm_export:tools/export_model.py -c configs/det/ch_PP-OCRv3/ch_PP-OCRv3_det_cml.yml -o
+quant_export:null
+fpgm_export:
+distill_export:null
+export1:null
+export2:null
+inference_dir:Student
+infer_model:./inference/ch_PP-OCRv3_det_infer/
+infer_export:null
+infer_quant:False
+inference:tools/infer/predict_det.py
+--use_gpu:True|False
+--enable_mkldnn:False
+--cpu_threads:6
+--rec_batch_num:1
+--use_tensorrt:False
+--precision:fp32
+--det_model_dir:
+--image_dir:./inference/ch_det_data_50/all-sum-510/
+null:null
+--benchmark:True
+null:null
+===========================infer_benchmark_params==========================
+random_infer_input:[{float32,[3,640,640]}];[{float32,[3,960,960]}]
diff --git a/test_tipc/configs/ch_PP-OCRv3_det/train_linux_gpu_fleet_normal_infer_python_linux_gpu_cpu.txt b/test_tipc/configs/ch_PP-OCRv3_det/train_linux_gpu_fleet_normal_infer_python_linux_gpu_cpu.txt
new file mode 100644
index 0000000000000000000000000000000000000000..7e987125a6681629a592d43f05c2ecfe51dac3f1
--- /dev/null
+++ b/test_tipc/configs/ch_PP-OCRv3_det/train_linux_gpu_fleet_normal_infer_python_linux_gpu_cpu.txt
@@ -0,0 +1,53 @@
+===========================train_params===========================
+model_name:ch_PP-OCRv3_det
+python:python3.7
+gpu_list:192.168.0.1,192.168.0.2;0,1
+Global.use_gpu:True
+Global.auto_cast:fp32
+Global.epoch_num:lite_train_lite_infer=1|whole_train_whole_infer=50
+Global.save_model_dir:./output/
+Train.loader.batch_size_per_card:lite_train_lite_infer=2|whole_train_whole_infer=4
+Global.pretrained_model:null
+train_model_name:latest
+train_infer_img_dir:./train_data/icdar2015/text_localization/ch4_test_images/
+null:null
+##
+trainer:norm_train
+norm_train:tools/train.py -c configs/det/ch_PP-OCRv3/ch_PP-OCRv3_det_cml.yml -o
+pact_train:null
+fpgm_train:null
+distill_train:null
+null:null
+null:null
+##
+===========================eval_params===========================
+eval:null
+null:null
+##
+===========================infer_params===========================
+Global.save_inference_dir:./output/
+Global.checkpoints:
+norm_export:tools/export_model.py -c configs/det/ch_PP-OCRv3/ch_PP-OCRv3_det_cml.yml -o
+quant_export:null
+fpgm_export:
+distill_export:null
+export1:null
+export2:null
+inference_dir:Student
+infer_model:./inference/ch_PP-OCRv3_det_infer/
+infer_export:null
+infer_quant:False
+inference:tools/infer/predict_det.py
+--use_gpu:False
+--enable_mkldnn:False
+--cpu_threads:6
+--rec_batch_num:1
+--use_tensorrt:False
+--precision:fp32
+--det_model_dir:
+--image_dir:./inference/ch_det_data_50/all-sum-510/
+null:null
+--benchmark:True
+null:null
+===========================infer_benchmark_params==========================
+random_infer_input:[{float32,[3,640,640]}];[{float32,[3,960,960]}]
diff --git a/test_tipc/configs/ch_ppocr_mobile_v2.0_det_PACT/train_linux_gpu_normal_amp_infer_python_linux_gpu_cpu.txt b/test_tipc/configs/ch_PP-OCRv3_det/train_linux_gpu_normal_amp_infer_python_linux_gpu_cpu.txt
similarity index 65%
rename from test_tipc/configs/ch_ppocr_mobile_v2.0_det_PACT/train_linux_gpu_normal_amp_infer_python_linux_gpu_cpu.txt
rename to test_tipc/configs/ch_PP-OCRv3_det/train_linux_gpu_normal_amp_infer_python_linux_gpu_cpu.txt
index 1f9bec12ada6894fcffbe697ae4da2f0df95cc62..fe72cfb4e9ec50d24b1b115eca129a0bc9534b9c 100644
--- a/test_tipc/configs/ch_ppocr_mobile_v2.0_det_PACT/train_linux_gpu_normal_amp_infer_python_linux_gpu_cpu.txt
+++ b/test_tipc/configs/ch_PP-OCRv3_det/train_linux_gpu_normal_amp_infer_python_linux_gpu_cpu.txt
@@ -1,10 +1,10 @@
===========================train_params===========================
-model_name:ch_ppocr_mobile_v2.0_det_PACT
+model_name:ch_PP-OCRv3_det
python:python3.7
gpu_list:0|0,1
Global.use_gpu:True|True
Global.auto_cast:amp
-Global.epoch_num:lite_train_lite_infer=20|whole_train_whole_infer=300
+Global.epoch_num:lite_train_lite_infer=1|whole_train_whole_infer=50
Global.save_model_dir:./output/
Train.loader.batch_size_per_card:lite_train_lite_infer=2|whole_train_whole_infer=4
Global.pretrained_model:null
@@ -12,9 +12,9 @@ train_model_name:latest
train_infer_img_dir:./train_data/icdar2015/text_localization/ch4_test_images/
null:null
##
-trainer:pact_train
-norm_train:null
-pact_train:deploy/slim/quantization/quant.py -c configs/det/ch_ppocr_v2.0/ch_det_mv3_db_v2.0.yml -o
+trainer:norm_train
+norm_train:tools/train.py -c configs/det/ch_PP-OCRv3/ch_PP-OCRv3_det_cml.yml -o
+pact_train:null
fpgm_train:null
distill_train:null
null:null
@@ -27,23 +27,23 @@ null:null
===========================infer_params===========================
Global.save_inference_dir:./output/
Global.checkpoints:
-norm_export:null
-quant_export:deploy/slim/quantization/export_model.py -c configs/det/ch_ppocr_v2.0/ch_det_mv3_db_v2.0.yml -o
-fpgm_export:null
+norm_export:tools/export_model.py -c configs/det/ch_PP-OCRv3/ch_PP-OCRv3_det_cml.yml -o
+quant_export:null
+fpgm_export:
distill_export:null
export1:null
export2:null
-inference_dir:null
-train_model:./inference/ch_ppocr_mobile_v2.0_det_prune_infer/
+inference_dir:Student
+infer_model:./inference/ch_PP-OCRv3_det_infer/
infer_export:null
infer_quant:False
inference:tools/infer/predict_det.py
--use_gpu:True|False
---enable_mkldnn:True|False
---cpu_threads:1|6
+--enable_mkldnn:False
+--cpu_threads:6
--rec_batch_num:1
---use_tensorrt:False|True
---precision:fp32|fp16|int8
+--use_tensorrt:False
+--precision:fp32
--det_model_dir:
--image_dir:./inference/ch_det_data_50/all-sum-510/
null:null
diff --git a/test_tipc/configs/ch_PP-OCRv2_det_PACT/train_linux_gpu_normal_amp_infer_python_linux_gpu_cpu.txt b/test_tipc/configs/ch_PP-OCRv3_det/train_pact_infer_python.txt
similarity index 78%
rename from test_tipc/configs/ch_PP-OCRv2_det_PACT/train_linux_gpu_normal_amp_infer_python_linux_gpu_cpu.txt
rename to test_tipc/configs/ch_PP-OCRv3_det/train_pact_infer_python.txt
index d922a4a5dad67da81e3c9cf7bed48a0431a88b84..b536e69b05878fac3c2672a063ed7869d7a784fe 100644
--- a/test_tipc/configs/ch_PP-OCRv2_det_PACT/train_linux_gpu_normal_amp_infer_python_linux_gpu_cpu.txt
+++ b/test_tipc/configs/ch_PP-OCRv3_det/train_pact_infer_python.txt
@@ -1,12 +1,12 @@
===========================train_params===========================
-model_name:ch_PPOCRv2_det_PACT
+model_name:ch_PP-OCRv3_det_PACT
python:python3.7
gpu_list:0|0,1
Global.use_gpu:True|True
-Global.auto_cast:amp
-Global.epoch_num:lite_train_lite_infer=1|whole_train_whole_infer=500
+Global.auto_cast:fp32
+Global.epoch_num:lite_train_lite_infer=1|whole_train_whole_infer=50
Global.save_model_dir:./output/
-Train.loader.batch_size_per_card:lite_train_lite_infer=2|whole_train_whole_infer=4
+Train.loader.batch_size_per_card:lite_train_lite_infer=1|whole_train_whole_infer=4
Global.pretrained_model:null
train_model_name:latest
train_infer_img_dir:./train_data/icdar2015/text_localization/ch4_test_images/
@@ -14,7 +14,7 @@ null:null
##
trainer:pact_train
norm_train:null
-pact_train:deploy/slim/quantization/quant.py -c configs/det/ch_PP-OCRv2/ch_PP-OCRv2_det_cml.yml -o
+pact_train:deploy/slim/quantization/quant.py -c configs/det/ch_PP-OCRv3/ch_PP-OCRv3_det_cml.yml -o
fpgm_train:null
distill_train:null
null:null
@@ -28,22 +28,22 @@ null:null
Global.save_inference_dir:./output/
Global.checkpoints:
norm_export:null
-quant_export:deploy/slim/quantization/export_model.py -c configs/det/ch_PP-OCRv2/ch_PP-OCRv2_det_cml.yml -o
+quant_export:deploy/slim/quantization/export_model.py -c configs/det/ch_PP-OCRv3/ch_PP-OCRv3_det_cml.yml -o
fpgm_export:
distill_export:null
export1:null
export2:null
inference_dir:Student
-infer_model:./inference/ch_PP-OCRv2_det_infer/
+infer_model:./inference/ch_PP-OCRv3_det_infer/
infer_export:null
infer_quant:False
inference:tools/infer/predict_det.py
--use_gpu:True|False
---enable_mkldnn:True|False
---cpu_threads:1|6
+--enable_mkldnn:False
+--cpu_threads:6
--rec_batch_num:1
---use_tensorrt:False|True
---precision:fp32|fp16|int8
+--use_tensorrt:False
+--precision:fp32
--det_model_dir:
--image_dir:./inference/ch_det_data_50/all-sum-510/
null:null
diff --git a/test_tipc/configs/ch_PP-OCRv3_det/train_ptq_infer_python.txt b/test_tipc/configs/ch_PP-OCRv3_det/train_ptq_infer_python.txt
new file mode 100644
index 0000000000000000000000000000000000000000..c27e08a64075e15e4fbd8d4ffab7001752365417
--- /dev/null
+++ b/test_tipc/configs/ch_PP-OCRv3_det/train_ptq_infer_python.txt
@@ -0,0 +1,21 @@
+===========================kl_quant_params===========================
+model_name:ch_PP-OCRv3_det_KL
+python:python3.7
+Global.pretrained_model:null
+Global.save_inference_dir:null
+infer_model:./inference/ch_PP-OCRv3_det_infer/
+infer_export:deploy/slim/quantization/quant_kl.py -c configs/det/ch_PP-OCRv3/ch_PP-OCRv3_det_cml.yml -o
+infer_quant:True
+inference:tools/infer/predict_det.py
+--use_gpu:False|True
+--enable_mkldnn:False
+--cpu_threads:6
+--rec_batch_num:1
+--use_tensorrt:False
+--precision:int8
+--det_model_dir:
+--image_dir:./inference/ch_det_data_50/all-sum-510/
+null:null
+--benchmark:True
+null:null
+null:null
diff --git a/test_tipc/configs/ch_PP-OCRv3_det_KL/model_linux_gpu_normal_normal_infer_cpp_linux_gpu_cpu.txt b/test_tipc/configs/ch_PP-OCRv3_det_KL/model_linux_gpu_normal_normal_infer_cpp_linux_gpu_cpu.txt
new file mode 100644
index 0000000000000000000000000000000000000000..a34ffe22ae02c698fcc757c913d3fcadd572df2f
--- /dev/null
+++ b/test_tipc/configs/ch_PP-OCRv3_det_KL/model_linux_gpu_normal_normal_infer_cpp_linux_gpu_cpu.txt
@@ -0,0 +1,20 @@
+===========================cpp_infer_params===========================
+model_name:ch_PP-OCRv3_det_KL
+use_opencv:True
+infer_model:./inference/ch_PP-OCRv3_det_klquant_infer
+infer_quant:False
+inference:./deploy/cpp_infer/build/ppocr
+--use_gpu:True|False
+--enable_mkldnn:False
+--cpu_threads:6
+--rec_batch_num:1
+--use_tensorrt:False
+--precision:fp32
+--det_model_dir:
+--image_dir:./inference/ch_det_data_50/all-sum-510/
+null:null
+--benchmark:True
+--det:True
+--rec:False
+--cls:False
+--use_angle_cls:False
\ No newline at end of file
diff --git a/test_tipc/configs/ch_PP-OCRv3_det_KL/model_linux_gpu_normal_normal_serving_cpp_linux_gpu_cpu.txt b/test_tipc/configs/ch_PP-OCRv3_det_KL/model_linux_gpu_normal_normal_serving_cpp_linux_gpu_cpu.txt
new file mode 100644
index 0000000000000000000000000000000000000000..22b429760ce6e83293ddaf074898002a3b0c8995
--- /dev/null
+++ b/test_tipc/configs/ch_PP-OCRv3_det_KL/model_linux_gpu_normal_normal_serving_cpp_linux_gpu_cpu.txt
@@ -0,0 +1,19 @@
+===========================serving_params===========================
+model_name:ch_PP-OCRv3_det_KL
+python:python3.7
+trans_model:-m paddle_serving_client.convert
+--det_dirname:./inference/ch_PP-OCRv3_det_klquant_infer/
+--model_filename:inference.pdmodel
+--params_filename:inference.pdiparams
+--det_serving_server:./deploy/pdserving/ppocr_det_v3_kl_serving/
+--det_serving_client:./deploy/pdserving/ppocr_det_v3_kl_client/
+--rec_dirname:./inference/ch_PP-OCRv3_rec_klquant_infer/
+--rec_serving_server:./deploy/pdserving/ppocr_rec_v3_kl_serving/
+--rec_serving_client:./deploy/pdserving/ppocr_rec_v3_kl_client/
+serving_dir:./deploy/pdserving
+web_service:-m paddle_serving_server.serve
+--op:GeneralDetectionOp GeneralInferOp
+--port:8181
+--gpu_id:"0"|null
+cpp_client:ocr_cpp_client.py
+--image_dir:../../doc/imgs/1.jpg
diff --git a/test_tipc/configs/ch_PP-OCRv3_det_KL/model_linux_gpu_normal_normal_serving_python_linux_gpu_cpu.txt b/test_tipc/configs/ch_PP-OCRv3_det_KL/model_linux_gpu_normal_normal_serving_python_linux_gpu_cpu.txt
new file mode 100644
index 0000000000000000000000000000000000000000..23dbc49a388b3ff9bfe65a775bf28cf70e0d9a0b
--- /dev/null
+++ b/test_tipc/configs/ch_PP-OCRv3_det_KL/model_linux_gpu_normal_normal_serving_python_linux_gpu_cpu.txt
@@ -0,0 +1,23 @@
+===========================serving_params===========================
+model_name:ch_PP-OCRv3_det_KL
+python:python3.7
+trans_model:-m paddle_serving_client.convert
+--det_dirname:./inference/ch_PP-OCRv3_det_klquant_infer/
+--model_filename:inference.pdmodel
+--params_filename:inference.pdiparams
+--det_serving_server:./deploy/pdserving/ppocr_det_v3_kl_serving/
+--det_serving_client:./deploy/pdserving/ppocr_det_v3_kl_client/
+--rec_dirname:null
+--rec_serving_server:null
+--rec_serving_client:null
+serving_dir:./deploy/pdserving
+web_service:web_service_det.py --config=config.yml --opt op.det.concurrency="1"
+op.det.local_service_conf.devices:gpu|null
+op.det.local_service_conf.use_mkldnn:False
+op.det.local_service_conf.thread_num:6
+op.det.local_service_conf.use_trt:False
+op.det.local_service_conf.precision:fp32
+op.det.local_service_conf.model_config:
+op.rec.local_service_conf.model_config:
+pipline:pipeline_http_client.py
+--image_dir:../../doc/imgs/1.jpg
diff --git a/test_tipc/configs/ch_PP-OCRv3_det_PACT/model_linux_gpu_normal_normal_infer_cpp_linux_gpu_cpu.txt b/test_tipc/configs/ch_PP-OCRv3_det_PACT/model_linux_gpu_normal_normal_infer_cpp_linux_gpu_cpu.txt
new file mode 100644
index 0000000000000000000000000000000000000000..3198b87552bc8c5e29ad7dd6e158c5b592aa82d5
--- /dev/null
+++ b/test_tipc/configs/ch_PP-OCRv3_det_PACT/model_linux_gpu_normal_normal_infer_cpp_linux_gpu_cpu.txt
@@ -0,0 +1,20 @@
+===========================cpp_infer_params===========================
+model_name:ch_PP-OCRv3_det_PACT
+use_opencv:True
+infer_model:./inference/ch_PP-OCRv3_det_pact_infer
+infer_quant:False
+inference:./deploy/cpp_infer/build/ppocr
+--use_gpu:True|False
+--enable_mkldnn:False
+--cpu_threads:6
+--rec_batch_num:1
+--use_tensorrt:False
+--precision:fp32
+--det_model_dir:
+--image_dir:./inference/ch_det_data_50/all-sum-510/
+null:null
+--benchmark:True
+--det:True
+--rec:False
+--cls:False
+--use_angle_cls:False
\ No newline at end of file
diff --git a/test_tipc/configs/ch_PP-OCRv3_det_PACT/model_linux_gpu_normal_normal_serving_cpp_linux_gpu_cpu.txt b/test_tipc/configs/ch_PP-OCRv3_det_PACT/model_linux_gpu_normal_normal_serving_cpp_linux_gpu_cpu.txt
new file mode 100644
index 0000000000000000000000000000000000000000..7d300f4456b4761e885d46c1a4f8ece916d0c2ea
--- /dev/null
+++ b/test_tipc/configs/ch_PP-OCRv3_det_PACT/model_linux_gpu_normal_normal_serving_cpp_linux_gpu_cpu.txt
@@ -0,0 +1,19 @@
+===========================serving_params===========================
+model_name:ch_PP-OCRv3_det_PACT
+python:python3.7
+trans_model:-m paddle_serving_client.convert
+--det_dirname:./inference/ch_PP-OCRv3_det_pact_infer/
+--model_filename:inference.pdmodel
+--params_filename:inference.pdiparams
+--det_serving_server:./deploy/pdserving/ppocr_det_v3_pact_serving/
+--det_serving_client:./deploy/pdserving/ppocr_det_v3_pact_client/
+--rec_dirname:./inference/ch_PP-OCRv3_rec_pact_infer/
+--rec_serving_server:./deploy/pdserving/ppocr_rec_v3_pact_serving/
+--rec_serving_client:./deploy/pdserving/ppocr_rec_v3_pact_client/
+serving_dir:./deploy/pdserving
+web_service:-m paddle_serving_server.serve
+--op:GeneralDetectionOp GeneralInferOp
+--port:8181
+--gpu_id:"0"|null
+cpp_client:ocr_cpp_client.py
+--image_dir:../../doc/imgs/1.jpg
diff --git a/test_tipc/configs/ch_PP-OCRv3_det_PACT/model_linux_gpu_normal_normal_serving_python_linux_gpu_cpu.txt b/test_tipc/configs/ch_PP-OCRv3_det_PACT/model_linux_gpu_normal_normal_serving_python_linux_gpu_cpu.txt
new file mode 100644
index 0000000000000000000000000000000000000000..4546644cbca46a8d11069f4149c3bae8683630c2
--- /dev/null
+++ b/test_tipc/configs/ch_PP-OCRv3_det_PACT/model_linux_gpu_normal_normal_serving_python_linux_gpu_cpu.txt
@@ -0,0 +1,23 @@
+===========================serving_params===========================
+model_name:ch_PP-OCRv3_det_PACT
+python:python3.7
+trans_model:-m paddle_serving_client.convert
+--det_dirname:./inference/ch_PP-OCRv3_det_pact_infer/
+--model_filename:inference.pdmodel
+--params_filename:inference.pdiparams
+--det_serving_server:./deploy/pdserving/ppocr_det_v3_pact_serving/
+--det_serving_client:./deploy/pdserving/ppocr_det_v3_pact_client/
+--rec_dirname:null
+--rec_serving_server:null
+--rec_serving_client:null
+serving_dir:./deploy/pdserving
+web_service:web_service_det.py --config=config.yml --opt op.det.concurrency="1"
+op.det.local_service_conf.devices:gpu|null
+op.det.local_service_conf.use_mkldnn:False
+op.det.local_service_conf.thread_num:6
+op.det.local_service_conf.use_trt:False
+op.det.local_service_conf.precision:fp32
+op.det.local_service_conf.model_config:
+op.rec.local_service_conf.model_config:
+pipline:pipeline_http_client.py
+--image_dir:../../doc/imgs/1.jpg
diff --git a/test_tipc/configs/ch_PP-OCRv3_rec/ch_PP-OCRv3_rec_distillation.yml b/test_tipc/configs/ch_PP-OCRv3_rec/ch_PP-OCRv3_rec_distillation.yml
new file mode 100644
index 0000000000000000000000000000000000000000..f704a1dfb5dc2335c353a495dfbc0ce42cf35bf4
--- /dev/null
+++ b/test_tipc/configs/ch_PP-OCRv3_rec/ch_PP-OCRv3_rec_distillation.yml
@@ -0,0 +1,205 @@
+Global:
+ debug: false
+ use_gpu: true
+ epoch_num: 800
+ log_smooth_window: 20
+ print_batch_step: 10
+ save_model_dir: ./output/rec_ppocr_v3_distillation
+ save_epoch_step: 3
+ eval_batch_step: [0, 2000]
+ cal_metric_during_train: true
+ pretrained_model:
+ checkpoints:
+ save_inference_dir:
+ use_visualdl: false
+ infer_img: doc/imgs_words/ch/word_1.jpg
+ character_dict_path: ppocr/utils/ppocr_keys_v1.txt
+ max_text_length: &max_text_length 25
+ infer_mode: false
+ use_space_char: true
+ distributed: true
+ save_res_path: ./output/rec/predicts_ppocrv3_distillation.txt
+
+
+Optimizer:
+ name: Adam
+ beta1: 0.9
+ beta2: 0.999
+ lr:
+ name: Piecewise
+ decay_epochs : [700, 800]
+ values : [0.0005, 0.00005]
+ warmup_epoch: 5
+ regularizer:
+ name: L2
+ factor: 3.0e-05
+
+
+Architecture:
+ model_type: &model_type "rec"
+ name: DistillationModel
+ algorithm: Distillation
+ Models:
+ Teacher:
+ pretrained:
+ freeze_params: false
+ return_all_feats: true
+ model_type: *model_type
+ algorithm: SVTR
+ Transform:
+ Backbone:
+ name: MobileNetV1Enhance
+ scale: 0.5
+ last_conv_stride: [1, 2]
+ last_pool_type: avg
+ Head:
+ name: MultiHead
+ head_list:
+ - CTCHead:
+ Neck:
+ name: svtr
+ dims: 64
+ depth: 2
+ hidden_dims: 120
+ use_guide: True
+ Head:
+ fc_decay: 0.00001
+ - SARHead:
+ enc_dim: 512
+ max_text_length: *max_text_length
+ Student:
+ pretrained:
+ freeze_params: false
+ return_all_feats: true
+ model_type: *model_type
+ algorithm: SVTR
+ Transform:
+ Backbone:
+ name: MobileNetV1Enhance
+ scale: 0.5
+ last_conv_stride: [1, 2]
+ last_pool_type: avg
+ Head:
+ name: MultiHead
+ head_list:
+ - CTCHead:
+ Neck:
+ name: svtr
+ dims: 64
+ depth: 2
+ hidden_dims: 120
+ use_guide: True
+ Head:
+ fc_decay: 0.00001
+ - SARHead:
+ enc_dim: 512
+ max_text_length: *max_text_length
+Loss:
+ name: CombinedLoss
+ loss_config_list:
+ - DistillationDMLLoss:
+ weight: 1.0
+ act: "softmax"
+ use_log: true
+ model_name_pairs:
+ - ["Student", "Teacher"]
+ key: head_out
+ multi_head: True
+ dis_head: ctc
+ name: dml_ctc
+ - DistillationDMLLoss:
+ weight: 0.5
+ act: "softmax"
+ use_log: true
+ model_name_pairs:
+ - ["Student", "Teacher"]
+ key: head_out
+ multi_head: True
+ dis_head: sar
+ name: dml_sar
+ - DistillationDistanceLoss:
+ weight: 1.0
+ mode: "l2"
+ model_name_pairs:
+ - ["Student", "Teacher"]
+ key: backbone_out
+ - DistillationCTCLoss:
+ weight: 1.0
+ model_name_list: ["Student", "Teacher"]
+ key: head_out
+ multi_head: True
+ - DistillationSARLoss:
+ weight: 1.0
+ model_name_list: ["Student", "Teacher"]
+ key: head_out
+ multi_head: True
+
+PostProcess:
+ name: DistillationCTCLabelDecode
+ model_name: ["Student", "Teacher"]
+ key: head_out
+ multi_head: True
+
+Metric:
+ name: DistillationMetric
+ base_metric_name: RecMetric
+ main_indicator: acc
+ key: "Student"
+ ignore_space: True
+
+Train:
+ dataset:
+ name: SimpleDataSet
+ data_dir: ./train_data/ic15_data/
+ ext_op_transform_idx: 1
+ label_file_list:
+ - ./train_data/ic15_data/rec_gt_train_lite.txt
+ transforms:
+ - DecodeImage:
+ img_mode: BGR
+ channel_first: false
+ - RecConAug:
+ prob: 0.5
+ ext_data_num: 2
+ image_shape: [48, 320, 3]
+ - RecAug:
+ - MultiLabelEncode:
+ - RecResizeImg:
+ image_shape: [3, 48, 320]
+ - KeepKeys:
+ keep_keys:
+ - image
+ - label_ctc
+ - label_sar
+ - length
+ - valid_ratio
+ loader:
+ shuffle: true
+ batch_size_per_card: 128
+ drop_last: true
+ num_workers: 4
+Eval:
+ dataset:
+ name: SimpleDataSet
+ data_dir: ./train_data/ic15_data
+ label_file_list:
+ - ./train_data/ic15_data/rec_gt_test_lite.txt
+ transforms:
+ - DecodeImage:
+ img_mode: BGR
+ channel_first: false
+ - MultiLabelEncode:
+ - RecResizeImg:
+ image_shape: [3, 48, 320]
+ - KeepKeys:
+ keep_keys:
+ - image
+ - label_ctc
+ - label_sar
+ - length
+ - valid_ratio
+ loader:
+ shuffle: false
+ drop_last: false
+ batch_size_per_card: 128
+ num_workers: 4
diff --git a/test_tipc/configs/ch_PP-OCRv3_rec/model_linux_gpu_normal_normal_infer_cpp_linux_gpu_cpu.txt b/test_tipc/configs/ch_PP-OCRv3_rec/model_linux_gpu_normal_normal_infer_cpp_linux_gpu_cpu.txt
new file mode 100644
index 0000000000000000000000000000000000000000..9d6ca2cf5ec6413cb675389218cbb3b82770ba73
--- /dev/null
+++ b/test_tipc/configs/ch_PP-OCRv3_rec/model_linux_gpu_normal_normal_infer_cpp_linux_gpu_cpu.txt
@@ -0,0 +1,20 @@
+===========================cpp_infer_params===========================
+model_name:ch_PP-OCRv3_rec
+use_opencv:True
+infer_model:./inference/ch_PP-OCRv3_rec_infer/
+infer_quant:False
+inference:./deploy/cpp_infer/build/ppocr --rec_img_h=48 --rec_char_dict_path=./ppocr/utils/ppocr_keys_v1.txt
+--use_gpu:True|False
+--enable_mkldnn:False
+--cpu_threads:6
+--rec_batch_num:6
+--use_tensorrt:False
+--precision:fp32
+--rec_model_dir:
+--image_dir:./inference/rec_inference/
+null:null
+--benchmark:True
+--det:False
+--rec:True
+--cls:False
+--use_angle_cls:False
\ No newline at end of file
diff --git a/test_tipc/configs/ch_PP-OCRv3_rec/model_linux_gpu_normal_normal_paddle2onnx_python_linux_cpu.txt b/test_tipc/configs/ch_PP-OCRv3_rec/model_linux_gpu_normal_normal_paddle2onnx_python_linux_cpu.txt
new file mode 100644
index 0000000000000000000000000000000000000000..9114c0acfd8ea41952cbc301fcd53d639de050ef
--- /dev/null
+++ b/test_tipc/configs/ch_PP-OCRv3_rec/model_linux_gpu_normal_normal_paddle2onnx_python_linux_cpu.txt
@@ -0,0 +1,17 @@
+===========================paddle2onnx_params===========================
+model_name:ch_PP-OCRv3_rec
+python:python3.7
+2onnx: paddle2onnx
+--det_model_dir:
+--model_filename:inference.pdmodel
+--params_filename:inference.pdiparams
+--det_save_file:
+--rec_model_dir:./inference/ch_PP-OCRv3_rec_infer/
+--rec_save_file:./inference/rec_v3_onnx/model.onnx
+--opset_version:10
+--enable_onnx_checker:True
+inference:tools/infer/predict_rec.py --rec_image_shape="3,48,320"
+--use_gpu:True|False
+--det_model_dir:
+--rec_model_dir:
+--image_dir:./inference/rec_inference/
\ No newline at end of file
diff --git a/test_tipc/configs/ch_PP-OCRv3_rec/model_linux_gpu_normal_normal_serving_python_linux_gpu_cpu.txt b/test_tipc/configs/ch_PP-OCRv3_rec/model_linux_gpu_normal_normal_serving_python_linux_gpu_cpu.txt
new file mode 100644
index 0000000000000000000000000000000000000000..f01db2e9504ffd400609f0f6556d92ea33ee49ad
--- /dev/null
+++ b/test_tipc/configs/ch_PP-OCRv3_rec/model_linux_gpu_normal_normal_serving_python_linux_gpu_cpu.txt
@@ -0,0 +1,23 @@
+===========================serving_params===========================
+model_name:ch_PP-OCRv3_rec
+python:python3.7
+trans_model:-m paddle_serving_client.convert
+--det_dirname:null
+--model_filename:inference.pdmodel
+--params_filename:inference.pdiparams
+--det_serving_server:null
+--det_serving_client:null
+--rec_dirname:./inference/ch_PP-OCRv3_rec_infer/
+--rec_serving_server:./deploy/pdserving/ppocr_rec_v3_serving/
+--rec_serving_client:./deploy/pdserving/ppocr_rec_v3_client/
+serving_dir:./deploy/pdserving
+web_service:web_service_rec.py --config=config.yml --opt op.rec.concurrency="1"
+op.det.local_service_conf.devices:gpu|null
+op.det.local_service_conf.use_mkldnn:False
+op.det.local_service_conf.thread_num:6
+op.det.local_service_conf.use_trt:False
+op.det.local_service_conf.precision:fp32
+op.det.local_service_conf.model_config:
+op.rec.local_service_conf.model_config:
+pipline:pipeline_http_client.py --det=False
+--image_dir:../../inference/rec_inference
diff --git a/test_tipc/configs/ch_PP-OCRv3_rec/train_infer_python.txt b/test_tipc/configs/ch_PP-OCRv3_rec/train_infer_python.txt
new file mode 100644
index 0000000000000000000000000000000000000000..1feb9d49fce69d92ef141c3a942f858fc68cfaab
--- /dev/null
+++ b/test_tipc/configs/ch_PP-OCRv3_rec/train_infer_python.txt
@@ -0,0 +1,53 @@
+===========================train_params===========================
+model_name:ch_PP-OCRv3_rec
+python:python3.7
+gpu_list:0|0,1
+Global.use_gpu:True|True
+Global.auto_cast:fp32
+Global.epoch_num:lite_train_lite_infer=3|whole_train_whole_infer=50
+Global.save_model_dir:./output/
+Train.loader.batch_size_per_card:lite_train_lite_infer=16|whole_train_whole_infer=128
+Global.pretrained_model:null
+train_model_name:latest
+train_infer_img_dir:./inference/rec_inference
+null:null
+##
+trainer:norm_train
+norm_train:tools/train.py -c test_tipc/configs/ch_PP-OCRv3_rec/ch_PP-OCRv3_rec_distillation.yml -o
+pact_train:null
+fpgm_train:null
+distill_train:null
+null:null
+null:null
+##
+===========================eval_params===========================
+eval:null
+null:null
+##
+===========================infer_params===========================
+Global.save_inference_dir:./output/
+Global.checkpoints:
+norm_export:tools/export_model.py -c test_tipc/configs/ch_PP-OCRv3_rec/ch_PP-OCRv3_rec_distillation.yml -o
+quant_export:
+fpgm_export:
+distill_export:null
+export1:null
+export2:null
+inference_dir:Student
+infer_model:./inference/ch_PP-OCRv3_rec_infer
+infer_export:null
+infer_quant:False
+inference:tools/infer/predict_rec.py --rec_image_shape="3,48,320"
+--use_gpu:True|False
+--enable_mkldnn:False
+--cpu_threads:6
+--rec_batch_num:1|6
+--use_tensorrt:False
+--precision:fp32
+--rec_model_dir:
+--image_dir:./inference/rec_inference
+null:null
+--benchmark:True
+null:null
+===========================infer_benchmark_params==========================
+random_infer_input:[{float32,[3,48,320]}]
diff --git a/test_tipc/configs/ch_PP-OCRv3_rec/train_linux_gpu_fleet_normal_infer_python_linux_gpu_cpu.txt b/test_tipc/configs/ch_PP-OCRv3_rec/train_linux_gpu_fleet_normal_infer_python_linux_gpu_cpu.txt
new file mode 100644
index 0000000000000000000000000000000000000000..7fcc8b4418c65b0f98624d92bd3896518f2ed465
--- /dev/null
+++ b/test_tipc/configs/ch_PP-OCRv3_rec/train_linux_gpu_fleet_normal_infer_python_linux_gpu_cpu.txt
@@ -0,0 +1,53 @@
+===========================train_params===========================
+model_name:ch_PP-OCRv3_rec
+python:python3.7
+gpu_list:192.168.0.1,192.168.0.2;0,1
+Global.use_gpu:True
+Global.auto_cast:fp32
+Global.epoch_num:lite_train_lite_infer=3|whole_train_whole_infer=50
+Global.save_model_dir:./output/
+Train.loader.batch_size_per_card:lite_train_lite_infer=16|whole_train_whole_infer=64
+Global.pretrained_model:null
+train_model_name:latest
+train_infer_img_dir:./inference/rec_inference
+null:null
+##
+trainer:norm_train
+norm_train:tools/train.py -c test_tipc/configs/ch_PP-OCRv3_rec/ch_PP-OCRv3_rec_distillation.yml -o
+pact_train:null
+fpgm_train:null
+distill_train:null
+null:null
+null:null
+##
+===========================eval_params===========================
+eval:null
+null:null
+##
+===========================infer_params===========================
+Global.save_inference_dir:./output/
+Global.checkpoints:
+norm_export:tools/export_model.py -c test_tipc/configs/ch_PP-OCRv3_rec/ch_PP-OCRv3_rec_distillation.yml -o
+quant_export:
+fpgm_export:
+distill_export:null
+export1:null
+export2:null
+inference_dir:Student
+infer_model:./inference/ch_PP-OCRv3_rec_infer
+infer_export:null
+infer_quant:False
+inference:tools/infer/predict_rec.py --rec_image_shape="3,48,320"
+--use_gpu:False
+--enable_mkldnn:False
+--cpu_threads:6
+--rec_batch_num:1|6
+--use_tensorrt:False
+--precision:fp32
+--rec_model_dir:
+--image_dir:./inference/rec_inference
+null:null
+--benchmark:True
+null:null
+===========================infer_benchmark_params==========================
+random_infer_input:[{float32,[3,48,320]}]
diff --git a/test_tipc/configs/ch_PP-OCRv3_rec/train_linux_gpu_normal_amp_infer_python_linux_gpu_cpu.txt b/test_tipc/configs/ch_PP-OCRv3_rec/train_linux_gpu_normal_amp_infer_python_linux_gpu_cpu.txt
new file mode 100644
index 0000000000000000000000000000000000000000..99e3c4247716bdf139440e22ea80c8542b2d9830
--- /dev/null
+++ b/test_tipc/configs/ch_PP-OCRv3_rec/train_linux_gpu_normal_amp_infer_python_linux_gpu_cpu.txt
@@ -0,0 +1,53 @@
+===========================train_params===========================
+model_name:ch_PP-OCRv3_rec
+python:python3.7
+gpu_list:0|0,1
+Global.use_gpu:True|True
+Global.auto_cast:amp
+Global.epoch_num:lite_train_lite_infer=1|whole_train_whole_infer=50
+Global.save_model_dir:./output/
+Train.loader.batch_size_per_card:lite_train_lite_infer=16|whole_train_whole_infer=128
+Global.pretrained_model:null
+train_model_name:latest
+train_infer_img_dir:./inference/rec_inference
+null:null
+##
+trainer:norm_train
+norm_train:tools/train.py -c test_tipc/configs/ch_PP-OCRv3_rec/ch_PP-OCRv3_rec_distillation.yml -o
+pact_train:null
+fpgm_train:null
+distill_train:null
+null:null
+null:null
+##
+===========================eval_params===========================
+eval:null
+null:null
+##
+===========================infer_params===========================
+Global.save_inference_dir:./output/
+Global.checkpoints:
+norm_export:tools/export_model.py -c test_tipc/configs/ch_PP-OCRv3_rec/ch_PP-OCRv3_rec_distillation.yml -o
+quant_export:
+fpgm_export:
+distill_export:null
+export1:null
+export2:null
+inference_dir:Student
+infer_model:./inference/ch_PP-OCRv3_rec_infer
+infer_export:null
+infer_quant:False
+inference:tools/infer/predict_rec.py --rec_image_shape="3,48,320"
+--use_gpu:True|False
+--enable_mkldnn:False
+--cpu_threads:6
+--rec_batch_num:6
+--use_tensorrt:False
+--precision:fp32
+--rec_model_dir:
+--image_dir:./inference/rec_inference
+null:null
+--benchmark:True
+null:null
+===========================infer_benchmark_params==========================
+random_infer_input:[{float32,[3,48,320]}]
diff --git a/test_tipc/configs/ch_PP-OCRv2_rec_PACT/train_linux_gpu_normal_amp_infer_python_linux_gpu_cpu.txt b/test_tipc/configs/ch_PP-OCRv3_rec/train_pact_infer_python.txt
similarity index 66%
rename from test_tipc/configs/ch_PP-OCRv2_rec_PACT/train_linux_gpu_normal_amp_infer_python_linux_gpu_cpu.txt
rename to test_tipc/configs/ch_PP-OCRv3_rec/train_pact_infer_python.txt
index e22d8a564b008206611469048b424b528dd379bd..24469a91cfa0fa7d26ff24dba13c9c7e78a5ca10 100644
--- a/test_tipc/configs/ch_PP-OCRv2_rec_PACT/train_linux_gpu_normal_amp_infer_python_linux_gpu_cpu.txt
+++ b/test_tipc/configs/ch_PP-OCRv3_rec/train_pact_infer_python.txt
@@ -1,20 +1,20 @@
===========================train_params===========================
-model_name:ch_PPOCRv2_rec_PACT
+model_name:ch_PP-OCRv3_rec_PACT
python:python3.7
gpu_list:0|0,1
Global.use_gpu:True|True
-Global.auto_cast:amp
-Global.epoch_num:lite_train_lite_infer=3|whole_train_whole_infer=300
+Global.auto_cast:fp32
+Global.epoch_num:lite_train_lite_infer=1|whole_train_whole_infer=50
Global.save_model_dir:./output/
Train.loader.batch_size_per_card:lite_train_lite_infer=16|whole_train_whole_infer=128
-Global.pretrained_model:null
+Global.pretrained_model:pretrain_models/ch_PP-OCRv3_rec_train/best_accuracy
train_model_name:latest
train_infer_img_dir:./inference/rec_inference
null:null
##
trainer:pact_train
norm_train:null
-pact_train:deploy/slim/quantization/quant.py -c test_tipc/configs/ch_PP-OCRv2_rec/ch_PP-OCRv2_rec_distillation.yml -o
+pact_train:deploy/slim/quantization/quant.py -c test_tipc/configs/ch_PP-OCRv3_rec/ch_PP-OCRv3_rec_distillation.yml -o
fpgm_train:null
distill_train:null
null:null
@@ -28,26 +28,26 @@ null:null
Global.save_inference_dir:./output/
Global.checkpoints:
norm_export:null
-quant_export:deploy/slim/quantization/export_model.py -c test_tipc/configs/ch_PP-OCRv2_rec/ch_PP-OCRv2_rec_distillation.yml -o
+quant_export:deploy/slim/quantization/export_model.py -c test_tipc/configs/ch_PP-OCRv3_rec/ch_PP-OCRv3_rec_distillation.yml -o
fpgm_export: null
distill_export:null
export1:null
export2:null
inference_dir:Student
-infer_model:./inference/ch_PP-OCRv2_rec_slim_quant_infer
+infer_model:./inference/ch_PP-OCRv3_rec_slim_infer
infer_export:null
infer_quant:True
-inference:tools/infer/predict_rec.py
+inference:tools/infer/predict_rec.py --rec_image_shape="3,48,320"
--use_gpu:True|False
---enable_mkldnn:True|False
---cpu_threads:1|6
+--enable_mkldnn:False
+--cpu_threads:6
--rec_batch_num:1|6
---use_tensorrt:False|True
---precision:fp32|int8
+--use_tensorrt:False
+--precision:fp32
--rec_model_dir:
--image_dir:./inference/rec_inference
null:null
--benchmark:True
null:null
===========================infer_benchmark_params==========================
-random_infer_input:[{float32,[3,32,320]}]
+random_infer_input:[{float32,[3,48,320]}]
diff --git a/test_tipc/configs/ch_PP-OCRv3_rec/train_ptq_infer_python.txt b/test_tipc/configs/ch_PP-OCRv3_rec/train_ptq_infer_python.txt
new file mode 100644
index 0000000000000000000000000000000000000000..d1a8c7c00137661bb1b5cced46f7616877f0b0a2
--- /dev/null
+++ b/test_tipc/configs/ch_PP-OCRv3_rec/train_ptq_infer_python.txt
@@ -0,0 +1,21 @@
+===========================kl_quant_params===========================
+model_name:ch_PP-OCRv3_rec_KL
+python:python3.7
+Global.pretrained_model:
+Global.save_inference_dir:null
+infer_model:./inference/ch_PP-OCRv3_rec_infer/
+infer_export:deploy/slim/quantization/quant_kl.py -c test_tipc/configs/ch_PP-OCRv3_rec/ch_PP-OCRv3_rec_distillation.yml -o
+infer_quant:True
+inference:tools/infer/predict_rec.py --rec_image_shape="3,48,320"
+--use_gpu:False|True
+--enable_mkldnn:False
+--cpu_threads:6
+--rec_batch_num:1|6
+--use_tensorrt:False
+--precision:int8
+--rec_model_dir:
+--image_dir:./inference/rec_inference
+null:null
+--benchmark:True
+null:null
+null:null
diff --git a/test_tipc/configs/ch_PP-OCRv3_rec_KL/model_linux_gpu_normal_normal_infer_cpp_linux_gpu_cpu.txt b/test_tipc/configs/ch_PP-OCRv3_rec_KL/model_linux_gpu_normal_normal_infer_cpp_linux_gpu_cpu.txt
new file mode 100644
index 0000000000000000000000000000000000000000..f1a308fcc8831f4d1ab66dffc69ee3deb9ac0bc3
--- /dev/null
+++ b/test_tipc/configs/ch_PP-OCRv3_rec_KL/model_linux_gpu_normal_normal_infer_cpp_linux_gpu_cpu.txt
@@ -0,0 +1,20 @@
+===========================cpp_infer_params===========================
+model_name:ch_PP-OCRv3_rec_KL
+use_opencv:True
+infer_model:./inference/ch_PP-OCRv3_rec_klquant_infer
+infer_quant:False
+inference:./deploy/cpp_infer/build/ppocr --rec_img_h=48 --rec_char_dict_path=./ppocr/utils/ppocr_keys_v1.txt
+--use_gpu:True|False
+--enable_mkldnn:False
+--cpu_threads:6
+--rec_batch_num:6
+--use_tensorrt:False
+--precision:fp32
+--rec_model_dir:
+--image_dir:./inference/rec_inference/
+null:null
+--benchmark:True
+--det:False
+--rec:True
+--cls:False
+--use_angle_cls:False
\ No newline at end of file
diff --git a/test_tipc/configs/ch_PP-OCRv3_rec_KL/model_linux_gpu_normal_normal_serving_cpp_linux_gpu_cpu.txt b/test_tipc/configs/ch_PP-OCRv3_rec_KL/model_linux_gpu_normal_normal_serving_cpp_linux_gpu_cpu.txt
new file mode 100644
index 0000000000000000000000000000000000000000..fa6f04e48c99f732624f60cba8df67783969ced5
--- /dev/null
+++ b/test_tipc/configs/ch_PP-OCRv3_rec_KL/model_linux_gpu_normal_normal_serving_cpp_linux_gpu_cpu.txt
@@ -0,0 +1,19 @@
+===========================serving_params===========================
+model_name:ch_PP-OCRv3_rec_KL
+python:python3.7
+trans_model:-m paddle_serving_client.convert
+--det_dirname:./inference/ch_PP-OCRv3_det_klquant_infer/
+--model_filename:inference.pdmodel
+--params_filename:inference.pdiparams
+--det_serving_server:./deploy/pdserving/ppocr_det_v3_kl_serving/
+--det_serving_client:./deploy/pdserving/ppocr_det_v3_kl_client/
+--rec_dirname:./inference/ch_PP-OCRv3_rec_klquant_infer/
+--rec_serving_server:./deploy/pdserving/ppocr_rec_v3_kl_serving/
+--rec_serving_client:./deploy/pdserving/ppocr_rec_v3_kl_client/
+serving_dir:./deploy/pdserving
+web_service:-m paddle_serving_server.serve
+--op:GeneralDetectionOp GeneralInferOp
+--port:8181
+--gpu_id:"0"|null
+cpp_client:ocr_cpp_client.py
+--image_dir:../../doc/imgs/1.jpg
diff --git a/test_tipc/configs/ch_PP-OCRv3_rec_KL/model_linux_gpu_normal_normal_serving_python_linux_gpu_cpu.txt b/test_tipc/configs/ch_PP-OCRv3_rec_KL/model_linux_gpu_normal_normal_serving_python_linux_gpu_cpu.txt
new file mode 100644
index 0000000000000000000000000000000000000000..68586af0d97c658c07f0ce65b8ac5af44e909592
--- /dev/null
+++ b/test_tipc/configs/ch_PP-OCRv3_rec_KL/model_linux_gpu_normal_normal_serving_python_linux_gpu_cpu.txt
@@ -0,0 +1,23 @@
+===========================serving_params===========================
+model_name:ch_PP-OCRv3_rec_KL
+python:python3.7
+trans_model:-m paddle_serving_client.convert
+--det_dirname:null
+--model_filename:inference.pdmodel
+--params_filename:inference.pdiparams
+--det_serving_server:null
+--det_serving_client:null
+--rec_dirname:./inference/ch_PP-OCRv3_rec_klquant_infer/
+--rec_serving_server:./deploy/pdserving/ppocr_rec_v3_kl_serving/
+--rec_serving_client:./deploy/pdserving/ppocr_rec_v3_kl_client/
+serving_dir:./deploy/pdserving
+web_service:web_service_rec.py --config=config.yml --opt op.rec.concurrency="1"
+op.det.local_service_conf.devices:gpu|null
+op.det.local_service_conf.use_mkldnn:False
+op.det.local_service_conf.thread_num:6
+op.det.local_service_conf.use_trt:False
+op.det.local_service_conf.precision:fp32
+op.det.local_service_conf.model_config:
+op.rec.local_service_conf.model_config:
+pipline:pipeline_http_client.py --det=False
+--image_dir:../../inference/rec_inference
diff --git a/test_tipc/configs/ch_PP-OCRv3_rec_PACT/model_linux_gpu_normal_normal_infer_cpp_linux_gpu_cpu.txt b/test_tipc/configs/ch_PP-OCRv3_rec_PACT/model_linux_gpu_normal_normal_infer_cpp_linux_gpu_cpu.txt
new file mode 100644
index 0000000000000000000000000000000000000000..8fc1132ca6016d78846527345528db2e003d5ad1
--- /dev/null
+++ b/test_tipc/configs/ch_PP-OCRv3_rec_PACT/model_linux_gpu_normal_normal_infer_cpp_linux_gpu_cpu.txt
@@ -0,0 +1,20 @@
+===========================cpp_infer_params===========================
+model_name:ch_PP-OCRv3_rec_PACT
+use_opencv:True
+infer_model:./inference/ch_PP-OCRv3_rec_pact_infer
+infer_quant:False
+inference:./deploy/cpp_infer/build/ppocr --rec_img_h=48 --rec_char_dict_path=./ppocr/utils/ppocr_keys_v1.txt
+--use_gpu:True|False
+--enable_mkldnn:False
+--cpu_threads:6
+--rec_batch_num:6
+--use_tensorrt:False
+--precision:fp32
+--rec_model_dir:
+--image_dir:./inference/rec_inference/
+null:null
+--benchmark:True
+--det:False
+--rec:True
+--cls:False
+--use_angle_cls:False
\ No newline at end of file
diff --git a/test_tipc/configs/ch_PP-OCRv3_rec_PACT/model_linux_gpu_normal_normal_serving_cpp_linux_gpu_cpu.txt b/test_tipc/configs/ch_PP-OCRv3_rec_PACT/model_linux_gpu_normal_normal_serving_cpp_linux_gpu_cpu.txt
new file mode 100644
index 0000000000000000000000000000000000000000..008df50d6d32c701ec7ce3814b2f0490916f2d7a
--- /dev/null
+++ b/test_tipc/configs/ch_PP-OCRv3_rec_PACT/model_linux_gpu_normal_normal_serving_cpp_linux_gpu_cpu.txt
@@ -0,0 +1,19 @@
+===========================serving_params===========================
+model_name:ch_PP-OCRv3_rec_PACT
+python:python3.7
+trans_model:-m paddle_serving_client.convert
+--det_dirname:./inference/ch_PP-OCRv3_det_pact_infer/
+--model_filename:inference.pdmodel
+--params_filename:inference.pdiparams
+--det_serving_server:./deploy/pdserving/ppocr_det_v3_pact_serving/
+--det_serving_client:./deploy/pdserving/ppocr_det_v3_pact_client/
+--rec_dirname:./inference/ch_PP-OCRv3_rec_pact_infer/
+--rec_serving_server:./deploy/pdserving/ppocr_rec_v3_pact_serving/
+--rec_serving_client:./deploy/pdserving/ppocr_rec_v3_pact_client/
+serving_dir:./deploy/pdserving
+web_service:-m paddle_serving_server.serve
+--op:GeneralDetectionOp GeneralInferOp
+--port:8181
+--gpu_id:"0"|null
+cpp_client:ocr_cpp_client.py
+--image_dir:../../doc/imgs/1.jpg
diff --git a/test_tipc/configs/ch_PP-OCRv3_rec_PACT/model_linux_gpu_normal_normal_serving_python_linux_gpu_cpu.txt b/test_tipc/configs/ch_PP-OCRv3_rec_PACT/model_linux_gpu_normal_normal_serving_python_linux_gpu_cpu.txt
new file mode 100644
index 0000000000000000000000000000000000000000..826f586f0c6759f1c13a45891e9ba9c0688554e8
--- /dev/null
+++ b/test_tipc/configs/ch_PP-OCRv3_rec_PACT/model_linux_gpu_normal_normal_serving_python_linux_gpu_cpu.txt
@@ -0,0 +1,23 @@
+===========================serving_params===========================
+model_name:ch_PP-OCRv3_rec_PACT
+python:python3.7
+trans_model:-m paddle_serving_client.convert
+--det_dirname:null
+--model_filename:inference.pdmodel
+--params_filename:inference.pdiparams
+--det_serving_server:null
+--det_serving_client:null
+--rec_dirname:./inference/ch_PP-OCRv3_rec_pact_infer/
+--rec_serving_server:./deploy/pdserving/ppocr_rec_v3_pact_serving/
+--rec_serving_client:./deploy/pdserving/ppocr_rec_v3_pact_client/
+serving_dir:./deploy/pdserving
+web_service:web_service_rec.py --config=config.yml --opt op.rec.concurrency="1"
+op.det.local_service_conf.devices:gpu|null
+op.det.local_service_conf.use_mkldnn:False
+op.det.local_service_conf.thread_num:6
+op.det.local_service_conf.use_trt:False
+op.det.local_service_conf.precision:fp32
+op.det.local_service_conf.model_config:
+op.rec.local_service_conf.model_config:
+pipline:pipeline_http_client.py --det=False
+--image_dir:../../inference/rec_inference
diff --git a/test_tipc/configs/ch_ppocr_mobile_v2.0_rec_PACT/train_linux_gpu_normal_amp_infer_python_linux_gpu_cpu.txt b/test_tipc/configs/ch_PP-OCRv3_rec_PACT/train_linux_gpu_normal_amp_infer_python_linux_gpu_cpu.txt
similarity index 54%
rename from test_tipc/configs/ch_ppocr_mobile_v2.0_rec_PACT/train_linux_gpu_normal_amp_infer_python_linux_gpu_cpu.txt
rename to test_tipc/configs/ch_PP-OCRv3_rec_PACT/train_linux_gpu_normal_amp_infer_python_linux_gpu_cpu.txt
index abed3cfba9b3f8c0ed626dbfcbda8621d8787001..c93b307debae630fccf41f29348c0b34761c1bf6 100644
--- a/test_tipc/configs/ch_ppocr_mobile_v2.0_rec_PACT/train_linux_gpu_normal_amp_infer_python_linux_gpu_cpu.txt
+++ b/test_tipc/configs/ch_PP-OCRv3_rec_PACT/train_linux_gpu_normal_amp_infer_python_linux_gpu_cpu.txt
@@ -1,20 +1,20 @@
===========================train_params===========================
-model_name:ch_ppocr_mobile_v2.0_rec_PACT
+model_name:ch_PP-OCRv3_rec_PACT
python:python3.7
-gpu_list:0
+gpu_list:0|0,1
Global.use_gpu:True|True
Global.auto_cast:amp
-Global.epoch_num:lite_train_lite_infer=1|whole_train_whole_infer=300
+Global.epoch_num:lite_train_lite_infer=1|whole_train_whole_infer=50
Global.save_model_dir:./output/
-Train.loader.batch_size_per_card:lite_train_lite_infer=128|whole_train_whole_infer=128
-Global.checkpoints:null
+Train.loader.batch_size_per_card:lite_train_lite_infer=16|whole_train_whole_infer=128
+Global.pretrained_model:pretrain_models/ch_PP-OCRv3_rec_train/best_accuracy
train_model_name:latest
-train_infer_img_dir:./train_data/ic15_data/test/word_1.png
+train_infer_img_dir:./inference/rec_inference
null:null
##
trainer:pact_train
norm_train:null
-pact_train:deploy/slim/quantization/quant.py -c test_tipc/configs/ch_ppocr_mobile_v2.0_rec_PACT/rec_chinese_lite_train_v2.0.yml -o
+pact_train:deploy/slim/quantization/quant.py -c test_tipc/configs/ch_PP-OCRv3_rec/ch_PP-OCRv3_rec_distillation.yml -o
fpgm_train:null
distill_train:null
null:null
@@ -28,26 +28,26 @@ null:null
Global.save_inference_dir:./output/
Global.checkpoints:
norm_export:null
-quant_export:deploy/slim/quantization/export_model.py -c test_tipc/configs/ch_ppocr_mobile_v2.0_rec_PACT/rec_chinese_lite_train_v2.0.yml -o
-fpgm_export:null
+quant_export:deploy/slim/quantization/export_model.py -c test_tipc/configs/ch_PP-OCRv3_rec/ch_PP-OCRv3_rec_distillation.yml -o
+fpgm_export: null
distill_export:null
export1:null
export2:null
-inference_dir:null
-infer_model:./inference/ch_ppocr_mobile_v2.0_rec_slim_infer/
+inference_dir:Student
+infer_model:./inference/ch_PP-OCRv3_rec_slim_quant_infer
infer_export:null
-infer_quant:False
-inference:tools/infer/predict_rec.py --rec_char_dict_path=./ppocr/utils/ppocr_keys_v1.txt --rec_image_shape="3,32,100"
+infer_quant:True
+inference:tools/infer/predict_rec.py --rec_image_shape="3,48,320"
--use_gpu:True|False
---enable_mkldnn:True|False
---cpu_threads:1|6
---rec_batch_num:1|6
---use_tensorrt:False|True
---precision:fp32|int8
+--enable_mkldnn:False
+--cpu_threads:6
+--rec_batch_num:6
+--use_tensorrt:False
+--precision:fp32
--rec_model_dir:
--image_dir:./inference/rec_inference
---save_log_path:./test/output/
+null:null
--benchmark:True
null:null
===========================infer_benchmark_params==========================
-random_infer_input:[{float32,[3,32,320]}]
+random_infer_input:[{float32,[3,48,320]}]
diff --git a/test_tipc/configs/ch_ppocr_mobile_v2.0/model_linux_gpu_normal_normal_infer_cpp_linux_gpu_cpu.txt b/test_tipc/configs/ch_ppocr_mobile_v2.0/model_linux_gpu_normal_normal_infer_cpp_linux_gpu_cpu.txt
new file mode 100644
index 0000000000000000000000000000000000000000..b42ab9db362b0ba56d795096fdc58a645b425480
--- /dev/null
+++ b/test_tipc/configs/ch_ppocr_mobile_v2.0/model_linux_gpu_normal_normal_infer_cpp_linux_gpu_cpu.txt
@@ -0,0 +1,20 @@
+===========================cpp_infer_params===========================
+model_name:ch_ppocr_mobile_v2.0
+use_opencv:True
+infer_model:./inference/ch_ppocr_mobile_v2.0_det_infer/
+infer_quant:False
+inference:./deploy/cpp_infer/build/ppocr --rec_char_dict_path=./ppocr/utils/ppocr_keys_v1.txt --rec_img_h=32
+--use_gpu:True|False
+--enable_mkldnn:False
+--cpu_threads:6
+--rec_batch_num:1
+--use_tensorrt:False
+--precision:fp32
+--det_model_dir:
+--image_dir:./inference/ch_det_data_50/all-sum-510/
+--rec_model_dir:./inference/ch_ppocr_mobile_v2.0_rec_infer/
+--benchmark:True
+--det:True
+--rec:True
+--cls:False
+--use_angle_cls:False
\ No newline at end of file
diff --git a/test_tipc/configs/ch_ppocr_mobile_v2.0/model_linux_gpu_normal_normal_infer_python_linux_gpu_cpu.txt b/test_tipc/configs/ch_ppocr_mobile_v2.0/model_linux_gpu_normal_normal_infer_python_linux_gpu_cpu.txt
index 4a46f0cf09dcf2bb812910f0cf322dda0749b87c..becad991eab2535b2df7862d0d25707ef37f08f8 100644
--- a/test_tipc/configs/ch_ppocr_mobile_v2.0/model_linux_gpu_normal_normal_infer_python_linux_gpu_cpu.txt
+++ b/test_tipc/configs/ch_ppocr_mobile_v2.0/model_linux_gpu_normal_normal_infer_python_linux_gpu_cpu.txt
@@ -6,10 +6,10 @@ infer_export:null
infer_quant:False
inference:tools/infer/predict_system.py
--use_gpu:False|True
---enable_mkldnn:False|True
---cpu_threads:1|6
+--enable_mkldnn:False
+--cpu_threads:6
--rec_batch_num:1
---use_tensorrt:False|True
+--use_tensorrt:False
--precision:fp32
--det_model_dir:
--image_dir:./inference/ch_det_data_50/all-sum-510/
diff --git a/test_tipc/configs/ch_ppocr_mobile_v2.0/model_linux_gpu_normal_normal_paddle2onnx_python_linux_cpu.txt b/test_tipc/configs/ch_ppocr_mobile_v2.0/model_linux_gpu_normal_normal_paddle2onnx_python_linux_cpu.txt
new file mode 100644
index 0000000000000000000000000000000000000000..17c2fbbae2e182c4a7631cb18908180d8c019b4f
--- /dev/null
+++ b/test_tipc/configs/ch_ppocr_mobile_v2.0/model_linux_gpu_normal_normal_paddle2onnx_python_linux_cpu.txt
@@ -0,0 +1,17 @@
+===========================paddle2onnx_params===========================
+model_name:ch_ppocr_mobile_v2.0
+python:python3.7
+2onnx: paddle2onnx
+--det_model_dir:./inference/ch_ppocr_mobile_v2.0_det_infer/
+--model_filename:inference.pdmodel
+--params_filename:inference.pdiparams
+--det_save_file:./inference/det_mobile_onnx/model.onnx
+--rec_model_dir:./inference/ch_ppocr_mobile_v2.0_rec_infer/
+--rec_save_file:./inference/rec_mobile_onnx/model.onnx
+--opset_version:10
+--enable_onnx_checker:True
+inference:tools/infer/predict_system.py --rec_image_shape="3,32,320"
+--use_gpu:True|False
+--det_model_dir:
+--rec_model_dir:
+--image_dir:./inference/ch_det_data_50/all-sum-510/
\ No newline at end of file
diff --git a/test_tipc/configs/ch_ppocr_mobile_v2.0/model_linux_gpu_normal_normal_serving_cpp_linux_gpu_cpu.txt b/test_tipc/configs/ch_ppocr_mobile_v2.0/model_linux_gpu_normal_normal_serving_cpp_linux_gpu_cpu.txt
new file mode 100644
index 0000000000000000000000000000000000000000..d18e9f11fdd2ff605cdd8f6c1bcf51ca780eb766
--- /dev/null
+++ b/test_tipc/configs/ch_ppocr_mobile_v2.0/model_linux_gpu_normal_normal_serving_cpp_linux_gpu_cpu.txt
@@ -0,0 +1,19 @@
+===========================serving_params===========================
+model_name:ch_ppocr_mobile_v2.0
+python:python3.7
+trans_model:-m paddle_serving_client.convert
+--det_dirname:./inference/ch_ppocr_mobile_v2.0_det_infer/
+--model_filename:inference.pdmodel
+--params_filename:inference.pdiparams
+--det_serving_server:./deploy/pdserving/ppocr_det_mobile_serving/
+--det_serving_client:./deploy/pdserving/ppocr_det_mobile_client/
+--rec_dirname:./inference/ch_ppocr_mobile_v2.0_rec_infer/
+--rec_serving_server:./deploy/pdserving/ppocr_rec_mobile_serving/
+--rec_serving_client:./deploy/pdserving/ppocr_rec_mobile_client/
+serving_dir:./deploy/pdserving
+web_service:-m paddle_serving_server.serve
+--op:GeneralDetectionOp GeneralInferOp
+--port:8181
+--gpu_id:"0"|null
+cpp_client:ocr_cpp_client.py
+--image_dir:../../doc/imgs/1.jpg
diff --git a/test_tipc/configs/ch_ppocr_mobile_v2.0/model_linux_gpu_normal_normal_serving_python_linux_gpu_cpu.txt b/test_tipc/configs/ch_ppocr_mobile_v2.0/model_linux_gpu_normal_normal_serving_python_linux_gpu_cpu.txt
new file mode 100644
index 0000000000000000000000000000000000000000..842c9340176d696c1e43e59491bdcab817f9256e
--- /dev/null
+++ b/test_tipc/configs/ch_ppocr_mobile_v2.0/model_linux_gpu_normal_normal_serving_python_linux_gpu_cpu.txt
@@ -0,0 +1,23 @@
+===========================serving_params===========================
+model_name:ch_ppocr_mobile_v2.0
+python:python3.7
+trans_model:-m paddle_serving_client.convert
+--det_dirname:./inference/ch_ppocr_mobile_v2.0_det_infer/
+--model_filename:inference.pdmodel
+--params_filename:inference.pdiparams
+--det_serving_server:./deploy/pdserving/ppocr_det_mobile_serving/
+--det_serving_client:./deploy/pdserving/ppocr_det_mobile_client/
+--rec_dirname:./inference/ch_ppocr_mobile_v2.0_rec_infer/
+--rec_serving_server:./deploy/pdserving/ppocr_rec_mobile_serving/
+--rec_serving_client:./deploy/pdserving/ppocr_rec_mobile_client/
+serving_dir:./deploy/pdserving
+web_service:web_service.py --config=config.yml --opt op.det.concurrency="1" op.rec.concurrency="1"
+op.det.local_service_conf.devices:gpu|null
+op.det.local_service_conf.use_mkldnn:False
+op.det.local_service_conf.thread_num:6
+op.det.local_service_conf.use_trt:False
+op.det.local_service_conf.precision:fp32
+op.det.local_service_conf.model_config:
+op.rec.local_service_conf.model_config:
+pipline:pipeline_http_client.py
+--image_dir:../../doc/imgs/1.jpg
diff --git a/test_tipc/configs/ch_ppocr_mobile_v2.0_det/model_linux_gpu_normal_normal_infer_cpp_linux_gpu_cpu.txt b/test_tipc/configs/ch_ppocr_mobile_v2.0_det/model_linux_gpu_normal_normal_infer_cpp_linux_gpu_cpu.txt
index d0ae17ccb55f40ddf65de936ca3cfc06bdd19475..1d1c2ae283b2103c2e7282186ab1f53bec05cda3 100644
--- a/test_tipc/configs/ch_ppocr_mobile_v2.0_det/model_linux_gpu_normal_normal_infer_cpp_linux_gpu_cpu.txt
+++ b/test_tipc/configs/ch_ppocr_mobile_v2.0_det/model_linux_gpu_normal_normal_infer_cpp_linux_gpu_cpu.txt
@@ -1,15 +1,15 @@
===========================cpp_infer_params===========================
-model_name:ocr_det
+model_name:ch_ppocr_mobile_v2.0_det
use_opencv:True
infer_model:./inference/ch_ppocr_mobile_v2.0_det_infer/
infer_quant:False
inference:./deploy/cpp_infer/build/ppocr
--use_gpu:True|False
---enable_mkldnn:True|False
---cpu_threads:1|6
+--enable_mkldnn:False
+--cpu_threads:6
--rec_batch_num:1
---use_tensorrt:False|True
---precision:fp32|fp16
+--use_tensorrt:False
+--precision:fp32
--det_model_dir:
--image_dir:./inference/ch_det_data_50/all-sum-510/
null:null
diff --git a/test_tipc/configs/ch_ppocr_mobile_v2.0_det/model_linux_gpu_normal_normal_infer_python_jetson.txt b/test_tipc/configs/ch_ppocr_mobile_v2.0_det/model_linux_gpu_normal_normal_infer_python_jetson.txt
index 7d3f60bd42aad18c045aeee70fc60d2c17a2af13..24bb8746ab7793dbcb4af99102a007aca8b8e16b 100644
--- a/test_tipc/configs/ch_ppocr_mobile_v2.0_det/model_linux_gpu_normal_normal_infer_python_jetson.txt
+++ b/test_tipc/configs/ch_ppocr_mobile_v2.0_det/model_linux_gpu_normal_normal_infer_python_jetson.txt
@@ -1,5 +1,5 @@
===========================infer_params===========================
-model_name:ocr_det
+model_name:ch_ppocr_mobile_v2.0_det
python:python
infer_model:./inference/ch_ppocr_mobile_v2.0_det_infer
infer_export:null
@@ -7,10 +7,10 @@ infer_quant:False
inference:tools/infer/predict_det.py
--use_gpu:True|False
--enable_mkldnn:False
---cpu_threads:1|6
+--cpu_threads:6
--rec_batch_num:1
---use_tensorrt:False|True
---precision:fp16|fp32
+--use_tensorrt:False
+--precision:fp32
--det_model_dir:
--image_dir:./inference/ch_det_data_50/all-sum-510/
null:null
diff --git a/test_tipc/configs/ch_ppocr_mobile_v2.0_det/model_linux_gpu_normal_normal_paddle2onnx_python_linux_cpu.txt b/test_tipc/configs/ch_ppocr_mobile_v2.0_det/model_linux_gpu_normal_normal_paddle2onnx_python_linux_cpu.txt
index 160bcdbd88661c3d795eb2faf6b93965598c3e22..00473d1062615834a42e350a727f50233efd831f 100644
--- a/test_tipc/configs/ch_ppocr_mobile_v2.0_det/model_linux_gpu_normal_normal_paddle2onnx_python_linux_cpu.txt
+++ b/test_tipc/configs/ch_ppocr_mobile_v2.0_det/model_linux_gpu_normal_normal_paddle2onnx_python_linux_cpu.txt
@@ -1,14 +1,17 @@
===========================paddle2onnx_params===========================
-model_name:ocr_det_mobile
+model_name:ch_ppocr_mobile_v2.0_det
python:python3.7
2onnx: paddle2onnx
---model_dir:./inference/ch_ppocr_mobile_v2.0_det_infer/
+--det_model_dir:./inference/ch_ppocr_mobile_v2.0_det_infer/
--model_filename:inference.pdmodel
--params_filename:inference.pdiparams
---save_file:./inference/det_mobile_onnx/model.onnx
+--det_save_file:./inference/det_mobile_onnx/model.onnx
+--rec_model_dir:
+--rec_save_file:
--opset_version:10
--enable_onnx_checker:True
inference:tools/infer/predict_det.py
--use_gpu:True|False
--det_model_dir:
+--rec_model_dir:
--image_dir:./inference/ch_det_data_50/all-sum-510/
\ No newline at end of file
diff --git a/test_tipc/configs/ch_ppocr_mobile_v2.0_det/model_linux_gpu_normal_normal_serving_python_linux_gpu_cpu.txt b/test_tipc/configs/ch_ppocr_mobile_v2.0_det/model_linux_gpu_normal_normal_serving_python_linux_gpu_cpu.txt
index 2326c9d2a7a785bf5f94124476fb3c21f91ceed2..c9dd5ad920d58f60ce36a7b489073279f23ba1b7 100644
--- a/test_tipc/configs/ch_ppocr_mobile_v2.0_det/model_linux_gpu_normal_normal_serving_python_linux_gpu_cpu.txt
+++ b/test_tipc/configs/ch_ppocr_mobile_v2.0_det/model_linux_gpu_normal_normal_serving_python_linux_gpu_cpu.txt
@@ -1,18 +1,23 @@
===========================serving_params===========================
-model_name:ocr_det_mobile
-python:python3.7|cpp
+model_name:ch_ppocr_mobile_v2.0_det
+python:python3.7
trans_model:-m paddle_serving_client.convert
---dirname:./inference/ch_ppocr_mobile_v2.0_det_infer/
+--det_dirname:./inference/ch_ppocr_mobile_v2.0_det_infer/
--model_filename:inference.pdmodel
--params_filename:inference.pdiparams
---serving_server:./deploy/pdserving/ppocr_det_mobile_2.0_serving/
---serving_client:./deploy/pdserving/ppocr_det_mobile_2.0_client/
+--det_serving_server:./deploy/pdserving/ppocr_det_mobile_serving/
+--det_serving_client:./deploy/pdserving/ppocr_det_mobile_client/
+--rec_dirname:null
+--rec_serving_server:null
+--rec_serving_client:null
serving_dir:./deploy/pdserving
web_service:web_service_det.py --config=config.yml --opt op.det.concurrency="1"
-op.det.local_service_conf.devices:"0"|null
-op.det.local_service_conf.use_mkldnn:True|False
-op.det.local_service_conf.thread_num:1|6
-op.det.local_service_conf.use_trt:False|True
-op.det.local_service_conf.precision:fp32|fp16|int8
-pipline:pipeline_rpc_client.py|pipeline_http_client.py
---image_dir:../../doc/imgs
+op.det.local_service_conf.devices:gpu|null
+op.det.local_service_conf.use_mkldnn:False
+op.det.local_service_conf.thread_num:6
+op.det.local_service_conf.use_trt:False
+op.det.local_service_conf.precision:fp32
+op.det.local_service_conf.model_config:
+op.rec.local_service_conf.model_config:
+pipline:pipeline_http_client.py
+--image_dir:../../doc/imgs/1.jpg
diff --git a/test_tipc/configs/ch_ppocr_mobile_v2.0_det/train_infer_python.txt b/test_tipc/configs/ch_ppocr_mobile_v2.0_det/train_infer_python.txt
index 269693a86e9e371d52865f48d7fbaccce5d72393..789ed4d23d9c1fa3997daceee0627218aecd4c73 100644
--- a/test_tipc/configs/ch_ppocr_mobile_v2.0_det/train_infer_python.txt
+++ b/test_tipc/configs/ch_ppocr_mobile_v2.0_det/train_infer_python.txt
@@ -4,7 +4,7 @@ python:python3.7
gpu_list:0|0,1
Global.use_gpu:True|True
Global.auto_cast:null
-Global.epoch_num:lite_train_lite_infer=100|whole_train_whole_infer=300
+Global.epoch_num:lite_train_lite_infer=100|whole_train_whole_infer=50
Global.save_model_dir:./output/
Train.loader.batch_size_per_card:lite_train_lite_infer=2|whole_train_whole_infer=4
Global.pretrained_model:null
@@ -39,11 +39,11 @@ infer_export:tools/export_model.py -c configs/det/ch_ppocr_v2.0/ch_det_mv3_db_v2
infer_quant:False
inference:tools/infer/predict_det.py
--use_gpu:True|False
---enable_mkldnn:True|False
---cpu_threads:1|6
+--enable_mkldnn:False
+--cpu_threads:6
--rec_batch_num:1
---use_tensorrt:False|True
---precision:fp32|fp16|int8
+--use_tensorrt:False
+--precision:fp32
--det_model_dir:
--image_dir:./inference/ch_det_data_50/all-sum-510/
null:null
diff --git a/test_tipc/configs/ch_ppocr_mobile_v2.0_det/train_linux_gpu_fleet_amp_infer_python_linux_gpu_cpu.txt b/test_tipc/configs/ch_ppocr_mobile_v2.0_det/train_linux_gpu_fleet_amp_infer_python_linux_gpu_cpu.txt
deleted file mode 100644
index bfb71b781039493b12875ed4e99c8cd004a2e295..0000000000000000000000000000000000000000
--- a/test_tipc/configs/ch_ppocr_mobile_v2.0_det/train_linux_gpu_fleet_amp_infer_python_linux_gpu_cpu.txt
+++ /dev/null
@@ -1,51 +0,0 @@
-===========================train_params===========================
-model_name:ocr_det
-python:python3.7
-gpu_list:xx.xx.xx.xx,yy.yy.yy.yy;0,1
-Global.use_gpu:True
-Global.auto_cast:fp32|amp
-Global.epoch_num:lite_train_lite_infer=1|whole_train_whole_infer=300
-Global.save_model_dir:./output/
-Train.loader.batch_size_per_card:lite_train_lite_infer=2|whole_train_whole_infer=4
-Global.pretrained_model:null
-train_model_name:latest
-train_infer_img_dir:./train_data/icdar2015/text_localization/ch4_test_images/
-null:null
-##
-trainer:norm_train|pact_train|fpgm_train
-norm_train:tools/train.py -c test_tipc/configs/ppocr_det_mobile/det_mv3_db.yml -o Global.pretrained_model=./pretrain_models/MobileNetV3_large_x0_5_pretrained
-pact_train:deploy/slim/quantization/quant.py -c test_tipc/configs/ppocr_det_mobile/det_mv3_db.yml -o
-fpgm_train:deploy/slim/prune/sensitivity_anal.py -c test_tipc/configs/ppocr_det_mobile/det_mv3_db.yml -o Global.pretrained_model=./pretrain_models/det_mv3_db_v2.0_train/best_accuracy
-distill_train:null
-null:null
-null:null
-##
-===========================eval_params===========================
-eval:null
-null:null
-##
-===========================infer_params===========================
-Global.save_inference_dir:./output/
-Global.pretrained_model:
-norm_export:tools/export_model.py -c test_tipc/configs/ppocr_det_mobile/det_mv3_db.yml -o
-quant_export:deploy/slim/quantization/export_model.py -c test_tipc/configs/ppocr_det_mobile/det_mv3_db.yml -o
-fpgm_export:deploy/slim/prune/export_prune_model.py -c test_tipc/configs/ppocr_det_mobile/det_mv3_db.yml -o
-distill_export:null
-export1:null
-export2:null
-inference_dir:null
-train_model:./inference/ch_ppocr_mobile_v2.0_det_train/best_accuracy
-infer_export:tools/export_model.py -c configs/det/ch_ppocr_v2.0/ch_det_mv3_db_v2.0.yml -o
-infer_quant:False
-inference:tools/infer/predict_det.py
---use_gpu:True|False
---enable_mkldnn:True|False
---cpu_threads:1|6
---rec_batch_num:1
---use_tensorrt:False|True
---precision:fp32|fp16|int8
---det_model_dir:
---image_dir:./inference/ch_det_data_50/all-sum-510/
-null:null
---benchmark:True
-null:null
diff --git a/test_tipc/configs/ch_ppocr_mobile_v2.0_det/train_linux_gpu_fleet_normal_infer_python_linux_gpu_cpu.txt b/test_tipc/configs/ch_ppocr_mobile_v2.0_det/train_linux_gpu_fleet_normal_infer_python_linux_gpu_cpu.txt
new file mode 100644
index 0000000000000000000000000000000000000000..5271f78bb778f9e419da7f9bbbb6b4a6fafb305b
--- /dev/null
+++ b/test_tipc/configs/ch_ppocr_mobile_v2.0_det/train_linux_gpu_fleet_normal_infer_python_linux_gpu_cpu.txt
@@ -0,0 +1,53 @@
+===========================train_params===========================
+model_name:ch_ppocr_mobile_v2.0_det
+python:python3.7
+gpu_list:192.168.0.1,192.168.0.2;0,1
+Global.use_gpu:True
+Global.auto_cast:null
+Global.epoch_num:lite_train_lite_infer=100|whole_train_whole_infer=50
+Global.save_model_dir:./output/
+Train.loader.batch_size_per_card:lite_train_lite_infer=2|whole_train_whole_infer=4
+Global.pretrained_model:null
+train_model_name:latest
+train_infer_img_dir:./train_data/icdar2015/text_localization/ch4_test_images/
+null:null
+##
+trainer:norm_train
+norm_train:tools/train.py -c configs/det/ch_ppocr_v2.0/ch_det_mv3_db_v2.0.yml -o Global.pretrained_model=./pretrain_models/MobileNetV3_large_x0_5_pretrained
+pact_train:null
+fpgm_train:null
+distill_train:null
+null:null
+null:null
+##
+===========================eval_params===========================
+eval:null
+null:null
+##
+===========================infer_params===========================
+Global.save_inference_dir:./output/
+Global.checkpoints:
+norm_export:tools/export_model.py -c configs/det/ch_ppocr_v2.0/ch_det_mv3_db_v2.0.yml -o
+quant_export:null
+fpgm_export:null
+distill_export:null
+export1:null
+export2:null
+inference_dir:null
+train_model:./inference/ch_ppocr_mobile_v2.0_det_train/best_accuracy
+infer_export:tools/export_model.py -c configs/det/ch_ppocr_v2.0/ch_det_mv3_db_v2.0.yml -o
+infer_quant:False
+inference:tools/infer/predict_det.py
+--use_gpu:False
+--enable_mkldnn:False
+--cpu_threads:6
+--rec_batch_num:1
+--use_tensorrt:False
+--precision:fp32
+--det_model_dir:
+--image_dir:./inference/ch_det_data_50/all-sum-510/
+null:null
+--benchmark:True
+null:null
+===========================infer_benchmark_params==========================
+random_infer_input:[{float32,[3,640,640]}];[{float32,[3,960,960]}]
\ No newline at end of file
diff --git a/test_tipc/configs/ch_ppocr_mobile_v2.0_det/train_linux_gpu_normal_amp_infer_python_linux_gpu_cpu.txt b/test_tipc/configs/ch_ppocr_mobile_v2.0_det/train_linux_gpu_normal_amp_infer_python_linux_gpu_cpu.txt
index 593e7ec7ed42af9b65c520852ff6372f89890170..6b3352f741a56124eead2f71c03c783e5c81a70d 100644
--- a/test_tipc/configs/ch_ppocr_mobile_v2.0_det/train_linux_gpu_normal_amp_infer_python_linux_gpu_cpu.txt
+++ b/test_tipc/configs/ch_ppocr_mobile_v2.0_det/train_linux_gpu_normal_amp_infer_python_linux_gpu_cpu.txt
@@ -4,7 +4,7 @@ python:python3.7
gpu_list:0|0,1
Global.use_gpu:True|True
Global.auto_cast:amp
-Global.epoch_num:lite_train_lite_infer=100|whole_train_whole_infer=300
+Global.epoch_num:lite_train_lite_infer=100|whole_train_whole_infer=50
Global.save_model_dir:./output/
Train.loader.batch_size_per_card:lite_train_lite_infer=2|whole_train_whole_infer=4
Global.pretrained_model:null
@@ -39,11 +39,11 @@ infer_export:tools/export_model.py -c configs/det/ch_ppocr_v2.0/ch_det_mv3_db_v2
infer_quant:False
inference:tools/infer/predict_det.py
--use_gpu:True|False
---enable_mkldnn:True|False
---cpu_threads:1|6
+--enable_mkldnn:False
+--cpu_threads:6
--rec_batch_num:1
---use_tensorrt:False|True
---precision:fp32|fp16|int8
+--use_tensorrt:False
+--precision:fp32
--det_model_dir:
--image_dir:./inference/ch_det_data_50/all-sum-510/
null:null
diff --git a/test_tipc/configs/ch_ppocr_mobile_v2.0_det/train_mac_cpu_normal_normal_infer_python_mac_cpu.txt b/test_tipc/configs/ch_ppocr_mobile_v2.0_det/train_mac_cpu_normal_normal_infer_python_mac_cpu.txt
index 014dad5fc9d87c08a0725f57127f8bf2cb248be3..3f321a1903cdd1076e181c4bb901f1c9dc6d7f58 100644
--- a/test_tipc/configs/ch_ppocr_mobile_v2.0_det/train_mac_cpu_normal_normal_infer_python_mac_cpu.txt
+++ b/test_tipc/configs/ch_ppocr_mobile_v2.0_det/train_mac_cpu_normal_normal_infer_python_mac_cpu.txt
@@ -4,7 +4,7 @@ python:python
gpu_list:-1
Global.use_gpu:False
Global.auto_cast:null
-Global.epoch_num:lite_train_lite_infer=5|whole_train_whole_infer=300
+Global.epoch_num:lite_train_lite_infer=5|whole_train_whole_infer=50
Global.save_model_dir:./output/
Train.loader.batch_size_per_card:lite_train_lite_infer=2|whole_train_whole_infer=4
Global.pretrained_model:null
diff --git a/test_tipc/configs/ch_ppocr_mobile_v2.0_det_PACT/train_infer_python.txt b/test_tipc/configs/ch_ppocr_mobile_v2.0_det/train_pact_infer_python.txt
similarity index 89%
rename from test_tipc/configs/ch_ppocr_mobile_v2.0_det_PACT/train_infer_python.txt
rename to test_tipc/configs/ch_ppocr_mobile_v2.0_det/train_pact_infer_python.txt
index 9d2855d8240a7c42295e6e2439d121504d307b09..04c8d0e194b687f58da1c449a6a0d8d9c1acd25e 100644
--- a/test_tipc/configs/ch_ppocr_mobile_v2.0_det_PACT/train_infer_python.txt
+++ b/test_tipc/configs/ch_ppocr_mobile_v2.0_det/train_pact_infer_python.txt
@@ -4,7 +4,7 @@ python:python3.7
gpu_list:0|0,1
Global.use_gpu:True|True
Global.auto_cast:null
-Global.epoch_num:lite_train_lite_infer=20|whole_train_whole_infer=300
+Global.epoch_num:lite_train_lite_infer=2|whole_train_whole_infer=50
Global.save_model_dir:./output/
Train.loader.batch_size_per_card:lite_train_lite_infer=2|whole_train_whole_infer=4
Global.pretrained_model:null
@@ -39,11 +39,11 @@ infer_export:null
infer_quant:False
inference:tools/infer/predict_det.py
--use_gpu:True|False
---enable_mkldnn:True|False
---cpu_threads:1|6
+--enable_mkldnn:False
+--cpu_threads:6
--rec_batch_num:1
---use_tensorrt:False|True
---precision:fp32|fp16|int8
+--use_tensorrt:False
+--precision:fp32
--det_model_dir:
--image_dir:./inference/ch_det_data_50/all-sum-510/
null:null
diff --git a/test_tipc/configs/ch_ppocr_mobile_v2.0_det_KL/model_linux_gpu_normal_normal_infer_python_linux_gpu_cpu.txt b/test_tipc/configs/ch_ppocr_mobile_v2.0_det/train_ptq_infer_python.txt
similarity index 89%
rename from test_tipc/configs/ch_ppocr_mobile_v2.0_det_KL/model_linux_gpu_normal_normal_infer_python_linux_gpu_cpu.txt
rename to test_tipc/configs/ch_ppocr_mobile_v2.0_det/train_ptq_infer_python.txt
index 1039dcad06d63bb1fc1a47b7cc4760cd8d75ed63..2bdec848833b6cf3799370b0337fa00f185a94d5 100644
--- a/test_tipc/configs/ch_ppocr_mobile_v2.0_det_KL/model_linux_gpu_normal_normal_infer_python_linux_gpu_cpu.txt
+++ b/test_tipc/configs/ch_ppocr_mobile_v2.0_det/train_ptq_infer_python.txt
@@ -8,10 +8,10 @@ infer_export:deploy/slim/quantization/quant_kl.py -c configs/det/ch_ppocr_v2.0/c
infer_quant:True
inference:tools/infer/predict_det.py
--use_gpu:False|True
---enable_mkldnn:True
---cpu_threads:1|6
+--enable_mkldnn:False
+--cpu_threads:6
--rec_batch_num:1
---use_tensorrt:False|True
+--use_tensorrt:False
--precision:int8
--det_model_dir:
--image_dir:./inference/ch_det_data_50/all-sum-510/
diff --git a/test_tipc/configs/ch_ppocr_mobile_v2.0_det/train_windows_gpu_normal_normal_infer_python_windows_cpu_gpu.txt b/test_tipc/configs/ch_ppocr_mobile_v2.0_det/train_windows_gpu_normal_normal_infer_python_windows_cpu_gpu.txt
index 6a63b39d976c0e9693deec097c37eb0ff212d8af..a3f6933a64bd64c8f39a90f72d139ad9fada55bb 100644
--- a/test_tipc/configs/ch_ppocr_mobile_v2.0_det/train_windows_gpu_normal_normal_infer_python_windows_cpu_gpu.txt
+++ b/test_tipc/configs/ch_ppocr_mobile_v2.0_det/train_windows_gpu_normal_normal_infer_python_windows_cpu_gpu.txt
@@ -4,7 +4,7 @@ python:python
gpu_list:0
Global.use_gpu:True
Global.auto_cast:fp32|amp
-Global.epoch_num:lite_train_lite_infer=5|whole_train_whole_infer=300
+Global.epoch_num:lite_train_lite_infer=5|whole_train_whole_infer=50
Global.save_model_dir:./output/
Train.loader.batch_size_per_card:lite_train_lite_infer=2|whole_train_whole_infer=4
Global.pretrained_model:null
@@ -39,10 +39,10 @@ infer_export:tools/export_model.py -c configs/det/ch_ppocr_v2.0/ch_det_mv3_db_v2
infer_quant:False
inference:tools/infer/predict_det.py
--use_gpu:True|False
---enable_mkldnn:True|False
+--enable_mkldnn:False
--cpu_threads:1|6
--rec_batch_num:1
---use_tensorrt:False|True
+--use_tensorrt:False
--precision:fp32|fp16|int8
--det_model_dir:
--image_dir:./inference/ch_det_data_50/all-sum-510/
diff --git a/test_tipc/configs/ch_ppocr_mobile_v2.0_det_FPGM/train_infer_python.txt b/test_tipc/configs/ch_ppocr_mobile_v2.0_det_FPGM/train_infer_python.txt
index 47ccf2e69e75bc8c215be8d1837e5248d1b4b513..dae3f8053a0264611b5baca0f45839f3550fe6a4 100644
--- a/test_tipc/configs/ch_ppocr_mobile_v2.0_det_FPGM/train_infer_python.txt
+++ b/test_tipc/configs/ch_ppocr_mobile_v2.0_det_FPGM/train_infer_python.txt
@@ -4,7 +4,7 @@ python:python3.7
gpu_list:0|0,1
Global.use_gpu:True|True
Global.auto_cast:null
-Global.epoch_num:lite_train_lite_infer=5|whole_train_whole_infer=300
+Global.epoch_num:lite_train_lite_infer=5|whole_train_whole_infer=50
Global.save_model_dir:./output/
Train.loader.batch_size_per_card:lite_train_lite_infer=2|whole_train_whole_infer=4
Global.pretrained_model:null
@@ -39,11 +39,11 @@ infer_export:null
infer_quant:False
inference:tools/infer/predict_det.py
--use_gpu:True|False
---enable_mkldnn:True|False
---cpu_threads:1|6
+--enable_mkldnn:False
+--cpu_threads:6
--rec_batch_num:1
---use_tensorrt:False|True
---precision:fp32|fp16|int8
+--use_tensorrt:False
+--precision:fp32
--det_model_dir:
--image_dir:./inference/ch_det_data_50/all-sum-510/
null:null
diff --git a/test_tipc/configs/ch_ppocr_mobile_v2.0_det_FPGM/train_linux_gpu_normal_amp_infer_python_linux_gpu_cpu.txt b/test_tipc/configs/ch_ppocr_mobile_v2.0_det_FPGM/train_linux_gpu_normal_amp_infer_python_linux_gpu_cpu.txt
index 5a95f026850b750bfadb85e0955f7426e5e73cb6..150a8a0315b83e8c62765a4aa66429cfd0590928 100644
--- a/test_tipc/configs/ch_ppocr_mobile_v2.0_det_FPGM/train_linux_gpu_normal_amp_infer_python_linux_gpu_cpu.txt
+++ b/test_tipc/configs/ch_ppocr_mobile_v2.0_det_FPGM/train_linux_gpu_normal_amp_infer_python_linux_gpu_cpu.txt
@@ -4,7 +4,7 @@ python:python3.7
gpu_list:0|0,1
Global.use_gpu:True|True
Global.auto_cast:amp
-Global.epoch_num:lite_train_lite_infer=5|whole_train_whole_infer=300
+Global.epoch_num:lite_train_lite_infer=5|whole_train_whole_infer=50
Global.save_model_dir:./output/
Train.loader.batch_size_per_card:lite_train_lite_infer=2|whole_train_whole_infer=4
Global.pretrained_model:null
@@ -39,11 +39,11 @@ infer_export:null
infer_quant:False
inference:tools/infer/predict_det.py
--use_gpu:True|False
---enable_mkldnn:True|False
---cpu_threads:1|6
+--enable_mkldnn:False
+--cpu_threads:6
--rec_batch_num:1
---use_tensorrt:False|True
---precision:fp32|fp16|int8
+--use_tensorrt:False
+--precision:fp32
--det_model_dir:
--image_dir:./inference/ch_det_data_50/all-sum-510/
null:null
diff --git a/test_tipc/configs/ch_ppocr_mobile_v2.0_det_KL/model_linux_gpu_normal_normal_infer_cpp_linux_gpu_cpu.txt b/test_tipc/configs/ch_ppocr_mobile_v2.0_det_KL/model_linux_gpu_normal_normal_infer_cpp_linux_gpu_cpu.txt
new file mode 100644
index 0000000000000000000000000000000000000000..eb2fd0a001ab506f241bed7eac75d96cf4b5d5cb
--- /dev/null
+++ b/test_tipc/configs/ch_ppocr_mobile_v2.0_det_KL/model_linux_gpu_normal_normal_infer_cpp_linux_gpu_cpu.txt
@@ -0,0 +1,20 @@
+===========================cpp_infer_params===========================
+model_name:ch_ppocr_mobile_v2.0_det_KL
+use_opencv:True
+infer_model:./inference/ch_ppocr_mobile_v2.0_det_klquant_infer
+infer_quant:False
+inference:./deploy/cpp_infer/build/ppocr
+--use_gpu:True|False
+--enable_mkldnn:False
+--cpu_threads:6
+--rec_batch_num:1
+--use_tensorrt:False
+--precision:fp32
+--det_model_dir:
+--image_dir:./inference/ch_det_data_50/all-sum-510/
+null:null
+--benchmark:True
+--det:True
+--rec:False
+--cls:False
+--use_angle_cls:False
\ No newline at end of file
diff --git a/test_tipc/configs/ch_ppocr_mobile_v2.0_det_KL/model_linux_gpu_normal_normal_serving_cpp_linux_gpu_cpu.txt b/test_tipc/configs/ch_ppocr_mobile_v2.0_det_KL/model_linux_gpu_normal_normal_serving_cpp_linux_gpu_cpu.txt
new file mode 100644
index 0000000000000000000000000000000000000000..d9de1cc19a729485845601fe929bf57d74002641
--- /dev/null
+++ b/test_tipc/configs/ch_ppocr_mobile_v2.0_det_KL/model_linux_gpu_normal_normal_serving_cpp_linux_gpu_cpu.txt
@@ -0,0 +1,19 @@
+===========================serving_params===========================
+model_name:ch_ppocr_mobile_v2.0_det_KL
+python:python3.7
+trans_model:-m paddle_serving_client.convert
+--det_dirname:./inference/ch_ppocr_mobile_v2.0_det_klquant_infer/
+--model_filename:inference.pdmodel
+--params_filename:inference.pdiparams
+--det_serving_server:./deploy/pdserving/ppocr_det_mobile_kl_serving/
+--det_serving_client:./deploy/pdserving/ppocr_det_mobile_kl_client/
+--rec_dirname:./inference/ch_ppocr_mobile_v2.0_rec_klquant_infer/
+--rec_serving_server:./deploy/pdserving/ppocr_rec_mobile_kl_serving/
+--rec_serving_client:./deploy/pdserving/ppocr_rec_mobile_kl_client/
+serving_dir:./deploy/pdserving
+web_service:-m paddle_serving_server.serve
+--op:GeneralDetectionOp GeneralInferOp
+--port:8181
+--gpu_id:"0"|null
+cpp_client:ocr_cpp_client.py
+--image_dir:../../doc/imgs/1.jpg
diff --git a/test_tipc/configs/ch_ppocr_mobile_v2.0_det_KL/model_linux_gpu_normal_normal_serving_python_linux_gpu_cpu.txt b/test_tipc/configs/ch_ppocr_mobile_v2.0_det_KL/model_linux_gpu_normal_normal_serving_python_linux_gpu_cpu.txt
new file mode 100644
index 0000000000000000000000000000000000000000..049ec784581bddcce066bc049b66f6f0ceff9eed
--- /dev/null
+++ b/test_tipc/configs/ch_ppocr_mobile_v2.0_det_KL/model_linux_gpu_normal_normal_serving_python_linux_gpu_cpu.txt
@@ -0,0 +1,23 @@
+===========================serving_params===========================
+model_name:ch_ppocr_mobile_v2.0_det_KL
+python:python3.7
+trans_model:-m paddle_serving_client.convert
+--det_dirname:./inference/ch_ppocr_mobile_v2.0_det_klquant_infer/
+--model_filename:inference.pdmodel
+--params_filename:inference.pdiparams
+--det_serving_server:./deploy/pdserving/ppocr_det_mobile_kl_serving/
+--det_serving_client:./deploy/pdserving/ppocr_det_mobile_kl_client/
+--rec_dirname:null
+--rec_serving_server:null
+--rec_serving_client:null
+serving_dir:./deploy/pdserving
+web_service:web_service_det.py --config=config.yml --opt op.det.concurrency="1"
+op.det.local_service_conf.devices:gpu|null
+op.det.local_service_conf.use_mkldnn:False
+op.det.local_service_conf.thread_num:6
+op.det.local_service_conf.use_trt:False
+op.det.local_service_conf.precision:fp32
+op.det.local_service_conf.model_config:
+op.rec.local_service_conf.model_config:
+pipline:pipeline_http_client.py
+--image_dir:../../doc/imgs/1.jpg
diff --git a/test_tipc/configs/ch_ppocr_mobile_v2.0_det_PACT/model_linux_gpu_normal_normal_infer_cpp_linux_gpu_cpu.txt b/test_tipc/configs/ch_ppocr_mobile_v2.0_det_PACT/model_linux_gpu_normal_normal_infer_cpp_linux_gpu_cpu.txt
new file mode 100644
index 0000000000000000000000000000000000000000..17723f41ab762f5316cba59c08ec719aa54f03b1
--- /dev/null
+++ b/test_tipc/configs/ch_ppocr_mobile_v2.0_det_PACT/model_linux_gpu_normal_normal_infer_cpp_linux_gpu_cpu.txt
@@ -0,0 +1,20 @@
+===========================cpp_infer_params===========================
+model_name:ch_ppocr_mobile_v2.0_det_PACT
+use_opencv:True
+infer_model:./inference/ch_ppocr_mobile_v2.0_det_pact_infer
+infer_quant:False
+inference:./deploy/cpp_infer/build/ppocr
+--use_gpu:True|False
+--enable_mkldnn:False
+--cpu_threads:6
+--rec_batch_num:1
+--use_tensorrt:False
+--precision:fp32
+--det_model_dir:
+--image_dir:./inference/ch_det_data_50/all-sum-510/
+null:null
+--benchmark:True
+--det:True
+--rec:False
+--cls:False
+--use_angle_cls:False
\ No newline at end of file
diff --git a/test_tipc/configs/ch_ppocr_mobile_v2.0_det_PACT/model_linux_gpu_normal_normal_serving_cpp_linux_gpu_cpu.txt b/test_tipc/configs/ch_ppocr_mobile_v2.0_det_PACT/model_linux_gpu_normal_normal_serving_cpp_linux_gpu_cpu.txt
new file mode 100644
index 0000000000000000000000000000000000000000..1a49a10f9b9d4e32916dd35bae3380e2ca5bebb9
--- /dev/null
+++ b/test_tipc/configs/ch_ppocr_mobile_v2.0_det_PACT/model_linux_gpu_normal_normal_serving_cpp_linux_gpu_cpu.txt
@@ -0,0 +1,19 @@
+===========================serving_params===========================
+model_name:ch_ppocr_mobile_v2.0_det_PACT
+python:python3.7
+trans_model:-m paddle_serving_client.convert
+--det_dirname:./inference/ch_ppocr_mobile_v2.0_det_pact_infer/
+--model_filename:inference.pdmodel
+--params_filename:inference.pdiparams
+--det_serving_server:./deploy/pdserving/ppocr_det_mobile_pact_serving/
+--det_serving_client:./deploy/pdserving/ppocr_det_mobile_pact_client/
+--rec_dirname:./inference/ch_ppocr_mobile_v2.0_rec_pact_infer/
+--rec_serving_server:./deploy/pdserving/ppocr_rec_mobile_pact_serving/
+--rec_serving_client:./deploy/pdserving/ppocr_rec_mobile_pact_client/
+serving_dir:./deploy/pdserving
+web_service:-m paddle_serving_server.serve
+--op:GeneralDetectionOp GeneralInferOp
+--port:8181
+--gpu_id:"0"|null
+cpp_client:ocr_cpp_client.py
+--image_dir:../../doc/imgs/1.jpg
diff --git a/test_tipc/configs/ch_ppocr_mobile_v2.0_det_PACT/model_linux_gpu_normal_normal_serving_python_linux_gpu_cpu.txt b/test_tipc/configs/ch_ppocr_mobile_v2.0_det_PACT/model_linux_gpu_normal_normal_serving_python_linux_gpu_cpu.txt
new file mode 100644
index 0000000000000000000000000000000000000000..909d738919bed78d6db04e238818cd4fbbb75e5f
--- /dev/null
+++ b/test_tipc/configs/ch_ppocr_mobile_v2.0_det_PACT/model_linux_gpu_normal_normal_serving_python_linux_gpu_cpu.txt
@@ -0,0 +1,23 @@
+===========================serving_params===========================
+model_name:ch_ppocr_mobile_v2.0_det_PACT
+python:python3.7
+trans_model:-m paddle_serving_client.convert
+--det_dirname:./inference/ch_ppocr_mobile_v2.0_det_pact_infer/
+--model_filename:inference.pdmodel
+--params_filename:inference.pdiparams
+--det_serving_server:./deploy/pdserving/ppocr_det_mobile_pact_serving/
+--det_serving_client:./deploy/pdserving/ppocr_det_mobile_pact_client/
+--rec_dirname:null
+--rec_serving_server:null
+--rec_serving_client:null
+serving_dir:./deploy/pdserving
+web_service:web_service_det.py --config=config.yml --opt op.det.concurrency="1"
+op.det.local_service_conf.devices:gpu|null
+op.det.local_service_conf.use_mkldnn:False
+op.det.local_service_conf.thread_num:6
+op.det.local_service_conf.use_trt:False
+op.det.local_service_conf.precision:fp32
+op.det.local_service_conf.model_config:
+op.rec.local_service_conf.model_config:
+pipline:pipeline_http_client.py
+--image_dir:../../doc/imgs/1.jpg
diff --git a/test_tipc/configs/ch_ppocr_mobile_v2.0_rec/model_linux_gpu_normal_normal_infer_cpp_linux_gpu_cpu.txt b/test_tipc/configs/ch_ppocr_mobile_v2.0_rec/model_linux_gpu_normal_normal_infer_cpp_linux_gpu_cpu.txt
new file mode 100644
index 0000000000000000000000000000000000000000..480fb16cddfc4c2f4784cc8fa88512f063f7b2ae
--- /dev/null
+++ b/test_tipc/configs/ch_ppocr_mobile_v2.0_rec/model_linux_gpu_normal_normal_infer_cpp_linux_gpu_cpu.txt
@@ -0,0 +1,20 @@
+===========================cpp_infer_params===========================
+model_name:ch_ppocr_mobile_v2.0_rec
+use_opencv:True
+infer_model:./inference/ch_ppocr_mobile_v2.0_rec_infer/
+infer_quant:False
+inference:./deploy/cpp_infer/build/ppocr --rec_char_dict_path=./ppocr/utils/ppocr_keys_v1.txt --rec_img_h=32
+--use_gpu:True|False
+--enable_mkldnn:False
+--cpu_threads:6
+--rec_batch_num:6
+--use_tensorrt:False
+--precision:fp32
+--rec_model_dir:
+--image_dir:./inference/rec_inference/
+null:null
+--benchmark:True
+--det:False
+--rec:True
+--cls:False
+--use_angle_cls:False
\ No newline at end of file
diff --git a/test_tipc/configs/ch_ppocr_mobile_v2.0_rec/model_linux_gpu_normal_normal_paddle2onnx_python_linux_cpu.txt b/test_tipc/configs/ch_ppocr_mobile_v2.0_rec/model_linux_gpu_normal_normal_paddle2onnx_python_linux_cpu.txt
index f29b303879f555eaa9a392633aed6e0095f05cfb..5bab0c9e4c77edba302f6b536306816b09df9224 100644
--- a/test_tipc/configs/ch_ppocr_mobile_v2.0_rec/model_linux_gpu_normal_normal_paddle2onnx_python_linux_cpu.txt
+++ b/test_tipc/configs/ch_ppocr_mobile_v2.0_rec/model_linux_gpu_normal_normal_paddle2onnx_python_linux_cpu.txt
@@ -1,14 +1,17 @@
===========================paddle2onnx_params===========================
-model_name:ocr_rec_mobile
+model_name:ch_ppocr_mobile_v2.0_rec
python:python3.7
2onnx: paddle2onnx
---model_dir:./inference/ch_ppocr_mobile_v2.0_rec_infer/
+--det_model_dir:
--model_filename:inference.pdmodel
--params_filename:inference.pdiparams
---save_file:./inference/rec_mobile_onnx/model.onnx
+--det_save_file:
+--rec_model_dir:./inference/ch_ppocr_mobile_v2.0_rec_infer/
+--rec_save_file:./inference/rec_mobile_onnx/model.onnx
--opset_version:10
--enable_onnx_checker:True
-inference:tools/infer/predict_rec.py
+inference:tools/infer/predict_rec.py --rec_image_shape="3,32,320"
--use_gpu:True|False
+--det_model_dir:
--rec_model_dir:
---image_dir:./inference/rec_inference
\ No newline at end of file
+--image_dir:./inference/rec_inference/
\ No newline at end of file
diff --git a/test_tipc/configs/ch_ppocr_mobile_v2.0_rec/model_linux_gpu_normal_normal_serving_python_linux_gpu_cpu.txt b/test_tipc/configs/ch_ppocr_mobile_v2.0_rec/model_linux_gpu_normal_normal_serving_python_linux_gpu_cpu.txt
index f890eff469ba82b87d2d83000add24cc9d380c49..c0c5291cc480f9f34aa5dcded3eafce7feac89e3 100644
--- a/test_tipc/configs/ch_ppocr_mobile_v2.0_rec/model_linux_gpu_normal_normal_serving_python_linux_gpu_cpu.txt
+++ b/test_tipc/configs/ch_ppocr_mobile_v2.0_rec/model_linux_gpu_normal_normal_serving_python_linux_gpu_cpu.txt
@@ -1,18 +1,23 @@
===========================serving_params===========================
-model_name:ocr_rec_mobile
-python:python3.7|cpp
+model_name:ch_ppocr_mobile_v2.0_rec
+python:python3.7
trans_model:-m paddle_serving_client.convert
---dirname:./inference/ch_ppocr_mobile_v2.0_rec_infer/
+--det_dirname:null
--model_filename:inference.pdmodel
--params_filename:inference.pdiparams
---serving_server:./deploy/pdserving/ppocr_rec_mobile_2.0_serving/
---serving_client:./deploy/pdserving/ppocr_rec_mobile_2.0_client/
+--det_serving_server:null
+--det_serving_client:null
+--rec_dirname:./inference/ch_ppocr_mobile_v2.0_rec_infer/
+--rec_serving_server:./deploy/pdserving/ppocr_rec_mobile_serving/
+--rec_serving_client:./deploy/pdserving/ppocr_rec_mobile_client/
serving_dir:./deploy/pdserving
-web_service:web_service_rec.py --config=config.yml --opt op.rec.concurrency=1
-op.rec.local_service_conf.devices:"0"|null
-op.rec.local_service_conf.use_mkldnn:True|False
-op.rec.local_service_conf.thread_num:1|6
-op.rec.local_service_conf.use_trt:False|True
-op.rec.local_service_conf.precision:fp32|fp16|int8
-pipline:pipeline_rpc_client.py|pipeline_http_client.py
---image_dir:../../doc/imgs_words_en
\ No newline at end of file
+web_service:web_service_rec.py --config=config.yml --opt op.rec.concurrency="1"
+op.det.local_service_conf.devices:gpu|null
+op.det.local_service_conf.use_mkldnn:False
+op.det.local_service_conf.thread_num:6
+op.det.local_service_conf.use_trt:False
+op.det.local_service_conf.precision:fp32
+op.det.local_service_conf.model_config:
+op.rec.local_service_conf.model_config:
+pipline:pipeline_http_client.py --det=False
+--image_dir:../../inference/rec_inference
diff --git a/test_tipc/configs/ch_ppocr_mobile_v2.0_rec/train_infer_python.txt b/test_tipc/configs/ch_ppocr_mobile_v2.0_rec/train_infer_python.txt
index 5086f80d7bad4fb359f152cc1dc7195017aa31c3..f02b93926cac8116844142fe3ecb03959abb0530 100644
--- a/test_tipc/configs/ch_ppocr_mobile_v2.0_rec/train_infer_python.txt
+++ b/test_tipc/configs/ch_ppocr_mobile_v2.0_rec/train_infer_python.txt
@@ -4,7 +4,7 @@ python:python3.7
gpu_list:0|0,1
Global.use_gpu:True|True
Global.auto_cast:null
-Global.epoch_num:lite_train_lite_infer=2|whole_train_whole_infer=300
+Global.epoch_num:lite_train_lite_infer=2|whole_train_whole_infer=50
Global.save_model_dir:./output/
Train.loader.batch_size_per_card:lite_train_lite_infer=128|whole_train_whole_infer=128
Global.pretrained_model:null
@@ -39,11 +39,11 @@ infer_export:tools/export_model.py -c configs/rec/rec_icdar15_train.yml -o
infer_quant:False
inference:tools/infer/predict_rec.py
--use_gpu:True|False
---enable_mkldnn:True|False
---cpu_threads:1|6
+--enable_mkldnn:False
+--cpu_threads:6
--rec_batch_num:1|6
---use_tensorrt:True|False
---precision:fp32|int8
+--use_tensorrt:False
+--precision:fp32
--rec_model_dir:
--image_dir:./inference/rec_inference
--save_log_path:./test/output/
diff --git a/test_tipc/configs/ch_ppocr_mobile_v2.0_rec/train_linux_gpu_fleet_normal_infer_python_linux_gpu_cpu.txt b/test_tipc/configs/ch_ppocr_mobile_v2.0_rec/train_linux_gpu_fleet_normal_infer_python_linux_gpu_cpu.txt
new file mode 100644
index 0000000000000000000000000000000000000000..631118c0a9ab98c10129f12ec1c1cf2bbac46115
--- /dev/null
+++ b/test_tipc/configs/ch_ppocr_mobile_v2.0_rec/train_linux_gpu_fleet_normal_infer_python_linux_gpu_cpu.txt
@@ -0,0 +1,53 @@
+===========================train_params===========================
+model_name:ch_ppocr_mobile_v2.0_rec
+python:python3.7
+gpu_list:192.168.0.1,192.168.0.2;0,1
+Global.use_gpu:True
+Global.auto_cast:null
+Global.epoch_num:lite_train_lite_infer=2|whole_train_whole_infer=50
+Global.save_model_dir:./output/
+Train.loader.batch_size_per_card:lite_train_lite_infer=128|whole_train_whole_infer=128
+Global.pretrained_model:null
+train_model_name:latest
+train_infer_img_dir:./inference/rec_inference
+null:null
+##
+trainer:norm_train
+norm_train:tools/train.py -c configs/rec/rec_icdar15_train.yml -o
+pact_train:null
+fpgm_train:null
+distill_train:null
+null:null
+null:null
+##
+===========================eval_params===========================
+eval:tools/eval.py -c configs/rec/rec_icdar15_train.yml -o
+null:null
+##
+===========================infer_params===========================
+Global.save_inference_dir:./output/
+Global.checkpoints:
+norm_export:tools/export_model.py -c configs/rec/rec_icdar15_train.yml -o
+quant_export:null
+fpgm_export:null
+distill_export:null
+export1:null
+export2:null
+##
+train_model:./inference/ch_ppocr_mobile_v2.0_rec_train/best_accuracy
+infer_export:tools/export_model.py -c configs/rec/rec_icdar15_train.yml -o
+infer_quant:False
+inference:tools/infer/predict_rec.py
+--use_gpu:False
+--enable_mkldnn:False
+--cpu_threads:6
+--rec_batch_num:1|6
+--use_tensorrt:False
+--precision:fp32
+--rec_model_dir:
+--image_dir:./inference/rec_inference
+--save_log_path:./test/output/
+--benchmark:True
+null:null
+===========================infer_benchmark_params==========================
+random_infer_input:[{float32,[3,32,100]}]
diff --git a/test_tipc/configs/ch_ppocr_mobile_v2.0_rec/train_linux_gpu_normal_amp_infer_python_linux_gpu_cpu.txt b/test_tipc/configs/ch_ppocr_mobile_v2.0_rec/train_linux_gpu_normal_amp_infer_python_linux_gpu_cpu.txt
index 30fb939bff646adf301191f88a9a499acf9c61de..bd9c4a8df2565af73b6db24636b6dd132dac0cc2 100644
--- a/test_tipc/configs/ch_ppocr_mobile_v2.0_rec/train_linux_gpu_normal_amp_infer_python_linux_gpu_cpu.txt
+++ b/test_tipc/configs/ch_ppocr_mobile_v2.0_rec/train_linux_gpu_normal_amp_infer_python_linux_gpu_cpu.txt
@@ -4,7 +4,7 @@ python:python3.7
gpu_list:0|0,1
Global.use_gpu:True|True
Global.auto_cast:amp
-Global.epoch_num:lite_train_lite_infer=2|whole_train_whole_infer=300
+Global.epoch_num:lite_train_lite_infer=2|whole_train_whole_infer=50
Global.save_model_dir:./output/
Train.loader.batch_size_per_card:lite_train_lite_infer=128|whole_train_whole_infer=128
Global.pretrained_model:null
@@ -39,11 +39,11 @@ infer_export:tools/export_model.py -c configs/rec/rec_icdar15_train.yml -o
infer_quant:False
inference:tools/infer/predict_rec.py
--use_gpu:True|False
---enable_mkldnn:True|False
---cpu_threads:1|6
+--enable_mkldnn:False
+--cpu_threads:6
--rec_batch_num:1|6
---use_tensorrt:True|False
---precision:fp32|int8
+--use_tensorrt:False
+--precision:fp32
--rec_model_dir:
--image_dir:./inference/rec_inference
--save_log_path:./test/output/
diff --git a/test_tipc/configs/ch_ppocr_mobile_v2.0_rec_PACT/train_infer_python.txt b/test_tipc/configs/ch_ppocr_mobile_v2.0_rec/train_pact_infer_python.txt
similarity index 94%
rename from test_tipc/configs/ch_ppocr_mobile_v2.0_rec_PACT/train_infer_python.txt
rename to test_tipc/configs/ch_ppocr_mobile_v2.0_rec/train_pact_infer_python.txt
index 94909ec340c1bbc582dd60aa947f1905580b8966..77472fbdfb21c81bb713df175a135ccf9e652f25 100644
--- a/test_tipc/configs/ch_ppocr_mobile_v2.0_rec_PACT/train_infer_python.txt
+++ b/test_tipc/configs/ch_ppocr_mobile_v2.0_rec/train_pact_infer_python.txt
@@ -4,7 +4,7 @@ python:python3.7
gpu_list:0
Global.use_gpu:True|True
Global.auto_cast:null
-Global.epoch_num:lite_train_lite_infer=1|whole_train_whole_infer=300
+Global.epoch_num:lite_train_lite_infer=1|whole_train_whole_infer=50
Global.save_model_dir:./output/
Train.loader.batch_size_per_card:lite_train_lite_infer=128|whole_train_whole_infer=128
Global.checkpoints:null
@@ -39,11 +39,11 @@ infer_export:null
infer_quant:False
inference:tools/infer/predict_rec.py --rec_char_dict_path=./ppocr/utils/ppocr_keys_v1.txt --rec_image_shape="3,32,100"
--use_gpu:True|False
---enable_mkldnn:True|False
---cpu_threads:1|6
+--enable_mkldnn:False
+--cpu_threads:6
--rec_batch_num:1|6
---use_tensorrt:False|True
---precision:fp32|int8
+--use_tensorrt:False
+--precision:fp32
--rec_model_dir:
--image_dir:./inference/rec_inference
--save_log_path:./test/output/
diff --git a/test_tipc/configs/ch_ppocr_mobile_v2.0_rec_KL/model_linux_gpu_normal_normal_infer_python_linux_gpu_cpu.txt b/test_tipc/configs/ch_ppocr_mobile_v2.0_rec/train_ptq_infer_python.txt
similarity index 81%
rename from test_tipc/configs/ch_ppocr_mobile_v2.0_rec_KL/model_linux_gpu_normal_normal_infer_python_linux_gpu_cpu.txt
rename to test_tipc/configs/ch_ppocr_mobile_v2.0_rec/train_ptq_infer_python.txt
index 4b77994f3f68c196b4d6e7a16eb44ec5fdef0d9e..f63fe4c2bb6a17353ecb008d83e2bee9d38aec23 100644
--- a/test_tipc/configs/ch_ppocr_mobile_v2.0_rec_KL/model_linux_gpu_normal_normal_infer_python_linux_gpu_cpu.txt
+++ b/test_tipc/configs/ch_ppocr_mobile_v2.0_rec/train_ptq_infer_python.txt
@@ -6,12 +6,12 @@ Global.save_inference_dir:null
infer_model:./inference/ch_ppocr_mobile_v2.0_rec_infer/
infer_export:deploy/slim/quantization/quant_kl.py -c test_tipc/configs/ch_ppocr_mobile_v2.0_rec_KL/rec_chinese_lite_train_v2.0.yml -o
infer_quant:True
-inference:tools/infer/predict_rec.py
+inference:tools/infer/predict_rec.py --rec_image_shape="3,32,320"
--use_gpu:False|True
---enable_mkldnn:True
---cpu_threads:1|6
+--enable_mkldnn:False
+--cpu_threads:6
--rec_batch_num:1
---use_tensorrt:False|True
+--use_tensorrt:False
--precision:int8
--rec_model_dir:
--image_dir:./inference/rec_inference
diff --git a/test_tipc/configs/ch_ppocr_mobile_v2.0_rec_FPGM/train_infer_python.txt b/test_tipc/configs/ch_ppocr_mobile_v2.0_rec_FPGM/train_infer_python.txt
index 77494ac347a73f61d18c070075db476a093c3f62..89daceeb5f4a991699a490b51358d33240e74913 100644
--- a/test_tipc/configs/ch_ppocr_mobile_v2.0_rec_FPGM/train_infer_python.txt
+++ b/test_tipc/configs/ch_ppocr_mobile_v2.0_rec_FPGM/train_infer_python.txt
@@ -4,7 +4,7 @@ python:python3.7
gpu_list:0
Global.use_gpu:True|True
Global.auto_cast:null
-Global.epoch_num:lite_train_lite_infer=1|whole_train_whole_infer=300
+Global.epoch_num:lite_train_lite_infer=1|whole_train_whole_infer=50
Global.save_model_dir:./output/
Train.loader.batch_size_per_card:lite_train_lite_infer=128|whole_train_whole_infer=128
Global.pretrained_model:null
@@ -39,11 +39,11 @@ infer_export:null
infer_quant:False
inference:tools/infer/predict_rec.py
--use_gpu:True|False
---enable_mkldnn:True|False
---cpu_threads:1|6
+--enable_mkldnn:False
+--cpu_threads:6
--rec_batch_num:1
---use_tensorrt:False|True
---precision:fp32|int8
+--use_tensorrt:False
+--precision:fp32
--rec_model_dir:
--image_dir:./inference/rec_inference
null:null
diff --git a/test_tipc/configs/ch_ppocr_mobile_v2.0_rec_FPGM/train_linux_gpu_normal_amp_infer_python_linux_gpu_cpu.txt b/test_tipc/configs/ch_ppocr_mobile_v2.0_rec_FPGM/train_linux_gpu_normal_amp_infer_python_linux_gpu_cpu.txt
index fda9cf4ddec6d3ab64045a4a7fdbb62183212021..7abc3e9340fe49a2b0bf0efd5e3c370817cd4e9d 100644
--- a/test_tipc/configs/ch_ppocr_mobile_v2.0_rec_FPGM/train_linux_gpu_normal_amp_infer_python_linux_gpu_cpu.txt
+++ b/test_tipc/configs/ch_ppocr_mobile_v2.0_rec_FPGM/train_linux_gpu_normal_amp_infer_python_linux_gpu_cpu.txt
@@ -4,7 +4,7 @@ python:python3.7
gpu_list:0
Global.use_gpu:True|True
Global.auto_cast:amp
-Global.epoch_num:lite_train_lite_infer=1|whole_train_whole_infer=300
+Global.epoch_num:lite_train_lite_infer=1|whole_train_whole_infer=50
Global.save_model_dir:./output/
Train.loader.batch_size_per_card:lite_train_lite_infer=128|whole_train_whole_infer=128
Global.pretrained_model:null
@@ -39,11 +39,11 @@ infer_export:null
infer_quant:False
inference:tools/infer/predict_rec.py
--use_gpu:True|False
---enable_mkldnn:True|False
---cpu_threads:1|6
+--enable_mkldnn:False
+--cpu_threads:6
--rec_batch_num:1
---use_tensorrt:False|True
---precision:fp32|int8
+--use_tensorrt:False
+--precision:fp32
--rec_model_dir:
--image_dir:./inference/rec_inference
null:null
diff --git a/test_tipc/configs/ch_ppocr_mobile_v2.0_rec_KL/model_linux_gpu_normal_normal_infer_cpp_linux_gpu_cpu.txt b/test_tipc/configs/ch_ppocr_mobile_v2.0_rec_KL/model_linux_gpu_normal_normal_infer_cpp_linux_gpu_cpu.txt
new file mode 100644
index 0000000000000000000000000000000000000000..adf06257a772cfd16d4109497c6e6ef7c3f8af8b
--- /dev/null
+++ b/test_tipc/configs/ch_ppocr_mobile_v2.0_rec_KL/model_linux_gpu_normal_normal_infer_cpp_linux_gpu_cpu.txt
@@ -0,0 +1,20 @@
+===========================cpp_infer_params===========================
+model_name:ch_ppocr_mobile_v2.0_rec_KL
+use_opencv:True
+infer_model:./inference/ch_ppocr_mobile_v2.0_rec_klquant_infer
+infer_quant:False
+inference:./deploy/cpp_infer/build/ppocr --rec_char_dict_path=./ppocr/utils/ppocr_keys_v1.txt --rec_img_h=32
+--use_gpu:True|False
+--enable_mkldnn:False
+--cpu_threads:6
+--rec_batch_num:6
+--use_tensorrt:False
+--precision:fp32
+--rec_model_dir:
+--image_dir:./inference/rec_inference/
+null:null
+--benchmark:True
+--det:False
+--rec:True
+--cls:False
+--use_angle_cls:False
\ No newline at end of file
diff --git a/test_tipc/configs/ch_ppocr_mobile_v2.0_rec_KL/model_linux_gpu_normal_normal_serving_cpp_linux_gpu_cpu.txt b/test_tipc/configs/ch_ppocr_mobile_v2.0_rec_KL/model_linux_gpu_normal_normal_serving_cpp_linux_gpu_cpu.txt
new file mode 100644
index 0000000000000000000000000000000000000000..ab518de55ae6b157b26bf332ec3b0afcab71f97a
--- /dev/null
+++ b/test_tipc/configs/ch_ppocr_mobile_v2.0_rec_KL/model_linux_gpu_normal_normal_serving_cpp_linux_gpu_cpu.txt
@@ -0,0 +1,19 @@
+===========================serving_params===========================
+model_name:ch_ppocr_mobile_v2.0_rec_KL
+python:python3.7
+trans_model:-m paddle_serving_client.convert
+--det_dirname:./inference/ch_ppocr_mobile_v2.0_det_klquant_infer/
+--model_filename:inference.pdmodel
+--params_filename:inference.pdiparams
+--det_serving_server:./deploy/pdserving/ppocr_det_mobile_kl_serving/
+--det_serving_client:./deploy/pdserving/ppocr_det_mobile_kl_client/
+--rec_dirname:./inference/ch_ppocr_mobile_v2.0_rec_klquant_infer/
+--rec_serving_server:./deploy/pdserving/ppocr_rec_mobile_kl_serving/
+--rec_serving_client:./deploy/pdserving/ppocr_rec_mobile_kl_client/
+serving_dir:./deploy/pdserving
+web_service:-m paddle_serving_server.serve
+--op:GeneralDetectionOp GeneralInferOp
+--port:8181
+--gpu_id:"0"|null
+cpp_client:ocr_cpp_client.py
+--image_dir:../../doc/imgs/1.jpg
diff --git a/test_tipc/configs/ch_ppocr_mobile_v2.0_rec_KL/model_linux_gpu_normal_normal_serving_python_linux_gpu_cpu.txt b/test_tipc/configs/ch_ppocr_mobile_v2.0_rec_KL/model_linux_gpu_normal_normal_serving_python_linux_gpu_cpu.txt
new file mode 100644
index 0000000000000000000000000000000000000000..948e3dceb3ef7e3f2199be5e417cfc5fc763d975
--- /dev/null
+++ b/test_tipc/configs/ch_ppocr_mobile_v2.0_rec_KL/model_linux_gpu_normal_normal_serving_python_linux_gpu_cpu.txt
@@ -0,0 +1,23 @@
+===========================serving_params===========================
+model_name:ch_ppocr_mobile_v2.0_rec_KL
+python:python3.7
+trans_model:-m paddle_serving_client.convert
+--det_dirname:null
+--model_filename:inference.pdmodel
+--params_filename:inference.pdiparams
+--det_serving_server:null
+--det_serving_client:null
+--rec_dirname:./inference/ch_ppocr_mobile_v2.0_rec_klquant_infer/
+--rec_serving_server:./deploy/pdserving/ppocr_rec_mobile_kl_serving/
+--rec_serving_client:./deploy/pdserving/ppocr_rec_mobile_kl_client/
+serving_dir:./deploy/pdserving
+web_service:web_service_rec.py --config=config.yml --opt op.rec.concurrency="1"
+op.det.local_service_conf.devices:gpu|null
+op.det.local_service_conf.use_mkldnn:False
+op.det.local_service_conf.thread_num:6
+op.det.local_service_conf.use_trt:False
+op.det.local_service_conf.precision:fp32
+op.det.local_service_conf.model_config:
+op.rec.local_service_conf.model_config:
+pipline:pipeline_http_client.py --det=False
+--image_dir:../../inference/rec_inference
diff --git a/test_tipc/configs/ch_ppocr_mobile_v2.0_rec_PACT/model_linux_gpu_normal_normal_infer_cpp_linux_gpu_cpu.txt b/test_tipc/configs/ch_ppocr_mobile_v2.0_rec_PACT/model_linux_gpu_normal_normal_infer_cpp_linux_gpu_cpu.txt
new file mode 100644
index 0000000000000000000000000000000000000000..ba2df90f75d2c70e043c45ea19d681aabc2b6fb2
--- /dev/null
+++ b/test_tipc/configs/ch_ppocr_mobile_v2.0_rec_PACT/model_linux_gpu_normal_normal_infer_cpp_linux_gpu_cpu.txt
@@ -0,0 +1,20 @@
+===========================cpp_infer_params===========================
+model_name:ch_ppocr_mobile_v2.0_rec_PACT
+use_opencv:True
+infer_model:./inference/ch_ppocr_mobile_v2.0_rec_pact_infer
+infer_quant:False
+inference:./deploy/cpp_infer/build/ppocr --rec_char_dict_path=./ppocr/utils/ppocr_keys_v1.txt --rec_img_h=32
+--use_gpu:True|False
+--enable_mkldnn:False
+--cpu_threads:6
+--rec_batch_num:6
+--use_tensorrt:False
+--precision:fp32
+--rec_model_dir:
+--image_dir:./inference/rec_inference/
+null:null
+--benchmark:True
+--det:False
+--rec:True
+--cls:False
+--use_angle_cls:False
\ No newline at end of file
diff --git a/test_tipc/configs/ch_ppocr_mobile_v2.0_rec_PACT/model_linux_gpu_normal_normal_serving_cpp_linux_gpu_cpu.txt b/test_tipc/configs/ch_ppocr_mobile_v2.0_rec_PACT/model_linux_gpu_normal_normal_serving_cpp_linux_gpu_cpu.txt
new file mode 100644
index 0000000000000000000000000000000000000000..229f70cf353318bf9ccc81f4e5be79dbc096de25
--- /dev/null
+++ b/test_tipc/configs/ch_ppocr_mobile_v2.0_rec_PACT/model_linux_gpu_normal_normal_serving_cpp_linux_gpu_cpu.txt
@@ -0,0 +1,19 @@
+===========================serving_params===========================
+model_name:ch_ppocr_mobile_v2.0_rec_PACT
+python:python3.7
+trans_model:-m paddle_serving_client.convert
+--det_dirname:./inference/ch_ppocr_mobile_v2.0_det_pact_infer/
+--model_filename:inference.pdmodel
+--params_filename:inference.pdiparams
+--det_serving_server:./deploy/pdserving/ppocr_det_mobile_pact_serving/
+--det_serving_client:./deploy/pdserving/ppocr_det_mobile_pact_client/
+--rec_dirname:./inference/ch_ppocr_mobile_v2.0_rec_pact_infer/
+--rec_serving_server:./deploy/pdserving/ppocr_rec_mobile_pact_serving/
+--rec_serving_client:./deploy/pdserving/ppocr_rec_mobile_pact_client/
+serving_dir:./deploy/pdserving
+web_service:-m paddle_serving_server.serve
+--op:GeneralDetectionOp GeneralInferOp
+--port:8181
+--gpu_id:"0"|null
+cpp_client:ocr_cpp_client.py
+--image_dir:../../doc/imgs/1.jpg
diff --git a/test_tipc/configs/ch_ppocr_mobile_v2.0_rec_PACT/model_linux_gpu_normal_normal_serving_python_linux_gpu_cpu.txt b/test_tipc/configs/ch_ppocr_mobile_v2.0_rec_PACT/model_linux_gpu_normal_normal_serving_python_linux_gpu_cpu.txt
new file mode 100644
index 0000000000000000000000000000000000000000..f123f365432ab68f2484cc11dd9ef94c8a60ea8e
--- /dev/null
+++ b/test_tipc/configs/ch_ppocr_mobile_v2.0_rec_PACT/model_linux_gpu_normal_normal_serving_python_linux_gpu_cpu.txt
@@ -0,0 +1,23 @@
+===========================serving_params===========================
+model_name:ch_ppocr_mobile_v2.0_rec_PACT
+python:python3.7
+trans_model:-m paddle_serving_client.convert
+--det_dirname:null
+--model_filename:inference.pdmodel
+--params_filename:inference.pdiparams
+--det_serving_server:null
+--det_serving_client:null
+--rec_dirname:./inference/ch_ppocr_mobile_v2.0_rec_pact_infer/
+--rec_serving_server:./deploy/pdserving/ppocr_rec_mobile_pact_serving/
+--rec_serving_client:./deploy/pdserving/ppocr_rec_mobile_pact_client/
+serving_dir:./deploy/pdserving
+web_service:web_service_rec.py --config=config.yml --opt op.rec.concurrency="1"
+op.det.local_service_conf.devices:gpu|null
+op.det.local_service_conf.use_mkldnn:False
+op.det.local_service_conf.thread_num:6
+op.det.local_service_conf.use_trt:False
+op.det.local_service_conf.precision:fp32
+op.det.local_service_conf.model_config:
+op.rec.local_service_conf.model_config:
+pipline:pipeline_http_client.py --det=False
+--image_dir:../../inference/rec_inference
diff --git a/test_tipc/configs/ch_ppocr_server_v2.0/model_linux_gpu_normal_normal_infer_cpp_linux_gpu_cpu.txt b/test_tipc/configs/ch_ppocr_server_v2.0/model_linux_gpu_normal_normal_infer_cpp_linux_gpu_cpu.txt
new file mode 100644
index 0000000000000000000000000000000000000000..7c980b2baeef7161a93dea360089b333f2003a31
--- /dev/null
+++ b/test_tipc/configs/ch_ppocr_server_v2.0/model_linux_gpu_normal_normal_infer_cpp_linux_gpu_cpu.txt
@@ -0,0 +1,20 @@
+===========================cpp_infer_params===========================
+model_name:ch_ppocr_server_v2.0
+use_opencv:True
+infer_model:./inference/ch_ppocr_server_v2.0_det_infer/
+infer_quant:False
+inference:./deploy/cpp_infer/build/ppocr --rec_char_dict_path=./ppocr/utils/ppocr_keys_v1.txt --rec_img_h=32
+--use_gpu:True|False
+--enable_mkldnn:False
+--cpu_threads:6
+--rec_batch_num:1
+--use_tensorrt:False
+--precision:fp32
+--det_model_dir:
+--image_dir:./inference/ch_det_data_50/all-sum-510/
+--rec_model_dir:./inference/ch_ppocr_server_v2.0_rec_infer/
+--benchmark:True
+--det:True
+--rec:True
+--cls:False
+--use_angle_cls:False
\ No newline at end of file
diff --git a/test_tipc/configs/ch_ppocr_server_v2.0/model_linux_gpu_normal_normal_infer_python_linux_gpu_cpu.txt b/test_tipc/configs/ch_ppocr_server_v2.0/model_linux_gpu_normal_normal_infer_python_linux_gpu_cpu.txt
index 92d7031e884d10df3a5c98bf675d64d63b3cb335..b20596f7a1db6da04307a7e527ef596477d237d3 100644
--- a/test_tipc/configs/ch_ppocr_server_v2.0/model_linux_gpu_normal_normal_infer_python_linux_gpu_cpu.txt
+++ b/test_tipc/configs/ch_ppocr_server_v2.0/model_linux_gpu_normal_normal_infer_python_linux_gpu_cpu.txt
@@ -6,8 +6,8 @@ infer_export:null
infer_quant:True
inference:tools/infer/predict_system.py
--use_gpu:False|True
---enable_mkldnn:False|True
---cpu_threads:1|6
+--enable_mkldnn:False
+--cpu_threads:6
--rec_batch_num:1
--use_tensorrt:False
--precision:fp32
diff --git a/test_tipc/configs/ch_ppocr_server_v2.0/model_linux_gpu_normal_normal_paddle2onnx_python_linux_cpu.txt b/test_tipc/configs/ch_ppocr_server_v2.0/model_linux_gpu_normal_normal_paddle2onnx_python_linux_cpu.txt
new file mode 100644
index 0000000000000000000000000000000000000000..e478896a54957481a3ce4c485ac02cd7979233dc
--- /dev/null
+++ b/test_tipc/configs/ch_ppocr_server_v2.0/model_linux_gpu_normal_normal_paddle2onnx_python_linux_cpu.txt
@@ -0,0 +1,17 @@
+===========================paddle2onnx_params===========================
+model_name:ch_ppocr_server_v2.0
+python:python3.7
+2onnx: paddle2onnx
+--det_model_dir:./inference/ch_ppocr_server_v2.0_det_infer/
+--model_filename:inference.pdmodel
+--params_filename:inference.pdiparams
+--det_save_file:./inference/det_server_onnx/model.onnx
+--rec_model_dir:./inference/ch_ppocr_server_v2.0_rec_infer/
+--rec_save_file:./inference/rec_server_onnx/model.onnx
+--opset_version:10
+--enable_onnx_checker:True
+inference:tools/infer/predict_system.py --rec_image_shape="3,32,320"
+--use_gpu:True|False
+--det_model_dir:
+--rec_model_dir:
+--image_dir:./inference/ch_det_data_50/all-sum-510/00008790.jpg
\ No newline at end of file
diff --git a/test_tipc/configs/ch_ppocr_server_v2.0/model_linux_gpu_normal_normal_serving_cpp_linux_gpu_cpu.txt b/test_tipc/configs/ch_ppocr_server_v2.0/model_linux_gpu_normal_normal_serving_cpp_linux_gpu_cpu.txt
new file mode 100644
index 0000000000000000000000000000000000000000..bbfec44dbab08dcfb932a922797448e541ea385b
--- /dev/null
+++ b/test_tipc/configs/ch_ppocr_server_v2.0/model_linux_gpu_normal_normal_serving_cpp_linux_gpu_cpu.txt
@@ -0,0 +1,19 @@
+===========================serving_params===========================
+model_name:ch_ppocr_server_v2.0
+python:python3.7
+trans_model:-m paddle_serving_client.convert
+--det_dirname:./inference/ch_ppocr_server_v2.0_det_infer/
+--model_filename:inference.pdmodel
+--params_filename:inference.pdiparams
+--det_serving_server:./deploy/pdserving/ppocr_det_server_serving/
+--det_serving_client:./deploy/pdserving/ppocr_det_server_client/
+--rec_dirname:./inference/ch_ppocr_server_v2.0_rec_infer/
+--rec_serving_server:./deploy/pdserving/ppocr_rec_server_serving/
+--rec_serving_client:./deploy/pdserving/ppocr_rec_server_client/
+serving_dir:./deploy/pdserving
+web_service:-m paddle_serving_server.serve
+--op:GeneralDetectionOp GeneralInferOp
+--port:8181
+--gpu_id:"0"|null
+cpp_client:ocr_cpp_client.py
+--image_dir:../../doc/imgs/1.jpg
diff --git a/test_tipc/configs/ch_ppocr_server_v2.0/model_linux_gpu_normal_normal_serving_python_linux_gpu_cpu.txt b/test_tipc/configs/ch_ppocr_server_v2.0/model_linux_gpu_normal_normal_serving_python_linux_gpu_cpu.txt
new file mode 100644
index 0000000000000000000000000000000000000000..8853e709d40a0fba6bedd7ce582425e39b9076ed
--- /dev/null
+++ b/test_tipc/configs/ch_ppocr_server_v2.0/model_linux_gpu_normal_normal_serving_python_linux_gpu_cpu.txt
@@ -0,0 +1,23 @@
+===========================serving_params===========================
+model_name:ch_ppocr_server_v2.0
+python:python3.7
+trans_model:-m paddle_serving_client.convert
+--det_dirname:./inference/ch_ppocr_server_v2.0_det_infer/
+--model_filename:inference.pdmodel
+--params_filename:inference.pdiparams
+--det_serving_server:./deploy/pdserving/ppocr_det_server_serving/
+--det_serving_client:./deploy/pdserving/ppocr_det_server_client/
+--rec_dirname:./inference/ch_ppocr_server_v2.0_rec_infer/
+--rec_serving_server:./deploy/pdserving/ppocr_rec_server_serving/
+--rec_serving_client:./deploy/pdserving/ppocr_rec_server_client/
+serving_dir:./deploy/pdserving
+web_service:web_service.py --config=config.yml --opt op.det.concurrency="1" op.rec.concurrency="1"
+op.det.local_service_conf.devices:gpu|null
+op.det.local_service_conf.use_mkldnn:False
+op.det.local_service_conf.thread_num:6
+op.det.local_service_conf.use_trt:False
+op.det.local_service_conf.precision:fp32
+op.det.local_service_conf.model_config:
+op.rec.local_service_conf.model_config:
+pipline:pipeline_http_client.py
+--image_dir:../../doc/imgs/1.jpg
diff --git a/test_tipc/configs/ch_ppocr_server_v2.0_det/model_linux_gpu_normal_normal_infer_cpp_linux_gpu_cpu.txt b/test_tipc/configs/ch_ppocr_server_v2.0_det/model_linux_gpu_normal_normal_infer_cpp_linux_gpu_cpu.txt
new file mode 100644
index 0000000000000000000000000000000000000000..69ae939e2b6cab5e07bc4e401a83c66324754223
--- /dev/null
+++ b/test_tipc/configs/ch_ppocr_server_v2.0_det/model_linux_gpu_normal_normal_infer_cpp_linux_gpu_cpu.txt
@@ -0,0 +1,20 @@
+===========================cpp_infer_params===========================
+model_name:ch_ppocr_server_v2.0_det
+use_opencv:True
+infer_model:./inference/ch_ppocr_server_v2.0_det_infer/
+infer_quant:False
+inference:./deploy/cpp_infer/build/ppocr
+--use_gpu:True|False
+--enable_mkldnn:False
+--cpu_threads:6
+--rec_batch_num:1
+--use_tensorrt:False
+--precision:fp32
+--det_model_dir:
+--image_dir:./inference/ch_det_data_50/all-sum-510/
+null:null
+--benchmark:True
+--det:True
+--rec:False
+--cls:False
+--use_angle_cls:False
\ No newline at end of file
diff --git a/test_tipc/configs/ch_ppocr_server_v2.0_det/model_linux_gpu_normal_normal_paddle2onnx_python_linux_cpu.txt b/test_tipc/configs/ch_ppocr_server_v2.0_det/model_linux_gpu_normal_normal_paddle2onnx_python_linux_cpu.txt
index 40fdc11241f3ac966ff01d4c51173f990cc594c5..c8bebf54f2ed2627cce9a22013d1566eb7a7b6ef 100644
--- a/test_tipc/configs/ch_ppocr_server_v2.0_det/model_linux_gpu_normal_normal_paddle2onnx_python_linux_cpu.txt
+++ b/test_tipc/configs/ch_ppocr_server_v2.0_det/model_linux_gpu_normal_normal_paddle2onnx_python_linux_cpu.txt
@@ -1,14 +1,17 @@
===========================paddle2onnx_params===========================
-model_name:ocr_det_server
+model_name:ch_ppocr_server_v2.0_det
python:python3.7
2onnx: paddle2onnx
---model_dir:./inference/ch_ppocr_server_v2.0_det_infer/
+--det_model_dir:./inference/ch_ppocr_server_v2.0_det_infer/
--model_filename:inference.pdmodel
--params_filename:inference.pdiparams
---save_file:./inference/det_server_onnx/model.onnx
+--det_save_file:./inference/det_server_onnx/model.onnx
+--rec_model_dir:
+--rec_save_file:
--opset_version:10
--enable_onnx_checker:True
inference:tools/infer/predict_det.py
--use_gpu:True|False
--det_model_dir:
---image_dir:./inference/det_inference
\ No newline at end of file
+--rec_model_dir:
+--image_dir:./inference/ch_det_data_50/all-sum-510/00008790.jpg
\ No newline at end of file
diff --git a/test_tipc/configs/ch_ppocr_server_v2.0_det/model_linux_gpu_normal_normal_serving_python_linux_gpu_cpu.txt b/test_tipc/configs/ch_ppocr_server_v2.0_det/model_linux_gpu_normal_normal_serving_python_linux_gpu_cpu.txt
index ec5464604697e15bdd4e0f7282d23a8e09f4a0b5..018dd1a227064479ebd60570113b122b035e7704 100644
--- a/test_tipc/configs/ch_ppocr_server_v2.0_det/model_linux_gpu_normal_normal_serving_python_linux_gpu_cpu.txt
+++ b/test_tipc/configs/ch_ppocr_server_v2.0_det/model_linux_gpu_normal_normal_serving_python_linux_gpu_cpu.txt
@@ -1,18 +1,23 @@
===========================serving_params===========================
-model_name:ocr_det_server
-python:python3.7|cpp
+model_name:ch_ppocr_server_v2.0_det
+python:python3.7
trans_model:-m paddle_serving_client.convert
---dirname:./inference/ch_ppocr_server_v2.0_det_infer/
+--det_dirname:./inference/ch_ppocr_server_v2.0_det_infer/
--model_filename:inference.pdmodel
--params_filename:inference.pdiparams
---serving_server:./deploy/pdserving/ppocr_det_server_2.0_serving/
---serving_client:./deploy/pdserving/ppocr_det_server_2.0_client/
+--det_serving_server:./deploy/pdserving/ppocr_det_server_serving/
+--det_serving_client:./deploy/pdserving/ppocr_det_server_client/
+--rec_dirname:null
+--rec_serving_server:null
+--rec_serving_client:null
serving_dir:./deploy/pdserving
-web_service:web_service_det.py --config=config.yml --opt op.det.concurrency=1
-op.det.local_service_conf.devices:"0"|null
-op.det.local_service_conf.use_mkldnn:True|False
-op.det.local_service_conf.thread_num:1|6
-op.det.local_service_conf.use_trt:False|True
-op.det.local_service_conf.precision:fp32|fp16|int8
-pipline:pipeline_rpc_client.py|pipeline_http_client.py
---image_dir:../../doc/imgs
\ No newline at end of file
+web_service:web_service_det.py --config=config.yml --opt op.det.concurrency="1"
+op.det.local_service_conf.devices:gpu|null
+op.det.local_service_conf.use_mkldnn:False
+op.det.local_service_conf.thread_num:6
+op.det.local_service_conf.use_trt:False
+op.det.local_service_conf.precision:fp32
+op.det.local_service_conf.model_config:
+op.rec.local_service_conf.model_config:
+pipline:pipeline_http_client.py
+--image_dir:../../doc/imgs/1.jpg
diff --git a/test_tipc/configs/ch_ppocr_server_v2.0_det/train_infer_python.txt b/test_tipc/configs/ch_ppocr_server_v2.0_det/train_infer_python.txt
index 52489fe5298dbeba31ff0ff5abe03c0c49b46e0a..c16ca150029d03052396ca28a6396520e63b3f84 100644
--- a/test_tipc/configs/ch_ppocr_server_v2.0_det/train_infer_python.txt
+++ b/test_tipc/configs/ch_ppocr_server_v2.0_det/train_infer_python.txt
@@ -4,7 +4,7 @@ python:python3.7
gpu_list:0|0,1
Global.use_gpu:True|True
Global.auto_cast:null
-Global.epoch_num:lite_train_lite_infer=2|whole_train_whole_infer=300
+Global.epoch_num:lite_train_lite_infer=2|whole_train_whole_infer=50
Global.save_model_dir:./output/
Train.loader.batch_size_per_card:lite_train_lite_infer=2|whole_train_lite_infer=4
Global.pretrained_model:null
@@ -39,11 +39,11 @@ infer_export:tools/export_model.py -c configs/det/ch_ppocr_v2.0/ch_det_res18_db_
infer_quant:False
inference:tools/infer/predict_det.py
--use_gpu:True|False
---enable_mkldnn:True|False
---cpu_threads:1|6
+--enable_mkldnn:False
+--cpu_threads:6
--rec_batch_num:1
---use_tensorrt:False|True
---precision:fp32|fp16|int8
+--use_tensorrt:False
+--precision:fp32
--det_model_dir:
--image_dir:./inference/ch_det_data_50/all-sum-510/
--save_log_path:null
diff --git a/test_tipc/configs/ch_ppocr_server_v2.0_det/train_linux_gpu_fleet_normal_infer_python_linux_gpu_cpu.txt b/test_tipc/configs/ch_ppocr_server_v2.0_det/train_linux_gpu_fleet_normal_infer_python_linux_gpu_cpu.txt
new file mode 100644
index 0000000000000000000000000000000000000000..12388d967755c54a46efdb915ef047896dddaef7
--- /dev/null
+++ b/test_tipc/configs/ch_ppocr_server_v2.0_det/train_linux_gpu_fleet_normal_infer_python_linux_gpu_cpu.txt
@@ -0,0 +1,53 @@
+===========================train_params===========================
+model_name:ch_ppocr_server_v2.0_det
+python:python3.7
+gpu_list:192.168.0.1,192.168.0.2;0,1
+Global.use_gpu:True
+Global.auto_cast:null
+Global.epoch_num:lite_train_lite_infer=2|whole_train_whole_infer=50
+Global.save_model_dir:./output/
+Train.loader.batch_size_per_card:lite_train_lite_infer=2|whole_train_lite_infer=4
+Global.pretrained_model:null
+train_model_name:latest
+train_infer_img_dir:./train_data/icdar2015/text_localization/ch4_test_images/
+null:null
+##
+trainer:norm_train
+norm_train:tools/train.py -c test_tipc/configs/ch_ppocr_server_v2.0_det/det_r50_vd_db.yml -o
+quant_train:null
+fpgm_train:null
+distill_train:null
+null:null
+null:null
+##
+===========================eval_params===========================
+eval:tools/eval.py -c test_tipc/configs/ch_ppocr_server_v2.0_det/det_r50_vd_db.yml -o
+null:null
+##
+===========================infer_params===========================
+Global.save_inference_dir:./output/
+Global.checkpoints:
+norm_export:tools/export_model.py -c test_tipc/configs/ch_ppocr_server_v2.0_det/det_r50_vd_db.yml -o
+quant_export:null
+fpgm_export:null
+distill_export:null
+export1:null
+export2:null
+##
+train_model:./inference/ch_ppocr_server_v2.0_det_train/best_accuracy
+infer_export:tools/export_model.py -c configs/det/ch_ppocr_v2.0/ch_det_res18_db_v2.0.yml -o
+infer_quant:False
+inference:tools/infer/predict_det.py
+--use_gpu:False
+--enable_mkldnn:False
+--cpu_threads:6
+--rec_batch_num:1
+--use_tensorrt:False
+--precision:fp32
+--det_model_dir:
+--image_dir:./inference/ch_det_data_50/all-sum-510/
+--save_log_path:null
+--benchmark:True
+null:null
+===========================infer_benchmark_params==========================
+random_infer_input:[{float32,[3,640,640]}];[{float32,[3,960,960]}]
\ No newline at end of file
diff --git a/test_tipc/configs/ch_ppocr_server_v2.0_det/train_linux_gpu_normal_amp_infer_python_linux_gpu_cpu.txt b/test_tipc/configs/ch_ppocr_server_v2.0_det/train_linux_gpu_normal_amp_infer_python_linux_gpu_cpu.txt
index 3e3764e8c6f62c72ffb8ceb268c8ceee660d02de..93ed14cb600229e744167f26573cba406880db8e 100644
--- a/test_tipc/configs/ch_ppocr_server_v2.0_det/train_linux_gpu_normal_amp_infer_python_linux_gpu_cpu.txt
+++ b/test_tipc/configs/ch_ppocr_server_v2.0_det/train_linux_gpu_normal_amp_infer_python_linux_gpu_cpu.txt
@@ -4,7 +4,7 @@ python:python3.7
gpu_list:0|0,1
Global.use_gpu:True|True
Global.auto_cast:amp
-Global.epoch_num:lite_train_lite_infer=2|whole_train_whole_infer=300
+Global.epoch_num:lite_train_lite_infer=2|whole_train_whole_infer=50
Global.save_model_dir:./output/
Train.loader.batch_size_per_card:lite_train_lite_infer=2|whole_train_lite_infer=4
Global.pretrained_model:null
@@ -39,11 +39,11 @@ infer_export:tools/export_model.py -c configs/det/ch_ppocr_v2.0/ch_det_res18_db_
infer_quant:False
inference:tools/infer/predict_det.py
--use_gpu:True|False
---enable_mkldnn:True|False
---cpu_threads:1|6
+--enable_mkldnn:False
+--cpu_threads:6
--rec_batch_num:1
---use_tensorrt:False|True
---precision:fp32|fp16|int8
+--use_tensorrt:False
+--precision:fp32
--det_model_dir:
--image_dir:./inference/ch_det_data_50/all-sum-510/
--save_log_path:null
diff --git a/test_tipc/configs/ch_ppocr_server_v2.0_rec/model_linux_gpu_normal_normal_infer_cpp_linux_gpu_cpu.txt b/test_tipc/configs/ch_ppocr_server_v2.0_rec/model_linux_gpu_normal_normal_infer_cpp_linux_gpu_cpu.txt
new file mode 100644
index 0000000000000000000000000000000000000000..cbec272cce544e332fd908d4946321a15543fcae
--- /dev/null
+++ b/test_tipc/configs/ch_ppocr_server_v2.0_rec/model_linux_gpu_normal_normal_infer_cpp_linux_gpu_cpu.txt
@@ -0,0 +1,20 @@
+===========================cpp_infer_params===========================
+model_name:ch_ppocr_server_v2.0_rec
+use_opencv:True
+infer_model:./inference/ch_ppocr_server_v2.0_rec_infer/
+infer_quant:False
+inference:./deploy/cpp_infer/build/ppocr --rec_char_dict_path=./ppocr/utils/ppocr_keys_v1.txt --rec_img_h=32
+--use_gpu:True|False
+--enable_mkldnn:False
+--cpu_threads:6
+--rec_batch_num:6
+--use_tensorrt:False
+--precision:fp32
+--rec_model_dir:
+--image_dir:./inference/rec_inference/
+null:null
+--benchmark:True
+--det:False
+--rec:True
+--cls:False
+--use_angle_cls:False
\ No newline at end of file
diff --git a/test_tipc/configs/ch_ppocr_server_v2.0_rec/model_linux_gpu_normal_normal_paddle2onnx_python_linux_cpu.txt b/test_tipc/configs/ch_ppocr_server_v2.0_rec/model_linux_gpu_normal_normal_paddle2onnx_python_linux_cpu.txt
index 05542332e94eab38d8a433a727e04bf0be15f423..462f6090d987ac2c58656136e896e71bcdc3bee1 100644
--- a/test_tipc/configs/ch_ppocr_server_v2.0_rec/model_linux_gpu_normal_normal_paddle2onnx_python_linux_cpu.txt
+++ b/test_tipc/configs/ch_ppocr_server_v2.0_rec/model_linux_gpu_normal_normal_paddle2onnx_python_linux_cpu.txt
@@ -1,14 +1,17 @@
===========================paddle2onnx_params===========================
-model_name:ocr_rec_server
+model_name:ch_ppocr_server_v2.0_rec
python:python3.7
2onnx: paddle2onnx
---model_dir:./inference/ch_ppocr_server_v2.0_rec_infer/
+--det_model_dir:
--model_filename:inference.pdmodel
--params_filename:inference.pdiparams
---save_file:./inference/rec_server_onnx/model.onnx
+--det_save_file:
+--rec_model_dir:./inference/ch_ppocr_server_v2.0_rec_infer/
+--rec_save_file:./inference/rec_server_onnx/model.onnx
--opset_version:10
--enable_onnx_checker:True
-inference:tools/infer/predict_rec.py
+inference:tools/infer/predict_rec.py --rec_image_shape="3,32,320"
--use_gpu:True|False
+--det_model_dir:
--rec_model_dir:
---image_dir:./inference/rec_inference
\ No newline at end of file
+--image_dir:./inference/rec_inference/
\ No newline at end of file
diff --git a/test_tipc/configs/ch_ppocr_server_v2.0_rec/model_linux_gpu_normal_normal_serving_python_linux_gpu_cpu.txt b/test_tipc/configs/ch_ppocr_server_v2.0_rec/model_linux_gpu_normal_normal_serving_python_linux_gpu_cpu.txt
index d72abc6054d5f2eccf35f305076b7062fdf49848..7f456320b687549fbcd6d4f0be7a1b4a2969684a 100644
--- a/test_tipc/configs/ch_ppocr_server_v2.0_rec/model_linux_gpu_normal_normal_serving_python_linux_gpu_cpu.txt
+++ b/test_tipc/configs/ch_ppocr_server_v2.0_rec/model_linux_gpu_normal_normal_serving_python_linux_gpu_cpu.txt
@@ -1,18 +1,23 @@
===========================serving_params===========================
-model_name:ocr_rec_server
-python:python3.7|cpp
+model_name:ch_ppocr_server_v2.0_rec
+python:python3.7
trans_model:-m paddle_serving_client.convert
---dirname:./inference/ch_ppocr_server_v2.0_rec_infer/
+--det_dirname:null
--model_filename:inference.pdmodel
--params_filename:inference.pdiparams
---serving_server:./deploy/pdserving/ppocr_rec_server_2.0_serving/
---serving_client:./deploy/pdserving/ppocr_rec_server_2.0_client/
+--det_serving_server:null
+--det_serving_client:null
+--rec_dirname:./inference/ch_ppocr_server_v2.0_rec_infer/
+--rec_serving_server:./deploy/pdserving/ppocr_rec_server_serving/
+--rec_serving_client:./deploy/pdserving/ppocr_rec_server_client/
serving_dir:./deploy/pdserving
-web_service:web_service_rec.py --config=config.yml --opt op.rec.concurrency=1
-op.rec.local_service_conf.devices:"0"|null
-op.rec.local_service_conf.use_mkldnn:True|False
-op.rec.local_service_conf.thread_num:1|6
-op.rec.local_service_conf.use_trt:False|True
-op.rec.local_service_conf.precision:fp32|fp16|int8
-pipline:pipeline_rpc_client.py|pipeline_http_client.py
---image_dir:../../doc/imgs_words_en
\ No newline at end of file
+web_service:web_service_rec.py --config=config.yml --opt op.rec.concurrency="1"
+op.det.local_service_conf.devices:gpu|null
+op.det.local_service_conf.use_mkldnn:False
+op.det.local_service_conf.thread_num:6
+op.det.local_service_conf.use_trt:False
+op.det.local_service_conf.precision:fp32
+op.det.local_service_conf.model_config:
+op.rec.local_service_conf.model_config:
+pipline:pipeline_http_client.py --det=False
+--image_dir:../../inference/rec_inference
diff --git a/test_tipc/configs/ch_ppocr_server_v2.0_rec/train_infer_python.txt b/test_tipc/configs/ch_ppocr_server_v2.0_rec/train_infer_python.txt
index 78a046c503686762688ce08097d68479f1023879..64c0cf455cdd058d1840a9ad1f86954293d2e219 100644
--- a/test_tipc/configs/ch_ppocr_server_v2.0_rec/train_infer_python.txt
+++ b/test_tipc/configs/ch_ppocr_server_v2.0_rec/train_infer_python.txt
@@ -39,11 +39,11 @@ infer_export:tools/export_model.py -c test_tipc/configs/ch_ppocr_server_v2.0_rec
infer_quant:False
inference:tools/infer/predict_rec.py
--use_gpu:True|False
---enable_mkldnn:True|False
---cpu_threads:1|6
+--enable_mkldnn:False
+--cpu_threads:6
--rec_batch_num:1|6
---use_tensorrt:True|False
---precision:fp32|int8
+--use_tensorrt:False
+--precision:fp32
--rec_model_dir:
--image_dir:./inference/rec_inference
--save_log_path:./test/output/
diff --git a/test_tipc/configs/ch_ppocr_server_v2.0_rec/train_linux_gpu_fleet_normal_infer_python_linux_gpu_cpu.txt b/test_tipc/configs/ch_ppocr_server_v2.0_rec/train_linux_gpu_fleet_normal_infer_python_linux_gpu_cpu.txt
new file mode 100644
index 0000000000000000000000000000000000000000..9884ab247b80de4ca700bf084cea4faa89c86396
--- /dev/null
+++ b/test_tipc/configs/ch_ppocr_server_v2.0_rec/train_linux_gpu_fleet_normal_infer_python_linux_gpu_cpu.txt
@@ -0,0 +1,53 @@
+===========================train_params===========================
+model_name:ch_ppocr_server_v2.0_rec
+python:python3.7
+gpu_list:192.168.0.1,192.168.0.2;0,1
+Global.use_gpu:True
+Global.auto_cast:null
+Global.epoch_num:lite_train_lite_infer=5|whole_train_whole_infer=100
+Global.save_model_dir:./output/
+Train.loader.batch_size_per_card:lite_train_lite_infer=128|whole_train_whole_infer=128
+Global.pretrained_model:null
+train_model_name:latest
+train_infer_img_dir:./inference/rec_inference
+null:null
+##
+trainer:norm_train
+norm_train:tools/train.py -c test_tipc/configs/ch_ppocr_server_v2.0_rec/rec_icdar15_train.yml -o
+pact_train:null
+fpgm_train:null
+distill_train:null
+null:null
+null:null
+##
+===========================eval_params===========================
+eval:tools/eval.py -c test_tipc/configs/ch_ppocr_server_v2.0_rec/rec_icdar15_train.yml -o
+null:null
+##
+===========================infer_params===========================
+Global.save_inference_dir:./output/
+Global.checkpoints:
+norm_export:tools/export_model.py -c test_tipc/configs/ch_ppocr_server_v2.0_rec/rec_icdar15_train.yml -o
+quant_export:null
+fpgm_export:null
+distill_export:null
+export1:null
+export2:null
+##
+train_model:./inference/ch_ppocr_server_v2.0_rec_train/best_accuracy
+infer_export:tools/export_model.py -c test_tipc/configs/ch_ppocr_server_v2.0_rec/rec_icdar15_train.yml -o
+infer_quant:False
+inference:tools/infer/predict_rec.py
+--use_gpu:False
+--enable_mkldnn:False
+--cpu_threads:6
+--rec_batch_num:1|6
+--use_tensorrt:False
+--precision:fp32
+--rec_model_dir:
+--image_dir:./inference/rec_inference
+--save_log_path:./test/output/
+--benchmark:True
+null:null
+===========================infer_benchmark_params==========================
+random_infer_input:[{float32,[3,32,100]}]
diff --git a/test_tipc/configs/ch_ppocr_server_v2.0_rec/train_linux_gpu_normal_amp_infer_python_linux_gpu_cpu.txt b/test_tipc/configs/ch_ppocr_server_v2.0_rec/train_linux_gpu_normal_amp_infer_python_linux_gpu_cpu.txt
index 78c15047fb522127075591cc9687392af77a300a..63ddaa4a8b2dcb19823034ee85af14b248b109b2 100644
--- a/test_tipc/configs/ch_ppocr_server_v2.0_rec/train_linux_gpu_normal_amp_infer_python_linux_gpu_cpu.txt
+++ b/test_tipc/configs/ch_ppocr_server_v2.0_rec/train_linux_gpu_normal_amp_infer_python_linux_gpu_cpu.txt
@@ -39,11 +39,11 @@ infer_export:tools/export_model.py -c test_tipc/configs/ch_ppocr_server_v2.0_rec
infer_quant:False
inference:tools/infer/predict_rec.py
--use_gpu:True|False
---enable_mkldnn:True|False
---cpu_threads:1|6
+--enable_mkldnn:False
+--cpu_threads:6
--rec_batch_num:1|6
---use_tensorrt:True|False
---precision:fp32|int8
+--use_tensorrt:False
+--precision:fp32
--rec_model_dir:
--image_dir:./inference/rec_inference
--save_log_path:./test/output/
diff --git a/test_tipc/configs/det_mv3_db_v2_0/train_infer_python.txt b/test_tipc/configs/det_mv3_db_v2_0/train_infer_python.txt
index fab8f50d5451f90183d02e30c6529d63af42fe7f..ab3aa59b601db58b48cf18de79f77710611e2596 100644
--- a/test_tipc/configs/det_mv3_db_v2_0/train_infer_python.txt
+++ b/test_tipc/configs/det_mv3_db_v2_0/train_infer_python.txt
@@ -39,11 +39,11 @@ infer_export:tools/export_model.py -c configs/det/det_mv3_db.yml -o
infer_quant:False
inference:tools/infer/predict_det.py
--use_gpu:True|False
---enable_mkldnn:True|False
---cpu_threads:1|6
+--enable_mkldnn:False
+--cpu_threads:6
--rec_batch_num:1
---use_tensorrt:False|True
---precision:fp32|fp16|int8
+--use_tensorrt:False
+--precision:fp32
--det_model_dir:
--image_dir:./inference/ch_det_data_50/all-sum-510/
null:null
diff --git a/test_tipc/configs/det_mv3_east_v2.0/train_infer_python.txt b/test_tipc/configs/det_mv3_east_v2.0/train_infer_python.txt
index 5634297973bafbdad6c168e369d15520db09aba3..1ec1597a4d50ba1c41cfb076fa7431f170e183bf 100644
--- a/test_tipc/configs/det_mv3_east_v2.0/train_infer_python.txt
+++ b/test_tipc/configs/det_mv3_east_v2.0/train_infer_python.txt
@@ -39,11 +39,11 @@ infer_export:tools/export_model.py -c test_tipc/configs/det_mv3_east_v2.0/det_mv
infer_quant:False
inference:tools/infer/predict_det.py
--use_gpu:True|False
---enable_mkldnn:True|False
---cpu_threads:1|6
+--enable_mkldnn:False
+--cpu_threads:6
--rec_batch_num:1
--use_tensorrt:False
---precision:fp32|fp16|int8
+--precision:fp32
--det_model_dir:
--image_dir:./inference/ch_det_data_50/all-sum-510/
--save_log_path:null
diff --git a/test_tipc/configs/det_mv3_pse_v2.0/train_infer_python.txt b/test_tipc/configs/det_mv3_pse_v2.0/train_infer_python.txt
index 661adc4a324e0d51846d05d52a4cf1862661c095..daeec69f84a766e1d6cd2f8906772c27f5f8d048 100644
--- a/test_tipc/configs/det_mv3_pse_v2.0/train_infer_python.txt
+++ b/test_tipc/configs/det_mv3_pse_v2.0/train_infer_python.txt
@@ -39,11 +39,11 @@ infer_export:tools/export_model.py -c test_tipc/configs/det_mv3_pse_v2.0/det_mv3
infer_quant:False
inference:tools/infer/predict_det.py
--use_gpu:True|False
---enable_mkldnn:True|False
---cpu_threads:1|6
+--enable_mkldnn:False
+--cpu_threads:6
--rec_batch_num:1
--use_tensorrt:False
---precision:fp32|fp16
+--precision:fp32
--det_model_dir:
--image_dir:./inference/ch_det_data_50/all-sum-510/
--save_log_path:null
diff --git a/test_tipc/configs/det_r18_vd_db_v2_0/train_infer_python.txt b/test_tipc/configs/det_r18_vd_db_v2_0/train_infer_python.txt
index 77023ef2c18ba1a189b8b066773370c8bb060d87..33e4dbf2337f3799328516119a213bc0f14af9fe 100644
--- a/test_tipc/configs/det_r18_vd_db_v2_0/train_infer_python.txt
+++ b/test_tipc/configs/det_r18_vd_db_v2_0/train_infer_python.txt
@@ -39,11 +39,11 @@ infer_export:null
infer_quant:False
inference:tools/infer/predict_det.py
--use_gpu:True|False
---enable_mkldnn:True|False
---cpu_threads:1|6
+--enable_mkldnn:False
+--cpu_threads:6
--rec_batch_num:1
---use_tensorrt:False|True
---precision:fp32|fp16|int8
+--use_tensorrt:False
+--precision:fp32
--det_model_dir:
--image_dir:./inference/ch_det_data_50/all-sum-510/
--save_log_path:null
diff --git a/test_tipc/configs/det_r50_db++/train_infer_python.txt b/test_tipc/configs/det_r50_db++/train_infer_python.txt
new file mode 100644
index 0000000000000000000000000000000000000000..bcf393a52b0e073267aa7423960179d8b5eba4bd
--- /dev/null
+++ b/test_tipc/configs/det_r50_db++/train_infer_python.txt
@@ -0,0 +1,59 @@
+===========================train_params===========================
+model_name:det_r50_db++
+python:python3.7
+gpu_list:0|0,1
+Global.use_gpu:True|True
+Global.auto_cast:null
+Global.epoch_num:lite_train_lite_infer=1|whole_train_whole_infer=300
+Global.save_model_dir:./output/
+Train.loader.batch_size_per_card:lite_train_lite_infer=2|whole_train_whole_infer=4
+Global.pretrained_model:null
+train_model_name:latest
+train_infer_img_dir:./train_data/icdar2015/text_localization/ch4_test_images/
+null:null
+##
+trainer:norm_train
+norm_train:tools/train.py -c configs/det/det_r50_db++_ic15.yml -o Global.pretrained_model=./pretrain_models/ResNet50_dcn_asf_synthtext_pretrained
+pact_train:null
+fpgm_train:null
+distill_train:null
+null:null
+null:null
+##
+===========================eval_params===========================
+eval:null
+null:null
+##
+===========================infer_params===========================
+Global.save_inference_dir:./output/
+Global.checkpoints:
+norm_export:tools/export_model.py -c configs/det/det_r50_db++_ic15.yml -o
+quant_export:null
+fpgm_export:null
+distill_export:null
+export1:null
+export2:null
+inference_dir:null
+train_model:./inference/det_r50_db++_train/best_accuracy
+infer_export:tools/export_model.py -c configs/det/det_r50_db++_ic15.yml -o
+infer_quant:False
+inference:tools/infer/predict_det.py --det_algorithm="DB++"
+--use_gpu:True|False
+--enable_mkldnn:False
+--cpu_threads:6
+--rec_batch_num:1
+--use_tensorrt:False
+--precision:fp32
+--det_model_dir:
+--image_dir:./inference/ch_det_data_50/all-sum-510/
+null:null
+--benchmark:True
+null:null
+===========================infer_benchmark_params==========================
+random_infer_input:[{float32,[3,640,640]}];[{float32,[3,960,960]}]
+===========================train_benchmark_params==========================
+batch_size:8|16
+fp_items:fp32|fp16
+epoch:2
+--profiler_options:batch_range=[10,20];state=GPU;tracer_option=Default;profile_path=model.profile
+flags:FLAGS_eager_delete_tensor_gb=0.0;FLAGS_fraction_of_gpu_memory_to_use=0.98;FLAGS_conv_workspace_size_limit=4096
diff --git a/test_tipc/configs/det_r50_db_v2.0/train_infer_python.txt b/test_tipc/configs/det_r50_db_v2.0/train_infer_python.txt
index 3fd875711e03f1c31db0948e68f573ba7e113b51..151f2769cc2d97d6a3546f338383dd811aa06ace 100644
--- a/test_tipc/configs/det_r50_db_v2.0/train_infer_python.txt
+++ b/test_tipc/configs/det_r50_db_v2.0/train_infer_python.txt
@@ -39,11 +39,11 @@ infer_export:tools/export_model.py -c configs/det/det_r50_vd_db.yml -o
infer_quant:False
inference:tools/infer/predict_det.py
--use_gpu:True|False
---enable_mkldnn:True|False
---cpu_threads:1|6
+--enable_mkldnn:False
+--cpu_threads:6
--rec_batch_num:1
---use_tensorrt:False|True
---precision:fp32|fp16|int8
+--use_tensorrt:False
+--precision:fp32
--det_model_dir:
--image_dir:./inference/ch_det_data_50/all-sum-510/
--save_log_path:null
diff --git a/test_tipc/configs/det_r50_vd_east_v2_0/train_infer_python.txt b/test_tipc/configs/det_r50_vd_east_v2_0/train_infer_python.txt
index c1748c5d2fca9690926f6645205084fb9a858185..8477a4fa74f7a0617104aa83617fc6f61b8234b3 100644
--- a/test_tipc/configs/det_r50_vd_east_v2_0/train_infer_python.txt
+++ b/test_tipc/configs/det_r50_vd_east_v2_0/train_infer_python.txt
@@ -39,11 +39,11 @@ infer_export:tools/export_model.py -c test_tipc/configs/det_r50_vd_east_v2_0/det
infer_quant:False
inference:tools/infer/predict_det.py
--use_gpu:True|False
---enable_mkldnn:True|False
---cpu_threads:1|6
+--enable_mkldnn:False
+--cpu_threads:6
--rec_batch_num:1
--use_tensorrt:False
---precision:fp32|fp16|int8
+--precision:fp32
--det_model_dir:
--image_dir:./inference/ch_det_data_50/all-sum-510/
--save_log_path:null
diff --git a/test_tipc/configs/det_r50_vd_pse_v2_0/train_infer_python.txt b/test_tipc/configs/det_r50_vd_pse_v2_0/train_infer_python.txt
index 55ebcd3547b2e92c86e1c0007e0a1bcb9758cced..62da89fe1c8e3a7c2b7586eae6b2589f94237a2e 100644
--- a/test_tipc/configs/det_r50_vd_pse_v2_0/train_infer_python.txt
+++ b/test_tipc/configs/det_r50_vd_pse_v2_0/train_infer_python.txt
@@ -39,11 +39,11 @@ infer_export:tools/export_model.py -c test_tipc/configs/det_r50_vd_pse_v2_0/det_
infer_quant:False
inference:tools/infer/predict_det.py
--use_gpu:True|False
---enable_mkldnn:True|False
---cpu_threads:1|6
+--enable_mkldnn:False
+--cpu_threads:6
--rec_batch_num:1
--use_tensorrt:False
---precision:fp32|fp16|int8
+--precision:fp32
--det_model_dir:
--image_dir:./inference/ch_det_data_50/all-sum-510/
--save_log_path:null
diff --git a/test_tipc/configs/det_r50_vd_sast_icdar15_v2.0/train_infer_python.txt b/test_tipc/configs/det_r50_vd_sast_icdar15_v2.0/train_infer_python.txt
index 16f37ace6fbc469cdd6fd2928e26978508a0841f..b70ef46b4afb3a39f3bbd3d6274f0135a0646a37 100644
--- a/test_tipc/configs/det_r50_vd_sast_icdar15_v2.0/train_infer_python.txt
+++ b/test_tipc/configs/det_r50_vd_sast_icdar15_v2.0/train_infer_python.txt
@@ -4,16 +4,16 @@ python:python3.7
gpu_list:0|0,1
Global.use_gpu:True|True
Global.auto_cast:null
-Global.epoch_num:lite_train_lite_infer=1|whole_train_whole_infer=5000
+Global.epoch_num:lite_train_lite_infer=1|whole_train_whole_infer=500
Global.save_model_dir:./output/
Train.loader.batch_size_per_card:lite_train_lite_infer=2|whole_train_whole_infer=4
-Global.pretrained_model:null
+Global.pretrained_model:./pretrain_models/det_r50_vd_sast_icdar15_v2.0_train/best_accuracy
train_model_name:latest
train_infer_img_dir:./train_data/icdar2015/text_localization/ch4_test_images/
null:null
##
trainer:norm_train
-norm_train:tools/train.py -c test_tipc/configs/det_r50_vd_sast_icdar15_v2.0/det_r50_vd_sast_icdar2015.yml -o Global.pretrained_model=./pretrain_models/ResNet50_vd_ssld_pretrained
+norm_train:tools/train.py -c test_tipc/configs/det_r50_vd_sast_icdar15_v2.0/det_r50_vd_sast_icdar2015.yml -o
pact_train:null
fpgm_train:null
distill_train:null
@@ -39,13 +39,13 @@ infer_export:tools/export_model.py -c test_tipc/configs/det_r50_vd_sast_icdar15_
infer_quant:False
inference:tools/infer/predict_det.py
--use_gpu:True|False
---enable_mkldnn:True|False
---cpu_threads:1|6
+--enable_mkldnn:False
+--cpu_threads:6
--rec_batch_num:1
--use_tensorrt:False
---precision:fp32|int8
+--precision:fp32
--det_model_dir:
---image_dir:./inference/ch_det_data_50/all-sum-510/
+--image_dir:./inference/ch_det_data_50/all-sum-510/00008790.jpg
null:null
--benchmark:True
--det_algorithm:SAST
diff --git a/test_tipc/configs/det_r50_vd_sast_totaltext_v2.0/train_infer_python.txt b/test_tipc/configs/det_r50_vd_sast_totaltext_v2.0/train_infer_python.txt
index 5e4c5666b7b90b754c77153612661aa5e01f4cb2..7be5af7ddee0ed0f688980f5d5dca5a99c9705a0 100644
--- a/test_tipc/configs/det_r50_vd_sast_totaltext_v2.0/train_infer_python.txt
+++ b/test_tipc/configs/det_r50_vd_sast_totaltext_v2.0/train_infer_python.txt
@@ -39,11 +39,11 @@ infer_export:tools/export_model.py -c test_tipc/configs/det_r50_vd_sast_totaltex
infer_quant:False
inference:tools/infer/predict_det.py
--use_gpu:True|False
---enable_mkldnn:True|False
---cpu_threads:1|6
+--enable_mkldnn:False
+--cpu_threads:6
--rec_batch_num:1
--use_tensorrt:False
---precision:fp32|int8
+--precision:fp32
--det_model_dir:
--image_dir:./inference/ch_det_data_50/all-sum-510/
null:null
diff --git a/test_tipc/configs/en_server_pgnetA/train_infer_python.txt b/test_tipc/configs/en_server_pgnetA/train_infer_python.txt
index 8a1509baab46d4c52c56b3afbaf23350adc86584..a9dd4e676bf607d654b40f1f5ff21ec735bc9d40 100644
--- a/test_tipc/configs/en_server_pgnetA/train_infer_python.txt
+++ b/test_tipc/configs/en_server_pgnetA/train_infer_python.txt
@@ -39,11 +39,11 @@ infer_export:tools/export_model.py -c configs/e2e/e2e_r50_vd_pg.yml -o
infer_quant:False
inference:tools/infer/predict_e2e.py
--use_gpu:True|False
---enable_mkldnn:True|False
---cpu_threads:1|6
+--enable_mkldnn:False
+--cpu_threads:6
--rec_batch_num:1
--use_tensorrt:False
---precision:fp32|fp16|int8
+--precision:fp32
--e2e_model_dir:
--image_dir:./inference/ch_det_data_50/all-sum-510/
null:null
diff --git a/test_tipc/configs/en_table_structure/table_mv3.yml b/test_tipc/configs/en_table_structure/table_mv3.yml
new file mode 100755
index 0000000000000000000000000000000000000000..adf326bd02aeff4683c8f37a704125b4e426efa9
--- /dev/null
+++ b/test_tipc/configs/en_table_structure/table_mv3.yml
@@ -0,0 +1,117 @@
+Global:
+ use_gpu: true
+ epoch_num: 10
+ log_smooth_window: 20
+ print_batch_step: 5
+ save_model_dir: ./output/table_mv3/
+ save_epoch_step: 3
+ # evaluation is run every 400 iterations after the 0th iteration
+ eval_batch_step: [0, 400]
+ cal_metric_during_train: True
+ pretrained_model:
+ checkpoints:
+ save_inference_dir:
+ use_visualdl: False
+ infer_img: doc/table/table.jpg
+ # for data or label process
+ character_dict_path: ppocr/utils/dict/table_structure_dict.txt
+ character_type: en
+ max_text_length: 100
+ max_elem_length: 800
+ max_cell_num: 500
+ infer_mode: False
+ process_total_num: 0
+ process_cut_num: 0
+
+Optimizer:
+ name: Adam
+ beta1: 0.9
+ beta2: 0.999
+ clip_norm: 5.0
+ lr:
+ learning_rate: 0.001
+ regularizer:
+ name: 'L2'
+ factor: 0.00000
+
+Architecture:
+ model_type: table
+ algorithm: TableAttn
+ Backbone:
+ name: MobileNetV3
+ scale: 1.0
+ model_name: large
+ Head:
+ name: TableAttentionHead
+ hidden_size: 256
+ l2_decay: 0.00001
+ loc_type: 2
+ max_text_length: 100
+ max_elem_length: 800
+ max_cell_num: 500
+
+Loss:
+ name: TableAttentionLoss
+ structure_weight: 100.0
+ loc_weight: 10000.0
+
+PostProcess:
+ name: TableLabelDecode
+
+Metric:
+ name: TableMetric
+ main_indicator: acc
+
+Train:
+ dataset:
+ name: PubTabDataSet
+ data_dir: ./train_data/pubtabnet/train
+ label_file_path: ./train_data/pubtabnet/train.jsonl
+ transforms:
+ - DecodeImage: # load image
+ img_mode: BGR
+ channel_first: False
+ - ResizeTableImage:
+ max_len: 488
+ - TableLabelEncode:
+ - NormalizeImage:
+ scale: 1./255.
+ mean: [0.485, 0.456, 0.406]
+ std: [0.229, 0.224, 0.225]
+ order: 'hwc'
+ - PaddingTableImage:
+ - ToCHWImage:
+ - KeepKeys:
+ keep_keys: ['image', 'structure', 'bbox_list', 'sp_tokens', 'bbox_list_mask']
+ loader:
+ shuffle: True
+ batch_size_per_card: 32
+ drop_last: True
+ num_workers: 1
+
+Eval:
+ dataset:
+ name: PubTabDataSet
+ data_dir: ./train_data/pubtabnet/test/
+ label_file_path: ./train_data/pubtabnet/test.jsonl
+ transforms:
+ - DecodeImage: # load image
+ img_mode: BGR
+ channel_first: False
+ - ResizeTableImage:
+ max_len: 488
+ - TableLabelEncode:
+ - NormalizeImage:
+ scale: 1./255.
+ mean: [0.485, 0.456, 0.406]
+ std: [0.229, 0.224, 0.225]
+ order: 'hwc'
+ - PaddingTableImage:
+ - ToCHWImage:
+ - KeepKeys:
+ keep_keys: ['image', 'structure', 'bbox_list', 'sp_tokens', 'bbox_list_mask']
+ loader:
+ shuffle: False
+ drop_last: False
+ batch_size_per_card: 16
+ num_workers: 1
diff --git a/test_tipc/configs/en_table_structure/train_infer_python.txt b/test_tipc/configs/en_table_structure/train_infer_python.txt
new file mode 100644
index 0000000000000000000000000000000000000000..d9f3b30e16c75281a929130d877b947a23c16190
--- /dev/null
+++ b/test_tipc/configs/en_table_structure/train_infer_python.txt
@@ -0,0 +1,53 @@
+===========================train_params===========================
+model_name:en_table_structure
+python:python3.7
+gpu_list:0|0,1
+Global.use_gpu:True|True
+Global.auto_cast:fp32
+Global.epoch_num:lite_train_lite_infer=3|whole_train_whole_infer=50
+Global.save_model_dir:./output/
+Train.loader.batch_size_per_card:lite_train_lite_infer=16|whole_train_whole_infer=128
+Global.pretrained_model:./pretrain_models/en_ppocr_mobile_v2.0_table_structure_train/best_accuracy
+train_model_name:latest
+train_infer_img_dir:./ppstructure/docs/table/table.jpg
+null:null
+##
+trainer:norm_train
+norm_train:tools/train.py -c test_tipc/configs/en_table_structure/table_mv3.yml -o
+pact_train:null
+fpgm_train:null
+distill_train:null
+null:null
+null:null
+##
+===========================eval_params===========================
+eval:null
+null:null
+##
+===========================infer_params===========================
+Global.save_inference_dir:./output/
+Global.checkpoints:
+norm_export:tools/export_model.py -c test_tipc/configs/en_table_structure/table_mv3.yml -o
+quant_export:
+fpgm_export:
+distill_export:null
+export1:null
+export2:null
+##
+infer_model:./inference/en_ppocr_mobile_v2.0_table_structure_infer
+infer_export:null
+infer_quant:False
+inference:ppstructure/table/predict_table.py --det_model_dir=./inference/en_ppocr_mobile_v2.0_table_det_infer --rec_model_dir=./inference/en_ppocr_mobile_v2.0_table_rec_infer --rec_char_dict_path=./ppocr/utils/dict/table_dict.txt --table_char_dict_path=./ppocr/utils/dict/table_structure_dict.txt --image_dir=./ppstructure/docs/table/table.jpg --det_limit_side_len=736 --det_limit_type=min --output ./output/table
+--use_gpu:True|False
+--enable_mkldnn:False
+--cpu_threads:6
+--rec_batch_num:1
+--use_tensorrt:False
+--precision:fp32
+--table_model_dir:
+--image_dir:./ppstructure/docs/table/table.jpg
+null:null
+--benchmark:False
+null:null
+===========================infer_benchmark_params==========================
+random_infer_input:[{float32,[3,488,488]}]
diff --git a/test_tipc/configs/en_table_structure/train_linux_gpu_fleet_normal_infer_python_linux_gpu_cpu.txt b/test_tipc/configs/en_table_structure/train_linux_gpu_fleet_normal_infer_python_linux_gpu_cpu.txt
new file mode 100644
index 0000000000000000000000000000000000000000..41d236c3765fbf6a711c6739d8dee4f41a147039
--- /dev/null
+++ b/test_tipc/configs/en_table_structure/train_linux_gpu_fleet_normal_infer_python_linux_gpu_cpu.txt
@@ -0,0 +1,53 @@
+===========================train_params===========================
+model_name:en_table_structure
+python:python3.7
+gpu_list:192.168.0.1,192.168.0.2;0,1
+Global.use_gpu:True
+Global.auto_cast:fp32
+Global.epoch_num:lite_train_lite_infer=3|whole_train_whole_infer=50
+Global.save_model_dir:./output/
+Train.loader.batch_size_per_card:lite_train_lite_infer=16|whole_train_whole_infer=128
+Global.pretrained_model:./pretrain_models/en_ppocr_mobile_v2.0_table_structure_train/best_accuracy
+train_model_name:latest
+train_infer_img_dir:./ppstructure/docs/table/table.jpg
+null:null
+##
+trainer:norm_train
+norm_train:tools/train.py -c test_tipc/configs/en_table_structure/table_mv3.yml -o
+pact_train:null
+fpgm_train:null
+distill_train:null
+null:null
+null:null
+##
+===========================eval_params===========================
+eval:null
+null:null
+##
+===========================infer_params===========================
+Global.save_inference_dir:./output/
+Global.checkpoints:
+norm_export:tools/export_model.py -c test_tipc/configs/en_table_structure/table_mv3.yml -o
+quant_export:
+fpgm_export:
+distill_export:null
+export1:null
+export2:null
+##
+infer_model:./inference/en_ppocr_mobile_v2.0_table_structure_infer
+infer_export:null
+infer_quant:False
+inference:ppstructure/table/predict_table.py --det_model_dir=./inference/en_ppocr_mobile_v2.0_table_det_infer --rec_model_dir=./inference/en_ppocr_mobile_v2.0_table_rec_infer --rec_char_dict_path=./ppocr/utils/dict/table_dict.txt --table_char_dict_path=./ppocr/utils/dict/table_structure_dict.txt --image_dir=./ppstructure/docs/table/table.jpg --det_limit_side_len=736 --det_limit_type=min --output ./output/table
+--use_gpu:False
+--enable_mkldnn:False
+--cpu_threads:6
+--rec_batch_num:1
+--use_tensorrt:False
+--precision:fp32
+--table_model_dir:
+--image_dir:./ppstructure/docs/table/table.jpg
+null:null
+--benchmark:False
+null:null
+===========================infer_benchmark_params==========================
+random_infer_input:[{float32,[3,488,488]}]
diff --git a/test_tipc/configs/en_table_structure/train_linux_gpu_normal_amp_infer_python_linux_gpu_cpu.txt b/test_tipc/configs/en_table_structure/train_linux_gpu_normal_amp_infer_python_linux_gpu_cpu.txt
new file mode 100644
index 0000000000000000000000000000000000000000..31ac1ed53f2adc9810bc4fd2cf4f874d89d49606
--- /dev/null
+++ b/test_tipc/configs/en_table_structure/train_linux_gpu_normal_amp_infer_python_linux_gpu_cpu.txt
@@ -0,0 +1,53 @@
+===========================train_params===========================
+model_name:en_table_structure
+python:python3.7
+gpu_list:0|0,1
+Global.use_gpu:True|True
+Global.auto_cast:amp
+Global.epoch_num:lite_train_lite_infer=3|whole_train_whole_infer=50
+Global.save_model_dir:./output/
+Train.loader.batch_size_per_card:lite_train_lite_infer=16|whole_train_whole_infer=128
+Global.pretrained_model:./pretrain_models/en_ppocr_mobile_v2.0_table_structure_train/best_accuracy
+train_model_name:latest
+train_infer_img_dir:./ppstructure/docs/table/table.jpg
+null:null
+##
+trainer:norm_train
+norm_train:tools/train.py -c test_tipc/configs/en_table_structure/table_mv3.yml -o
+pact_train:null
+fpgm_train:null
+distill_train:null
+null:null
+null:null
+##
+===========================eval_params===========================
+eval:null
+null:null
+##
+===========================infer_params===========================
+Global.save_inference_dir:./output/
+Global.checkpoints:
+norm_export:tools/export_model.py -c test_tipc/configs/en_table_structure/table_mv3.yml -o
+quant_export:
+fpgm_export:
+distill_export:null
+export1:null
+export2:null
+##
+infer_model:./inference/en_ppocr_mobile_v2.0_table_structure_infer
+infer_export:null
+infer_quant:False
+inference:ppstructure/table/predict_table.py --det_model_dir=./inference/en_ppocr_mobile_v2.0_table_det_infer --rec_model_dir=./inference/en_ppocr_mobile_v2.0_table_rec_infer --rec_char_dict_path=./ppocr/utils/dict/table_dict.txt --table_char_dict_path=./ppocr/utils/dict/table_structure_dict.txt --image_dir=./ppstructure/docs/table/table.jpg --det_limit_side_len=736 --det_limit_type=min --output ./output/table
+--use_gpu:True|False
+--enable_mkldnn:False
+--cpu_threads:6
+--rec_batch_num:1
+--use_tensorrt:False
+--precision:fp32
+--table_model_dir:
+--image_dir:./ppstructure/docs/table/table.jpg
+null:null
+--benchmark:False
+null:null
+===========================infer_benchmark_params==========================
+random_infer_input:[{float32,[3,488,488]}]
diff --git a/test_tipc/configs/en_table_structure/train_pact_infer_python.txt b/test_tipc/configs/en_table_structure/train_pact_infer_python.txt
new file mode 100644
index 0000000000000000000000000000000000000000..f62e8b68bc6c1af06a65a8dfb438d5d63576e123
--- /dev/null
+++ b/test_tipc/configs/en_table_structure/train_pact_infer_python.txt
@@ -0,0 +1,53 @@
+===========================train_params===========================
+model_name:en_table_structure_PACT
+python:python3.7
+gpu_list:0|0,1
+Global.use_gpu:True|True
+Global.auto_cast:fp32
+Global.epoch_num:lite_train_lite_infer=1|whole_train_whole_infer=50
+Global.save_model_dir:./output/
+Train.loader.batch_size_per_card:lite_train_lite_infer=16|whole_train_whole_infer=128
+Global.pretrained_model:./pretrain_models/en_ppocr_mobile_v2.0_table_structure_train/best_accuracy
+train_model_name:latest
+train_infer_img_dir:./ppstructure/docs/table/table.jpg
+null:null
+##
+trainer:pact_train
+norm_train:null
+pact_train:deploy/slim/quantization/quant.py -c test_tipc/configs/en_table_structure/table_mv3.yml -o
+fpgm_train:null
+distill_train:null
+null:null
+null:null
+##
+===========================eval_params===========================
+eval:null
+null:null
+##
+===========================infer_params===========================
+Global.save_inference_dir:./output/
+Global.checkpoints:
+norm_export:null
+quant_export:deploy/slim/quantization/export_model.py -c test_tipc/configs/en_table_structure/table_mv3.yml -o
+fpgm_export:
+distill_export:null
+export1:null
+export2:null
+##
+infer_model:./inference/en_ppocr_mobile_v2.0_table_structure_infer
+infer_export:null
+infer_quant:True
+inference:ppstructure/table/predict_table.py --det_model_dir=./inference/en_ppocr_mobile_v2.0_table_det_infer --rec_model_dir=./inference/en_ppocr_mobile_v2.0_table_rec_infer --rec_char_dict_path=./ppocr/utils/dict/table_dict.txt --table_char_dict_path=./ppocr/utils/dict/table_structure_dict.txt --image_dir=./ppstructure/docs/table/table.jpg --det_limit_side_len=736 --det_limit_type=min --output ./output/table
+--use_gpu:True|False
+--enable_mkldnn:False
+--cpu_threads:6
+--rec_batch_num:1
+--use_tensorrt:False
+--precision:fp32
+--table_model_dir:
+--image_dir:./ppstructure/docs/table/table.jpg
+null:null
+--benchmark:False
+null:null
+===========================infer_benchmark_params==========================
+random_infer_input:[{float32,[3,488,488]}]
diff --git a/test_tipc/configs/en_table_structure/train_ptq_infer_python.txt b/test_tipc/configs/en_table_structure/train_ptq_infer_python.txt
new file mode 100644
index 0000000000000000000000000000000000000000..e8f7bbaa50417b97f79596634677fff0a95cb47f
--- /dev/null
+++ b/test_tipc/configs/en_table_structure/train_ptq_infer_python.txt
@@ -0,0 +1,21 @@
+===========================train_params===========================
+model_name:en_table_structure_KL
+python:python3.7
+Global.pretrained_model:
+Global.save_inference_dir:null
+infer_model:./inference/en_ppocr_mobile_v2.0_table_structure_infer/
+infer_export:deploy/slim/quantization/quant_kl.py -c test_tipc/configs/en_table_structure/table_mv3.yml -o
+infer_quant:True
+inference:ppstructure/table/predict_table.py --det_model_dir=./inference/en_ppocr_mobile_v2.0_table_det_infer --rec_model_dir=./inference/en_ppocr_mobile_v2.0_table_rec_infer --rec_char_dict_path=./ppocr/utils/dict/table_dict.txt --table_char_dict_path=./ppocr/utils/dict/table_structure_dict.txt --image_dir=./ppstructure/docs/table/table.jpg --det_limit_side_len=736 --det_limit_type=min --output ./output/table
+--use_gpu:True|False
+--enable_mkldnn:False
+--cpu_threads:6
+--rec_batch_num:1
+--use_tensorrt:False
+--precision:int8
+--table_model_dir:
+--image_dir:./ppstructure/docs/table/table.jpg
+null:null
+--benchmark:False
+null:null
+null:null
diff --git a/test_tipc/configs/rec_mtb_nrtr/rec_mtb_nrtr.yml b/test_tipc/configs/rec_mtb_nrtr/rec_mtb_nrtr.yml
index 15119bb2a9de02c19684d21ad5a1859db94895ce..8118d587248b7e4797e3a75c897e7b0a3d71b364 100644
--- a/test_tipc/configs/rec_mtb_nrtr/rec_mtb_nrtr.yml
+++ b/test_tipc/configs/rec_mtb_nrtr/rec_mtb_nrtr.yml
@@ -49,7 +49,7 @@ Architecture:
Loss:
- name: NRTRLoss
+ name: CELoss
smoothing: True
PostProcess:
@@ -69,7 +69,7 @@ Train:
img_mode: BGR
channel_first: False
- NRTRLabelEncode: # Class handling label
- - NRTRRecResizeImg:
+ - GrayRecResizeImg:
image_shape: [100, 32]
resize_type: PIL # PIL or OpenCV
- KeepKeys:
@@ -90,7 +90,7 @@ Eval:
img_mode: BGR
channel_first: False
- NRTRLabelEncode: # Class handling label
- - NRTRRecResizeImg:
+ - GrayRecResizeImg:
image_shape: [100, 32]
resize_type: PIL # PIL or OpenCV
- KeepKeys:
@@ -99,5 +99,5 @@ Eval:
shuffle: False
drop_last: False
batch_size_per_card: 256
- num_workers: 1
+ num_workers: 4
use_shared_memory: False
diff --git a/test_tipc/configs/rec_mtb_nrtr/train_infer_python.txt b/test_tipc/configs/rec_mtb_nrtr/train_infer_python.txt
index de6de5a0caa36fb3ff89d8dbf5c7ff8b7965ca7f..fed8ba26753bb770e062f751a9ba1e8e35fc6843 100644
--- a/test_tipc/configs/rec_mtb_nrtr/train_infer_python.txt
+++ b/test_tipc/configs/rec_mtb_nrtr/train_infer_python.txt
@@ -39,11 +39,11 @@ infer_export:tools/export_model.py -c test_tipc/configs/rec_mtb_nrtr/rec_mtb_nrt
infer_quant:False
inference:tools/infer/predict_rec.py --rec_char_dict_path=./ppocr/utils/EN_symbol_dict.txt --rec_image_shape="1,32,100" --rec_algorithm="NRTR"
--use_gpu:True|False
---enable_mkldnn:True|False
---cpu_threads:1|6
+--enable_mkldnn:False
+--cpu_threads:6
--rec_batch_num:1|6
---use_tensorrt:True|False
---precision:fp32|int8
+--use_tensorrt:False
+--precision:fp32
--rec_model_dir:
--image_dir:./inference/rec_inference
--save_log_path:./test/output/
diff --git a/test_tipc/configs/rec_mv3_none_bilstm_ctc_v2.0/train_infer_python.txt b/test_tipc/configs/rec_mv3_none_bilstm_ctc_v2.0/train_infer_python.txt
index e67dd1509054b34bfac6a36eaaca16fa31c0f1a0..39bf9227902480ffe4ed37d454c21d6a163c41bd 100644
--- a/test_tipc/configs/rec_mv3_none_bilstm_ctc_v2.0/train_infer_python.txt
+++ b/test_tipc/configs/rec_mv3_none_bilstm_ctc_v2.0/train_infer_python.txt
@@ -39,11 +39,11 @@ infer_export:tools/export_model.py -c test_tipc/configs/rec_mv3_none_bilstm_ctc_
infer_quant:False
inference:tools/infer/predict_rec.py --rec_char_dict_path=./ppocr/utils/ic15_dict.txt --rec_image_shape="3,32,100"
--use_gpu:True|False
---enable_mkldnn:True|False
---cpu_threads:1|6
+--enable_mkldnn:False
+--cpu_threads:6
--rec_batch_num:1|6
---use_tensorrt:True|False
---precision:fp32|int8
+--use_tensorrt:False
+--precision:fp32
--rec_model_dir:
--image_dir:./inference/rec_inference
--save_log_path:./test/output/
diff --git a/test_tipc/configs/rec_mv3_none_none_ctc_v2.0/train_infer_python.txt b/test_tipc/configs/rec_mv3_none_none_ctc_v2.0/train_infer_python.txt
index aa3e88d284fe557c109cb8d794e2caecbec7a7ee..593de3ff20aa9890e7d9a02a9e5ca5b130e5a266 100644
--- a/test_tipc/configs/rec_mv3_none_none_ctc_v2.0/train_infer_python.txt
+++ b/test_tipc/configs/rec_mv3_none_none_ctc_v2.0/train_infer_python.txt
@@ -39,11 +39,11 @@ infer_export:tools/export_model.py -c test_tipc/configs/rec_mv3_none_none_ctc_v2
infer_quant:False
inference:tools/infer/predict_rec.py --rec_char_dict_path=./ppocr/utils/ic15_dict.txt --rec_image_shape="3,32,100"
--use_gpu:True|False
---enable_mkldnn:True|False
---cpu_threads:1|6
+--enable_mkldnn:False
+--cpu_threads:6
--rec_batch_num:1|6
---use_tensorrt:True|False
---precision:fp32|int8
+--use_tensorrt:False
+--precision:fp32
--rec_model_dir:
--image_dir:./inference/rec_inference
--save_log_path:./test/output/
diff --git a/test_tipc/configs/rec_mv3_tps_bilstm_att_v2.0/train_infer_python.txt b/test_tipc/configs/rec_mv3_tps_bilstm_att_v2.0/train_infer_python.txt
index c22767c60fa8294aa244536b4c04135f7f7ade02..1b2d9abb0f00467ce92c4f51f97c283bc3e85c5e 100644
--- a/test_tipc/configs/rec_mv3_tps_bilstm_att_v2.0/train_infer_python.txt
+++ b/test_tipc/configs/rec_mv3_tps_bilstm_att_v2.0/train_infer_python.txt
@@ -39,11 +39,11 @@ infer_export:tools/export_model.py -c test_tipc/configs/rec_mv3_tps_bilstm_att_v
infer_quant:False
inference:tools/infer/predict_rec.py --rec_char_dict_path=./ppocr/utils/ic15_dict.txt --rec_image_shape="3,32,100" --rec_algorithm="RARE" --min_subgraph_size=5
--use_gpu:True|False
---enable_mkldnn:True|False
---cpu_threads:1|6
+--enable_mkldnn:False
+--cpu_threads:6
--rec_batch_num:1|6
---use_tensorrt:True|False
---precision:fp32|int8
+--use_tensorrt:False
+--precision:fp32
--rec_model_dir:
--image_dir:./inference/rec_inference
--save_log_path:./test/output/
diff --git a/test_tipc/configs/rec_mv3_tps_bilstm_ctc_v2.0/train_infer_python.txt b/test_tipc/configs/rec_mv3_tps_bilstm_ctc_v2.0/train_infer_python.txt
index 7a3096eb1e3a94bf3967a80d49b622603ae06ff8..1367c7abd4c9ca5b0c6f1eb291dd2af8d9fa4de4 100644
--- a/test_tipc/configs/rec_mv3_tps_bilstm_ctc_v2.0/train_infer_python.txt
+++ b/test_tipc/configs/rec_mv3_tps_bilstm_ctc_v2.0/train_infer_python.txt
@@ -39,11 +39,11 @@ infer_export:tools/export_model.py -c test_tipc/configs/rec_mv3_tps_bilstm_ctc_v
infer_quant:False
inference:tools/infer/predict_rec.py --rec_char_dict_path=./ppocr/utils/ic15_dict.txt --rec_image_shape="3,32,100" --rec_algorithm="StarNet"
--use_gpu:True|False
---enable_mkldnn:True|False
---cpu_threads:1|6
+--enable_mkldnn:False
+--cpu_threads:6
--rec_batch_num:1|6
---use_tensorrt:True|False
---precision:fp32|int8
+--use_tensorrt:False
+--precision:fp32
--rec_model_dir:
--image_dir:./inference/rec_inference
--save_log_path:./test/output/
diff --git a/test_tipc/configs/rec_r31_sar/train_infer_python.txt b/test_tipc/configs/rec_r31_sar/train_infer_python.txt
index 1a32a3d507d8923a8b51be726c7624ea2049ae14..03ec54abb65ac41d3b5ad4f6e2fdcf7abb34c344 100644
--- a/test_tipc/configs/rec_r31_sar/train_infer_python.txt
+++ b/test_tipc/configs/rec_r31_sar/train_infer_python.txt
@@ -38,12 +38,12 @@ train_model:./inference/rec_r31_sar_train/best_accuracy
infer_export:tools/export_model.py -c test_tipc/configs/rec_r31_sar/rec_r31_sar.yml -o
infer_quant:False
inference:tools/infer/predict_rec.py --rec_char_dict_path=./ppocr/utils/dict90.txt --rec_image_shape="3,48,48,160" --rec_algorithm="SAR"
---use_gpu:True|False
---enable_mkldnn:True|False
---cpu_threads:1|6
+--use_gpu:True
+--enable_mkldnn:False
+--cpu_threads:6
--rec_batch_num:1|6
---use_tensorrt:True|False
---precision:fp32|int8
+--use_tensorrt:False
+--precision:fp32
--rec_model_dir:
--image_dir:./inference/rec_inference
--save_log_path:./test/output/
diff --git a/test_tipc/configs/rec_r34_vd_none_bilstm_ctc_v2.0/train_infer_python.txt b/test_tipc/configs/rec_r34_vd_none_bilstm_ctc_v2.0/train_infer_python.txt
index 02cea56fbe922bb94cceb77c079371f180cac618..46aa3d719051a4f124583f88709026569d95c1c7 100644
--- a/test_tipc/configs/rec_r34_vd_none_bilstm_ctc_v2.0/train_infer_python.txt
+++ b/test_tipc/configs/rec_r34_vd_none_bilstm_ctc_v2.0/train_infer_python.txt
@@ -39,11 +39,11 @@ infer_export:tools/export_model.py -c test_tipc/configs/rec_r34_vd_none_bilstm_c
infer_quant:False
inference:tools/infer/predict_rec.py --rec_char_dict_path=./ppocr/utils/ic15_dict.txt --rec_image_shape="3,32,100"
--use_gpu:True|False
---enable_mkldnn:True|False
---cpu_threads:1|6
+--enable_mkldnn:False
+--cpu_threads:6
--rec_batch_num:1|6
---use_tensorrt:True|False
---precision:fp32|int8
+--use_tensorrt:False
+--precision:fp32
--rec_model_dir:
--image_dir:./inference/rec_inference
--save_log_path:./test/output/
diff --git a/test_tipc/configs/rec_r34_vd_none_none_ctc_v2.0/train_infer_python.txt b/test_tipc/configs/rec_r34_vd_none_none_ctc_v2.0/train_infer_python.txt
index 5e7c1d34314cfc8aab1c97d5f6e74b0dd75f496a..3e066d7b72a6a707322b3aabe41ca6d698496433 100644
--- a/test_tipc/configs/rec_r34_vd_none_none_ctc_v2.0/train_infer_python.txt
+++ b/test_tipc/configs/rec_r34_vd_none_none_ctc_v2.0/train_infer_python.txt
@@ -39,11 +39,11 @@ infer_export:tools/export_model.py -c test_tipc/configs/rec_r34_vd_none_none_ctc
infer_quant:False
inference:tools/infer/predict_rec.py --rec_char_dict_path=./ppocr/utils/ic15_dict.txt --rec_image_shape="3,32,100"
--use_gpu:True|False
---enable_mkldnn:True|False
---cpu_threads:1|6
+--enable_mkldnn:False
+--cpu_threads:6
--rec_batch_num:1|6
---use_tensorrt:True|False
---precision:fp32|int8
+--use_tensorrt:False
+--precision:fp32
--rec_model_dir:
--image_dir:./inference/rec_inference
--save_log_path:./test/output/
diff --git a/test_tipc/configs/rec_r34_vd_tps_bilstm_att_v2.0/train_infer_python.txt b/test_tipc/configs/rec_r34_vd_tps_bilstm_att_v2.0/train_infer_python.txt
index 55e937881bec1852fade4f99d81a319b8b2c5b67..1e4f46633efbf36fc78ed2beb7ed883d1483b3b0 100644
--- a/test_tipc/configs/rec_r34_vd_tps_bilstm_att_v2.0/train_infer_python.txt
+++ b/test_tipc/configs/rec_r34_vd_tps_bilstm_att_v2.0/train_infer_python.txt
@@ -39,11 +39,11 @@ infer_export:tools/export_model.py -c test_tipc/configs/rec_r34_vd_tps_bilstm_at
infer_quant:False
inference:tools/infer/predict_rec.py --rec_char_dict_path=./ppocr/utils/ic15_dict.txt --rec_image_shape="3,32,100" --rec_algorithm="RARE" --min_subgraph_size=5
--use_gpu:True|False
---enable_mkldnn:True|False
---cpu_threads:1|6
+--enable_mkldnn:False
+--cpu_threads:6
--rec_batch_num:1|6
---use_tensorrt:True|False
---precision:fp32|int8
+--use_tensorrt:False
+--precision:fp32
--rec_model_dir:
--image_dir:./inference/rec_inference
--save_log_path:./test/output/
diff --git a/test_tipc/configs/rec_r34_vd_tps_bilstm_ctc_v2.0/train_infer_python.txt b/test_tipc/configs/rec_r34_vd_tps_bilstm_ctc_v2.0/train_infer_python.txt
index 5b5ba0fd01c02b3b16d147edaf93495aeeaab7bf..9e795b66453039696ed5eedb92fba5e25150413c 100644
--- a/test_tipc/configs/rec_r34_vd_tps_bilstm_ctc_v2.0/train_infer_python.txt
+++ b/test_tipc/configs/rec_r34_vd_tps_bilstm_ctc_v2.0/train_infer_python.txt
@@ -39,11 +39,11 @@ infer_export:tools/export_model.py -c test_tipc/configs/rec_r34_vd_tps_bilstm_ct
infer_quant:False
inference:tools/infer/predict_rec.py --rec_char_dict_path=./ppocr/utils/ic15_dict.txt --rec_image_shape="3,32,100" --rec_algorithm="StarNet"
--use_gpu:True|False
---enable_mkldnn:True|False
---cpu_threads:1|6
+--enable_mkldnn:False
+--cpu_threads:6
--rec_batch_num:1|6
---use_tensorrt:True|False
---precision:fp32|int8
+--use_tensorrt:False
+--precision:fp32
--rec_model_dir:
--image_dir:./inference/rec_inference
--save_log_path:./test/output/
diff --git a/test_tipc/configs/rec_r45_abinet/rec_r45_abinet.yml b/test_tipc/configs/rec_r45_abinet/rec_r45_abinet.yml
new file mode 100644
index 0000000000000000000000000000000000000000..5b5890e7728b9a1cb629744bd5d56488657c73f3
--- /dev/null
+++ b/test_tipc/configs/rec_r45_abinet/rec_r45_abinet.yml
@@ -0,0 +1,106 @@
+Global:
+ use_gpu: True
+ epoch_num: 10
+ log_smooth_window: 20
+ print_batch_step: 10
+ save_model_dir: ./output/rec/r45_abinet/
+ save_epoch_step: 1
+ # evaluation is run every 2000 iterations
+ eval_batch_step: [0, 2000]
+ cal_metric_during_train: True
+ pretrained_model:
+ checkpoints:
+ save_inference_dir:
+ use_visualdl: False
+ infer_img: doc/imgs_words_en/word_10.png
+ # for data or label process
+ character_dict_path:
+ character_type: en
+ max_text_length: 25
+ infer_mode: False
+ use_space_char: False
+ save_res_path: ./output/rec/predicts_abinet.txt
+
+Optimizer:
+ name: Adam
+ beta1: 0.9
+ beta2: 0.99
+ clip_norm: 20.0
+ lr:
+ name: Piecewise
+ decay_epochs: [6]
+ values: [0.0001, 0.00001]
+ regularizer:
+ name: 'L2'
+ factor: 0.
+
+Architecture:
+ model_type: rec
+ algorithm: ABINet
+ in_channels: 3
+ Transform:
+ Backbone:
+ name: ResNet45
+
+ Head:
+ name: ABINetHead
+ use_lang: True
+ iter_size: 3
+
+
+Loss:
+ name: CELoss
+ ignore_index: &ignore_index 100 # Must be greater than the number of character classes
+
+PostProcess:
+ name: ABINetLabelDecode
+
+Metric:
+ name: RecMetric
+ main_indicator: acc
+
+Train:
+ dataset:
+ name: SimpleDataSet
+ data_dir: ./train_data/ic15_data/
+ label_file_list: ["./train_data/ic15_data/rec_gt_train.txt"]
+ transforms:
+ - DecodeImage: # load image
+ img_mode: RGB
+ channel_first: False
+ - ABINetRecAug:
+ - ABINetLabelEncode: # Class handling label
+ ignore_index: *ignore_index
+ - ABINetRecResizeImg:
+ image_shape: [3, 32, 128]
+ padding: False
+ - KeepKeys:
+ keep_keys: ['image', 'label', 'length'] # dataloader will return list in this order
+ loader:
+ shuffle: True
+ batch_size_per_card: 96
+ drop_last: True
+ num_workers: 4
+
+Eval:
+ dataset:
+ name: SimpleDataSet
+ data_dir: ./train_data/ic15_data
+ label_file_list: ["./train_data/ic15_data/rec_gt_test.txt"]
+ transforms:
+ - DecodeImage: # load image
+ img_mode: RGB
+ channel_first: False
+ - ABINetLabelEncode: # Class handling label
+ ignore_index: *ignore_index
+ - ABINetRecResizeImg:
+ image_shape: [3, 32, 128]
+ padding: False
+ - KeepKeys:
+ keep_keys: ['image', 'label', 'length'] # dataloader will return list in this order
+ loader:
+ shuffle: False
+ drop_last: False
+ batch_size_per_card: 256
+ num_workers: 4
+ use_shared_memory: False
diff --git a/test_tipc/configs/rec_r45_abinet/train_infer_python.txt b/test_tipc/configs/rec_r45_abinet/train_infer_python.txt
new file mode 100644
index 0000000000000000000000000000000000000000..ecab1bcbbde11fc6d14357b6715033704c2c3316
--- /dev/null
+++ b/test_tipc/configs/rec_r45_abinet/train_infer_python.txt
@@ -0,0 +1,53 @@
+===========================train_params===========================
+model_name:rec_abinet
+python:python3.7
+gpu_list:0|0,1
+Global.use_gpu:True|True
+Global.auto_cast:null
+Global.epoch_num:lite_train_lite_infer=2|whole_train_whole_infer=300
+Global.save_model_dir:./output/
+Train.loader.batch_size_per_card:lite_train_lite_infer=16|whole_train_whole_infer=64
+Global.pretrained_model:null
+train_model_name:latest
+train_infer_img_dir:./inference/rec_inference
+null:null
+##
+trainer:norm_train
+norm_train:tools/train.py -c test_tipc/configs/rec_r45_abinet/rec_r45_abinet.yml -o
+pact_train:null
+fpgm_train:null
+distill_train:null
+null:null
+null:null
+##
+===========================eval_params===========================
+eval:tools/eval.py -c test_tipc/configs/rec_r45_abinet/rec_r45_abinet.yml -o
+null:null
+##
+===========================infer_params===========================
+Global.save_inference_dir:./output/
+Global.checkpoints:
+norm_export:tools/export_model.py -c test_tipc/configs/rec_r45_abinet/rec_r45_abinet.yml -o
+quant_export:null
+fpgm_export:null
+distill_export:null
+export1:null
+export2:null
+##
+train_model:./inference/rec_r45_abinet_train/best_accuracy
+infer_export:tools/export_model.py -c test_tipc/configs/rec_r45_abinet/rec_r45_abinet.yml -o
+infer_quant:False
+inference:tools/infer/predict_rec.py --rec_char_dict_path=./ppocr/utils/ic15_dict.txt --rec_image_shape="3,32,128" --rec_algorithm="ABINet"
+--use_gpu:True|False
+--enable_mkldnn:False
+--cpu_threads:6
+--rec_batch_num:1|6
+--use_tensorrt:False
+--precision:fp32
+--rec_model_dir:
+--image_dir:./inference/rec_inference
+--save_log_path:./test/output/
+--benchmark:True
+null:null
+===========================infer_benchmark_params==========================
+random_infer_input:[{float32,[3,32,128]}]
diff --git a/test_tipc/configs/rec_r50_fpn_vd_none_srn/train_infer_python.txt b/test_tipc/configs/rec_r50_fpn_vd_none_srn/train_infer_python.txt
index 4877512b689ec87b7b2cd0258a2fac706968322b..b5a5286010a5830dc23031b3e0885247fb6ae53f 100644
--- a/test_tipc/configs/rec_r50_fpn_vd_none_srn/train_infer_python.txt
+++ b/test_tipc/configs/rec_r50_fpn_vd_none_srn/train_infer_python.txt
@@ -39,11 +39,11 @@ infer_export:tools/export_model.py -c test_tipc/configs/rec_r50_fpn_vd_none_srn/
infer_quant:False
inference:tools/infer/predict_rec.py --rec_char_dict_path=./ppocr/utils/ic15_dict.txt --rec_image_shape="1,64,256" --rec_algorithm="SRN" --use_space_char=False --min_subgraph_size=3
--use_gpu:True|False
---enable_mkldnn:True|False
---cpu_threads:1|6
+--enable_mkldnn:False
+--cpu_threads:6
--rec_batch_num:1|6
---use_tensorrt:True|False
---precision:fp32|int8
+--use_tensorrt:False
+--precision:fp32
--rec_model_dir:
--image_dir:./inference/rec_inference
--save_log_path:./test/output/
diff --git a/test_tipc/configs/rec_svtrnet/rec_svtrnet.yml b/test_tipc/configs/rec_svtrnet/rec_svtrnet.yml
new file mode 100644
index 0000000000000000000000000000000000000000..140b17e0e79f9895167e9c51d86ced173e44a541
--- /dev/null
+++ b/test_tipc/configs/rec_svtrnet/rec_svtrnet.yml
@@ -0,0 +1,117 @@
+Global:
+ use_gpu: True
+ epoch_num: 20
+ log_smooth_window: 20
+ print_batch_step: 10
+ save_model_dir: ./output/rec/svtr/
+ save_epoch_step: 1
+ # evaluation is run every 2000 iterations after the 0th iteration
+ eval_batch_step: [0, 2000]
+ cal_metric_during_train: True
+ pretrained_model:
+ checkpoints:
+ save_inference_dir:
+ use_visualdl: False
+ infer_img: doc/imgs_words_en/word_10.png
+ # for data or label process
+ character_dict_path:
+ character_type: en
+ max_text_length: 25
+ infer_mode: False
+ use_space_char: False
+ save_res_path: ./output/rec/predicts_svtr_tiny.txt
+
+
+Optimizer:
+ name: AdamW
+ beta1: 0.9
+ beta2: 0.99
+ epsilon: 8.e-8
+ weight_decay: 0.05
+ no_weight_decay_name: norm pos_embed
+ one_dim_param_no_weight_decay: true
+ lr:
+ name: Cosine
+ learning_rate: 0.0005
+ warmup_epoch: 2
+
+Architecture:
+ model_type: rec
+ algorithm: SVTR
+ Transform:
+ name: STN_ON
+ tps_inputsize: [32, 64]
+ tps_outputsize: [32, 100]
+ num_control_points: 20
+ tps_margins: [0.05,0.05]
+ stn_activation: none
+ Backbone:
+ name: SVTRNet
+ img_size: [32, 100]
+ out_char_num: 25
+ out_channels: 192
+ patch_merging: 'Conv'
+ embed_dim: [64, 128, 256]
+ depth: [3, 6, 3]
+ num_heads: [2, 4, 8]
+ mixer: ['Local','Local','Local','Local','Local','Local','Global','Global','Global','Global','Global','Global']
+ local_mixer: [[7, 11], [7, 11], [7, 11]]
+ last_stage: True
+ prenorm: false
+ Neck:
+ name: SequenceEncoder
+ encoder_type: reshape
+ Head:
+ name: CTCHead
+
+Loss:
+ name: CTCLoss
+
+PostProcess:
+ name: CTCLabelDecode
+
+Metric:
+ name: RecMetric
+ main_indicator: acc
+
+Train:
+ dataset:
+ name: SimpleDataSet
+ data_dir: ./train_data/ic15_data/
+ label_file_list: ["./train_data/ic15_data/rec_gt_train.txt"]
+ transforms:
+ - DecodeImage: # load image
+ img_mode: BGR
+ channel_first: False
+ - CTCLabelEncode: # Class handling label
+ - SVTRRecResizeImg:
+ image_shape: [3, 64, 256]
+ padding: False
+ - KeepKeys:
+ keep_keys: ['image', 'label', 'length'] # dataloader will return list in this order
+ loader:
+ shuffle: True
+ batch_size_per_card: 512
+ drop_last: True
+ num_workers: 4
+
+Eval:
+ dataset:
+ name: SimpleDataSet
+ data_dir: ./train_data/ic15_data
+ label_file_list: ["./train_data/ic15_data/rec_gt_test.txt"]
+ transforms:
+ - DecodeImage: # load image
+ img_mode: BGR
+ channel_first: False
+ - CTCLabelEncode: # Class handling label
+ - SVTRRecResizeImg:
+ image_shape: [3, 64, 256]
+ padding: False
+ - KeepKeys:
+ keep_keys: ['image', 'label', 'length'] # dataloader will return list in this order
+ loader:
+ shuffle: False
+ drop_last: False
+ batch_size_per_card: 256
+ num_workers: 2
diff --git a/test_tipc/configs/rec_svtrnet/train_infer_python.txt b/test_tipc/configs/rec_svtrnet/train_infer_python.txt
new file mode 100644
index 0000000000000000000000000000000000000000..a7e4a24063b2e248f2ab92d5efd257a2837c0a34
--- /dev/null
+++ b/test_tipc/configs/rec_svtrnet/train_infer_python.txt
@@ -0,0 +1,53 @@
+===========================train_params===========================
+model_name:rec_svtrnet
+python:python3.7
+gpu_list:0|0,1
+Global.use_gpu:True|True
+Global.auto_cast:null
+Global.epoch_num:lite_train_lite_infer=2|whole_train_whole_infer=300
+Global.save_model_dir:./output/
+Train.loader.batch_size_per_card:lite_train_lite_infer=16|whole_train_whole_infer=64
+Global.pretrained_model:null
+train_model_name:latest
+train_infer_img_dir:./inference/rec_inference
+null:null
+##
+trainer:norm_train
+norm_train:tools/train.py -c test_tipc/configs/rec_svtrnet/rec_svtrnet.yml -o
+pact_train:null
+fpgm_train:null
+distill_train:null
+null:null
+null:null
+##
+===========================eval_params===========================
+eval:tools/eval.py -c test_tipc/configs/rec_svtrnet/rec_svtrnet.yml -o
+null:null
+##
+===========================infer_params===========================
+Global.save_inference_dir:./output/
+Global.checkpoints:
+norm_export:tools/export_model.py -c test_tipc/configs/rec_svtrnet/rec_svtrnet.yml -o
+quant_export:null
+fpgm_export:null
+distill_export:null
+export1:null
+export2:null
+##
+train_model:./inference/rec_svtrnet_train/best_accuracy
+infer_export:tools/export_model.py -c test_tipc/configs/rec_svtrnet/rec_svtrnet.yml -o
+infer_quant:False
+inference:tools/infer/predict_rec.py --rec_char_dict_path=./ppocr/utils/ic15_dict.txt --rec_image_shape="3,64,256" --rec_algorithm="SVTR"
+--use_gpu:True|False
+--enable_mkldnn:False
+--cpu_threads:6
+--rec_batch_num:1|6
+--use_tensorrt:False
+--precision:fp32
+--rec_model_dir:
+--image_dir:./inference/rec_inference
+--save_log_path:./test/output/
+--benchmark:True
+null:null
+===========================infer_benchmark_params==========================
+random_infer_input:[{float32,[3,64,256]}]
diff --git a/test_tipc/configs/rec_vitstr_none_ce/rec_vitstr_none_ce.yml b/test_tipc/configs/rec_vitstr_none_ce/rec_vitstr_none_ce.yml
new file mode 100644
index 0000000000000000000000000000000000000000..a0aed488755f7cb6fed18a5747e9b7f62f57da86
--- /dev/null
+++ b/test_tipc/configs/rec_vitstr_none_ce/rec_vitstr_none_ce.yml
@@ -0,0 +1,104 @@
+Global:
+ use_gpu: True
+ epoch_num: 20
+ log_smooth_window: 20
+ print_batch_step: 10
+ save_model_dir: ./output/rec/vitstr_none_ce/
+ save_epoch_step: 1
+ # evaluation is run every 2000 iterations after the 0th iteration#
+ eval_batch_step: [0, 2000]
+ cal_metric_during_train: True
+ pretrained_model:
+ checkpoints:
+ save_inference_dir:
+ use_visualdl: False
+ infer_img: doc/imgs_words_en/word_10.png
+ # for data or label process
+ character_dict_path: ppocr/utils/EN_symbol_dict.txt
+ max_text_length: 25
+ infer_mode: False
+ use_space_char: False
+ save_res_path: ./output/rec/predicts_vitstr.txt
+
+
+Optimizer:
+ name: Adadelta
+ epsilon: 1.e-8
+ rho: 0.95
+ clip_norm: 5.0
+ lr:
+ learning_rate: 1.0
+
+Architecture:
+ model_type: rec
+ algorithm: ViTSTR
+ in_channels: 1
+ Transform:
+ Backbone:
+ name: ViTSTR
+ Neck:
+ name: SequenceEncoder
+ encoder_type: reshape
+ Head:
+ name: CTCHead
+
+Loss:
+ name: CELoss
+ smoothing: False
+ with_all: True
+ ignore_index: &ignore_index 0 # Must be zero or greater than the number of character classes
+
+PostProcess:
+ name: ViTSTRLabelDecode
+
+Metric:
+ name: RecMetric
+ main_indicator: acc
+
+Train:
+ dataset:
+ name: SimpleDataSet
+ data_dir: ./train_data/ic15_data/
+ label_file_list: ["./train_data/ic15_data/rec_gt_train.txt"]
+ transforms:
+ - DecodeImage: # load image
+ img_mode: BGR
+ channel_first: False
+ - ViTSTRLabelEncode: # Class handling label
+ ignore_index: *ignore_index
+ - GrayRecResizeImg:
+ image_shape: [224, 224] # W H
+ resize_type: PIL # PIL or OpenCV
+ inter_type: 'Image.BICUBIC'
+ scale: false
+ - KeepKeys:
+ keep_keys: ['image', 'label', 'length'] # dataloader will return list in this order
+ loader:
+ shuffle: True
+ batch_size_per_card: 48
+ drop_last: True
+ num_workers: 8
+
+Eval:
+ dataset:
+ name: SimpleDataSet
+ data_dir: ./train_data/ic15_data
+ label_file_list: ["./train_data/ic15_data/rec_gt_test.txt"]
+ transforms:
+ - DecodeImage: # load image
+ img_mode: BGR
+ channel_first: False
+ - ViTSTRLabelEncode: # Class handling label
+ ignore_index: *ignore_index
+ - GrayRecResizeImg:
+ image_shape: [224, 224] # W H
+ resize_type: PIL # PIL or OpenCV
+ inter_type: 'Image.BICUBIC'
+ scale: false
+ - KeepKeys:
+ keep_keys: ['image', 'label', 'length'] # dataloader will return list in this order
+ loader:
+ shuffle: False
+ drop_last: False
+ batch_size_per_card: 256
+ num_workers: 2
diff --git a/test_tipc/configs/rec_vitstr_none_ce/train_infer_python.txt b/test_tipc/configs/rec_vitstr_none_ce/train_infer_python.txt
new file mode 100644
index 0000000000000000000000000000000000000000..04c5742ea2ddaf01e782d8b39c21bcbcfa0a7ce7
--- /dev/null
+++ b/test_tipc/configs/rec_vitstr_none_ce/train_infer_python.txt
@@ -0,0 +1,53 @@
+===========================train_params===========================
+model_name:rec_vitstr
+python:python3.7
+gpu_list:0|0,1
+Global.use_gpu:True|True
+Global.auto_cast:null
+Global.epoch_num:lite_train_lite_infer=2|whole_train_whole_infer=300
+Global.save_model_dir:./output/
+Train.loader.batch_size_per_card:lite_train_lite_infer=16|whole_train_whole_infer=64
+Global.pretrained_model:null
+train_model_name:latest
+train_infer_img_dir:./inference/rec_inference
+null:null
+##
+trainer:norm_train
+norm_train:tools/train.py -c test_tipc/configs/rec_vitstr_none_ce/rec_vitstr_none_ce.yml -o
+pact_train:null
+fpgm_train:null
+distill_train:null
+null:null
+null:null
+##
+===========================eval_params===========================
+eval:tools/eval.py -c test_tipc/configs/rec_vitstr_none_ce/rec_vitstr_none_ce.yml -o
+null:null
+##
+===========================infer_params===========================
+Global.save_inference_dir:./output/
+Global.checkpoints:
+norm_export:tools/export_model.py -c test_tipc/configs/rec_vitstr_none_ce/rec_vitstr_none_ce.yml -o
+quant_export:null
+fpgm_export:null
+distill_export:null
+export1:null
+export2:null
+##
+train_model:./inference/rec_vitstr_none_ce_train/best_accuracy
+infer_export:tools/export_model.py -c test_tipc/configs/rec_vitstr_none_ce/rec_vitstr_none_ce.yml -o
+infer_quant:False
+inference:tools/infer/predict_rec.py --rec_char_dict_path=./ppocr/utils/EN_symbol_dict.txt --rec_image_shape="1,224,224" --rec_algorithm="ViTSTR"
+--use_gpu:True|False
+--enable_mkldnn:False
+--cpu_threads:6
+--rec_batch_num:1|6
+--use_tensorrt:False
+--precision:fp32
+--rec_model_dir:
+--image_dir:./inference/rec_inference
+--save_log_path:./test/output/
+--benchmark:True
+null:null
+===========================infer_benchmark_params==========================
+random_infer_input:[{float32,[1,224,224]}]
diff --git a/test_tipc/docs/jeston_test_train_inference_python.md b/test_tipc/docs/jeston_test_train_inference_python.md
index 9e9d15fb674ca04558b1f8cb616dc4e44934dbb9..b25175ed0071dd3728ae22c7588ca20535af0505 100644
--- a/test_tipc/docs/jeston_test_train_inference_python.md
+++ b/test_tipc/docs/jeston_test_train_inference_python.md
@@ -115,4 +115,4 @@ ValueError: The results of python_infer_gpu_usetrt_True_precision_fp32_batchsize
## 3. 更多教程
本文档为功能测试用,更丰富的训练预测使用教程请参考:
[模型训练](https://github.com/PaddlePaddle/PaddleOCR/blob/dygraph/doc/doc_ch/training.md)
-[基于Python预测引擎推理](https://github.com/PaddlePaddle/PaddleOCR/blob/dygraph/doc/doc_ch/inference.md)
+[基于Python预测引擎推理](https://github.com/PaddlePaddle/PaddleOCR/blob/dygraph/doc/doc_ch/inference_ppocr.md)
diff --git a/test_tipc/docs/mac_test_train_inference_python.md b/test_tipc/docs/mac_test_train_inference_python.md
index ea6e0218b12b342347677ea512d3afd89053261a..c37291a8fc9b239564adce8f556565f51f2a9475 100644
--- a/test_tipc/docs/mac_test_train_inference_python.md
+++ b/test_tipc/docs/mac_test_train_inference_python.md
@@ -152,4 +152,4 @@ ValueError: The results of python_infer_cpu_usemkldnn_False_threads_1_batchsize_
## 3. 更多教程
本文档为功能测试用,更丰富的训练预测使用教程请参考:
[模型训练](https://github.com/PaddlePaddle/PaddleOCR/blob/dygraph/doc/doc_ch/training.md)
-[基于Python预测引擎推理](https://github.com/PaddlePaddle/PaddleOCR/blob/dygraph/doc/doc_ch/inference.md)
+[基于Python预测引擎推理](https://github.com/PaddlePaddle/PaddleOCR/blob/dygraph/doc/doc_ch/inference_ppocr.md)
diff --git a/test_tipc/docs/test_serving.md b/test_tipc/docs/test_serving.md
index 8600eff3b95b1fec7d0519c1d26cf2a3232b59e5..71f01c0d5ff47004d70baa17b404c10714a6fb64 100644
--- a/test_tipc/docs/test_serving.md
+++ b/test_tipc/docs/test_serving.md
@@ -1,6 +1,6 @@
# PaddleServing预测功能测试
-PaddleServing预测功能测试的主程序为`test_serving.sh`,可以测试基于PaddleServing的部署功能。
+PaddleServing预测功能测试的主程序为`test_serving_infer_python.sh`和`test_serving_infer_cpp.sh`,可以测试基于PaddleServing的部署功能。
## 1. 测试结论汇总
@@ -17,13 +17,23 @@ PaddleServing预测功能测试的主程序为`test_serving.sh`,可以测试
运行环境配置请参考[文档](./install.md)的内容配置TIPC的运行环境。
### 2.1 功能测试
-先运行`prepare.sh`准备数据和模型,然后运行`test_serving.sh`进行测试,最终在```test_tipc/output```目录下生成`serving_infer_*.log`后缀的日志文件。
+**python serving**
+先运行`prepare.sh`准备数据和模型,然后运行`test_serving_infer_python.sh`进行测试,最终在```test_tipc/output```目录下生成`serving_infer_python*.log`后缀的日志文件。
```shell
bash test_tipc/prepare.sh ./test_tipc/configs/ch_ppocr_mobile_v2.0_det/model_linux_gpu_normal_normal_serving_python_linux_gpu_cpu.txt "serving_infer"
# 用法:
-bash test_tipc/test_serving.sh ./test_tipc/configs/ch_ppocr_mobile_v2.0_det/model_linux_gpu_normal_normal_serving_python_linux_gpu_cpu.txt
+bash test_tipc/test_serving_infer_python.sh ./test_tipc/configs/ch_ppocr_mobile_v2.0_det/model_linux_gpu_normal_normal_serving_python_linux_gpu_cpu.txt "serving_infer"
+```
+**cpp serving**
+先运行`prepare.sh`准备数据和模型,然后运行`test_serving_infer_cpp.sh`进行测试,最终在```test_tipc/output```目录下生成`serving_infer_cpp*.log`后缀的日志文件。
+
+```shell
+bash test_tipc/prepare.sh ./test_tipc/configs/ch_ppocr_mobile_v2.0/model_linux_gpu_normal_normal_serving_python_linux_gpu_cpu.txt "serving_infer"
+
+# 用法:
+bash test_tipc/test_serving_infer_cpp.sh ./test_tipc/configs/ch_ppocr_mobile_v2.0/model_linux_gpu_normal_normal_serving_cpp_linux_gpu_cpu.txt "serving_infer"
```
#### 运行结果
diff --git a/test_tipc/docs/test_train_fleet_inference_python.md b/test_tipc/docs/test_train_fleet_inference_python.md
new file mode 100644
index 0000000000000000000000000000000000000000..9fddb5d1634b452f1906a83bca4157dbaec47c81
--- /dev/null
+++ b/test_tipc/docs/test_train_fleet_inference_python.md
@@ -0,0 +1,107 @@
+# Linux GPU/CPU 多机多卡训练推理测试
+
+Linux GPU/CPU 多机多卡训练推理测试的主程序为`test_train_inference_python.sh`,可以测试基于Python的模型训练、评估、推理等基本功能。
+
+## 1. 测试结论汇总
+
+- 训练相关:
+
+| 算法名称 | 模型名称 | 多机多卡 |
+| :----: | :----: | :----: |
+| PP-OCRv3 | ch_PP-OCRv3_rec | 分布式训练 |
+
+
+- 推理相关:
+
+| 算法名称 | 模型名称 | device_CPU | device_GPU | batchsize |
+| :----: | :----: | :----: | :----: | :----: |
+| PP-OCRv3 | ch_PP-OCRv3_rec | 支持 | - | 1/6 |
+
+
+## 2. 测试流程
+
+运行环境配置请参考[文档](./install.md)的内容配置TIPC的运行环境。
+
+### 2.1 功能测试
+
+#### 2.1.1 修改配置文件
+
+首先,修改配置文件中的`ip`设置: 假设两台机器的`ip`地址分别为`192.168.0.1`和`192.168.0.2`,则对应的配置文件`gpu_list`字段需要修改为`gpu_list:192.168.0.1,192.168.0.2;0,1`; `ip`地址查看命令为`ifconfig`。
+
+
+#### 2.1.2 准备数据
+
+运行`prepare.sh`准备数据和模型,以配置文件`test_tipc/configs/ch_PP-OCRv3_rec/train_linux_gpu_fleet_normal_infer_python_linux_gpu_cpu.txt`为例,数据准备命令如下所示。
+
+```shell
+bash test_tipc/prepare.sh test_tipc/configs/ch_PP-OCRv3_rec/train_linux_gpu_fleet_normal_infer_python_linux_gpu_cpu.txt lite_train_lite_infer
+```
+
+**注意:** 由于是多机训练,这里需要在所有的节点上均运行启动上述命令,准备数据。
+
+#### 2.1.3 修改起始端口并开始测试
+
+在多机的节点上使用下面的命令设置分布式的起始端口(否则后面运行的时候会由于无法找到运行端口而hang住),一般建议设置在`10000~20000`之间。
+
+```shell
+export FLAGS_START_PORT=17000
+```
+
+以配置文件`test_tipc/configs/ch_PP-OCRv3_rec/train_linux_gpu_fleet_normal_infer_python_linux_gpu_cpu.txt`为例,测试方法如下所示。
+
+```shell
+bash test_tipc/test_train_inference_python.sh test_tipc/configs/ch_PP-OCRv3_rec/train_linux_gpu_fleet_normal_infer_python_linux_gpu_cpu.txt lite_train_lite_infer
+```
+
+**注意:** 由于是多机训练,这里需要在所有的节点上均运行启动上述命令进行测试。
+
+
+#### 2.1.4 输出结果
+
+输出结果如下,表示命令运行成功。
+
+```bash
+ Run successfully with command - ch_PP-OCRv3_rec - python3.7 -m paddle.distributed.launch --ips=192.168.0.1,192.168.0.2 --gpus=0,1 tools/train.py -c test_tipc/configs/ch_PP-OCRv3_rec/ch_PP-OCRv3_rec_distillation.yml -o Global.use_gpu=True Global.save_model_dir=./test_tipc/output/ch_PP-OCRv3_rec/lite_train_lite_infer/norm_train_gpus_0,1_autocast_fp32_nodes_2 Global.epoch_num=3 Global.auto_cast=fp32 Train.loader.batch_size_per_card=16 !
+ ......
+ Run successfully with command - ch_PP-OCRv3_rec - python3.7 tools/infer/predict_rec.py --rec_image_shape="3,48,320" --use_gpu=False --enable_mkldnn=False --cpu_threads=6 --rec_model_dir=./test_tipc/output/ch_PP-OCRv3_rec/lite_train_lite_infer/norm_train_gpus_0,1_autocast_fp32_nodes_2/Student --rec_batch_num=1 --image_dir=./inference/rec_inference --benchmark=True --precision=fp32 > ./test_tipc/output/ch_PP-OCRv3_rec/lite_train_lite_infer/python_infer_cpu_usemkldnn_False_threads_6_precision_fp32_batchsize_1.log 2>&1 !
+```
+
+在开启benchmark参数时,可以得到测试的详细数据,包含运行环境信息(系统版本、CUDA版本、CUDNN版本、驱动版本),Paddle版本信息,参数设置信息(运行设备、线程数、是否开启内存优化等),模型信息(模型名称、精度),数据信息(batchsize、是否为动态shape等),性能信息(CPU,GPU的占用、运行耗时、预处理耗时、推理耗时、后处理耗时),内容如下所示:
+
+```
+[2022/06/02 22:53:35] ppocr INFO:
+
+[2022/06/02 22:53:35] ppocr INFO: ---------------------- Env info ----------------------
+[2022/06/02 22:53:35] ppocr INFO: OS_version: Ubuntu 16.04
+[2022/06/02 22:53:35] ppocr INFO: CUDA_version: 10.1.243
+[2022/06/02 22:53:35] ppocr INFO: CUDNN_version: 7.6.5
+[2022/06/02 22:53:35] ppocr INFO: drivier_version: 460.32.03
+[2022/06/02 22:53:35] ppocr INFO: ---------------------- Paddle info ----------------------
+[2022/06/02 22:53:35] ppocr INFO: paddle_version: 2.3.0-rc0
+[2022/06/02 22:53:35] ppocr INFO: paddle_commit: 5d4980c052583fec022812d9c29460aff7cdc18b
+[2022/06/02 22:53:35] ppocr INFO: log_api_version: 1.0
+[2022/06/02 22:53:35] ppocr INFO: ----------------------- Conf info -----------------------
+[2022/06/02 22:53:35] ppocr INFO: runtime_device: cpu
+[2022/06/02 22:53:35] ppocr INFO: ir_optim: True
+[2022/06/02 22:53:35] ppocr INFO: enable_memory_optim: True
+[2022/06/02 22:53:35] ppocr INFO: enable_tensorrt: False
+[2022/06/02 22:53:35] ppocr INFO: enable_mkldnn: False
+[2022/06/02 22:53:35] ppocr INFO: cpu_math_library_num_threads: 6
+[2022/06/02 22:53:35] ppocr INFO: ----------------------- Model info ----------------------
+[2022/06/02 22:53:35] ppocr INFO: model_name: rec
+[2022/06/02 22:53:35] ppocr INFO: precision: fp32
+[2022/06/02 22:53:35] ppocr INFO: ----------------------- Data info -----------------------
+[2022/06/02 22:53:35] ppocr INFO: batch_size: 1
+[2022/06/02 22:53:35] ppocr INFO: input_shape: dynamic
+[2022/06/02 22:53:35] ppocr INFO: data_num: 6
+[2022/06/02 22:53:35] ppocr INFO: ----------------------- Perf info -----------------------
+[2022/06/02 22:53:35] ppocr INFO: cpu_rss(MB): 288.957, gpu_rss(MB): None, gpu_util: None%
+[2022/06/02 22:53:35] ppocr INFO: total time spent(s): 0.4824
+[2022/06/02 22:53:35] ppocr INFO: preprocess_time(ms): 0.1136, inference_time(ms): 79.5877, postprocess_time(ms): 0.6945
+```
+
+该信息可以在运行log中查看,以上面的`ch_PP-OCRv3_rec`为例,log位置在`./test_tipc/output/ch_PP-OCRv3_rec/lite_train_lite_infer/results_python.log`。
+
+如果运行失败,也会在终端中输出运行失败的日志信息以及对应的运行命令。可以基于该命令,分析运行失败的原因。
+
+**注意:** 由于分布式训练时,仅在`trainer_id=0`所在的节点中保存模型,因此其他的节点中在运行模型导出与推理时会报错,为正常现象。
diff --git a/test_tipc/docs/test_train_inference_python.md b/test_tipc/docs/test_train_inference_python.md
index fa969cbe1b9b6fd524efdaad5002afcfcf40e119..99de9400797493f429f8176a9b6b374a76df4872 100644
--- a/test_tipc/docs/test_train_inference_python.md
+++ b/test_tipc/docs/test_train_inference_python.md
@@ -153,4 +153,4 @@ python3.7 test_tipc/compare_results.py --gt_file=./test_tipc/results/python_*.tx
## 3. 更多教程
本文档为功能测试用,更丰富的训练预测使用教程请参考:
[模型训练](https://github.com/PaddlePaddle/PaddleOCR/blob/dygraph/doc/doc_ch/training.md)
-[基于Python预测引擎推理](https://github.com/PaddlePaddle/PaddleOCR/blob/dygraph/doc/doc_ch/inference.md)
+[基于Python预测引擎推理](https://github.com/PaddlePaddle/PaddleOCR/blob/dygraph/doc/doc_ch/inference_ppocr.md)
diff --git a/test_tipc/docs/win_test_train_inference_python.md b/test_tipc/docs/win_test_train_inference_python.md
index 95585af0380be410c230a799b7d61e607de5f654..6e3ce93bb3123133075b9d65c64850a87de5f828 100644
--- a/test_tipc/docs/win_test_train_inference_python.md
+++ b/test_tipc/docs/win_test_train_inference_python.md
@@ -156,4 +156,4 @@ ValueError: The results of python_infer_cpu_usemkldnn_False_threads_1_batchsize_
## 3. 更多教程
本文档为功能测试用,更丰富的训练预测使用教程请参考:
[模型训练](https://github.com/PaddlePaddle/PaddleOCR/blob/dygraph/doc/doc_ch/training.md)
-[基于Python预测引擎推理](https://github.com/PaddlePaddle/PaddleOCR/blob/dygraph/doc/doc_ch/inference.md)
+[基于Python预测引擎推理](https://github.com/PaddlePaddle/PaddleOCR/blob/dygraph/doc/doc_ch/inference_ppocr.md)
diff --git a/test_tipc/prepare.sh b/test_tipc/prepare.sh
index 6a8983009e527b8a59b41c1d9b950e8e3f349ef2..32df8e786568395d02d50d7ae238630fae2e5fcc 100644
--- a/test_tipc/prepare.sh
+++ b/test_tipc/prepare.sh
@@ -44,30 +44,52 @@ if [ ${MODE} = "lite_train_lite_infer" ];then
# pretrain lite train data
wget -nc -P ./pretrain_models/ https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/MobileNetV3_large_x0_5_pretrained.pdparams --no-check-certificate
wget -nc -P ./pretrain_models/ https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/det_mv3_db_v2.0_train.tar --no-check-certificate
- if [[ ${model_name} =~ "PPOCRv2_det" ]];then
+ if [[ ${model_name} =~ "ch_PP-OCRv2_det" ]];then
wget -nc -P ./pretrain_models/ https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_det_distill_train.tar --no-check-certificate
cd ./pretrain_models/ && tar xf ch_PP-OCRv2_det_distill_train.tar && cd ../
fi
+ if [[ ${model_name} =~ "ch_PP-OCRv3_det" ]];then
+ wget -nc -P ./pretrain_models/ https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_det_distill_train.tar --no-check-certificate
+ cd ./pretrain_models/ && tar xf ch_PP-OCRv3_det_distill_train.tar && cd ../
+ fi
+ if [ ${model_name} == "en_table_structure" ];then
+ wget -nc -P ./pretrain_models/ https://paddleocr.bj.bcebos.com/dygraph_v2.1/table/en_ppocr_mobile_v2.0_table_structure_train.tar --no-check-certificate
+ cd ./pretrain_models/ && tar xf en_ppocr_mobile_v2.0_table_structure_train.tar && cd ../
+ wget -nc -P ./inference/ https://paddleocr.bj.bcebos.com/dygraph_v2.0/table/en_ppocr_mobile_v2.0_table_det_infer.tar --no-check-certificate
+ wget -nc -P ./inference/ https://paddleocr.bj.bcebos.com/dygraph_v2.0/table/en_ppocr_mobile_v2.0_table_rec_infer.tar --no-check-certificate
+ cd ./inference/ && tar xf en_ppocr_mobile_v2.0_table_det_infer.tar && tar xf en_ppocr_mobile_v2.0_table_rec_infer.tar && cd ../
+ fi
+ if [[ ${model_name} =~ "det_r50_db++" ]];then
+ wget -nc -P ./pretrain_models/ https://paddleocr.bj.bcebos.com/dygraph_v2.1/en_det/ResNet50_dcn_asf_synthtext_pretrained.pdparams --no-check-certificate
+ fi
cd ./pretrain_models/ && tar xf det_mv3_db_v2.0_train.tar && cd ../
rm -rf ./train_data/icdar2015
rm -rf ./train_data/ic15_data
+ rm -rf ./train_data/pubtabnet
wget -nc -P ./train_data/ https://paddleocr.bj.bcebos.com/dygraph_v2.0/test/icdar2015_lite.tar --no-check-certificate
wget -nc -P ./train_data/ https://paddleocr.bj.bcebos.com/dygraph_v2.0/test/ic15_data.tar --no-check-certificate
+ wget -nc -P ./train_data/ https://paddleocr.bj.bcebos.com/dataset/pubtabnet.tar --no-check-certificate
wget -nc -P ./inference https://paddleocr.bj.bcebos.com/dygraph_v2.0/test/rec_inference.tar --no-check-certificate
wget -nc -P ./deploy/slim/prune https://paddleocr.bj.bcebos.com/dygraph_v2.0/test/sen.pickle --no-check-certificate
- cd ./train_data/ && tar xf icdar2015_lite.tar && tar xf ic15_data.tar
+ cd ./train_data/ && tar xf icdar2015_lite.tar && tar xf ic15_data.tar && tar xf pubtabnet.tar
ln -s ./icdar2015_lite ./icdar2015
+ wget -nc -P ./ic15_data/ https://paddleocr.bj.bcebos.com/dataset/rec_gt_train_lite.txt --no-check-certificate
+ wget -nc -P ./ic15_data/ https://paddleocr.bj.bcebos.com/dataset/rec_gt_test_lite.txt --no-check-certificate
cd ../
cd ./inference && tar xf rec_inference.tar && cd ../
- if [ ${model_name} == "ch_PPOCRv2_det" ] || [ ${model_name} == "ch_PPOCRv2_det_PACT" ]; then
+ if [ ${model_name} == "ch_PP-OCRv2_det" ] || [ ${model_name} == "ch_PP-OCRv2_det_PACT" ]; then
wget -nc -P ./pretrain_models/ https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_det_train.tar --no-check-certificate
cd ./pretrain_models/ && tar xf ch_ppocr_server_v2.0_det_train.tar && cd ../
fi
- if [ ${model_name} == "ch_PPOCRv2_rec" ] || [ ${model_name} == "ch_PPOCRv2_rec_PACT" ]; then
+ if [ ${model_name} == "ch_PP-OCRv2_rec" ] || [ ${model_name} == "ch_PP-OCRv2_rec_PACT" ]; then
wget -nc -P ./pretrain_models/ https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_rec_train.tar --no-check-certificate
cd ./pretrain_models/ && tar xf ch_PP-OCRv2_rec_train.tar && cd ../
fi
+ if [ ${model_name} == "ch_PP-OCRv3_rec" ] || [ ${model_name} == "ch_PP-OCRv3_rec_PACT" ]; then
+ wget -nc -P ./pretrain_models/ https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_rec_train.tar --no-check-certificate
+ cd ./pretrain_models/ && tar xf ch_PP-OCRv3_rec_train.tar && cd ../
+ fi
if [ ${model_name} == "det_r18_db_v2_0" ]; then
wget -nc -P ./pretrain_models/ https://paddleocr.bj.bcebos.com/pretrained/ResNet18_vd_pretrained.pdparams --no-check-certificate
fi
@@ -79,8 +101,10 @@ if [ ${MODE} = "lite_train_lite_infer" ];then
fi
if [ ${model_name} == "det_r50_vd_sast_icdar15_v2.0" ] || [ ${model_name} == "det_r50_vd_sast_totaltext_v2.0" ]; then
wget -nc -P ./pretrain_models/ https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/ResNet50_vd_ssld_pretrained.pdparams --no-check-certificate
+ wget -nc -P ./pretrain_models/ https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/det_r50_vd_sast_icdar15_v2.0_train.tar --no-check-certificate
wget -nc -P ./train_data/ https://paddleocr.bj.bcebos.com/dygraph_v2.0/test/total_text_lite.tar --no-check-certificate
cd ./train_data && tar xf total_text_lite.tar && ln -s total_text_lite total_text && cd ../
+ cd ./pretrain_models && tar xf det_r50_vd_sast_icdar15_v2.0_train.tar && cd ../
fi
if [ ${model_name} == "det_mv3_db_v2_0" ]; then
wget -nc -P ./inference/ https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/det_mv3_db_v2.0_train.tar --no-check-certificate
@@ -104,13 +128,22 @@ elif [ ${MODE} = "whole_train_whole_infer" ];then
wget -nc -P ./pretrain_models/ https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/MobileNetV3_large_x0_5_pretrained.pdparams --no-check-certificate
rm -rf ./train_data/icdar2015
rm -rf ./train_data/ic15_data
+ rm -rf ./train_data/pubtabnet
wget -nc -P ./train_data/ https://paddleocr.bj.bcebos.com/dygraph_v2.0/test/icdar2015.tar --no-check-certificate
wget -nc -P ./train_data/ https://paddleocr.bj.bcebos.com/dygraph_v2.0/test/ic15_data.tar --no-check-certificate
- cd ./train_data/ && tar xf icdar2015.tar && tar xf ic15_data.tar && cd ../
- if [ ${model_name} == "ch_PPOCRv2_det" ]; then
+ wget -nc -P ./train_data/ https://paddleocr.bj.bcebos.com/dataset/pubtabnet.tar --no-check-certificate
+ cd ./train_data/ && tar xf icdar2015.tar && tar xf ic15_data.tar && tar xf pubtabnet.tar
+ wget -nc -P ./ic15_data/ https://paddleocr.bj.bcebos.com/dataset/rec_gt_train_lite.txt --no-check-certificate
+ wget -nc -P ./ic15_data/ https://paddleocr.bj.bcebos.com/dataset/rec_gt_test_lite.txt --no-check-certificate
+ cd ../
+ if [ ${model_name} == "ch_PP-OCRv2_det" ]; then
wget -nc -P ./pretrain_models/ https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_det_distill_train.tar --no-check-certificate
cd ./pretrain_models/ && tar xf ch_PP-OCRv2_det_distill_train.tar && cd ../
fi
+ if [ ${model_name} == "ch_PP-OCRv3_det" ]; then
+ wget -nc -P ./pretrain_models/ https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_det_distill_train.tar --no-check-certificate
+ cd ./pretrain_models/ && tar xf ch_PP-OCRv3_det_distill_train.tar && cd ../
+ fi
if [ ${model_name} == "en_server_pgnetA" ]; then
wget -nc -P ./train_data/ https://paddleocr.bj.bcebos.com/dygraph_v2.0/test/total_text_lite.tar --no-check-certificate
wget -nc -P ./pretrain_models/ https://paddleocr.bj.bcebos.com/dygraph_v2.0/pgnet/en_server_pgnetA.tar --no-check-certificate
@@ -122,19 +155,41 @@ elif [ ${MODE} = "whole_train_whole_infer" ];then
wget -nc -P ./train_data/ https://paddleocr.bj.bcebos.com/dygraph_v2.0/test/total_text_lite.tar --no-check-certificate
cd ./train_data && tar xf total_text.tar && ln -s total_text_lite total_text && cd ../
fi
+ if [[ ${model_name} =~ "en_table_structure" ]];then
+ wget -nc -P ./pretrain_models/ https://paddleocr.bj.bcebos.com/dygraph_v2.1/table/en_ppocr_mobile_v2.0_table_structure_train.tar --no-check-certificate
+ cd ./pretrain_models/ && tar xf en_ppocr_mobile_v2.0_table_structure_train.tar && cd ../
+ wget -nc -P ./inference/ https://paddleocr.bj.bcebos.com/dygraph_v2.0/table/en_ppocr_mobile_v2.0_table_det_infer.tar --no-check-certificate
+ wget -nc -P ./inference/ https://paddleocr.bj.bcebos.com/dygraph_v2.0/table/en_ppocr_mobile_v2.0_table_rec_infer.tar --no-check-certificate
+ cd ./inference/ && tar xf en_ppocr_mobile_v2.0_table_det_infer.tar && tar xf en_ppocr_mobile_v2.0_table_rec_infer.tar && cd ../
+ fi
elif [ ${MODE} = "lite_train_whole_infer" ];then
wget -nc -P ./pretrain_models/ https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/MobileNetV3_large_x0_5_pretrained.pdparams --no-check-certificate
rm -rf ./train_data/icdar2015
rm -rf ./train_data/ic15_data
+ rm -rf ./train_data/pubtabnet
wget -nc -P ./train_data/ https://paddleocr.bj.bcebos.com/dygraph_v2.0/test/icdar2015_infer.tar --no-check-certificate
wget -nc -P ./train_data/ https://paddleocr.bj.bcebos.com/dygraph_v2.0/test/ic15_data.tar --no-check-certificate
- cd ./train_data/ && tar xf icdar2015_infer.tar && tar xf ic15_data.tar
+ wget -nc -P ./train_data/ https://paddleocr.bj.bcebos.com/dataset/pubtabnet.tar --no-check-certificate
+ cd ./train_data/ && tar xf icdar2015_infer.tar && tar xf ic15_data.tar && tar xf pubtabnet.tar
ln -s ./icdar2015_infer ./icdar2015
+ wget -nc -P ./ic15_data/ https://paddleocr.bj.bcebos.com/dataset/rec_gt_train_lite.txt --no-check-certificate
+ wget -nc -P ./ic15_data/ https://paddleocr.bj.bcebos.com/dataset/rec_gt_test_lite.txt --no-check-certificate
cd ../
- if [ ${model_name} == "ch_PPOCRv2_det" ]; then
+ if [ ${model_name} == "ch_PP-OCRv2_det" ]; then
wget -nc -P ./pretrain_models/ https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_det_distill_train.tar --no-check-certificate
cd ./pretrain_models/ && tar xf ch_PP-OCRv2_det_distill_train.tar && cd ../
fi
+ if [ ${model_name} == "ch_PP-OCRv3_det" ]; then
+ wget -nc -P ./pretrain_models/ https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_det_distill_train.tar --no-check-certificate
+ cd ./pretrain_models/ && tar xf ch_PP-OCRv3_det_distill_train.tar && cd ../
+ fi
+ if [[ ${model_name} =~ "en_table_structure" ]];then
+ wget -nc -P ./pretrain_models/ https://paddleocr.bj.bcebos.com/dygraph_v2.1/table/en_ppocr_mobile_v2.0_table_structure_train.tar --no-check-certificate
+ cd ./pretrain_models/ && tar xf en_ppocr_mobile_v2.0_table_structure_train.tar && cd ../
+ wget -nc -P ./inference/ https://paddleocr.bj.bcebos.com/dygraph_v2.0/table/en_ppocr_mobile_v2.0_table_det_infer.tar --no-check-certificate
+ wget -nc -P ./inference/ https://paddleocr.bj.bcebos.com/dygraph_v2.0/table/en_ppocr_mobile_v2.0_table_rec_infer.tar --no-check-certificate
+ cd ./inference/ && tar xf en_ppocr_mobile_v2.0_table_det_infer.tar && tar xf en_ppocr_mobile_v2.0_table_rec_infer.tar && cd ../
+ fi
elif [ ${MODE} = "whole_infer" ];then
wget -nc -P ./inference https://paddleocr.bj.bcebos.com/dygraph_v2.0/test/ch_det_data_50.tar --no-check-certificate
wget -nc -P ./inference/ https://paddleocr.bj.bcebos.com/dygraph_v2.0/test/rec_inference.tar --no-check-certificate
@@ -169,17 +224,42 @@ elif [ ${MODE} = "whole_infer" ];then
wget -nc -P ./inference https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_rec_infer.tar --no-check-certificate
cd ./inference && tar xf ${eval_model_name}.tar && cd ../
fi
- if [[ ${model_name} =~ "ch_PPOCRv2_det" ]]; then
+ if [[ ${model_name} =~ "ch_PP-OCRv2" ]]; then
+ wget -nc -P ./inference/ https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_det_infer.tar --no-check-certificate
+ wget -nc -P ./inference/ https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_rec_infer.tar --no-check-certificate
+ cd ./inference && tar xf ch_PP-OCRv2_det_infer.tar && tar xf ch_PP-OCRv2_rec_infer.tar && tar xf ch_det_data_50.tar && cd ../
+ fi
+ if [[ ${model_name} =~ "ch_PP-OCRv3" ]]; then
+ wget -nc -P ./inference/ https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_det_infer.tar --no-check-certificate
+ wget -nc -P ./inference/ https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_rec_infer.tar --no-check-certificate
+ cd ./inference && tar xf ch_PP-OCRv3_det_infer.tar && tar xf ch_PP-OCRv3_rec_infer.tar && tar xf ch_det_data_50.tar && cd ../
+ fi
+ if [[ ${model_name} =~ "ch_PP-OCRv2_det" ]]; then
eval_model_name="ch_PP-OCRv2_det_infer"
wget -nc -P ./inference/ https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_det_infer.tar --no-check-certificate
cd ./inference && tar xf ${eval_model_name}.tar && tar xf ch_det_data_50.tar && cd ../
fi
- if [[ ${model_name} =~ "PPOCRv2_ocr_rec" ]]; then
+ if [[ ${model_name} =~ "ch_PP-OCRv3_det" ]]; then
+ eval_model_name="ch_PP-OCRv3_det_infer"
+ wget -nc -P ./inference/ https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_det_infer.tar --no-check-certificate
+ cd ./inference && tar xf ${eval_model_name}.tar && tar xf ch_det_data_50.tar && cd ../
+ fi
+ if [[ ${model_name} =~ "ch_PP-OCRv2_rec" ]]; then
eval_model_name="ch_PP-OCRv2_rec_infer"
wget -nc -P ./inference/ https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_rec_infer.tar --no-check-certificate
wget -nc -P ./inference/ https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_rec_slim_quant_infer.tar --no-check-certificate
cd ./inference && tar xf ${eval_model_name}.tar && tar xf ch_PP-OCRv2_rec_slim_quant_infer.tar && cd ../
fi
+ if [[ ${model_name} =~ "ch_PP-OCRv3_rec" ]]; then
+ eval_model_name="ch_PP-OCRv3_rec_infer"
+ wget -nc -P ./inference/ https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_rec_infer.tar --no-check-certificate
+ wget -nc -P ./inference/ https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_rec_slim_infer.tar --no-check-certificate
+ cd ./inference && tar xf ${eval_model_name}.tar && tar xf ch_PP-OCRv3_rec_slim_infer.tar && cd ../
+ fi
+ if [[ ${model_name} == "ch_PP-OCRv3_rec_PACT" ]]; then
+ wget -nc -P ./inference/ https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_rec_slim_infer.tar --no-check-certificate
+ cd ./inference && tar xf ch_PP-OCRv3_rec_slim_infer.tar && cd ../
+ fi
if [ ${model_name} == "en_server_pgnetA" ]; then
wget -nc -P ./inference/ https://paddleocr.bj.bcebos.com/dygraph_v2.0/pgnet/en_server_pgnetA.tar --no-check-certificate
cd ./inference && tar xf en_server_pgnetA.tar && tar xf ch_det_data_50.tar && cd ../
@@ -269,9 +349,15 @@ elif [ ${MODE} = "whole_infer" ];then
wget -nc -P ./inference/ https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/det_r50_vd_east_v2.0_train.tar --no-check-certificate
cd ./inference/ && tar xf det_r50_vd_east_v2.0_train.tar & cd ../
fi
+ if [[ ${model_name} =~ "en_table_structure" ]];then
+ wget -nc -P ./inference/ https://paddleocr.bj.bcebos.com/dygraph_v2.0/table/en_ppocr_mobile_v2.0_table_structure_infer.tar --no-check-certificate
+ wget -nc -P ./inference/ https://paddleocr.bj.bcebos.com/dygraph_v2.0/table/en_ppocr_mobile_v2.0_table_det_infer.tar --no-check-certificate
+ wget -nc -P ./inference/ https://paddleocr.bj.bcebos.com/dygraph_v2.0/table/en_ppocr_mobile_v2.0_table_rec_infer.tar --no-check-certificate
+ cd ./inference/ && tar xf en_ppocr_mobile_v2.0_table_structure_infer.tar && tar xf en_ppocr_mobile_v2.0_table_det_infer.tar && tar xf en_ppocr_mobile_v2.0_table_rec_infer.tar && cd ../
+ fi
fi
-if [ ${MODE} = "klquant_whole_infer" ]; then
+if [[ ${model_name} =~ "KL" ]]; then
wget -nc -P ./train_data/ https://paddleocr.bj.bcebos.com/dygraph_v2.0/test/icdar2015_lite.tar --no-check-certificate
cd ./train_data/ && tar xf icdar2015_lite.tar && rm -rf ./icdar2015 && ln -s ./icdar2015_lite ./icdar2015 && cd ../
if [ ${model_name} = "ch_ppocr_mobile_v2.0_det_KL" ]; then
@@ -279,41 +365,151 @@ if [ ${MODE} = "klquant_whole_infer" ]; then
wget -nc -P ./inference https://paddleocr.bj.bcebos.com/dygraph_v2.0/test/ch_det_data_50.tar --no-check-certificate
cd ./inference && tar xf ch_ppocr_mobile_v2.0_det_infer.tar && tar xf ch_det_data_50.tar && cd ../
fi
- if [ ${model_name} = "PPOCRv2_ocr_rec_kl" ]; then
+ if [ ${model_name} = "ch_PP-OCRv2_rec_KL" ]; then
wget -nc -P ./inference https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_rec_infer.tar --no-check-certificate
wget -nc -P ./inference/ https://paddleocr.bj.bcebos.com/dygraph_v2.0/test/rec_inference.tar --no-check-certificate
wget -nc -P ./train_data/ https://paddleocr.bj.bcebos.com/dygraph_v2.0/test/ic15_data.tar --no-check-certificate
cd ./train_data/ && tar xf ic15_data.tar && cd ../
cd ./inference && tar xf rec_inference.tar && tar xf ch_PP-OCRv2_rec_infer.tar && cd ../
fi
- if [ ${model_name} = "PPOCRv2_ocr_det_kl" ]; then
+ if [ ${model_name} = "ch_PP-OCRv3_rec_KL" ]; then
+ wget -nc -P ./inference https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_rec_infer.tar --no-check-certificate
+ wget -nc -P ./inference/ https://paddleocr.bj.bcebos.com/dygraph_v2.0/test/rec_inference.tar --no-check-certificate
+ wget -nc -P ./train_data/ https://paddleocr.bj.bcebos.com/dygraph_v2.0/test/ic15_data.tar --no-check-certificate
+ cd ./train_data/ && tar xf ic15_data.tar
+ wget -nc -P ./ic15_data/ https://paddleocr.bj.bcebos.com/dataset/rec_gt_train_lite.txt --no-check-certificate
+ wget -nc -P ./ic15_data/ https://paddleocr.bj.bcebos.com/dataset/rec_gt_test_lite.txt --no-check-certificate
+ cd ../
+ cd ./inference && tar xf rec_inference.tar && tar xf ch_PP-OCRv3_rec_infer.tar && cd ../
+ fi
+ if [ ${model_name} = "ch_PP-OCRv2_det_KL" ]; then
wget -nc -P ./inference https://paddleocr.bj.bcebos.com/dygraph_v2.0/test/ch_det_data_50.tar --no-check-certificate
wget -nc -P ./inference https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_det_infer.tar --no-check-certificate
cd ./inference && tar xf ch_PP-OCRv2_det_infer.tar && tar xf ch_det_data_50.tar && cd ../
fi
+ if [ ${model_name} = "ch_PP-OCRv3_det_KL" ]; then
+ wget -nc -P ./inference https://paddleocr.bj.bcebos.com/dygraph_v2.0/test/ch_det_data_50.tar --no-check-certificate
+ wget -nc -P ./inference https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_det_infer.tar --no-check-certificate
+ cd ./inference && tar xf ch_PP-OCRv3_det_infer.tar && tar xf ch_det_data_50.tar && cd ../
+ fi
if [ ${model_name} = "ch_ppocr_mobile_v2.0_rec_KL" ]; then
wget -nc -P ./inference/ https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_rec_infer.tar --no-check-certificate
wget -nc -P ./inference/ https://paddleocr.bj.bcebos.com/dygraph_v2.0/test/rec_inference.tar --no-check-certificate
wget -nc -P ./train_data/ https://paddleocr.bj.bcebos.com/dygraph_v2.0/test/ic15_data.tar --no-check-certificate
cd ./train_data/ && tar xf ic15_data.tar && cd ../
cd ./inference && tar xf ch_ppocr_mobile_v2.0_rec_infer.tar && tar xf rec_inference.tar && cd ../
- fi
+ fi
+ if [ ${model_name} = "en_table_structure_KL" ];then
+ wget -nc -P ./inference/ https://paddleocr.bj.bcebos.com/dygraph_v2.0/table/en_ppocr_mobile_v2.0_table_structure_infer.tar --no-check-certificate
+ wget -nc -P ./inference/ https://paddleocr.bj.bcebos.com/dygraph_v2.0/table/en_ppocr_mobile_v2.0_table_det_infer.tar --no-check-certificate
+ wget -nc -P ./inference/ https://paddleocr.bj.bcebos.com/dygraph_v2.0/table/en_ppocr_mobile_v2.0_table_rec_infer.tar --no-check-certificate
+ wget -nc -P ./train_data/ https://paddleocr.bj.bcebos.com/dataset/pubtabnet.tar --no-check-certificate
+ cd ./inference/ && tar xf en_ppocr_mobile_v2.0_table_structure_infer.tar && tar xf en_ppocr_mobile_v2.0_table_det_infer.tar && tar xf en_ppocr_mobile_v2.0_table_rec_infer.tar && cd ../
+ cd ./train_data/ && tar xf pubtabnet.tar && cd ../
+ fi
fi
if [ ${MODE} = "cpp_infer" ];then
- if [ ${model_name} = "ocr_det" ]; then
+ if [ ${model_name} = "ch_ppocr_mobile_v2.0_det" ]; then
wget -nc -P ./inference https://paddleocr.bj.bcebos.com/dygraph_v2.0/test/ch_det_data_50.tar --no-check-certificate
wget -nc -P ./inference https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_det_infer.tar --no-check-certificate
cd ./inference && tar xf ch_ppocr_mobile_v2.0_det_infer.tar && tar xf ch_det_data_50.tar && cd ../
+ elif [ ${model_name} = "ch_ppocr_mobile_v2.0_det_KL" ]; then
+ wget -nc -P ./inference https://paddleocr.bj.bcebos.com/dygraph_v2.0/test/ch_det_data_50.tar --no-check-certificate
+ wget -nc -P ./inference https://paddleocr.bj.bcebos.com/tipc_fake_model/ch_ppocr_mobile_v2.0_det_klquant_infer.tar --no-check-certificate
+ cd ./inference && tar xf ch_ppocr_mobile_v2.0_det_klquant_infer.tar && tar xf ch_det_data_50.tar && cd ../
+ elif [ ${model_name} = "ch_ppocr_mobile_v2.0_det_PACT" ]; then
+ wget -nc -P ./inference https://paddleocr.bj.bcebos.com/dygraph_v2.0/test/ch_det_data_50.tar --no-check-certificate
+ wget -nc -P ./inference https://paddleocr.bj.bcebos.com/tipc_fake_model/ch_ppocr_mobile_v2.0_det_pact_infer.tar --no-check-certificate
+ cd ./inference && tar xf ch_ppocr_mobile_v2.0_det_pact_infer.tar && tar xf ch_det_data_50.tar && cd ../
elif [ ${model_name} = "ch_ppocr_mobile_v2.0_rec" ]; then
wget -nc -P ./inference/ https://paddleocr.bj.bcebos.com/dygraph_v2.0/test/rec_inference.tar --no-check-certificate
wget -nc -P ./inference https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_rec_infer.tar --no-check-certificate
cd ./inference && tar xf ch_ppocr_mobile_v2.0_rec_infer.tar && tar xf rec_inference.tar && cd ../
- elif [ ${model_name} = "ocr_system" ]; then
+ elif [ ${model_name} = "ch_ppocr_mobile_v2.0_rec_KL" ]; then
+ wget -nc -P ./inference/ https://paddleocr.bj.bcebos.com/dygraph_v2.0/test/rec_inference.tar --no-check-certificate
+ wget -nc -P ./inference https://paddleocr.bj.bcebos.com/tipc_fake_model/ch_ppocr_mobile_v2.0_rec_klquant_infer.tar --no-check-certificate
+ cd ./inference && tar xf ch_ppocr_mobile_v2.0_rec_klquant_infer.tar && tar xf rec_inference.tar && cd ../
+ elif [ ${model_name} = "ch_ppocr_mobile_v2.0_rec_PACT" ]; then
+ wget -nc -P ./inference/ https://paddleocr.bj.bcebos.com/dygraph_v2.0/test/rec_inference.tar --no-check-certificate
+ wget -nc -P ./inference https://paddleocr.bj.bcebos.com/tipc_fake_model/ch_ppocr_mobile_v2.0_rec_pact_infer.tar --no-check-certificate
+ cd ./inference && tar xf ch_ppocr_mobile_v2.0_rec_pact_infer.tar && tar xf rec_inference.tar && cd ../
+ elif [ ${model_name} = "ch_ppocr_server_v2.0_det" ]; then
+ wget -nc -P ./inference https://paddleocr.bj.bcebos.com/dygraph_v2.0/test/ch_det_data_50.tar --no-check-certificate
+ wget -nc -P ./inference https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_det_infer.tar --no-check-certificate
+ cd ./inference && tar xf ch_ppocr_server_v2.0_det_infer.tar && tar xf ch_det_data_50.tar && cd ../
+ elif [ ${model_name} = "ch_ppocr_server_v2.0_rec" ]; then
+ wget -nc -P ./inference/ https://paddleocr.bj.bcebos.com/dygraph_v2.0/test/rec_inference.tar --no-check-certificate
+ wget -nc -P ./inference https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_rec_infer.tar --no-check-certificate
+ cd ./inference && tar xf ch_ppocr_server_v2.0_rec_infer.tar && tar xf rec_inference.tar && cd ../
+ elif [ ${model_name} = "ch_PP-OCRv2_det" ]; then
+ wget -nc -P ./inference https://paddleocr.bj.bcebos.com/dygraph_v2.0/test/ch_det_data_50.tar --no-check-certificate
+ wget -nc -P ./inference https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_det_infer.tar --no-check-certificate
+ cd ./inference && tar xf ch_PP-OCRv2_det_infer.tar && tar xf ch_det_data_50.tar && cd ../
+ elif [ ${model_name} = "ch_PP-OCRv2_det_KL" ]; then
+ wget -nc -P ./inference https://paddleocr.bj.bcebos.com/dygraph_v2.0/test/ch_det_data_50.tar --no-check-certificate
+ wget -nc -P ./inference https://paddleocr.bj.bcebos.com/tipc_fake_model/ch_PP-OCRv2_det_klquant_infer.tar --no-check-certificate
+ cd ./inference && tar xf ch_PP-OCRv2_det_klquant_infer.tar && tar xf ch_det_data_50.tar && cd ../
+ elif [ ${model_name} = "ch_PP-OCRv2_det_PACT" ]; then
+ wget -nc -P ./inference https://paddleocr.bj.bcebos.com/dygraph_v2.0/test/ch_det_data_50.tar --no-check-certificate
+ wget -nc -P ./inference https://paddleocr.bj.bcebos.com/tipc_fake_model/ch_PP-OCRv2_det_pact_infer.tar --no-check-certificate
+ cd ./inference && tar xf ch_PP-OCRv2_det_pact_infer.tar && tar xf ch_det_data_50.tar && cd ../
+ elif [ ${model_name} = "ch_PP-OCRv2_rec" ]; then
+ wget -nc -P ./inference https://paddleocr.bj.bcebos.com/dygraph_v2.0/test/rec_inference.tar --no-check-certificate
+ wget -nc -P ./inference https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_rec_infer.tar --no-check-certificate
+ cd ./inference && tar xf ch_PP-OCRv2_rec_infer.tar && tar xf rec_inference.tar && cd ../
+ elif [ ${model_name} = "ch_PP-OCRv2_rec_KL" ]; then
+ wget -nc -P ./inference/ https://paddleocr.bj.bcebos.com/dygraph_v2.0/test/rec_inference.tar --no-check-certificate
+ wget -nc -P ./inference https://paddleocr.bj.bcebos.com/tipc_fake_model/ch_PP-OCRv2_rec_klquant_infer.tar --no-check-certificate
+ cd ./inference && tar xf ch_PP-OCRv2_rec_klquant_infer.tar && tar xf rec_inference.tar && cd ../
+ elif [ ${model_name} = "ch_PP-OCRv2_rec_PACT" ]; then
+ wget -nc -P ./inference/ https://paddleocr.bj.bcebos.com/dygraph_v2.0/test/rec_inference.tar --no-check-certificate
+ wget -nc -P ./inference https://paddleocr.bj.bcebos.com/tipc_fake_model/ch_PP-OCRv2_rec_pact_infer.tar --no-check-certificate
+ cd ./inference && tar xf ch_PP-OCRv2_rec_pact_infer.tar && tar xf rec_inference.tar && cd ../
+ elif [ ${model_name} = "ch_PP-OCRv3_det" ]; then
+ wget -nc -P ./inference https://paddleocr.bj.bcebos.com/dygraph_v2.0/test/ch_det_data_50.tar --no-check-certificate
+ wget -nc -P ./inference https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_det_infer.tar --no-check-certificate
+ cd ./inference && tar xf ch_PP-OCRv3_det_infer.tar && tar xf ch_det_data_50.tar && cd ../
+ elif [ ${model_name} = "ch_PP-OCRv3_det_KL" ]; then
+ wget -nc -P ./inference https://paddleocr.bj.bcebos.com/dygraph_v2.0/test/ch_det_data_50.tar --no-check-certificate
+ wget -nc -P ./inference https://paddleocr.bj.bcebos.com/tipc_fake_model/ch_PP-OCRv3_det_klquant_infer.tar --no-check-certificate
+ cd ./inference && tar xf ch_PP-OCRv3_det_klquant_infer.tar && tar xf ch_det_data_50.tar && cd ../
+ elif [ ${model_name} = "ch_PP-OCRv3_det_PACT" ]; then
+ wget -nc -P ./inference https://paddleocr.bj.bcebos.com/dygraph_v2.0/test/ch_det_data_50.tar --no-check-certificate
+ wget -nc -P ./inference https://paddleocr.bj.bcebos.com/tipc_fake_model/ch_PP-OCRv3_det_pact_infer.tar --no-check-certificate
+ cd ./inference && tar xf ch_PP-OCRv3_det_pact_infer.tar && tar xf ch_det_data_50.tar && cd ../
+ elif [ ${model_name} = "ch_PP-OCRv3_rec" ]; then
+ wget -nc -P ./inference https://paddleocr.bj.bcebos.com/dygraph_v2.0/test/rec_inference.tar --no-check-certificate
+ wget -nc -P ./inference https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_rec_infer.tar --no-check-certificate
+ cd ./inference && tar xf ch_PP-OCRv3_rec_infer.tar && tar xf rec_inference.tar && cd ../
+ elif [ ${model_name} = "ch_PP-OCRv3_rec_KL" ]; then
+ wget -nc -P ./inference/ https://paddleocr.bj.bcebos.com/dygraph_v2.0/test/rec_inference.tar --no-check-certificate
+ wget -nc -P ./inference https://paddleocr.bj.bcebos.com/tipc_fake_model/ch_PP-OCRv3_rec_klquant_infer.tar --no-check-certificate
+ cd ./inference && tar xf ch_PP-OCRv3_rec_klquant_infer.tar && tar xf rec_inference.tar && cd ../
+ elif [ ${model_name} = "ch_PP-OCRv3_rec_PACT" ]; then
+ wget -nc -P ./inference/ https://paddleocr.bj.bcebos.com/dygraph_v2.0/test/rec_inference.tar --no-check-certificate
+ wget -nc -P ./inference https://paddleocr.bj.bcebos.com/tipc_fake_model/ch_PP-OCRv3_rec_pact_infer.tar --no-check-certificate
+ cd ./inference && tar xf ch_PP-OCRv3_rec_pact_infer.tar && tar xf rec_inference.tar && cd ../
+ elif [ ${model_name} = "ch_ppocr_mobile_v2.0" ]; then
wget -nc -P ./inference https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_det_infer.tar --no-check-certificate
wget -nc -P ./inference https://paddleocr.bj.bcebos.com/dygraph_v2.0/test/ch_det_data_50.tar --no-check-certificate
wget -nc -P ./inference https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_rec_infer.tar --no-check-certificate
cd ./inference && tar xf ch_ppocr_mobile_v2.0_det_infer.tar && tar xf ch_ppocr_mobile_v2.0_rec_infer.tar && tar xf ch_det_data_50.tar && cd ../
+ elif [ ${model_name} = "ch_ppocr_server_v2.0" ]; then
+ wget -nc -P ./inference https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_det_infer.tar --no-check-certificate
+ wget -nc -P ./inference https://paddleocr.bj.bcebos.com/dygraph_v2.0/test/ch_det_data_50.tar --no-check-certificate
+ wget -nc -P ./inference https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_rec_infer.tar --no-check-certificate
+ cd ./inference && tar xf ch_ppocr_server_v2.0_det_infer.tar && tar xf ch_ppocr_server_v2.0_rec_infer.tar && tar xf ch_det_data_50.tar && cd ../
+ elif [ ${model_name} = "ch_PP-OCRv2" ]; then
+ wget -nc -P ./inference https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_det_infer.tar --no-check-certificate
+ wget -nc -P ./inference https://paddleocr.bj.bcebos.com/dygraph_v2.0/test/ch_det_data_50.tar --no-check-certificate
+ wget -nc -P ./inference https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_rec_infer.tar --no-check-certificate
+ cd ./inference && tar xf ch_PP-OCRv2_det_infer.tar && tar xf ch_PP-OCRv2_rec_infer.tar && tar xf ch_det_data_50.tar && cd ../
+ elif [ ${model_name} = "ch_PP-OCRv3" ]; then
+ wget -nc -P ./inference https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_det_infer.tar --no-check-certificate
+ wget -nc -P ./inference https://paddleocr.bj.bcebos.com/dygraph_v2.0/test/ch_det_data_50.tar --no-check-certificate
+ wget -nc -P ./inference https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_rec_infer.tar --no-check-certificate
+ cd ./inference && tar xf ch_PP-OCRv3_det_infer.tar && tar xf ch_PP-OCRv3_rec_infer.tar && tar xf ch_det_data_50.tar && cd ../
fi
fi
@@ -323,29 +519,84 @@ if [ ${MODE} = "serving_infer" ];then
IFS='|'
array=(${python_name_list})
python_name=${array[0]}
- ${python_name} -m pip install paddle-serving-server-gpu==0.8.3.post101
- ${python_name} -m pip install paddle_serving_client==0.8.3
- ${python_name} -m pip install paddle-serving-app==0.8.3
- wget -nc -P ./inference https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_det_infer.tar
- wget -nc -P ./inference https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_rec_infer.tar
- wget -nc -P ./inference https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_det_infer.tar
- wget -nc -P ./inference https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_rec_infer.tar
- cd ./inference && tar xf ch_ppocr_mobile_v2.0_det_infer.tar && tar xf ch_ppocr_mobile_v2.0_rec_infer.tar && tar xf ch_ppocr_server_v2.0_rec_infer.tar && tar xf ch_ppocr_server_v2.0_det_infer.tar && cd ../
+ ${python_name} -m pip install paddle-serving-server-gpu
+ ${python_name} -m pip install paddle_serving_client
+ ${python_name} -m pip install paddle-serving-app
+ # wget model
+ if [ ${model_name} == "ch_ppocr_mobile_v2.0_det_KL" ] || [ ${model_name} == "ch_ppocr_mobile_v2.0_rec_KL" ] ; then
+ wget -nc -P ./inference https://paddleocr.bj.bcebos.com/tipc_fake_model/ch_ppocr_mobile_v2.0_det_klquant_infer.tar --no-check-certificate
+ wget -nc -P ./inference https://paddleocr.bj.bcebos.com/tipc_fake_model/ch_ppocr_mobile_v2.0_rec_klquant_infer.tar --no-check-certificate
+ cd ./inference && tar xf ch_ppocr_mobile_v2.0_det_klquant_infer.tar && tar xf ch_ppocr_mobile_v2.0_rec_klquant_infer.tar && cd ../
+ elif [ ${model_name} == "ch_PP-OCRv2_det_KL" ] || [ ${model_name} == "ch_PP-OCRv2_rec_KL" ] ; then
+ wget -nc -P ./inference https://paddleocr.bj.bcebos.com/tipc_fake_model/ch_PP-OCRv2_det_klquant_infer.tar --no-check-certificate
+ wget -nc -P ./inference https://paddleocr.bj.bcebos.com/tipc_fake_model/ch_PP-OCRv2_rec_klquant_infer.tar --no-check-certificate
+ cd ./inference && tar xf ch_PP-OCRv2_det_klquant_infer.tar && tar xf ch_PP-OCRv2_rec_klquant_infer.tar && cd ../
+ elif [ ${model_name} == "ch_PP-OCRv3_det_KL" ] || [ ${model_name} == "ch_PP-OCRv3_rec_KL" ] ; then
+ wget -nc -P ./inference https://paddleocr.bj.bcebos.com/tipc_fake_model/ch_PP-OCRv3_det_klquant_infer.tar --no-check-certificate
+ wget -nc -P ./inference https://paddleocr.bj.bcebos.com/tipc_fake_model/ch_PP-OCRv3_rec_klquant_infer.tar --no-check-certificate
+ cd ./inference && tar xf ch_PP-OCRv3_det_klquant_infer.tar && tar xf ch_PP-OCRv3_rec_klquant_infer.tar && cd ../
+ elif [ ${model_name} == "ch_ppocr_mobile_v2.0_det_PACT" ] || [ ${model_name} == "ch_ppocr_mobile_v2.0_rec_PACT" ] ; then
+ wget -nc -P ./inference https://paddleocr.bj.bcebos.com/tipc_fake_model/ch_ppocr_mobile_v2.0_det_pact_infer.tar --no-check-certificate
+ wget -nc -P ./inference https://paddleocr.bj.bcebos.com/tipc_fake_model/ch_ppocr_mobile_v2.0_rec_pact_infer.tar --no-check-certificate
+ cd ./inference && tar xf ch_ppocr_mobile_v2.0_det_pact_infer.tar && tar xf ch_ppocr_mobile_v2.0_rec_pact_infer.tar && cd ../
+ elif [ ${model_name} == "ch_PP-OCRv2_det_PACT" ] || [ ${model_name} == "ch_PP-OCRv2_rec_PACT" ] ; then
+ wget -nc -P ./inference https://paddleocr.bj.bcebos.com/tipc_fake_model/ch_PP-OCRv2_det_pact_infer.tar --no-check-certificate
+ wget -nc -P ./inference https://paddleocr.bj.bcebos.com/tipc_fake_model/ch_PP-OCRv2_rec_pact_infer.tar --no-check-certificate
+ cd ./inference && tar xf ch_PP-OCRv2_det_pact_infer.tar && tar xf ch_PP-OCRv2_rec_pact_infer.tar && cd ../
+ elif [ ${model_name} == "ch_PP-OCRv3_det_PACT" ] || [ ${model_name} == "ch_PP-OCRv3_rec_PACT" ] ; then
+ wget -nc -P ./inference https://paddleocr.bj.bcebos.com/tipc_fake_model/ch_PP-OCRv3_det_pact_infer.tar --no-check-certificate
+ wget -nc -P ./inference https://paddleocr.bj.bcebos.com/tipc_fake_model/ch_PP-OCRv3_rec_pact_infer.tar --no-check-certificate
+ cd ./inference && tar xf ch_PP-OCRv3_det_pact_infer.tar && tar xf ch_PP-OCRv3_rec_pact_infer.tar && cd ../
+ elif [[ ${model_name} =~ "ch_ppocr_mobile_v2.0" ]]; then
+ wget -nc -P ./inference https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_det_infer.tar --no-check-certificate
+ wget -nc -P ./inference https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_rec_infer.tar --no-check-certificate
+ cd ./inference && tar xf ch_ppocr_mobile_v2.0_det_infer.tar && tar xf ch_ppocr_mobile_v2.0_rec_infer.tar && cd ../
+ elif [[ ${model_name} =~ "ch_ppocr_server_v2.0" ]]; then
+ wget -nc -P ./inference https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_det_infer.tar --no-check-certificate
+ wget -nc -P ./inference https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_rec_infer.tar --no-check-certificate
+ cd ./inference && tar xf ch_ppocr_server_v2.0_det_infer.tar && tar xf ch_ppocr_server_v2.0_rec_infer.tar && cd ../
+ elif [[ ${model_name} =~ "ch_PP-OCRv2" ]]; then
+ wget -nc -P ./inference https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_det_infer.tar --no-check-certificate
+ wget -nc -P ./inference https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_rec_infer.tar --no-check-certificate
+ cd ./inference && tar xf ch_PP-OCRv2_det_infer.tar && tar xf ch_PP-OCRv2_rec_infer.tar && cd ../
+ elif [[ ${model_name} =~ "ch_PP-OCRv3" ]]; then
+ wget -nc -P ./inference https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_det_infer.tar --no-check-certificate
+ wget -nc -P ./inference https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_rec_infer.tar --no-check-certificate
+ cd ./inference && tar xf ch_PP-OCRv3_det_infer.tar && tar xf ch_PP-OCRv3_rec_infer.tar && cd ../
+ fi
+ # wget data
+ wget -nc -P ./inference https://paddleocr.bj.bcebos.com/dygraph_v2.0/test/ch_det_data_50.tar --no-check-certificate
+ wget -nc -P ./inference https://paddleocr.bj.bcebos.com/dygraph_v2.0/test/rec_inference.tar --no-check-certificate
+ cd ./inference && tar xf ch_det_data_50.tar && tar xf rec_inference.tar && cd ../
fi
if [ ${MODE} = "paddle2onnx_infer" ];then
# prepare serving env
python_name=$(func_parser_value "${lines[2]}")
- ${python_name} -m pip install install paddle2onnx
- ${python_name} -m pip install onnxruntime==1.4.0
+ ${python_name} -m pip install paddle2onnx
+ ${python_name} -m pip install onnxruntime
# wget model
- wget -nc -P ./inference https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_det_infer.tar
- wget -nc -P ./inference https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_rec_infer.tar
- wget -nc -P ./inference https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_det_infer.tar
- wget -nc -P ./inference https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_rec_infer.tar
+ if [[ ${model_name} =~ "ch_ppocr_mobile_v2.0" ]]; then
+ wget -nc -P ./inference https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_det_infer.tar --no-check-certificate
+ wget -nc -P ./inference https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_rec_infer.tar --no-check-certificate
+ cd ./inference && tar xf ch_ppocr_mobile_v2.0_det_infer.tar && tar xf ch_ppocr_mobile_v2.0_rec_infer.tar && cd ../
+ elif [[ ${model_name} =~ "ch_ppocr_server_v2.0" ]]; then
+ wget -nc -P ./inference https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_det_infer.tar --no-check-certificate
+ wget -nc -P ./inference https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_rec_infer.tar --no-check-certificate
+ cd ./inference && tar xf ch_ppocr_server_v2.0_det_infer.tar && tar xf ch_ppocr_server_v2.0_rec_infer.tar && cd ../
+ elif [[ ${model_name} =~ "ch_PP-OCRv2" ]]; then
+ wget -nc -P ./inference https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_det_infer.tar --no-check-certificate
+ wget -nc -P ./inference https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_rec_infer.tar --no-check-certificate
+ cd ./inference && tar xf ch_PP-OCRv2_det_infer.tar && tar xf ch_PP-OCRv2_rec_infer.tar && cd ../
+ elif [[ ${model_name} =~ "ch_PP-OCRv3" ]]; then
+ wget -nc -P ./inference https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_det_infer.tar --no-check-certificate
+ wget -nc -P ./inference https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_rec_infer.tar --no-check-certificate
+ cd ./inference && tar xf ch_PP-OCRv3_det_infer.tar && tar xf ch_PP-OCRv3_rec_infer.tar && cd ../
+ fi
+
# wget data
wget -nc -P ./inference https://paddleocr.bj.bcebos.com/dygraph_v2.0/test/ch_det_data_50.tar
wget -nc -P ./inference/ https://paddleocr.bj.bcebos.com/dygraph_v2.0/test/rec_inference.tar
- cd ./inference && tar xf ch_ppocr_mobile_v2.0_det_infer.tar && tar xf ch_ppocr_mobile_v2.0_rec_infer.tar && tar xf ch_ppocr_server_v2.0_rec_infer.tar && tar xf ch_ppocr_server_v2.0_det_infer.tar && tar xf ch_det_data_50.tar && tar xf rec_inference.tar && cd ../
+ cd ./inference && tar xf ch_det_data_50.tar && tar xf rec_inference.tar && cd ../
fi
diff --git a/test_tipc/prepare_lite_cpp.sh b/test_tipc/prepare_lite_cpp.sh
index 94af43c88d60aad706009c8bdd1e519a7b7e9acc..9148cb5dd72e16790e10db1cb266e4169cd4fdab 100644
--- a/test_tipc/prepare_lite_cpp.sh
+++ b/test_tipc/prepare_lite_cpp.sh
@@ -51,6 +51,8 @@ for model in ${lite_model_list[*]}; do
inference_model_url=https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/${model}.tar
elif [[ $model =~ "v2.0" ]]; then
inference_model_url=https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/${model}.tar
+ elif [[ $model =~ "PP-OCRv3" ]]; then
+ inference_model_url=https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/${model}.tar
else
echo "Model is wrong, please check."
exit 3
diff --git a/test_tipc/readme.md b/test_tipc/readme.md
index 8110f0073be248259c7cdd002d209c150a52fb71..effb2f168b6cc91012bef3de120de9e98a21dbda 100644
--- a/test_tipc/readme.md
+++ b/test_tipc/readme.md
@@ -138,6 +138,7 @@ bash test_tipc/test_train_inference_python.sh ./test_tipc/configs/ch_ppocr_mobil
## 4. 开始测试
各功能测试中涉及混合精度、裁剪、量化等训练相关,及mkldnn、Tensorrt等多种预测相关参数配置,请点击下方相应链接了解更多细节和使用教程:
- [test_train_inference_python 使用](docs/test_train_inference_python.md) :测试基于Python的模型训练、评估、推理等基本功能,包括裁剪、量化、蒸馏。
+- [test_train_fleet_inference_python 使用](./docs/test_train_fleet_inference_python.md):测试基于Python的多机多卡训练与推理等基本功能。
- [test_inference_cpp 使用](docs/test_inference_cpp.md):测试基于C++的模型推理。
- [test_serving 使用](docs/test_serving.md):测试基于Paddle Serving的服务化部署功能。
- [test_lite_arm_cpp 使用](docs/test_lite_arm_cpp.md):测试基于Paddle-Lite的ARM CPU端c++预测部署功能。
diff --git a/test_tipc/test_inference_cpp.sh b/test_tipc/test_inference_cpp.sh
index 9885e3937255658d4aacc5835eba634b74ea12a0..c0c7c18a38a46b00c839757e303049135a508691 100644
--- a/test_tipc/test_inference_cpp.sh
+++ b/test_tipc/test_inference_cpp.sh
@@ -43,7 +43,7 @@ cpp_cls_value=$(func_parser_value "${lines[18]}")
cpp_use_angle_cls_key=$(func_parser_key "${lines[19]}")
cpp_use_angle_cls_value=$(func_parser_value "${lines[19]}")
-LOG_PATH="./test_tipc/output"
+LOG_PATH="./test_tipc/output/${model_name}/cpp_infer"
mkdir -p ${LOG_PATH}
status_log="${LOG_PATH}/results_cpp.log"
@@ -84,7 +84,7 @@ function func_cpp_inference(){
eval $command
last_status=${PIPESTATUS[0]}
eval "cat ${_save_log_path}"
- status_check $last_status "${command}" "${status_log}"
+ status_check $last_status "${command}" "${status_log}" "${model_name}"
done
done
done
@@ -117,7 +117,7 @@ function func_cpp_inference(){
eval $command
last_status=${PIPESTATUS[0]}
eval "cat ${_save_log_path}"
- status_check $last_status "${command}" "${status_log}"
+ status_check $last_status "${command}" "${status_log}" "${model_name}"
done
done
@@ -178,7 +178,23 @@ if [ ${use_opencv} = "True" ]; then
else
OPENCV_DIR=''
fi
-LIB_DIR=$(pwd)/Paddle/build/paddle_inference_install_dir/
+if [ -d "paddle_inference/" ] ;then
+ echo "################### download paddle inference skipped ###################"
+else
+ echo "################### download paddle inference ###################"
+ PADDLEInfer=$3
+ if [ "" = "$PADDLEInfer" ];then
+ wget -nc https://paddle-inference-lib.bj.bcebos.com/2.3.0/cxx_c/Linux/GPU/x86-64_gcc8.2_avx_mkl_cuda10.1_cudnn7.6.5_trt6.0.1.5/paddle_inference.tgz --no-check-certificate
+ else
+ wget -nc $PADDLEInfer --no-check-certificate
+ fi
+ tar zxf paddle_inference.tgz
+ if [ ! -d "paddle_inference" ]; then
+ ln -s paddle_inference_install_dir paddle_inference
+ fi
+ echo "################### download paddle inference finished ###################"
+fi
+LIB_DIR=$(pwd)/paddle_inference/
CUDA_LIB_DIR=$(dirname `find /usr -name libcudart.so`)
CUDNN_LIB_DIR=$(dirname `find /usr -name libcudnn.so`)
@@ -205,11 +221,10 @@ echo "################### build PaddleOCR demo finished ###################"
# set cuda device
GPUID=$2
if [ ${#GPUID} -le 0 ];then
- env=" "
+ env="export CUDA_VISIBLE_DEVICES=0"
else
env="export CUDA_VISIBLE_DEVICES=${GPUID}"
fi
-set CUDA_VISIBLE_DEVICES
eval $env
diff --git a/test_tipc/test_inference_python.sh b/test_tipc/test_inference_python.sh
index 27276d55b95051e167432600308f42127d784ee6..2a31a468f0d54d1979e82c8f0da98cac6f4edcec 100644
--- a/test_tipc/test_inference_python.sh
+++ b/test_tipc/test_inference_python.sh
@@ -44,7 +44,7 @@ infer_value1=$(func_parser_value "${lines[17]}")
-LOG_PATH="./test_tipc/output"
+LOG_PATH="./test_tipc/output/${model_name}/${MODE}"
mkdir -p ${LOG_PATH}
status_log="${LOG_PATH}/results_python.log"
@@ -88,7 +88,7 @@ function func_inference(){
eval $command
last_status=${PIPESTATUS[0]}
eval "cat ${_save_log_path}"
- status_check $last_status "${command}" "${status_log}"
+ status_check $last_status "${command}" "${status_log}" "${model_name}"
done
done
done
@@ -113,13 +113,13 @@ function func_inference(){
set_tensorrt=$(func_set_params "${use_trt_key}" "${use_trt}")
set_precision=$(func_set_params "${precision_key}" "${precision}")
set_model_dir=$(func_set_params "${infer_model_key}" "${_model_dir}")
- set_infer_params0=$(func_set_params "${save_log_key}" "${save_log_value}")
+ set_infer_params0=$(func_set_params "${rec_model_key}" "${rec_model_value}")
set_infer_params1=$(func_set_params "${infer_key1}" "${infer_value1}")
command="${_python} ${_script} ${use_gpu_key}=${use_gpu} ${set_tensorrt} ${set_precision} ${set_model_dir} ${set_batchsize} ${set_infer_data} ${set_benchmark} ${set_infer_params1} ${set_infer_params0} > ${_save_log_path} 2>&1 "
eval $command
last_status=${PIPESTATUS[0]}
eval "cat ${_save_log_path}"
- status_check $last_status "${command}" "${status_log}"
+ status_check $last_status "${command}" "${status_log}" "${model_name}"
done
done
@@ -153,7 +153,7 @@ if [ ${MODE} = "whole_infer" ]; then
echo ${infer_run_exports[Count]}
eval $export_cmd
status_export=$?
- status_check $status_export "${export_cmd}" "${status_log}"
+ status_check $status_export "${export_cmd}" "${status_log}" "${model_name}"
else
save_infer_dir=${infer_model}
fi
diff --git a/test_tipc/test_paddle2onnx.sh b/test_tipc/test_paddle2onnx.sh
index 300c61770d2519fad0502147e2cee4a3e4f50ac9..356bc98041fffa8f0437c6419fc72c06d5e719f7 100644
--- a/test_tipc/test_paddle2onnx.sh
+++ b/test_tipc/test_paddle2onnx.sh
@@ -11,7 +11,7 @@ python=$(func_parser_value "${lines[2]}")
# parser params
-dataline=$(awk 'NR==1, NR==12{print}' $FILENAME)
+dataline=$(awk 'NR==1, NR==17{print}' $FILENAME)
IFS=$'\n'
lines=(${dataline})
@@ -19,29 +19,33 @@ lines=(${dataline})
model_name=$(func_parser_value "${lines[1]}")
python=$(func_parser_value "${lines[2]}")
padlle2onnx_cmd=$(func_parser_value "${lines[3]}")
-infer_model_dir_key=$(func_parser_key "${lines[4]}")
-infer_model_dir_value=$(func_parser_value "${lines[4]}")
+det_infer_model_dir_key=$(func_parser_key "${lines[4]}")
+det_infer_model_dir_value=$(func_parser_value "${lines[4]}")
model_filename_key=$(func_parser_key "${lines[5]}")
model_filename_value=$(func_parser_value "${lines[5]}")
params_filename_key=$(func_parser_key "${lines[6]}")
params_filename_value=$(func_parser_value "${lines[6]}")
-save_file_key=$(func_parser_key "${lines[7]}")
-save_file_value=$(func_parser_value "${lines[7]}")
-opset_version_key=$(func_parser_key "${lines[8]}")
-opset_version_value=$(func_parser_value "${lines[8]}")
-enable_onnx_checker_key=$(func_parser_key "${lines[9]}")
-enable_onnx_checker_value=$(func_parser_value "${lines[9]}")
+det_save_file_key=$(func_parser_key "${lines[7]}")
+det_save_file_value=$(func_parser_value "${lines[7]}")
+rec_infer_model_dir_key=$(func_parser_key "${lines[8]}")
+rec_infer_model_dir_value=$(func_parser_value "${lines[8]}")
+rec_save_file_key=$(func_parser_key "${lines[9]}")
+rec_save_file_value=$(func_parser_value "${lines[9]}")
+opset_version_key=$(func_parser_key "${lines[10]}")
+opset_version_value=$(func_parser_value "${lines[10]}")
+enable_onnx_checker_key=$(func_parser_key "${lines[11]}")
+enable_onnx_checker_value=$(func_parser_value "${lines[11]}")
# parser onnx inference
-inference_py=$(func_parser_value "${lines[10]}")
-use_gpu_key=$(func_parser_key "${lines[11]}")
-use_gpu_value=$(func_parser_value "${lines[11]}")
-det_model_key=$(func_parser_key "${lines[12]}")
-image_dir_key=$(func_parser_key "${lines[13]}")
-image_dir_value=$(func_parser_value "${lines[13]}")
+inference_py=$(func_parser_value "${lines[12]}")
+use_gpu_key=$(func_parser_key "${lines[13]}")
+use_gpu_list=$(func_parser_value "${lines[13]}")
+det_model_key=$(func_parser_key "${lines[14]}")
+rec_model_key=$(func_parser_key "${lines[15]}")
+image_dir_key=$(func_parser_key "${lines[16]}")
+image_dir_value=$(func_parser_value "${lines[16]}")
-
-LOG_PATH="./test_tipc/output"
-mkdir -p ./test_tipc/output
+LOG_PATH="./test_tipc/output/${model_name}/paddle2onnx"
+mkdir -p ${LOG_PATH}
status_log="${LOG_PATH}/results_paddle2onnx.log"
@@ -50,24 +54,103 @@ function func_paddle2onnx(){
_script=$1
# paddle2onnx
- _save_log_path="${LOG_PATH}/paddle2onnx_infer_cpu.log"
- set_dirname=$(func_set_params "${infer_model_dir_key}" "${infer_model_dir_value}")
- set_model_filename=$(func_set_params "${model_filename_key}" "${model_filename_value}")
- set_params_filename=$(func_set_params "${params_filename_key}" "${params_filename_value}")
- set_save_model=$(func_set_params "${save_file_key}" "${save_file_value}")
- set_opset_version=$(func_set_params "${opset_version_key}" "${opset_version_value}")
- set_enable_onnx_checker=$(func_set_params "${enable_onnx_checker_key}" "${enable_onnx_checker_value}")
- trans_model_cmd="${padlle2onnx_cmd} ${set_dirname} ${set_model_filename} ${set_params_filename} ${set_save_model} ${set_opset_version} ${set_enable_onnx_checker}"
- eval $trans_model_cmd
- last_status=${PIPESTATUS[0]}
- status_check $last_status "${trans_model_cmd}" "${status_log}"
+ if [ ${model_name} = "ch_PP-OCRv2" ] || [ ${model_name} = "ch_PP-OCRv3" ] || [ ${model_name} = "ch_ppocr_mobile_v2.0" ] || [ ${model_name} = "ch_ppocr_server_v2.0" ]; then
+ # trans det
+ set_dirname=$(func_set_params "--model_dir" "${det_infer_model_dir_value}")
+ set_model_filename=$(func_set_params "${model_filename_key}" "${model_filename_value}")
+ set_params_filename=$(func_set_params "${params_filename_key}" "${params_filename_value}")
+ set_save_model=$(func_set_params "--save_file" "${det_save_file_value}")
+ set_opset_version=$(func_set_params "${opset_version_key}" "${opset_version_value}")
+ set_enable_onnx_checker=$(func_set_params "${enable_onnx_checker_key}" "${enable_onnx_checker_value}")
+ trans_det_log="${LOG_PATH}/trans_model_det.log"
+ trans_model_cmd="${padlle2onnx_cmd} ${set_dirname} ${set_model_filename} ${set_params_filename} ${set_save_model} ${set_opset_version} ${set_enable_onnx_checker} > ${trans_det_log} 2>&1 "
+ eval $trans_model_cmd
+ last_status=${PIPESTATUS[0]}
+ status_check $last_status "${trans_model_cmd}" "${status_log}" "${model_name}"
+ # trans rec
+ set_dirname=$(func_set_params "--model_dir" "${rec_infer_model_dir_value}")
+ set_model_filename=$(func_set_params "${model_filename_key}" "${model_filename_value}")
+ set_params_filename=$(func_set_params "${params_filename_key}" "${params_filename_value}")
+ set_save_model=$(func_set_params "--save_file" "${rec_save_file_value}")
+ set_opset_version=$(func_set_params "${opset_version_key}" "${opset_version_value}")
+ set_enable_onnx_checker=$(func_set_params "${enable_onnx_checker_key}" "${enable_onnx_checker_value}")
+ trans_rec_log="${LOG_PATH}/trans_model_rec.log"
+ trans_model_cmd="${padlle2onnx_cmd} ${set_dirname} ${set_model_filename} ${set_params_filename} ${set_save_model} ${set_opset_version} ${set_enable_onnx_checker} > ${trans_rec_log} 2>&1 "
+ eval $trans_model_cmd
+ last_status=${PIPESTATUS[0]}
+ status_check $last_status "${trans_model_cmd}" "${status_log}" "${model_name}"
+ elif [[ ${model_name} =~ "det" ]]; then
+ # trans det
+ set_dirname=$(func_set_params "--model_dir" "${det_infer_model_dir_value}")
+ set_model_filename=$(func_set_params "${model_filename_key}" "${model_filename_value}")
+ set_params_filename=$(func_set_params "${params_filename_key}" "${params_filename_value}")
+ set_save_model=$(func_set_params "--save_file" "${det_save_file_value}")
+ set_opset_version=$(func_set_params "${opset_version_key}" "${opset_version_value}")
+ set_enable_onnx_checker=$(func_set_params "${enable_onnx_checker_key}" "${enable_onnx_checker_value}")
+ trans_det_log="${LOG_PATH}/trans_model_det.log"
+ trans_model_cmd="${padlle2onnx_cmd} ${set_dirname} ${set_model_filename} ${set_params_filename} ${set_save_model} ${set_opset_version} ${set_enable_onnx_checker} > ${trans_det_log} 2>&1 "
+ eval $trans_model_cmd
+ last_status=${PIPESTATUS[0]}
+ status_check $last_status "${trans_model_cmd}" "${status_log}" "${model_name}"
+ elif [[ ${model_name} =~ "rec" ]]; then
+ # trans rec
+ set_dirname=$(func_set_params "--model_dir" "${rec_infer_model_dir_value}")
+ set_model_filename=$(func_set_params "${model_filename_key}" "${model_filename_value}")
+ set_params_filename=$(func_set_params "${params_filename_key}" "${params_filename_value}")
+ set_save_model=$(func_set_params "--save_file" "${rec_save_file_value}")
+ set_opset_version=$(func_set_params "${opset_version_key}" "${opset_version_value}")
+ set_enable_onnx_checker=$(func_set_params "${enable_onnx_checker_key}" "${enable_onnx_checker_value}")
+ trans_rec_log="${LOG_PATH}/trans_model_rec.log"
+ trans_model_cmd="${padlle2onnx_cmd} ${set_dirname} ${set_model_filename} ${set_params_filename} ${set_save_model} ${set_opset_version} ${set_enable_onnx_checker} > ${trans_rec_log} 2>&1 "
+ eval $trans_model_cmd
+ last_status=${PIPESTATUS[0]}
+ status_check $last_status "${trans_model_cmd}" "${status_log}" "${model_name}"
+ fi
+
# python inference
- set_gpu=$(func_set_params "${use_gpu_key}" "${use_gpu_value}")
- set_model_dir=$(func_set_params "${det_model_key}" "${save_file_value}")
- set_img_dir=$(func_set_params "${image_dir_key}" "${image_dir_value}")
- infer_model_cmd="${python} ${inference_py} ${set_gpu} ${set_img_dir} ${set_model_dir} --use_onnx=True > ${_save_log_path} 2>&1 "
- eval $infer_model_cmd
- status_check $last_status "${infer_model_cmd}" "${status_log}"
+ for use_gpu in ${use_gpu_list[*]}; do
+ if [ ${use_gpu} = "False" ] || [ ${use_gpu} = "cpu" ]; then
+ _save_log_path="${LOG_PATH}/paddle2onnx_infer_cpu.log"
+ set_gpu=$(func_set_params "${use_gpu_key}" "${use_gpu}")
+ set_img_dir=$(func_set_params "${image_dir_key}" "${image_dir_value}")
+ if [ ${model_name} = "ch_PP-OCRv2" ] || [ ${model_name} = "ch_PP-OCRv3" ] || [ ${model_name} = "ch_ppocr_mobile_v2.0" ] || [ ${model_name} = "ch_ppocr_server_v2.0" ]; then
+ set_det_model_dir=$(func_set_params "${det_model_key}" "${det_save_file_value}")
+ set_rec_model_dir=$(func_set_params "${rec_model_key}" "${rec_save_file_value}")
+ infer_model_cmd="${python} ${inference_py} ${set_gpu} ${set_img_dir} ${set_det_model_dir} ${set_rec_model_dir} --use_onnx=True > ${_save_log_path} 2>&1 "
+ elif [[ ${model_name} =~ "det" ]]; then
+ set_det_model_dir=$(func_set_params "${det_model_key}" "${det_save_file_value}")
+ infer_model_cmd="${python} ${inference_py} ${set_gpu} ${set_img_dir} ${set_det_model_dir} --use_onnx=True > ${_save_log_path} 2>&1 "
+ elif [[ ${model_name} =~ "rec" ]]; then
+ set_rec_model_dir=$(func_set_params "${rec_model_key}" "${rec_save_file_value}")
+ infer_model_cmd="${python} ${inference_py} ${set_gpu} ${set_img_dir} ${set_rec_model_dir} --use_onnx=True > ${_save_log_path} 2>&1 "
+ fi
+ eval $infer_model_cmd
+ last_status=${PIPESTATUS[0]}
+ eval "cat ${_save_log_path}"
+ status_check $last_status "${infer_model_cmd}" "${status_log}" "${model_name}"
+ elif [ ${use_gpu} = "True" ] || [ ${use_gpu} = "gpu" ]; then
+ _save_log_path="${LOG_PATH}/paddle2onnx_infer_gpu.log"
+ set_gpu=$(func_set_params "${use_gpu_key}" "${use_gpu}")
+ set_img_dir=$(func_set_params "${image_dir_key}" "${image_dir_value}")
+ if [ ${model_name} = "ch_PP-OCRv2" ] || [ ${model_name} = "ch_PP-OCRv3" ] || [ ${model_name} = "ch_ppocr_mobile_v2.0" ] || [ ${model_name} = "ch_ppocr_server_v2.0" ]; then
+ set_det_model_dir=$(func_set_params "${det_model_key}" "${det_save_file_value}")
+ set_rec_model_dir=$(func_set_params "${rec_model_key}" "${rec_save_file_value}")
+ infer_model_cmd="${python} ${inference_py} ${set_gpu} ${set_img_dir} ${set_det_model_dir} ${set_rec_model_dir} --use_onnx=True > ${_save_log_path} 2>&1 "
+ elif [[ ${model_name} =~ "det" ]]; then
+ set_det_model_dir=$(func_set_params "${det_model_key}" "${det_save_file_value}")
+ infer_model_cmd="${python} ${inference_py} ${set_gpu} ${set_img_dir} ${set_det_model_dir} --use_onnx=True > ${_save_log_path} 2>&1 "
+ elif [[ ${model_name} =~ "rec" ]]; then
+ set_rec_model_dir=$(func_set_params "${rec_model_key}" "${rec_save_file_value}")
+ infer_model_cmd="${python} ${inference_py} ${set_gpu} ${set_img_dir} ${set_rec_model_dir} --use_onnx=True > ${_save_log_path} 2>&1 "
+ fi
+ eval $infer_model_cmd
+ last_status=${PIPESTATUS[0]}
+ eval "cat ${_save_log_path}"
+ status_check $last_status "${infer_model_cmd}" "${status_log}" "${model_name}"
+ else
+ echo "Does not support hardware other than CPU and GPU Currently!"
+ fi
+ done
}
diff --git a/test_tipc/test_ptq_inference_python.sh b/test_tipc/test_ptq_inference_python.sh
new file mode 100644
index 0000000000000000000000000000000000000000..288e6098966be4aaf2953d627e7890963100cb6e
--- /dev/null
+++ b/test_tipc/test_ptq_inference_python.sh
@@ -0,0 +1,158 @@
+#!/bin/bash
+source test_tipc/common_func.sh
+
+FILENAME=$1
+# MODE be one of [''whole_infer']
+MODE=$2
+
+IFS=$'\n'
+# parser klquant_infer params
+
+dataline=$(awk 'NR==1, NR==17{print}' $FILENAME)
+lines=(${dataline})
+model_name=$(func_parser_value "${lines[1]}")
+python=$(func_parser_value "${lines[2]}")
+export_weight=$(func_parser_key "${lines[3]}")
+save_infer_key=$(func_parser_key "${lines[4]}")
+# parser inference model
+infer_model_dir_list=$(func_parser_value "${lines[5]}")
+infer_export_list=$(func_parser_value "${lines[6]}")
+infer_is_quant=$(func_parser_value "${lines[7]}")
+# parser inference
+inference_py=$(func_parser_value "${lines[8]}")
+use_gpu_key=$(func_parser_key "${lines[9]}")
+use_gpu_list=$(func_parser_value "${lines[9]}")
+use_mkldnn_key=$(func_parser_key "${lines[10]}")
+use_mkldnn_list=$(func_parser_value "${lines[10]}")
+cpu_threads_key=$(func_parser_key "${lines[11]}")
+cpu_threads_list=$(func_parser_value "${lines[11]}")
+batch_size_key=$(func_parser_key "${lines[12]}")
+batch_size_list=$(func_parser_value "${lines[12]}")
+use_trt_key=$(func_parser_key "${lines[13]}")
+use_trt_list=$(func_parser_value "${lines[13]}")
+precision_key=$(func_parser_key "${lines[14]}")
+precision_list=$(func_parser_value "${lines[14]}")
+infer_model_key=$(func_parser_key "${lines[15]}")
+image_dir_key=$(func_parser_key "${lines[16]}")
+infer_img_dir=$(func_parser_value "${lines[16]}")
+save_log_key=$(func_parser_key "${lines[17]}")
+save_log_value=$(func_parser_value "${lines[17]}")
+benchmark_key=$(func_parser_key "${lines[18]}")
+benchmark_value=$(func_parser_value "${lines[18]}")
+infer_key1=$(func_parser_key "${lines[19]}")
+infer_value1=$(func_parser_value "${lines[19]}")
+
+
+LOG_PATH="./test_tipc/output/${model_name}/${MODE}"
+mkdir -p ${LOG_PATH}
+status_log="${LOG_PATH}/results_python.log"
+
+
+function func_inference(){
+ IFS='|'
+ _python=$1
+ _script=$2
+ _model_dir=$3
+ _log_path=$4
+ _img_dir=$5
+ _flag_quant=$6
+ # inference
+ for use_gpu in ${use_gpu_list[*]}; do
+ if [ ${use_gpu} = "False" ] || [ ${use_gpu} = "cpu" ]; then
+ for use_mkldnn in ${use_mkldnn_list[*]}; do
+ for threads in ${cpu_threads_list[*]}; do
+ for batch_size in ${batch_size_list[*]}; do
+ for precision in ${precision_list[*]}; do
+ if [ ${use_mkldnn} = "False" ] && [ ${precision} = "fp16" ]; then
+ continue
+ fi # skip when enable fp16 but disable mkldnn
+ if [ ${_flag_quant} = "True" ] && [ ${precision} != "int8" ]; then
+ continue
+ fi # skip when quant model inference but precision is not int8
+ set_precision=$(func_set_params "${precision_key}" "${precision}")
+
+ _save_log_path="${_log_path}/python_infer_cpu_usemkldnn_${use_mkldnn}_threads_${threads}_precision_${precision}_batchsize_${batch_size}.log"
+ set_infer_data=$(func_set_params "${image_dir_key}" "${_img_dir}")
+ set_benchmark=$(func_set_params "${benchmark_key}" "${benchmark_value}")
+ set_batchsize=$(func_set_params "${batch_size_key}" "${batch_size}")
+ set_mkldnn=$(func_set_params "${use_mkldnn_key}" "${use_mkldnn}")
+ set_cpu_threads=$(func_set_params "${cpu_threads_key}" "${threads}")
+ set_model_dir=$(func_set_params "${infer_model_key}" "${_model_dir}")
+ set_infer_params0=$(func_set_params "${save_log_key}" "${save_log_value}")
+ set_infer_params1=$(func_set_params "${infer_key1}" "${infer_value1}")
+ command="${_python} ${_script} ${use_gpu_key}=${use_gpu} ${set_mkldnn} ${set_cpu_threads} ${set_model_dir} ${set_batchsize} ${set_infer_params0} ${set_infer_data} ${set_benchmark} ${set_precision} ${set_infer_params1} > ${_save_log_path} 2>&1 "
+ eval $command
+ last_status=${PIPESTATUS[0]}
+ eval "cat ${_save_log_path}"
+ status_check $last_status "${command}" "${status_log}" "${model_name}"
+ done
+ done
+ done
+ done
+ elif [ ${use_gpu} = "True" ] || [ ${use_gpu} = "gpu" ]; then
+ for use_trt in ${use_trt_list[*]}; do
+ for precision in ${precision_list[*]}; do
+ if [ ${_flag_quant} = "True" ] && [ ${precision} != "int8" ]; then
+ continue
+ fi # skip when quant model inference but precision is not int8
+ for batch_size in ${batch_size_list[*]}; do
+ _save_log_path="${_log_path}/python_infer_gpu_usetrt_${use_trt}_precision_${precision}_batchsize_${batch_size}.log"
+ set_infer_data=$(func_set_params "${image_dir_key}" "${_img_dir}")
+ set_benchmark=$(func_set_params "${benchmark_key}" "${benchmark_value}")
+ set_batchsize=$(func_set_params "${batch_size_key}" "${batch_size}")
+ set_tensorrt=$(func_set_params "${use_trt_key}" "${use_trt}")
+ set_precision=$(func_set_params "${precision_key}" "${precision}")
+ set_model_dir=$(func_set_params "${infer_model_key}" "${_model_dir}")
+ set_infer_params0=$(func_set_params "${save_log_key}" "${save_log_value}")
+ set_infer_params1=$(func_set_params "${infer_key1}" "${infer_value1}")
+ command="${_python} ${_script} ${use_gpu_key}=${use_gpu} ${set_tensorrt} ${set_precision} ${set_model_dir} ${set_batchsize} ${set_infer_data} ${set_benchmark} ${set_infer_params1} ${set_infer_params0} > ${_save_log_path} 2>&1 "
+ eval $command
+ last_status=${PIPESTATUS[0]}
+ eval "cat ${_save_log_path}"
+ status_check $last_status "${command}" "${status_log}" "${model_name}"
+
+ done
+ done
+ done
+ else
+ echo "Does not support hardware other than CPU and GPU Currently!"
+ fi
+ done
+}
+
+if [ ${MODE} = "whole_infer" ]; then
+ GPUID=$3
+ if [ ${#GPUID} -le 0 ];then
+ env=" "
+ else
+ env="export CUDA_VISIBLE_DEVICES=${GPUID}"
+ fi
+ # set CUDA_VISIBLE_DEVICES
+ eval $env
+ export Count=0
+ IFS="|"
+ infer_run_exports=(${infer_export_list})
+ infer_quant_flag=(${infer_is_quant})
+ for infer_model in ${infer_model_dir_list[*]}; do
+ # run export
+ if [ ${infer_run_exports[Count]} != "null" ];then
+ save_infer_dir="${infer_model}_klquant"
+ set_export_weight=$(func_set_params "${export_weight}" "${infer_model}")
+ set_save_infer_key=$(func_set_params "${save_infer_key}" "${save_infer_dir}")
+ export_log_path="${LOG_PATH}_export_${Count}.log"
+ export_cmd="${python} ${infer_run_exports[Count]} ${set_export_weight} ${set_save_infer_key} > ${export_log_path} 2>&1 "
+ echo ${infer_run_exports[Count]}
+ echo $export_cmd
+ eval $export_cmd
+ status_export=$?
+ status_check $status_export "${export_cmd}" "${status_log}" "${model_name}"
+ else
+ save_infer_dir=${infer_model}
+ fi
+ #run inference
+ is_quant="True"
+ func_inference "${python}" "${inference_py}" "${save_infer_dir}" "${LOG_PATH}" "${infer_img_dir}" ${is_quant}
+ Count=$(($Count + 1))
+ done
+fi
+
diff --git a/test_tipc/test_serving.sh b/test_tipc/test_serving.sh
deleted file mode 100644
index 260b252f4144b66d42902112708f2e45fa0b7ac1..0000000000000000000000000000000000000000
--- a/test_tipc/test_serving.sh
+++ /dev/null
@@ -1,175 +0,0 @@
-#!/bin/bash
-source test_tipc/common_func.sh
-
-FILENAME=$1
-dataline=$(awk 'NR==1, NR==18{print}' $FILENAME)
-
-# parser params
-IFS=$'\n'
-lines=(${dataline})
-
-# parser serving
-model_name=$(func_parser_value "${lines[1]}")
-python_list=$(func_parser_value "${lines[2]}")
-trans_model_py=$(func_parser_value "${lines[3]}")
-infer_model_dir_key=$(func_parser_key "${lines[4]}")
-infer_model_dir_value=$(func_parser_value "${lines[4]}")
-model_filename_key=$(func_parser_key "${lines[5]}")
-model_filename_value=$(func_parser_value "${lines[5]}")
-params_filename_key=$(func_parser_key "${lines[6]}")
-params_filename_value=$(func_parser_value "${lines[6]}")
-serving_server_key=$(func_parser_key "${lines[7]}")
-serving_server_value=$(func_parser_value "${lines[7]}")
-serving_client_key=$(func_parser_key "${lines[8]}")
-serving_client_value=$(func_parser_value "${lines[8]}")
-serving_dir_value=$(func_parser_value "${lines[9]}")
-web_service_py=$(func_parser_value "${lines[10]}")
-web_use_gpu_key=$(func_parser_key "${lines[11]}")
-web_use_gpu_list=$(func_parser_value "${lines[11]}")
-web_use_mkldnn_key=$(func_parser_key "${lines[12]}")
-web_use_mkldnn_list=$(func_parser_value "${lines[12]}")
-web_cpu_threads_key=$(func_parser_key "${lines[13]}")
-web_cpu_threads_list=$(func_parser_value "${lines[13]}")
-web_use_trt_key=$(func_parser_key "${lines[14]}")
-web_use_trt_list=$(func_parser_value "${lines[14]}")
-web_precision_key=$(func_parser_key "${lines[15]}")
-web_precision_list=$(func_parser_value "${lines[15]}")
-pipeline_py=$(func_parser_value "${lines[16]}")
-image_dir_key=$(func_parser_key "${lines[17]}")
-image_dir_value=$(func_parser_value "${lines[17]}")
-
-LOG_PATH="../../test_tipc/output"
-mkdir -p ./test_tipc/output
-status_log="${LOG_PATH}/results_serving.log"
-
-function func_serving(){
- IFS='|'
- _python=$1
- _script=$2
- _model_dir=$3
- # pdserving
- set_dirname=$(func_set_params "${infer_model_dir_key}" "${infer_model_dir_value}")
- set_model_filename=$(func_set_params "${model_filename_key}" "${model_filename_value}")
- set_params_filename=$(func_set_params "${params_filename_key}" "${params_filename_value}")
- set_serving_server=$(func_set_params "${serving_server_key}" "${serving_server_value}")
- set_serving_client=$(func_set_params "${serving_client_key}" "${serving_client_value}")
- set_image_dir=$(func_set_params "${image_dir_key}" "${image_dir_value}")
- python_list=(${python_list})
- trans_model_cmd="${python_list[0]} ${trans_model_py} ${set_dirname} ${set_model_filename} ${set_params_filename} ${set_serving_server} ${set_serving_client}"
- eval $trans_model_cmd
- cd ${serving_dir_value}
- unset https_proxy
- unset http_proxy
- for python in ${python_list[*]}; do
- if [ ${python} = "cpp" ]; then
- for use_gpu in ${web_use_gpu_list[*]}; do
- if [ ${use_gpu} = "null" ]; then
- web_service_cpp_cmd="${python_list[0]} -m paddle_serving_server.serve --model ppocr_det_mobile_2.0_serving/ ppocr_rec_mobile_2.0_serving/ --port 9293"
- eval $web_service_cpp_cmd
- last_status=${PIPESTATUS[0]}
- status_check $last_status "${web_service_cpp_cmd}" "${status_log}"
- sleep 2s
- _save_log_path="${LOG_PATH}/server_infer_cpp_cpu_pipeline_usemkldnn_False_threads_4_batchsize_1.log"
- pipeline_cmd="${python_list[0]} ocr_cpp_client.py ppocr_det_mobile_2.0_client/ ppocr_rec_mobile_2.0_client/"
- eval $pipeline_cmd
- last_status=${PIPESTATUS[0]}
- status_check $last_status "${pipeline_cmd}" "${status_log}"
- sleep 2s
- ps ux | grep -E 'web_service|pipeline' | awk '{print $2}' | xargs kill -s 9
- else
- web_service_cpp_cmd="${python_list[0]} -m paddle_serving_server.serve --model ppocr_det_mobile_2.0_serving/ ppocr_rec_mobile_2.0_serving/ --port 9293 --gpu_id=0"
- eval $web_service_cpp_cmd
- sleep 2s
- _save_log_path="${LOG_PATH}/server_infer_cpp_cpu_pipeline_usemkldnn_False_threads_4_batchsize_1.log"
- pipeline_cmd="${python_list[0]} ocr_cpp_client.py ppocr_det_mobile_2.0_client/ ppocr_rec_mobile_2.0_client/"
- eval $pipeline_cmd
- last_status=${PIPESTATUS[0]}
- status_check $last_status "${pipeline_cmd}" "${status_log}"
- sleep 2s
- ps ux | grep -E 'web_service|pipeline' | awk '{print $2}' | xargs kill -s 9
- fi
- done
- else
- # python serving
- for use_gpu in ${web_use_gpu_list[*]}; do
- if [ ${use_gpu} = "null" ]; then
- for use_mkldnn in ${web_use_mkldnn_list[*]}; do
- for threads in ${web_cpu_threads_list[*]}; do
- set_cpu_threads=$(func_set_params "${web_cpu_threads_key}" "${threads}")
- web_service_cmd="${python} ${web_service_py} ${web_use_gpu_key}="" ${web_use_mkldnn_key}=${use_mkldnn} ${set_cpu_threads} &"
- eval $web_service_cmd
- last_status=${PIPESTATUS[0]}
- status_check $last_status "${web_service_cmd}" "${status_log}"
- sleep 2s
- for pipeline in ${pipeline_py[*]}; do
- _save_log_path="${LOG_PATH}/server_infer_cpu_${pipeline%_client*}_usemkldnn_${use_mkldnn}_threads_${threads}_batchsize_1.log"
- pipeline_cmd="${python} ${pipeline} ${set_image_dir} > ${_save_log_path} 2>&1 "
- eval $pipeline_cmd
- last_status=${PIPESTATUS[0]}
- eval "cat ${_save_log_path}"
- status_check $last_status "${pipeline_cmd}" "${status_log}"
- sleep 2s
- done
- ps ux | grep -E 'web_service|pipeline' | awk '{print $2}' | xargs kill -s 9
- done
- done
- elif [ ${use_gpu} = "0" ]; then
- for use_trt in ${web_use_trt_list[*]}; do
- for precision in ${web_precision_list[*]}; do
- if [[ ${_flag_quant} = "False" ]] && [[ ${precision} =~ "int8" ]]; then
- continue
- fi
- if [[ ${precision} =~ "fp16" || ${precision} =~ "int8" ]] && [ ${use_trt} = "False" ]; then
- continue
- fi
- if [[ ${use_trt} = "False" || ${precision} =~ "int8" ]] && [[ ${_flag_quant} = "True" ]]; then
- continue
- fi
- set_tensorrt=$(func_set_params "${web_use_trt_key}" "${use_trt}")
- if [ ${use_trt} = True ]; then
- device_type=2
- fi
- set_precision=$(func_set_params "${web_precision_key}" "${precision}")
- web_service_cmd="${python} ${web_service_py} ${web_use_gpu_key}=${use_gpu} ${set_tensorrt} ${set_precision} & "
- eval $web_service_cmd
- last_status=${PIPESTATUS[0]}
- status_check $last_status "${web_service_cmd}" "${status_log}"
-
- sleep 2s
- for pipeline in ${pipeline_py[*]}; do
- _save_log_path="${LOG_PATH}/server_infer_gpu_${pipeline%_client*}_usetrt_${use_trt}_precision_${precision}_batchsize_1.log"
- pipeline_cmd="${python} ${pipeline} ${set_image_dir}> ${_save_log_path} 2>&1"
- eval $pipeline_cmd
- last_status=${PIPESTATUS[0]}
- eval "cat ${_save_log_path}"
- status_check $last_status "${pipeline_cmd}" "${status_log}"
- sleep 2s
- done
- ps ux | grep -E 'web_service|pipeline' | awk '{print $2}' | xargs kill -s 9
- done
- done
- else
- echo "Does not support hardware other than CPU and GPU Currently!"
- fi
- done
- fi
- done
-}
-
-
-#set cuda device
-GPUID=$2
-if [ ${#GPUID} -le 0 ];then
- env="export CUDA_VISIBLE_DEVICES=0"
-else
- env="export CUDA_VISIBLE_DEVICES=${GPUID}"
-fi
-eval $env
-echo $env
-
-
-echo "################### run test ###################"
-
-export Count=0
-IFS="|"
-func_serving "${web_service_cmd}"
diff --git a/test_tipc/test_serving_infer_cpp.sh b/test_tipc/test_serving_infer_cpp.sh
new file mode 100644
index 0000000000000000000000000000000000000000..0be6a45adf3105f088a96336dddfbe9ac612f19b
--- /dev/null
+++ b/test_tipc/test_serving_infer_cpp.sh
@@ -0,0 +1,139 @@
+#!/bin/bash
+source test_tipc/common_func.sh
+
+function func_parser_model_config(){
+ strs=$1
+ IFS="/"
+ array=(${strs})
+ tmp=${array[-1]}
+ echo ${tmp}
+}
+
+FILENAME=$1
+dataline=$(awk 'NR==1, NR==19{print}' $FILENAME)
+MODE=$2
+
+# parser params
+IFS=$'\n'
+lines=(${dataline})
+
+# parser serving
+model_name=$(func_parser_value "${lines[1]}")
+python_list=$(func_parser_value "${lines[2]}")
+trans_model_py=$(func_parser_value "${lines[3]}")
+det_infer_model_dir_key=$(func_parser_key "${lines[4]}")
+det_infer_model_dir_value=$(func_parser_value "${lines[4]}")
+model_filename_key=$(func_parser_key "${lines[5]}")
+model_filename_value=$(func_parser_value "${lines[5]}")
+params_filename_key=$(func_parser_key "${lines[6]}")
+params_filename_value=$(func_parser_value "${lines[6]}")
+det_serving_server_key=$(func_parser_key "${lines[7]}")
+det_serving_server_value=$(func_parser_value "${lines[7]}")
+det_serving_client_key=$(func_parser_key "${lines[8]}")
+det_serving_client_value=$(func_parser_value "${lines[8]}")
+rec_infer_model_dir_key=$(func_parser_key "${lines[9]}")
+rec_infer_model_dir_value=$(func_parser_value "${lines[9]}")
+rec_serving_server_key=$(func_parser_key "${lines[10]}")
+rec_serving_server_value=$(func_parser_value "${lines[10]}")
+rec_serving_client_key=$(func_parser_key "${lines[11]}")
+rec_serving_client_value=$(func_parser_value "${lines[11]}")
+det_server_value=$(func_parser_model_config "${lines[7]}")
+det_client_value=$(func_parser_model_config "${lines[8]}")
+rec_server_value=$(func_parser_model_config "${lines[10]}")
+rec_client_value=$(func_parser_model_config "${lines[11]}")
+serving_dir_value=$(func_parser_value "${lines[12]}")
+web_service_py=$(func_parser_value "${lines[13]}")
+op_key=$(func_parser_key "${lines[14]}")
+op_value=$(func_parser_value "${lines[14]}")
+port_key=$(func_parser_key "${lines[15]}")
+port_value=$(func_parser_value "${lines[15]}")
+gpu_key=$(func_parser_key "${lines[16]}")
+gpu_value=$(func_parser_value "${lines[16]}")
+cpp_client_py=$(func_parser_value "${lines[17]}")
+image_dir_key=$(func_parser_key "${lines[18]}")
+image_dir_value=$(func_parser_value "${lines[18]}")
+
+LOG_PATH="$(pwd)/test_tipc/output/${model_name}/${MODE}/cpp"
+mkdir -p ${LOG_PATH}
+status_log="${LOG_PATH}/results_cpp_serving.log"
+
+function func_serving(){
+ IFS='|'
+ _python=$1
+ _script=$2
+ _model_dir=$3
+ # pdserving
+ set_model_filename=$(func_set_params "${model_filename_key}" "${model_filename_value}")
+ set_params_filename=$(func_set_params "${params_filename_key}" "${params_filename_value}")
+ # trans det
+ set_dirname=$(func_set_params "--dirname" "${det_infer_model_dir_value}")
+ set_serving_server=$(func_set_params "--serving_server" "${det_serving_server_value}")
+ set_serving_client=$(func_set_params "--serving_client" "${det_serving_client_value}")
+ python_list=(${python_list})
+ trans_det_log="${LOG_PATH}/cpp_trans_model_det.log"
+ trans_model_cmd="${python_list[0]} ${trans_model_py} ${set_dirname} ${set_model_filename} ${set_params_filename} ${set_serving_server} ${set_serving_client} > ${trans_det_log} 2>&1 "
+ eval $trans_model_cmd
+ cp "deploy/pdserving/serving_client_conf.prototxt" ${det_serving_client_value}
+ # trans rec
+ set_dirname=$(func_set_params "--dirname" "${rec_infer_model_dir_value}")
+ set_serving_server=$(func_set_params "--serving_server" "${rec_serving_server_value}")
+ set_serving_client=$(func_set_params "--serving_client" "${rec_serving_client_value}")
+ python_list=(${python_list})
+ trans_rec_log="${LOG_PATH}/cpp_trans_model_rec.log"
+ trans_model_cmd="${python_list[0]} ${trans_model_py} ${set_dirname} ${set_model_filename} ${set_params_filename} ${set_serving_server} ${set_serving_client} > ${trans_rec_log} 2>&1 "
+ eval $trans_model_cmd
+ last_status=${PIPESTATUS[0]}
+ status_check $last_status "${trans_model_cmd}" "${status_log}" "${model_name}"
+ set_image_dir=$(func_set_params "${image_dir_key}" "${image_dir_value}")
+ python_list=(${python_list})
+ cd ${serving_dir_value}
+
+ # cpp serving
+ for gpu_id in ${gpu_value[*]}; do
+ if [ ${gpu_id} = "null" ]; then
+ server_log_path="${LOG_PATH}/cpp_server_cpu.log"
+ web_service_cpp_cmd="nohup ${python_list[0]} ${web_service_py} --model ${det_server_value} ${rec_server_value} ${op_key} ${op_value} ${port_key} ${port_value} > ${server_log_path} 2>&1 &"
+ eval $web_service_cpp_cmd
+ last_status=${PIPESTATUS[0]}
+ status_check $last_status "${web_service_cpp_cmd}" "${status_log}" "${model_name}"
+ sleep 5s
+ _save_log_path="${LOG_PATH}/cpp_client_cpu.log"
+ cpp_client_cmd="${python_list[0]} ${cpp_client_py} ${det_client_value} ${rec_client_value} > ${_save_log_path} 2>&1"
+ eval $cpp_client_cmd
+ last_status=${PIPESTATUS[0]}
+ eval "cat ${_save_log_path}"
+ status_check $last_status "${cpp_client_cmd}" "${status_log}" "${model_name}"
+ ps ux | grep -i ${port_value} | awk '{print $2}' | xargs kill -s 9
+ else
+ server_log_path="${LOG_PATH}/cpp_server_gpu.log"
+ web_service_cpp_cmd="nohup ${python_list[0]} ${web_service_py} --model ${det_server_value} ${rec_server_value} ${op_key} ${op_value} ${port_key} ${port_value} ${gpu_key} ${gpu_id} > ${server_log_path} 2>&1 &"
+ eval $web_service_cpp_cmd
+ sleep 5s
+ _save_log_path="${LOG_PATH}/cpp_client_gpu.log"
+ cpp_client_cmd="${python_list[0]} ${cpp_client_py} ${det_client_value} ${rec_client_value} > ${_save_log_path} 2>&1"
+ eval $cpp_client_cmd
+ last_status=${PIPESTATUS[0]}
+ eval "cat ${_save_log_path}"
+ status_check $last_status "${cpp_client_cmd}" "${status_log}" "${model_name}"
+ ps ux | grep -i ${port_value} | awk '{print $2}' | xargs kill -s 9
+ fi
+ done
+}
+
+
+#set cuda device
+GPUID=$3
+if [ ${#GPUID} -le 0 ];then
+ env="export CUDA_VISIBLE_DEVICES=0"
+else
+ env="export CUDA_VISIBLE_DEVICES=${GPUID}"
+fi
+eval $env
+echo $env
+
+
+echo "################### run test ###################"
+
+export Count=0
+IFS="|"
+func_serving "${web_service_cpp_cmd}"
diff --git a/test_tipc/test_serving_infer_python.sh b/test_tipc/test_serving_infer_python.sh
new file mode 100644
index 0000000000000000000000000000000000000000..4ccccc06e23ce086e7dac1f3446aae9130605444
--- /dev/null
+++ b/test_tipc/test_serving_infer_python.sh
@@ -0,0 +1,229 @@
+#!/bin/bash
+source test_tipc/common_func.sh
+
+function func_parser_model_config(){
+ strs=$1
+ IFS="/"
+ array=(${strs})
+ tmp=${array[-1]}
+ echo ${tmp}
+}
+
+FILENAME=$1
+dataline=$(awk 'NR==1, NR==23{print}' $FILENAME)
+MODE=$2
+
+# parser params
+IFS=$'\n'
+lines=(${dataline})
+
+# parser serving
+model_name=$(func_parser_value "${lines[1]}")
+python_list=$(func_parser_value "${lines[2]}")
+trans_model_py=$(func_parser_value "${lines[3]}")
+det_infer_model_dir_key=$(func_parser_key "${lines[4]}")
+det_infer_model_dir_value=$(func_parser_value "${lines[4]}")
+model_filename_key=$(func_parser_key "${lines[5]}")
+model_filename_value=$(func_parser_value "${lines[5]}")
+params_filename_key=$(func_parser_key "${lines[6]}")
+params_filename_value=$(func_parser_value "${lines[6]}")
+det_serving_server_key=$(func_parser_key "${lines[7]}")
+det_serving_server_value=$(func_parser_value "${lines[7]}")
+det_serving_client_key=$(func_parser_key "${lines[8]}")
+det_serving_client_value=$(func_parser_value "${lines[8]}")
+rec_infer_model_dir_key=$(func_parser_key "${lines[9]}")
+rec_infer_model_dir_value=$(func_parser_value "${lines[9]}")
+rec_serving_server_key=$(func_parser_key "${lines[10]}")
+rec_serving_server_value=$(func_parser_value "${lines[10]}")
+rec_serving_client_key=$(func_parser_key "${lines[11]}")
+rec_serving_client_value=$(func_parser_value "${lines[11]}")
+serving_dir_value=$(func_parser_value "${lines[12]}")
+web_service_py=$(func_parser_value "${lines[13]}")
+web_use_gpu_key=$(func_parser_key "${lines[14]}")
+web_use_gpu_list=$(func_parser_value "${lines[14]}")
+web_use_mkldnn_key=$(func_parser_key "${lines[15]}")
+web_use_mkldnn_list=$(func_parser_value "${lines[15]}")
+web_cpu_threads_key=$(func_parser_key "${lines[16]}")
+web_cpu_threads_list=$(func_parser_value "${lines[16]}")
+web_use_trt_key=$(func_parser_key "${lines[17]}")
+web_use_trt_list=$(func_parser_value "${lines[17]}")
+web_precision_key=$(func_parser_key "${lines[18]}")
+web_precision_list=$(func_parser_value "${lines[18]}")
+det_server_key=$(func_parser_key "${lines[19]}")
+det_server_value=$(func_parser_model_config "${lines[7]}")
+det_client_value=$(func_parser_model_config "${lines[8]}")
+rec_server_key=$(func_parser_key "${lines[20]}")
+rec_server_value=$(func_parser_model_config "${lines[10]}")
+rec_client_value=$(func_parser_model_config "${lines[11]}")
+pipeline_py=$(func_parser_value "${lines[21]}")
+image_dir_key=$(func_parser_key "${lines[22]}")
+image_dir_value=$(func_parser_value "${lines[22]}")
+
+LOG_PATH="$(pwd)/test_tipc/output/${model_name}/${MODE}/python"
+mkdir -p ${LOG_PATH}
+status_log="${LOG_PATH}/results_python_serving.log"
+
+function func_serving(){
+ IFS='|'
+ _python=$1
+ _script=$2
+ _model_dir=$3
+ # pdserving
+ set_model_filename=$(func_set_params "${model_filename_key}" "${model_filename_value}")
+ set_params_filename=$(func_set_params "${params_filename_key}" "${params_filename_value}")
+ if [ ${model_name} = "ch_PP-OCRv2" ] || [ ${model_name} = "ch_PP-OCRv3" ] || [ ${model_name} = "ch_ppocr_mobile_v2.0" ] || [ ${model_name} = "ch_ppocr_server_v2.0" ]; then
+ # trans det
+ set_dirname=$(func_set_params "--dirname" "${det_infer_model_dir_value}")
+ set_serving_server=$(func_set_params "--serving_server" "${det_serving_server_value}")
+ set_serving_client=$(func_set_params "--serving_client" "${det_serving_client_value}")
+ python_list=(${python_list})
+ trans_det_log="${LOG_PATH}/python_trans_model_det.log"
+ trans_model_cmd="${python_list[0]} ${trans_model_py} ${set_dirname} ${set_model_filename} ${set_params_filename} ${set_serving_server} ${set_serving_client} > ${trans_det_log} 2>&1 "
+ eval $trans_model_cmd
+ # trans rec
+ set_dirname=$(func_set_params "--dirname" "${rec_infer_model_dir_value}")
+ set_serving_server=$(func_set_params "--serving_server" "${rec_serving_server_value}")
+ set_serving_client=$(func_set_params "--serving_client" "${rec_serving_client_value}")
+ python_list=(${python_list})
+ trans_rec_log="${LOG_PATH}/python_trans_model_rec.log"
+ trans_model_cmd="${python_list[0]} ${trans_model_py} ${set_dirname} ${set_model_filename} ${set_params_filename} ${set_serving_server} ${set_serving_client} > ${trans_rec_log} 2>&1 "
+ eval $trans_model_cmd
+ elif [[ ${model_name} =~ "det" ]]; then
+ # trans det
+ set_dirname=$(func_set_params "--dirname" "${det_infer_model_dir_value}")
+ set_serving_server=$(func_set_params "--serving_server" "${det_serving_server_value}")
+ set_serving_client=$(func_set_params "--serving_client" "${det_serving_client_value}")
+ python_list=(${python_list})
+ trans_det_log="${LOG_PATH}/python_trans_model_det.log"
+ trans_model_cmd="${python_list[0]} ${trans_model_py} ${set_dirname} ${set_model_filename} ${set_params_filename} ${set_serving_server} ${set_serving_client} > ${trans_det_log} 2>&1 "
+ eval $trans_model_cmd
+ elif [[ ${model_name} =~ "rec" ]]; then
+ # trans rec
+ set_dirname=$(func_set_params "--dirname" "${rec_infer_model_dir_value}")
+ set_serving_server=$(func_set_params "--serving_server" "${rec_serving_server_value}")
+ set_serving_client=$(func_set_params "--serving_client" "${rec_serving_client_value}")
+ python_list=(${python_list})
+ trans_rec_log="${LOG_PATH}/python_trans_model_rec.log"
+ trans_model_cmd="${python_list[0]} ${trans_model_py} ${set_dirname} ${set_model_filename} ${set_params_filename} ${set_serving_server} ${set_serving_client} > ${trans_rec_log} 2>&1 "
+ eval $trans_model_cmd
+ fi
+ set_image_dir=$(func_set_params "${image_dir_key}" "${image_dir_value}")
+ python_list=(${python_list})
+
+ cd ${serving_dir_value}
+ python=${python_list[0]}
+
+ # python serving
+ for use_gpu in ${web_use_gpu_list[*]}; do
+ if [ ${use_gpu} = "null" ]; then
+ for use_mkldnn in ${web_use_mkldnn_list[*]}; do
+ for threads in ${web_cpu_threads_list[*]}; do
+ set_cpu_threads=$(func_set_params "${web_cpu_threads_key}" "${threads}")
+ server_log_path="${LOG_PATH}/python_server_cpu_usemkldnn_${use_mkldnn}_threads_${threads}.log"
+ if [ ${model_name} = "ch_PP-OCRv2" ] || [ ${model_name} = "ch_PP-OCRv3" ] || [ ${model_name} = "ch_ppocr_mobile_v2.0" ] || [ ${model_name} = "ch_ppocr_server_v2.0" ]; then
+ set_det_model_config=$(func_set_params "${det_server_key}" "${det_server_value}")
+ set_rec_model_config=$(func_set_params "${rec_server_key}" "${rec_server_value}")
+ web_service_cmd="nohup ${python} ${web_service_py} ${web_use_gpu_key}="" ${web_use_mkldnn_key}=${use_mkldnn} ${set_cpu_threads} ${set_det_model_config} ${set_rec_model_config} > ${server_log_path} 2>&1 &"
+ eval $web_service_cmd
+ last_status=${PIPESTATUS[0]}
+ status_check $last_status "${web_service_cmd}" "${status_log}" "${model_name}"
+ elif [[ ${model_name} =~ "det" ]]; then
+ set_det_model_config=$(func_set_params "${det_server_key}" "${det_server_value}")
+ web_service_cmd="nohup ${python} ${web_service_py} ${web_use_gpu_key}="" ${web_use_mkldnn_key}=${use_mkldnn} ${set_cpu_threads} ${set_det_model_config} > ${server_log_path} 2>&1 &"
+ eval $web_service_cmd
+ last_status=${PIPESTATUS[0]}
+ status_check $last_status "${web_service_cmd}" "${status_log}" "${model_name}"
+ elif [[ ${model_name} =~ "rec" ]]; then
+ set_rec_model_config=$(func_set_params "${rec_server_key}" "${rec_server_value}")
+ web_service_cmd="nohup ${python} ${web_service_py} ${web_use_gpu_key}="" ${web_use_mkldnn_key}=${use_mkldnn} ${set_cpu_threads} ${set_rec_model_config} > ${server_log_path} 2>&1 &"
+ eval $web_service_cmd
+ last_status=${PIPESTATUS[0]}
+ status_check $last_status "${web_service_cmd}" "${status_log}" "${model_name}"
+ fi
+ sleep 2s
+ for pipeline in ${pipeline_py[*]}; do
+ _save_log_path="${LOG_PATH}/python_client_cpu_${pipeline%_client*}_usemkldnn_${use_mkldnn}_threads_${threads}_batchsize_1.log"
+ pipeline_cmd="${python} ${pipeline} ${set_image_dir} > ${_save_log_path} 2>&1 "
+ eval $pipeline_cmd
+ last_status=${PIPESTATUS[0]}
+ eval "cat ${_save_log_path}"
+ status_check $last_status "${pipeline_cmd}" "${status_log}" "${model_name}"
+ sleep 2s
+ done
+ ps ux | grep -E 'web_service' | awk '{print $2}' | xargs kill -s 9
+ done
+ done
+ elif [ ${use_gpu} = "gpu" ]; then
+ for use_trt in ${web_use_trt_list[*]}; do
+ for precision in ${web_precision_list[*]}; do
+ server_log_path="${LOG_PATH}/python_server_gpu_usetrt_${use_trt}_precision_${precision}.log"
+ if [[ ${_flag_quant} = "False" ]] && [[ ${precision} =~ "int8" ]]; then
+ continue
+ fi
+ if [[ ${precision} =~ "fp16" || ${precision} =~ "int8" ]] && [ ${use_trt} = "False" ]; then
+ continue
+ fi
+ if [[ ${use_trt} = "False" || ${precision} =~ "int8" ]] && [[ ${_flag_quant} = "True" ]]; then
+ continue
+ fi
+ set_tensorrt=$(func_set_params "${web_use_trt_key}" "${use_trt}")
+ if [ ${use_trt} = True ]; then
+ device_type=2
+ fi
+ set_precision=$(func_set_params "${web_precision_key}" "${precision}")
+ if [ ${model_name} = "ch_PP-OCRv2" ] || [ ${model_name} = "ch_PP-OCRv3" ] || [ ${model_name} = "ch_ppocr_mobile_v2.0" ] || [ ${model_name} = "ch_ppocr_server_v2.0" ]; then
+ set_det_model_config=$(func_set_params "${det_server_key}" "${det_server_value}")
+ set_rec_model_config=$(func_set_params "${rec_server_key}" "${rec_server_value}")
+ web_service_cmd="nohup ${python} ${web_service_py} ${set_tensorrt} ${set_precision} ${set_det_model_config} ${set_rec_model_config} > ${server_log_path} 2>&1 &"
+ eval $web_service_cmd
+ last_status=${PIPESTATUS[0]}
+ status_check $last_status "${web_service_cmd}" "${status_log}" "${model_name}"
+ elif [[ ${model_name} =~ "det" ]]; then
+ set_det_model_config=$(func_set_params "${det_server_key}" "${det_server_value}")
+ web_service_cmd="nohup ${python} ${web_service_py} ${set_tensorrt} ${set_precision} ${set_det_model_config} > ${server_log_path} 2>&1 &"
+ eval $web_service_cmd
+ last_status=${PIPESTATUS[0]}
+ status_check $last_status "${web_service_cmd}" "${status_log}" "${model_name}"
+ elif [[ ${model_name} =~ "rec" ]]; then
+ set_rec_model_config=$(func_set_params "${rec_server_key}" "${rec_server_value}")
+ web_service_cmd="nohup ${python} ${web_service_py} ${set_tensorrt} ${set_precision} ${set_rec_model_config} > ${server_log_path} 2>&1 &"
+ eval $web_service_cmd
+ last_status=${PIPESTATUS[0]}
+ status_check $last_status "${web_service_cmd}" "${status_log}" "${model_name}"
+ fi
+ sleep 2s
+ for pipeline in ${pipeline_py[*]}; do
+ _save_log_path="${LOG_PATH}/python_client_gpu_${pipeline%_client*}_usetrt_${use_trt}_precision_${precision}_batchsize_1.log"
+ pipeline_cmd="${python} ${pipeline} ${set_image_dir}> ${_save_log_path} 2>&1"
+ eval $pipeline_cmd
+ last_status=${PIPESTATUS[0]}
+ eval "cat ${_save_log_path}"
+ status_check $last_status "${pipeline_cmd}" "${status_log}" "${model_name}"
+ sleep 2s
+ done
+ ps ux | grep -E 'web_service' | awk '{print $2}' | xargs kill -s 9
+ done
+ done
+ else
+ echo "Does not support hardware other than CPU and GPU Currently!"
+ fi
+ done
+}
+
+
+#set cuda device
+GPUID=$3
+if [ ${#GPUID} -le 0 ];then
+ env="export CUDA_VISIBLE_DEVICES=0"
+else
+ env="export CUDA_VISIBLE_DEVICES=${GPUID}"
+fi
+eval $env
+echo $env
+
+
+echo "################### run test ###################"
+
+export Count=0
+IFS="|"
+func_serving "${web_service_cmd}"
diff --git a/test_tipc/test_train_inference_python.sh b/test_tipc/test_train_inference_python.sh
index fe98cb00f6cc428995d7f91db55895e0f1cd9bfd..907efcec9008f89740971bb6d4253bafb44938c4 100644
--- a/test_tipc/test_train_inference_python.sh
+++ b/test_tipc/test_train_inference_python.sh
@@ -2,7 +2,7 @@
source test_tipc/common_func.sh
FILENAME=$1
-# MODE be one of ['lite_train_lite_infer' 'lite_train_whole_infer' 'whole_train_whole_infer', 'whole_infer', 'klquant_whole_infer']
+# MODE be one of ['lite_train_lite_infer' 'lite_train_whole_infer' 'whole_train_whole_infer', 'whole_infer']
MODE=$2
dataline=$(awk 'NR==1, NR==51{print}' $FILENAME)
@@ -88,44 +88,7 @@ benchmark_value=$(func_parser_value "${lines[49]}")
infer_key1=$(func_parser_key "${lines[50]}")
infer_value1=$(func_parser_value "${lines[50]}")
-# parser klquant_infer
-if [ ${MODE} = "klquant_whole_infer" ]; then
- dataline=$(awk 'NR==1, NR==17{print}' $FILENAME)
- lines=(${dataline})
- model_name=$(func_parser_value "${lines[1]}")
- python=$(func_parser_value "${lines[2]}")
- export_weight=$(func_parser_key "${lines[3]}")
- save_infer_key=$(func_parser_key "${lines[4]}")
- # parser inference model
- infer_model_dir_list=$(func_parser_value "${lines[5]}")
- infer_export_list=$(func_parser_value "${lines[6]}")
- infer_is_quant=$(func_parser_value "${lines[7]}")
- # parser inference
- inference_py=$(func_parser_value "${lines[8]}")
- use_gpu_key=$(func_parser_key "${lines[9]}")
- use_gpu_list=$(func_parser_value "${lines[9]}")
- use_mkldnn_key=$(func_parser_key "${lines[10]}")
- use_mkldnn_list=$(func_parser_value "${lines[10]}")
- cpu_threads_key=$(func_parser_key "${lines[11]}")
- cpu_threads_list=$(func_parser_value "${lines[11]}")
- batch_size_key=$(func_parser_key "${lines[12]}")
- batch_size_list=$(func_parser_value "${lines[12]}")
- use_trt_key=$(func_parser_key "${lines[13]}")
- use_trt_list=$(func_parser_value "${lines[13]}")
- precision_key=$(func_parser_key "${lines[14]}")
- precision_list=$(func_parser_value "${lines[14]}")
- infer_model_key=$(func_parser_key "${lines[15]}")
- image_dir_key=$(func_parser_key "${lines[16]}")
- infer_img_dir=$(func_parser_value "${lines[16]}")
- save_log_key=$(func_parser_key "${lines[17]}")
- save_log_value=$(func_parser_value "${lines[17]}")
- benchmark_key=$(func_parser_key "${lines[18]}")
- benchmark_value=$(func_parser_value "${lines[18]}")
- infer_key1=$(func_parser_key "${lines[19]}")
- infer_value1=$(func_parser_value "${lines[19]}")
-fi
-
-LOG_PATH="./test_tipc/output/${model_name}"
+LOG_PATH="./test_tipc/output/${model_name}/${MODE}"
mkdir -p ${LOG_PATH}
status_log="${LOG_PATH}/results_python.log"
@@ -142,9 +105,9 @@ function func_inference(){
for use_gpu in ${use_gpu_list[*]}; do
if [ ${use_gpu} = "False" ] || [ ${use_gpu} = "cpu" ]; then
for use_mkldnn in ${use_mkldnn_list[*]}; do
- if [ ${use_mkldnn} = "False" ] && [ ${_flag_quant} = "True" ]; then
- continue
- fi
+ # if [ ${use_mkldnn} = "False" ] && [ ${_flag_quant} = "True" ]; then
+ # continue
+ # fi
for threads in ${cpu_threads_list[*]}; do
for batch_size in ${batch_size_list[*]}; do
for precision in ${precision_list[*]}; do
@@ -169,7 +132,7 @@ function func_inference(){
eval $command
last_status=${PIPESTATUS[0]}
eval "cat ${_save_log_path}"
- status_check $last_status "${command}" "${status_log}"
+ status_check $last_status "${command}" "${status_log}" "${model_name}"
done
done
done
@@ -200,7 +163,7 @@ function func_inference(){
eval $command
last_status=${PIPESTATUS[0]}
eval "cat ${_save_log_path}"
- status_check $last_status "${command}" "${status_log}"
+ status_check $last_status "${command}" "${status_log}" "${model_name}"
done
done
@@ -211,7 +174,7 @@ function func_inference(){
done
}
-if [ ${MODE} = "whole_infer" ] || [ ${MODE} = "klquant_whole_infer" ]; then
+if [ ${MODE} = "whole_infer" ]; then
GPUID=$3
if [ ${#GPUID} -le 0 ];then
env=" "
@@ -226,29 +189,22 @@ if [ ${MODE} = "whole_infer" ] || [ ${MODE} = "klquant_whole_infer" ]; then
infer_quant_flag=(${infer_is_quant})
for infer_model in ${infer_model_dir_list[*]}; do
# run export
- if [ ${infer_run_exports[Count]} != "null" ];then
- if [ ${MODE} = "klquant_whole_infer" ]; then
- save_infer_dir="${infer_model}_klquant"
- fi
- if [ ${MODE} = "whole_infer" ]; then
- save_infer_dir="${infer_model}"
- fi
+ if [ ${infer_run_exports[Count]} != "null" ];then
+ save_infer_dir="${infer_model}"
set_export_weight=$(func_set_params "${export_weight}" "${infer_model}")
set_save_infer_key=$(func_set_params "${save_infer_key}" "${save_infer_dir}")
- export_cmd="${python} ${infer_run_exports[Count]} ${set_export_weight} ${set_save_infer_key}"
+ export_log_path="${LOG_PATH}_export_${Count}.log"
+ export_cmd="${python} ${infer_run_exports[Count]} ${set_export_weight} ${set_save_infer_key} > ${export_log_path} 2>&1 "
echo ${infer_run_exports[Count]}
echo $export_cmd
eval $export_cmd
status_export=$?
- status_check $status_export "${export_cmd}" "${status_log}"
+ status_check $status_export "${export_cmd}" "${status_log}" "${model_name}"
else
save_infer_dir=${infer_model}
fi
#run inference
is_quant=${infer_quant_flag[Count]}
- if [ ${MODE} = "klquant_whole_infer" ]; then
- is_quant="True"
- fi
func_inference "${python}" "${inference_py}" "${save_infer_dir}" "${LOG_PATH}" "${infer_img_dir}" ${is_quant}
Count=$(($Count + 1))
done
@@ -315,7 +271,9 @@ else
set_batchsize=$(func_set_params "${train_batch_key}" "${train_batch_value}")
set_train_params1=$(func_set_params "${train_param_key1}" "${train_param_value1}")
set_use_gpu=$(func_set_params "${train_use_gpu_key}" "${train_use_gpu}")
- if [ ${#ips} -le 26 ];then
+ # if length of ips >= 15, then it is seen as multi-machine
+ # 15 is the min length of ips info for multi-machine: 0.0.0.0,0.0.0.0
+ if [ ${#ips} -le 15 ];then
save_log="${LOG_PATH}/${trainer}_gpus_${gpu}_autocast_${autocast}"
nodes=1
else
@@ -330,14 +288,15 @@ else
set_save_model=$(func_set_params "${save_model_key}" "${save_log}")
if [ ${#gpu} -le 2 ];then # train with cpu or single gpu
cmd="${python} ${run_train} ${set_use_gpu} ${set_save_model} ${set_epoch} ${set_pretrain} ${set_autocast} ${set_batchsize} ${set_train_params1} ${set_amp_config} "
- elif [ ${#ips} -le 26 ];then # train with multi-gpu
+ elif [ ${#ips} -le 15 ];then # train with multi-gpu
cmd="${python} -m paddle.distributed.launch --gpus=${gpu} ${run_train} ${set_use_gpu} ${set_save_model} ${set_epoch} ${set_pretrain} ${set_autocast} ${set_batchsize} ${set_train_params1} ${set_amp_config}"
else # train with multi-machine
cmd="${python} -m paddle.distributed.launch --ips=${ips} --gpus=${gpu} ${run_train} ${set_use_gpu} ${set_save_model} ${set_pretrain} ${set_epoch} ${set_autocast} ${set_batchsize} ${set_train_params1} ${set_amp_config}"
fi
# run train
eval $cmd
- status_check $? "${cmd}" "${status_log}"
+ eval "cat ${save_log}/train.log >> ${save_log}.log"
+ status_check $? "${cmd}" "${status_log}" "${model_name}"
set_eval_pretrain=$(func_set_params "${pretrain_model_key}" "${save_log}/${train_model_name}")
@@ -345,19 +304,21 @@ else
if [ ${eval_py} != "null" ]; then
eval ${env}
set_eval_params1=$(func_set_params "${eval_key1}" "${eval_value1}")
- eval_cmd="${python} ${eval_py} ${set_eval_pretrain} ${set_use_gpu} ${set_eval_params1}"
+ eval_log_path="${LOG_PATH}/${trainer}_gpus_${gpu}_autocast_${autocast}_nodes_${nodes}_eval.log"
+ eval_cmd="${python} ${eval_py} ${set_eval_pretrain} ${set_use_gpu} ${set_eval_params1} > ${eval_log_path} 2>&1 "
eval $eval_cmd
- status_check $? "${eval_cmd}" "${status_log}"
+ status_check $? "${eval_cmd}" "${status_log}" "${model_name}"
fi
# run export model
if [ ${run_export} != "null" ]; then
# run export model
save_infer_path="${save_log}"
+ export_log_path="${LOG_PATH}/${trainer}_gpus_${gpu}_autocast_${autocast}_nodes_${nodes}_export.log"
set_export_weight=$(func_set_params "${export_weight}" "${save_log}/${train_model_name}")
set_save_infer_key=$(func_set_params "${save_infer_key}" "${save_infer_path}")
- export_cmd="${python} ${run_export} ${set_export_weight} ${set_save_infer_key}"
+ export_cmd="${python} ${run_export} ${set_export_weight} ${set_save_infer_key} > ${export_log_path} 2>&1 "
eval $export_cmd
- status_check $? "${export_cmd}" "${status_log}"
+ status_check $? "${export_cmd}" "${status_log}" "${model_name}"
#run inference
eval $env
diff --git a/tools/end2end/convert_ppocr_label.py b/tools/end2end/convert_ppocr_label.py
index 8084cac785125f23885399931f98531326b6fb20..c64b9ed168113879182262b609bea692fcb73165 100644
--- a/tools/end2end/convert_ppocr_label.py
+++ b/tools/end2end/convert_ppocr_label.py
@@ -85,10 +85,16 @@ def convert_label(label_dir, mode="gt", save_dir="./save_results/"):
print("The convert label saved in {}".format(save_dir))
-if __name__ == "__main__":
+def parse_args():
+ import argparse
+ parser = argparse.ArgumentParser(description="args")
+ parser.add_argument("--label_path", type=str, required=True)
+ parser.add_argument("--save_folder", type=str, required=True)
+ parser.add_argument("--mode", type=str, default=False)
+ args = parser.parse_args()
+ return args
- ppocr_label_gt = "/paddle/Datasets/chinese/test_set/Label_refine_310_V2.txt"
- convert_label(ppocr_label_gt, "gt", "./save_gt_310_V2/")
- ppocr_label_gt = "./infer_results/ch_PPOCRV2_infer.txt"
- convert_label(ppocr_label_gt_en, "pred", "./save_PPOCRV2_infer/")
+if __name__ == "__main__":
+ args = parse_args()
+ convert_label(args.label_path, args.mode, args.save_folder)
diff --git a/tools/end2end/readme.md b/tools/end2end/readme.md
index 69da06dcdabc92c0b6f1831341e592e674ea7473..636ee764aef35a551f729cce20c16357ddf4f495 100644
--- a/tools/end2end/readme.md
+++ b/tools/end2end/readme.md
@@ -23,19 +23,13 @@ all-sum-510/00224225.jpg [{"transcription": "超赞", "points": [[8.0, 48
**步骤二:**
将步骤一保存的数据转换为端对端评测需要的数据格式:
-修改 `tools/convert_ppocr_label.py`中的代码,convert_label函数中设置输入标签路径,Mode,保存标签路径等,对预测数据的GTlabel和预测结果的label格式进行转换。
-```
-ppocr_label_gt = "gt_label.txt"
-convert_label(ppocr_label_gt, "gt", "./save_gt_label/")
+修改 `tools/end2end/convert_ppocr_label.py`中的代码,convert_label函数中设置输入标签路径,Mode,保存标签路径等,对预测数据的GTlabel和预测结果的label格式进行转换。
-ppocr_label_gt = "./ch_PP-OCRv2_results/system_results.txt"
-convert_label(ppocr_label_gt_en, "pred", "./save_PPOCRV2_infer/")
```
+python3 tools/end2end/convert_ppocr_label.py --mode=gt --label_path=path/to/label_txt --save_folder=save_gt_label
-运行`convert_ppocr_label.py`:
-```
-python3 tools/convert_ppocr_label.py
+python3 tools/end2end/convert_ppocr_label.py --mode=pred --label_path=path/to/pred_txt --save_folder=save_PPOCRV2_infer
```
得到如下结果:
diff --git a/tools/export_model.py b/tools/export_model.py
index fbb2201e39906660ac200350751f684091117f30..afecbff8cbb834a5aa5ef3ea1448cf04fbd8c3bb 100755
--- a/tools/export_model.py
+++ b/tools/export_model.py
@@ -17,7 +17,7 @@ import sys
__dir__ = os.path.dirname(os.path.abspath(__file__))
sys.path.append(__dir__)
-sys.path.append(os.path.abspath(os.path.join(__dir__, "..")))
+sys.path.insert(0, os.path.abspath(os.path.join(__dir__, "..")))
import argparse
@@ -31,7 +31,12 @@ from ppocr.utils.logging import get_logger
from tools.program import load_config, merge_config, ArgsParser
-def export_single_model(model, arch_config, save_path, logger, quanter=None):
+def export_single_model(model,
+ arch_config,
+ save_path,
+ logger,
+ input_shape=None,
+ quanter=None):
if arch_config["algorithm"] == "SRN":
max_text_length = arch_config["Head"]["max_text_length"]
other_shape = [
@@ -64,7 +69,7 @@ def export_single_model(model, arch_config, save_path, logger, quanter=None):
else:
other_shape = [
paddle.static.InputSpec(
- shape=[None, 3, 64, 256], dtype="float32"),
+ shape=[None] + input_shape, dtype="float32"),
]
model = to_static(model, input_spec=other_shape)
elif arch_config["algorithm"] == "PREN":
@@ -73,10 +78,45 @@ def export_single_model(model, arch_config, save_path, logger, quanter=None):
shape=[None, 3, 64, 512], dtype="float32"),
]
model = to_static(model, input_spec=other_shape)
+ elif arch_config["algorithm"] == "ViTSTR":
+ other_shape = [
+ paddle.static.InputSpec(
+ shape=[None, 1, 224, 224], dtype="float32"),
+ ]
+ model = to_static(model, input_spec=other_shape)
+ elif arch_config["algorithm"] == "ABINet":
+ other_shape = [
+ paddle.static.InputSpec(
+ shape=[None, 3, 32, 128], dtype="float32"),
+ ]
+ # print([None, 3, 32, 128])
+ model = to_static(model, input_spec=other_shape)
+ elif arch_config["algorithm"] == "NRTR":
+ other_shape = [
+ paddle.static.InputSpec(
+ shape=[None, 1, 32, 100], dtype="float32"),
+ ]
+ model = to_static(model, input_spec=other_shape)
+ elif arch_config["algorithm"] in ["LayoutLM", "LayoutLMv2", "LayoutXLM"]:
+ input_spec = [
+ paddle.static.InputSpec(
+ shape=[None, 512], dtype="int64"), # input_ids
+ paddle.static.InputSpec(
+ shape=[None, 512, 4], dtype="int64"), # bbox
+ paddle.static.InputSpec(
+ shape=[None, 512], dtype="int64"), # attention_mask
+ paddle.static.InputSpec(
+ shape=[None, 512], dtype="int64"), # token_type_ids
+ paddle.static.InputSpec(
+ shape=[None, 3, 224, 224], dtype="int64"), # image
+ ]
+ if arch_config["algorithm"] == "LayoutLM":
+ input_spec.pop(4)
+ model = to_static(model, input_spec=[input_spec])
else:
infer_shape = [3, -1, -1]
if arch_config["model_type"] == "rec":
- infer_shape = [3, 32, -1] # for rec model, H must be 32
+ infer_shape = [3, 48, -1] # for rec model, H must be 32
if "Transform" in arch_config and arch_config[
"Transform"] is not None and arch_config["Transform"][
"name"] == "TPS":
@@ -84,8 +124,6 @@ def export_single_model(model, arch_config, save_path, logger, quanter=None):
"When there is tps in the network, variable length input is not supported, and the input size needs to be the same as during training"
)
infer_shape[-1] = 100
- if arch_config["algorithm"] == "NRTR":
- infer_shape = [1, 32, 100]
elif arch_config["model_type"] == "table":
infer_shape = [3, 488, 488]
if arch_config["algorithm"] == "TableMaster":
@@ -152,13 +190,20 @@ def main():
config["Architecture"]["Head"]["out_channels"] = char_num
model = build_model(config["Architecture"])
- load_model(config, model)
+ load_model(config, model, model_type=config['Architecture']["model_type"])
model.eval()
save_path = config["Global"]["save_inference_dir"]
arch_config = config["Architecture"]
+ if arch_config["algorithm"] == "SVTR" and arch_config["Head"][
+ "name"] != 'MultiHead':
+ input_shape = config["Eval"]["dataset"]["transforms"][-2][
+ 'SVTRRecResizeImg']['image_shape']
+ else:
+ input_shape = None
+
if arch_config["algorithm"] in ["Distillation", ]: # distillation model
archs = list(arch_config["Models"].values())
for idx, name in enumerate(model.model_name_list):
@@ -167,7 +212,8 @@ def main():
sub_model_save_path, logger)
else:
save_path = os.path.join(save_path, "inference")
- export_single_model(model, arch_config, save_path, logger)
+ export_single_model(
+ model, arch_config, save_path, logger, input_shape=input_shape)
if __name__ == "__main__":
diff --git a/tools/infer/predict_det.py b/tools/infer/predict_det.py
index 5f2675d667c2aab8186886a60d8d447f43419954..394a48948b1f284bd405532769b76eeb298668bd 100755
--- a/tools/infer/predict_det.py
+++ b/tools/infer/predict_det.py
@@ -67,6 +67,23 @@ class TextDetector(object):
postprocess_params["unclip_ratio"] = args.det_db_unclip_ratio
postprocess_params["use_dilation"] = args.use_dilation
postprocess_params["score_mode"] = args.det_db_score_mode
+ elif self.det_algorithm == "DB++":
+ postprocess_params['name'] = 'DBPostProcess'
+ postprocess_params["thresh"] = args.det_db_thresh
+ postprocess_params["box_thresh"] = args.det_db_box_thresh
+ postprocess_params["max_candidates"] = 1000
+ postprocess_params["unclip_ratio"] = args.det_db_unclip_ratio
+ postprocess_params["use_dilation"] = args.use_dilation
+ postprocess_params["score_mode"] = args.det_db_score_mode
+ pre_process_list[1] = {
+ 'NormalizeImage': {
+ 'std': [1.0, 1.0, 1.0],
+ 'mean':
+ [0.48109378172549, 0.45752457890196, 0.40787054090196],
+ 'scale': '1./255.',
+ 'order': 'hwc'
+ }
+ }
elif self.det_algorithm == "EAST":
postprocess_params['name'] = 'EASTPostProcess'
postprocess_params["score_thresh"] = args.det_east_score_thresh
@@ -154,9 +171,10 @@ class TextDetector(object):
s = pts.sum(axis=1)
rect[0] = pts[np.argmin(s)]
rect[2] = pts[np.argmax(s)]
- diff = np.diff(pts, axis=1)
- rect[1] = pts[np.argmin(diff)]
- rect[3] = pts[np.argmax(diff)]
+ tmp = np.delete(pts, (np.argmin(s), np.argmax(s)), axis=0)
+ diff = np.diff(np.array(tmp), axis=1)
+ rect[1] = tmp[np.argmin(diff)]
+ rect[3] = tmp[np.argmax(diff)]
return rect
def clip_det_res(self, points, img_height, img_width):
@@ -230,7 +248,7 @@ class TextDetector(object):
preds['f_score'] = outputs[1]
preds['f_tco'] = outputs[2]
preds['f_tvo'] = outputs[3]
- elif self.det_algorithm in ['DB', 'PSE']:
+ elif self.det_algorithm in ['DB', 'PSE', 'DB++']:
preds['maps'] = outputs[0]
elif self.det_algorithm == 'FCE':
for i, output in enumerate(outputs):
diff --git a/tools/infer/predict_rec.py b/tools/infer/predict_rec.py
index 3664ef2caf4b888d6a3918202256c99cc54c5eb1..a95f55596647acc0eaca9616b5630917d7ebdf3a 100755
--- a/tools/infer/predict_rec.py
+++ b/tools/infer/predict_rec.py
@@ -69,6 +69,18 @@ class TextRecognizer(object):
"character_dict_path": args.rec_char_dict_path,
"use_space_char": args.use_space_char
}
+ elif self.rec_algorithm == 'ViTSTR':
+ postprocess_params = {
+ 'name': 'ViTSTRLabelDecode',
+ "character_dict_path": args.rec_char_dict_path,
+ "use_space_char": args.use_space_char
+ }
+ elif self.rec_algorithm == 'ABINet':
+ postprocess_params = {
+ 'name': 'ABINetLabelDecode',
+ "character_dict_path": args.rec_char_dict_path,
+ "use_space_char": args.use_space_char
+ }
self.postprocess_op = build_post_process(postprocess_params)
self.predictor, self.input_tensor, self.output_tensors, self.config = \
utility.create_predictor(args, 'rec', logger)
@@ -96,15 +108,22 @@ class TextRecognizer(object):
def resize_norm_img(self, img, max_wh_ratio):
imgC, imgH, imgW = self.rec_image_shape
- if self.rec_algorithm == 'NRTR':
+ if self.rec_algorithm == 'NRTR' or self.rec_algorithm == 'ViTSTR':
img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# return padding_im
image_pil = Image.fromarray(np.uint8(img))
- img = image_pil.resize([100, 32], Image.ANTIALIAS)
+ if self.rec_algorithm == 'ViTSTR':
+ img = image_pil.resize([imgW, imgH], Image.BICUBIC)
+ else:
+ img = image_pil.resize([imgW, imgH], Image.ANTIALIAS)
img = np.array(img)
norm_img = np.expand_dims(img, -1)
norm_img = norm_img.transpose((2, 0, 1))
- return norm_img.astype(np.float32) / 128. - 1.
+ if self.rec_algorithm == 'ViTSTR':
+ norm_img = norm_img.astype(np.float32) / 255.
+ else:
+ norm_img = norm_img.astype(np.float32) / 128. - 1.
+ return norm_img
assert imgC == img.shape[2]
imgW = int((imgH * max_wh_ratio))
@@ -132,17 +151,6 @@ class TextRecognizer(object):
padding_im[:, :, 0:resized_w] = resized_image
return padding_im
- def resize_norm_img_svtr(self, img, image_shape):
-
- imgC, imgH, imgW = image_shape
- resized_image = cv2.resize(
- img, (imgW, imgH), interpolation=cv2.INTER_LINEAR)
- resized_image = resized_image.astype('float32')
- resized_image = resized_image.transpose((2, 0, 1)) / 255
- resized_image -= 0.5
- resized_image /= 0.5
- return resized_image
-
def resize_norm_img_srn(self, img, image_shape):
imgC, imgH, imgW = image_shape
@@ -250,6 +258,35 @@ class TextRecognizer(object):
return padding_im, resize_shape, pad_shape, valid_ratio
+ def resize_norm_img_svtr(self, img, image_shape):
+
+ imgC, imgH, imgW = image_shape
+ resized_image = cv2.resize(
+ img, (imgW, imgH), interpolation=cv2.INTER_LINEAR)
+ resized_image = resized_image.astype('float32')
+ resized_image = resized_image.transpose((2, 0, 1)) / 255
+ resized_image -= 0.5
+ resized_image /= 0.5
+ return resized_image
+
+ def resize_norm_img_abinet(self, img, image_shape):
+
+ imgC, imgH, imgW = image_shape
+
+ resized_image = cv2.resize(
+ img, (imgW, imgH), interpolation=cv2.INTER_LINEAR)
+ resized_image = resized_image.astype('float32')
+ resized_image = resized_image / 255.
+
+ mean = np.array([0.485, 0.456, 0.406])
+ std = np.array([0.229, 0.224, 0.225])
+ resized_image = (
+ resized_image - mean[None, None, ...]) / std[None, None, ...]
+ resized_image = resized_image.transpose((2, 0, 1))
+ resized_image = resized_image.astype('float32')
+
+ return resized_image
+
def __call__(self, img_list):
img_num = len(img_list)
# Calculate the aspect ratio of all text bars
@@ -300,6 +337,11 @@ class TextRecognizer(object):
self.rec_image_shape)
norm_img = norm_img[np.newaxis, :]
norm_img_batch.append(norm_img)
+ elif self.rec_algorithm == "ABINet":
+ norm_img = self.resize_norm_img_abinet(
+ img_list[indices[ino]], self.rec_image_shape)
+ norm_img = norm_img[np.newaxis, :]
+ norm_img_batch.append(norm_img)
else:
norm_img = self.resize_norm_img(img_list[indices[ino]],
max_wh_ratio)
diff --git a/tools/infer/utility.py b/tools/infer/utility.py
index 48b16db4a0f2c2c901509d691088d3dc4381fabd..7eb77dec74bf283936e1143edcb5b5dfc28365bd 100644
--- a/tools/infer/utility.py
+++ b/tools/infer/utility.py
@@ -34,6 +34,7 @@ def init_args():
parser = argparse.ArgumentParser()
# params for prediction engine
parser.add_argument("--use_gpu", type=str2bool, default=True)
+ parser.add_argument("--use_xpu", type=str2bool, default=False)
parser.add_argument("--ir_optim", type=str2bool, default=True)
parser.add_argument("--use_tensorrt", type=str2bool, default=False)
parser.add_argument("--min_subgraph_size", type=int, default=15)
@@ -152,6 +153,8 @@ def create_predictor(args, mode, logger):
model_dir = args.rec_model_dir
elif mode == 'table':
model_dir = args.table_model_dir
+ elif mode == 'ser':
+ model_dir = args.ser_model_dir
else:
model_dir = args.e2e_model_dir
@@ -201,7 +204,8 @@ def create_predictor(args, mode, logger):
workspace_size=1 << 30,
precision_mode=precision,
max_batch_size=args.max_batch_size,
- min_subgraph_size=args.min_subgraph_size)
+ min_subgraph_size=args.min_subgraph_size,
+ use_calib_mode=False)
# skip the minmum trt subgraph
use_dynamic_shape = True
if mode == "det":
@@ -286,6 +290,8 @@ def create_predictor(args, mode, logger):
config.set_trt_dynamic_shape_info(
min_input_shape, max_input_shape, opt_input_shape)
+ elif args.use_xpu:
+ config.enable_xpu(10 * 1024 * 1024)
else:
config.disable_gpu()
if hasattr(args, "cpu_threads"):
@@ -312,8 +318,13 @@ def create_predictor(args, mode, logger):
# create predictor
predictor = inference.create_predictor(config)
input_names = predictor.get_input_names()
- for name in input_names:
- input_tensor = predictor.get_input_handle(name)
+ if mode in ['ser', 're']:
+ input_tensor = []
+ for name in input_names:
+ input_tensor.append(predictor.get_input_handle(name))
+ else:
+ for name in input_names:
+ input_tensor = predictor.get_input_handle(name)
output_tensors = get_output_tensors(args, mode, predictor)
return predictor, input_tensor, output_tensors, config
diff --git a/tools/infer_kie.py b/tools/infer_kie.py
index 0cb0b8702cbd7ea74a7b7fcff69122731578a1bd..346e2e0aeeee695ab49577b6b13dcc058150df1a 100755
--- a/tools/infer_kie.py
+++ b/tools/infer_kie.py
@@ -39,13 +39,12 @@ import time
def read_class_list(filepath):
- dict = {}
+ ret = {}
with open(filepath, "r") as f:
lines = f.readlines()
- for line in lines:
- key, value = line.split(" ")
- dict[key] = value.rstrip()
- return dict
+ for idx, line in enumerate(lines):
+ ret[idx] = line.strip("\n")
+ return ret
def draw_kie_result(batch, node, idx_to_cls, count):
@@ -71,7 +70,7 @@ def draw_kie_result(batch, node, idx_to_cls, count):
x_min = int(min([point[0] for point in new_box]))
y_min = int(min([point[1] for point in new_box]))
- pred_label = str(node_pred_label[i])
+ pred_label = node_pred_label[i]
if pred_label in idx_to_cls:
pred_label = idx_to_cls[pred_label]
pred_score = '{:.2f}'.format(node_pred_score[i])
@@ -109,8 +108,7 @@ def main():
save_res_path = config['Global']['save_res_path']
class_path = config['Global']['class_path']
idx_to_cls = read_class_list(class_path)
- if not os.path.exists(os.path.dirname(save_res_path)):
- os.makedirs(os.path.dirname(save_res_path))
+ os.makedirs(os.path.dirname(save_res_path), exist_ok=True)
model.eval()
diff --git a/tools/infer_rec.py b/tools/infer_rec.py
index 193e24a4de12392130d16b86a3407db74602e1f4..a08fa25b467482da4a2996912ad2cc8cc7c398da 100755
--- a/tools/infer_rec.py
+++ b/tools/infer_rec.py
@@ -157,7 +157,7 @@ def main():
if info is not None:
logger.info("\t result: {}".format(info))
- fout.write(file + "\t" + info)
+ fout.write(file + "\t" + info + "\n")
logger.info("success!")
diff --git a/tools/infer_vqa_token_ser.py b/tools/infer_vqa_token_ser.py
index 83ed72b392e627c161903c3945f57be0abfabc2b..0173a554cace31e20ab47dbe36d132a4dbb2127b 100755
--- a/tools/infer_vqa_token_ser.py
+++ b/tools/infer_vqa_token_ser.py
@@ -44,6 +44,7 @@ def to_tensor(data):
from collections import defaultdict
data_dict = defaultdict(list)
to_tensor_idxs = []
+
for idx, v in enumerate(data):
if isinstance(v, (np.ndarray, paddle.Tensor, numbers.Number)):
if idx not in to_tensor_idxs:
@@ -57,6 +58,7 @@ def to_tensor(data):
class SerPredictor(object):
def __init__(self, config):
global_config = config['Global']
+ self.algorithm = config['Architecture']["algorithm"]
# build post process
self.post_process_class = build_post_process(config['PostProcess'],
@@ -70,7 +72,10 @@ class SerPredictor(object):
from paddleocr import PaddleOCR
- self.ocr_engine = PaddleOCR(use_angle_cls=False, show_log=False)
+ self.ocr_engine = PaddleOCR(
+ use_angle_cls=False,
+ show_log=False,
+ use_gpu=global_config['use_gpu'])
# create data ops
transforms = []
@@ -80,29 +85,30 @@ class SerPredictor(object):
op[op_name]['ocr_engine'] = self.ocr_engine
elif op_name == 'KeepKeys':
op[op_name]['keep_keys'] = [
- 'input_ids', 'labels', 'bbox', 'image', 'attention_mask',
- 'token_type_ids', 'segment_offset_id', 'ocr_info',
+ 'input_ids', 'bbox', 'attention_mask', 'token_type_ids',
+ 'image', 'labels', 'segment_offset_id', 'ocr_info',
'entities'
]
transforms.append(op)
- global_config['infer_mode'] = True
+ if config["Global"].get("infer_mode", None) is None:
+ global_config['infer_mode'] = True
self.ops = create_operators(config['Eval']['dataset']['transforms'],
global_config)
self.model.eval()
- def __call__(self, img_path):
- with open(img_path, 'rb') as f:
+ def __call__(self, data):
+ with open(data["img_path"], 'rb') as f:
img = f.read()
- data = {'image': img}
+ data["image"] = img
batch = transform(data, self.ops)
batch = to_tensor(batch)
preds = self.model(batch)
+ if self.algorithm in ['LayoutLMv2', 'LayoutXLM']:
+ preds = preds[0]
+
post_result = self.post_process_class(
- preds,
- attention_masks=batch[4],
- segment_offset_ids=batch[6],
- ocr_infos=batch[7])
+ preds, segment_offset_ids=batch[6], ocr_infos=batch[7])
return post_result, batch
@@ -112,20 +118,33 @@ if __name__ == '__main__':
ser_engine = SerPredictor(config)
- infer_imgs = get_image_file_list(config['Global']['infer_img'])
+ if config["Global"].get("infer_mode", None) is False:
+ data_dir = config['Eval']['dataset']['data_dir']
+ with open(config['Global']['infer_img'], "rb") as f:
+ infer_imgs = f.readlines()
+ else:
+ infer_imgs = get_image_file_list(config['Global']['infer_img'])
+
with open(
os.path.join(config['Global']['save_res_path'],
"infer_results.txt"),
"w",
encoding='utf-8') as fout:
- for idx, img_path in enumerate(infer_imgs):
+ for idx, info in enumerate(infer_imgs):
+ if config["Global"].get("infer_mode", None) is False:
+ data_line = info.decode('utf-8')
+ substr = data_line.strip("\n").split("\t")
+ img_path = os.path.join(data_dir, substr[0])
+ data = {'img_path': img_path, 'label': substr[1]}
+ else:
+ img_path = info
+ data = {'img_path': img_path}
+
save_img_path = os.path.join(
config['Global']['save_res_path'],
os.path.splitext(os.path.basename(img_path))[0] + "_ser.jpg")
- logger.info("process: [{}/{}], save result to {}".format(
- idx, len(infer_imgs), save_img_path))
- result, _ = ser_engine(img_path)
+ result, _ = ser_engine(data)
result = result[0]
fout.write(img_path + "\t" + json.dumps(
{
@@ -133,3 +152,6 @@ if __name__ == '__main__':
}, ensure_ascii=False) + "\n")
img_res = draw_ser_results(img_path, result)
cv2.imwrite(save_img_path, img_res)
+
+ logger.info("process: [{}/{}], save result to {}".format(
+ idx, len(infer_imgs), save_img_path))
diff --git a/tools/infer_vqa_token_ser_re.py b/tools/infer_vqa_token_ser_re.py
index 6210f7f3c24227c9d366b08ce93ccfe4df849ce1..20ab1fe176c3be75f7a7b01a8d77df6419c58c75 100755
--- a/tools/infer_vqa_token_ser_re.py
+++ b/tools/infer_vqa_token_ser_re.py
@@ -38,7 +38,7 @@ from ppocr.utils.save_load import load_model
from ppocr.utils.visual import draw_re_results
from ppocr.utils.logging import get_logger
from ppocr.utils.utility import get_image_file_list, load_vqa_bio_label_maps, print_dict
-from tools.program import ArgsParser, load_config, merge_config, check_gpu
+from tools.program import ArgsParser, load_config, merge_config
from tools.infer_vqa_token_ser import SerPredictor
@@ -107,7 +107,7 @@ def make_input(ser_inputs, ser_results):
# remove ocr_info segment_offset_id and label in ser input
ser_inputs.pop(7)
ser_inputs.pop(6)
- ser_inputs.pop(1)
+ ser_inputs.pop(5)
return ser_inputs, entity_idx_dict_batch
@@ -131,9 +131,7 @@ class SerRePredictor(object):
self.model.eval()
def __call__(self, img_path):
- ser_results, ser_inputs = self.ser_engine(img_path)
- paddle.save(ser_inputs, 'ser_inputs.npy')
- paddle.save(ser_results, 'ser_results.npy')
+ ser_results, ser_inputs = self.ser_engine({'img_path': img_path})
re_input, entity_idx_dict_batch = make_input(ser_inputs, ser_results)
preds = self.model(re_input)
post_result = self.post_process_class(
@@ -155,7 +153,6 @@ def preprocess():
# check if set use_gpu=True in paddlepaddle cpu version
use_gpu = config['Global']['use_gpu']
- check_gpu(use_gpu)
device = 'gpu:{}'.format(dist.ParallelEnv().dev_id) if use_gpu else 'cpu'
device = paddle.set_device(device)
@@ -185,9 +182,7 @@ if __name__ == '__main__':
for idx, img_path in enumerate(infer_imgs):
save_img_path = os.path.join(
config['Global']['save_res_path'],
- os.path.splitext(os.path.basename(img_path))[0] + "_ser.jpg")
- logger.info("process: [{}/{}], save result to {}".format(
- idx, len(infer_imgs), save_img_path))
+ os.path.splitext(os.path.basename(img_path))[0] + "_ser_re.jpg")
result = ser_re_engine(img_path)
result = result[0]
@@ -197,3 +192,6 @@ if __name__ == '__main__':
}, ensure_ascii=False) + "\n")
img_res = draw_re_results(img_path, result)
cv2.imwrite(save_img_path, img_res)
+
+ logger.info("process: [{}/{}], save result to {}".format(
+ idx, len(infer_imgs), save_img_path))
diff --git a/tools/program.py b/tools/program.py
index 17079cb86e7762663a76951379fb8d7804b19f9e..1d83b46216ad62d59e7123c1b2d590d2a1aae5ac 100755
--- a/tools/program.py
+++ b/tools/program.py
@@ -112,20 +112,25 @@ def merge_config(config, opts):
return config
-def check_gpu(use_gpu):
+def check_device(use_gpu, use_xpu=False):
"""
Log error and exit when set use_gpu=true in paddlepaddle
cpu version.
"""
- err = "Config use_gpu cannot be set as true while you are " \
- "using paddlepaddle cpu version ! \nPlease try: \n" \
- "\t1. Install paddlepaddle-gpu to run model on GPU \n" \
- "\t2. Set use_gpu as false in config file to run " \
+ err = "Config {} cannot be set as true while your paddle " \
+ "is not compiled with {} ! \nPlease try: \n" \
+ "\t1. Install paddlepaddle to run model on {} \n" \
+ "\t2. Set {} as false in config file to run " \
"model on CPU"
try:
+ if use_gpu and use_xpu:
+ print("use_xpu and use_gpu can not both be ture.")
if use_gpu and not paddle.is_compiled_with_cuda():
- print(err)
+ print(err.format("use_gpu", "cuda", "gpu", "use_gpu"))
+ sys.exit(1)
+ if use_xpu and not paddle.device.is_compiled_with_xpu():
+ print(err.format("use_xpu", "xpu", "xpu", "use_xpu"))
sys.exit(1)
except Exception as e:
pass
@@ -250,6 +255,8 @@ def train(config,
with paddle.amp.auto_cast():
if model_type == 'table' or extra_input:
preds = model(images, data=batch[1:])
+ elif model_type in ["kie", 'vqa']:
+ preds = model(batch)
else:
preds = model(images)
else:
@@ -559,7 +566,7 @@ def preprocess(is_train=False):
# check if set use_gpu=True in paddlepaddle cpu version
use_gpu = config['Global']['use_gpu']
- check_gpu(use_gpu)
+ use_xpu = config['Global'].get('use_xpu', False)
# check if set use_xpu=True in paddlepaddle cpu/gpu version
use_xpu = False
@@ -571,15 +578,17 @@ def preprocess(is_train=False):
assert alg in [
'EAST', 'DB', 'SAST', 'Rosetta', 'CRNN', 'STARNet', 'RARE', 'SRN',
'CLS', 'PGNet', 'Distillation', 'NRTR', 'TableAttn', 'SAR', 'PSE',
- 'SEED', 'SDMGR', 'LayoutXLM', 'LayoutLM', 'PREN', 'FCE', 'SVTR',
- 'TableMaster'
+ 'SEED', 'SDMGR', 'LayoutXLM', 'LayoutLM', 'LayoutLMv2', 'PREN', 'FCE',
+ 'SVTR', 'ViTSTR', 'ABINet', 'DB++', 'TableMaster'
]
- device = 'cpu'
- if use_gpu:
- device = 'gpu:{}'.format(dist.ParallelEnv().dev_id)
if use_xpu:
- device = 'xpu'
+ device = 'xpu:{0}'.format(os.getenv('FLAGS_selected_xpus', 0))
+ else:
+ device = 'gpu:{}'.format(dist.ParallelEnv()
+ .dev_id) if use_gpu else 'cpu'
+ check_device(use_gpu, use_xpu)
+
device = paddle.set_device(device)
config['Global']['distributed'] = dist.get_world_size() != 1