add ppocrv3_rec (#1575)

* add ppocrv3_rec * fix eval.py

add ppocrv3_rec (#1575)
* add ppocrv3_rec * fix eval.py
ef6a8f25 · zhouzj · GitHub · 22ec8eae · ef6a8f25 · ef6a8f25
5 changed file
--- a/example/auto_compression/ocr/README.md
+++ b/example/auto_compression/ocr/README.md
@@ -21,12 +21,14 @@
 | 模型 | 策略 | Metric | GPU 耗时(ms) | ARM CPU 耗时(ms) | 配置文件 | Inference模型 |
 |:------:|:------:|:------:|:------:|:------:|:------:|:------:|
 | 中文PPOCRV3-det | Baseline | 84.57 | - | - | - | [Model](https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_det_infer.tar) |
-| 中文PPOCRV3-det | 量化+蒸馏 | 83.4 | - | - | [Config](./configs/ppocrv3_det_qat_dist.yaml) | [Model](https://bj.bcebos.com/v1/paddle-slim-models/act/PPOCRV3_det_QAT.tar) |
-
+| 中文PPOCRV3-det | 量化+蒸馏 | 85.01 | - | - | [Config](./configs/ppocrv3_det_qat_dist.yaml) | [Model](https://bj.bcebos.com/v1/paddle-slim-models/act/OCR/PPOCRV3_det_QAT.tar) |
+| 中文PPOCRV3-rec | Baseline | 76.48 | - | - | - | [Model](https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_rec_infer.tar) |
+| 中文PPOCRV3-rec | 量化+蒸馏 | 73.23 | - | - | [Config](./configs/ppocrv3_rec_qat_dist.yaml) | [Model](https://bj.bcebos.com/v1/paddle-slim-models/act/OCR/PPOCRV3_rec_QAT.tar) |
+> PPOCRV3-det 的测试指标为 hmean，PPOCRV3-rec的测试指标为 accuracy.

 ## 3. 自动压缩流程

-#### 3.1 准备环境
+### 3.1 准备环境

 - python >= 3.6
 - PaddlePaddle >= 2.3 （可从[Paddle官网](https://www.paddlepaddle.org.cn/install/quick?docurl=/documentation/docs/zh/install/pip/linux-pip.html)下载安装）
@@ -45,15 +47,21 @@ pip install paddlepaddle-gpu
 pip install paddleslim
 ```

-#### 3.2 准备数据集
+下载PaddleOCR:
+```shell
+git clone -b release/2.6 https://github.com/PaddlePaddle/PaddleOCR.git
+```
+> 下载 [PaddleOCR](https://github.com/PaddlePaddle/PaddleOCR.git) 的目的只是为了直接使用 PaddleOCR 中的 Dataloader 组件和精度评估模块，不涉及模型组网等。通过 `pip install paddleocr` 安装的 paddleocr 只有预测代码，没有数据集读取和精度评估的部分，因此需要下载 PaddleOCR 库。
+
+### 3.2 准备数据集
 公开数据集可参考[OCR数据集](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.6/doc/doc_ch/dataset/ocr_datasets.md)。

-注意：使用不同的数据集需要修改配置文件中`dataset`中数据路径和数据处理部分。
+> 注意：使用不同的数据集需要修改配置文件中`dataset`中数据路径和数据处理部分。

-#### 3.3 准备预测模型
+### 3.3 准备预测模型
 预测模型的格式为：`model.pdmodel` 和 `model.pdiparams`两个，带`pdmodel`的是模型文件，带`pdiparams`后缀的是权重文件。

-注：其他像`__model__`和`__params__`分别对应`model.pdmodel` 和 `model.pdiparams`文件。
+> 注：其他像`__model__`和`__params__`分别对应`model.pdmodel` 和 `model.pdiparams`文件。

 可在[PaddleOCR模型库](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.6/doc/doc_ch/models_list.md)中直接获取Inference模型，具体可参考下方获取中文PPOCRV3检测模型示例：

@@ -94,13 +102,13 @@ python eval.py --config_path='./configs/ppocrv3_det_qat_dist.yaml'
 ```

 ## 4.预测部署
-#### 4.1 Python预测推理
+### 4.1 Python预测推理

 环境配置：若使用 TesorRT 预测引擎，需安装 ```WITH_TRT=ON``` 的Paddle，下载地址：[Python预测库](https://paddleinference.paddlepaddle.org.cn/master/user_guides/download_lib.html#python)

 Python预测引擎推理可参考[基于Python预测引擎推理](https://github.com/PaddlePaddle/PaddleOCR/blob/9cdab61d909eb595af849db885c257ca8c74cb57/doc/doc_ch/inference_ppocr.md)

-#### 4.2 PaddleLite端侧部署
+### 4.2 PaddleLite端侧部署
 PaddleLite端侧部署可参考：
 - [Paddle Lite部署](https://github.com/PaddlePaddle/PaddleOCR/tree/9cdab61d909eb595af849db885c257ca8c74cb57/deploy/lite)


--- a/example/auto_compression/ocr/configs/ppocrv3_rec_qat_dist.yaml
+++ b/example/auto_compression/ocr/configs/ppocrv3_rec_qat_dist.yaml
+Global:
+  model_dir: ch_PP-OCRv3_rec_infer
+  model_filename: inference.pdmodel
+  params_filename: inference.pdiparams
+  model_type: rec
+  algorithm: SVTR
+  character_dict_path: ./ppocr_keys_v1.txt
+  max_text_length: &max_text_length 25
+  use_space_char: true
+
+Distillation:
+  alpha: [1.0, 1.0]
+  loss: ['skd', 'l2']
+  node:
+   - ['linear_43.tmp_1']
+   - ['linear_43.tmp_1']
+
+Quantization:
+  use_pact: true
+  activation_bits: 8
+  is_full_quantize: false
+  onnx_format: True
+  activation_quantize_type: moving_average_abs_max
+  weight_quantize_type: channel_wise_abs_max
+  not_quant_pattern:
+  - skip_quant
+  quantize_op_types:
+  - conv2d
+  - depthwise_conv2d
+  weight_bits: 8
+
+TrainConfig:
+  epochs: 10
+  eval_iter: 2000
+  logging_iter: 100
+  learning_rate: 
+    type: CosineAnnealingDecay 
+    learning_rate: 0.00005
+  optimizer_builder:
+    optimizer:
+      type: Adam
+    weight_decay: 5.0e-05
+
+PostProcess:
+  name: CTCLabelDecode 
+
+Metric:
+  name: RecMetric
+  main_indicator: acc
+  ignore_space: False 
+
+Train:
+  dataset:
+    name: SimpleDataSet
+    data_dir: ./train_data/icdar2015/text_localization/
+    ext_op_transform_idx: 1
+    label_file_list:
+    - ./train_data/icdar2015/text_localization/train_icdar2015_label.txt
+    transforms:
+    - DecodeImage:
+        img_mode: BGR
+        channel_first: false
+    - RecConAug:
+        prob: 0.5
+        ext_data_num: 2
+        image_shape: [48, 320, 3]
+        max_text_length: *max_text_length
+    - RecAug:
+    - MultiLabelEncode:
+    - RecResizeImg:
+        image_shape: [3, 48, 320]
+    - KeepKeys:
+        keep_keys:
+        - image
+        - label_ctc
+        - label_sar
+        - length
+        - valid_ratio
+  loader:
+    shuffle: true
+    batch_size_per_card: 64
+    drop_last: true
+    num_workers: 0
+
+Eval:
+  dataset:
+    name: SimpleDataSet
+    data_dir: ./train_data/icdar2015/text_localization/
+    label_file_list:
+    - ./train_data/icdar2015/text_localization/test_icdar2015_label.txt
+    transforms:
+    - DecodeImage:
+        img_mode: BGR
+        channel_first: false
+    - MultiLabelEncode:
+    - RecResizeImg:
+        image_shape: [3, 48, 320]
+    - KeepKeys:
+        keep_keys:
+        - image
+        - label_ctc
+        - label_sar
+        - length
+        - valid_ratio
+  loader:
+    shuffle: false
+    drop_last: false
+    batch_size_per_card: 64
+    num_workers: 0
--- a/example/auto_compression/ocr/eval.py
+++ b/example/auto_compression/ocr/eval.py
@@ -12,8 +12,6 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.

-import os
-import sys
 import logging
 import numpy as np
 import argparse
@@ -21,11 +19,10 @@ from tqdm import tqdm
 import paddle
 from paddleslim.common import load_config as load_slim_config
 from paddleslim.common import get_logger
-from paddleslim.auto_compression import AutoCompression
+
+import sys
+sys.path.append('../PaddleOCR')
 from ppocr.data import build_dataloader
-from ppocr.modeling.architectures import build_model
-from ppocr.losses import build_loss
-from ppocr.optimizer import build_optimizer
 from ppocr.postprocess import build_post_process
 from ppocr.metrics import build_metric

@@ -37,7 +34,7 @@ def argsparser():
    parser.add_argument(
        '--config_path',
        type=str,
-        default='./image_classification/configs/eval.yaml',
+        default='./configs/ppocrv3_det_qat_dist.yaml',
        help="path of compression strategy config.")
    parser.add_argument(
        '--model_dir',
@@ -47,20 +44,6 @@ def argsparser():
    return parser


-extra_input_models = [
-    "SRN", "NRTR", "SAR", "SEED", "SVTR", "VisionLAN", "RobustScanner"
-]
-
-
-def sample_generator(loader):
-    def __reader__():
-        for indx, data in enumerate(loader):
-            images = np.array(data[0])
-            yield images
-
-    return __reader__
-
-
 def eval():
    devices = paddle.device.get_device().split(':')[0]
    places = paddle.device._convert_to_place(devices)
@@ -70,15 +53,13 @@ def eval():
        exe,
        model_filename=global_config["model_filename"],
        params_filename=global_config["params_filename"])
-    print('Loaded model from: {}'.format(global_config["model_dir"]))
+    logger.info('Loaded model from: {}'.format(global_config["model_dir"]))

    val_loader = build_dataloader(all_config, 'Eval', devices, logger)
    post_process_class = build_post_process(all_config['PostProcess'],
                                            global_config)
    eval_class = build_metric(all_config['Metric'])
    model_type = global_config['model_type']
-    extra_input = True if global_config[
-        'algorithm'] in extra_input_models else False

    with tqdm(
            total=len(val_loader),
@@ -86,27 +67,20 @@ def eval():
            ncols=80) as t:
        for batch_id, batch in enumerate(val_loader):
            images = batch[0]
-            if extra_input:
-                preds = exe.run(
-                    val_program,
-                    feed={feed_target_names[0]: images,
-                          'data': batch[1:]},
-                    fetch_list=fetch_targets)
-            else:
-                preds = exe.run(val_program,
-                                feed={feed_target_names[0]: images},
-                                fetch_list=fetch_targets)
+            preds, = exe.run(val_program,
+                             feed={feed_target_names[0]: images},
+                             fetch_list=fetch_targets)

            batch_numpy = []
            for item in batch:
                batch_numpy.append(np.array(item))

            if model_type == 'det':
-                preds_map = {'maps': preds[0]}
+                preds_map = {'maps': preds}
                post_result = post_process_class(preds_map, batch_numpy[1])
                eval_class(post_result, batch_numpy)
            elif model_type == 'rec':
-                post_result = post_process_class(preds[0], batch_numpy[1])
+                post_result = post_process_class(preds, batch_numpy[1])
                eval_class(post_result, batch_numpy)

            t.update()

--- a/example/auto_compression/ocr/ppocr_keys_v1.txt
+++ b/example/auto_compression/ocr/ppocr_keys_v1.txt
--- a/example/auto_compression/ocr/run.py
+++ b/example/auto_compression/ocr/run.py
@@ -12,8 +12,6 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.

-import os
-import sys
 import logging
 from tqdm import tqdm
 import numpy as np
@@ -22,13 +20,14 @@ import paddle
 from paddleslim.common import load_config as load_slim_config
 from paddleslim.common import get_logger
 from paddleslim.auto_compression import AutoCompression
+from paddleslim.common.dataloader import get_feed_vars
+
+import sys
+sys.path.append('../PaddleOCR')
 from ppocr.data import build_dataloader
-from ppocr.modeling.architectures import build_model
-from ppocr.losses import build_loss
-from ppocr.optimizer import build_optimizer
 from ppocr.postprocess import build_post_process
 from ppocr.metrics import build_metric
-from paddleslim.common.dataloader import get_feed_vars
+
 logger = get_logger(__name__, level=logging.INFO)


@@ -76,20 +75,20 @@ def eval_function(exe, compiled_test_program, test_feed_names, test_fetch_list):
            ncols=80) as t:
        for batch_id, batch in enumerate(val_loader):
            images = batch[0]
-            preds = exe.run(compiled_test_program,
-                            feed={test_feed_names[0]: images},
-                            fetch_list=test_fetch_list)
+            preds, = exe.run(compiled_test_program,
+                             feed={test_feed_names[0]: images},
+                             fetch_list=test_fetch_list)

            batch_numpy = []
            for item in batch:
                batch_numpy.append(np.array(item))

            if model_type == 'det':
-                preds_map = {'maps': preds[0]}
+                preds_map = {'maps': preds}
                post_result = post_process_class(preds_map, batch_numpy[1])
                eval_class(post_result, batch_numpy)
            elif model_type == 'rec':
-                post_result = post_process_class(preds[0], batch_numpy[1])
+                post_result = post_process_class(preds, batch_numpy[1])
                eval_class(post_result, batch_numpy)
            t.update()
        metric = eval_class.get_metric()