未验证 提交 ef6a8f25 编写于 作者: Z zhouzj 提交者: GitHub

add ppocrv3_rec (#1575)

* add ppocrv3_rec

* fix eval.py
上级 22ec8eae
......@@ -21,12 +21,14 @@
| 模型 | 策略 | Metric | GPU 耗时(ms) | ARM CPU 耗时(ms) | 配置文件 | Inference模型 |
|:------:|:------:|:------:|:------:|:------:|:------:|:------:|
| 中文PPOCRV3-det | Baseline | 84.57 | - | - | - | [Model](https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_det_infer.tar) |
| 中文PPOCRV3-det | 量化+蒸馏 | 83.4 | - | - | [Config](./configs/ppocrv3_det_qat_dist.yaml) | [Model](https://bj.bcebos.com/v1/paddle-slim-models/act/PPOCRV3_det_QAT.tar) |
| 中文PPOCRV3-det | 量化+蒸馏 | 85.01 | - | - | [Config](./configs/ppocrv3_det_qat_dist.yaml) | [Model](https://bj.bcebos.com/v1/paddle-slim-models/act/OCR/PPOCRV3_det_QAT.tar) |
| 中文PPOCRV3-rec | Baseline | 76.48 | - | - | - | [Model](https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_rec_infer.tar) |
| 中文PPOCRV3-rec | 量化+蒸馏 | 73.23 | - | - | [Config](./configs/ppocrv3_rec_qat_dist.yaml) | [Model](https://bj.bcebos.com/v1/paddle-slim-models/act/OCR/PPOCRV3_rec_QAT.tar) |
> PPOCRV3-det 的测试指标为 hmean,PPOCRV3-rec的测试指标为 accuracy.
## 3. 自动压缩流程
#### 3.1 准备环境
### 3.1 准备环境
- python >= 3.6
- PaddlePaddle >= 2.3 (可从[Paddle官网](https://www.paddlepaddle.org.cn/install/quick?docurl=/documentation/docs/zh/install/pip/linux-pip.html)下载安装)
......@@ -45,15 +47,21 @@ pip install paddlepaddle-gpu
pip install paddleslim
```
#### 3.2 准备数据集
下载PaddleOCR:
```shell
git clone -b release/2.6 https://github.com/PaddlePaddle/PaddleOCR.git
```
> 下载 [PaddleOCR](https://github.com/PaddlePaddle/PaddleOCR.git) 的目的只是为了直接使用 PaddleOCR 中的 Dataloader 组件和精度评估模块,不涉及模型组网等。通过 `pip install paddleocr` 安装的 paddleocr 只有预测代码,没有数据集读取和精度评估的部分,因此需要下载 PaddleOCR 库。
### 3.2 准备数据集
公开数据集可参考[OCR数据集](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.6/doc/doc_ch/dataset/ocr_datasets.md)
注意:使用不同的数据集需要修改配置文件中`dataset`中数据路径和数据处理部分。
> 注意:使用不同的数据集需要修改配置文件中`dataset`中数据路径和数据处理部分。
#### 3.3 准备预测模型
### 3.3 准备预测模型
预测模型的格式为:`model.pdmodel``model.pdiparams`两个,带`pdmodel`的是模型文件,带`pdiparams`后缀的是权重文件。
注:其他像`__model__``__params__`分别对应`model.pdmodel``model.pdiparams`文件。
> 注:其他像`__model__`和`__params__`分别对应`model.pdmodel` 和 `model.pdiparams`文件。
可在[PaddleOCR模型库](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.6/doc/doc_ch/models_list.md)中直接获取Inference模型,具体可参考下方获取中文PPOCRV3检测模型示例:
......@@ -94,13 +102,13 @@ python eval.py --config_path='./configs/ppocrv3_det_qat_dist.yaml'
```
## 4.预测部署
#### 4.1 Python预测推理
### 4.1 Python预测推理
环境配置:若使用 TesorRT 预测引擎,需安装 ```WITH_TRT=ON``` 的Paddle,下载地址:[Python预测库](https://paddleinference.paddlepaddle.org.cn/master/user_guides/download_lib.html#python)
Python预测引擎推理可参考[基于Python预测引擎推理](https://github.com/PaddlePaddle/PaddleOCR/blob/9cdab61d909eb595af849db885c257ca8c74cb57/doc/doc_ch/inference_ppocr.md)
#### 4.2 PaddleLite端侧部署
### 4.2 PaddleLite端侧部署
PaddleLite端侧部署可参考:
- [Paddle Lite部署](https://github.com/PaddlePaddle/PaddleOCR/tree/9cdab61d909eb595af849db885c257ca8c74cb57/deploy/lite)
......
Global:
model_dir: ch_PP-OCRv3_rec_infer
model_filename: inference.pdmodel
params_filename: inference.pdiparams
model_type: rec
algorithm: SVTR
character_dict_path: ./ppocr_keys_v1.txt
max_text_length: &max_text_length 25
use_space_char: true
Distillation:
alpha: [1.0, 1.0]
loss: ['skd', 'l2']
node:
- ['linear_43.tmp_1']
- ['linear_43.tmp_1']
Quantization:
use_pact: true
activation_bits: 8
is_full_quantize: false
onnx_format: True
activation_quantize_type: moving_average_abs_max
weight_quantize_type: channel_wise_abs_max
not_quant_pattern:
- skip_quant
quantize_op_types:
- conv2d
- depthwise_conv2d
weight_bits: 8
TrainConfig:
epochs: 10
eval_iter: 2000
logging_iter: 100
learning_rate:
type: CosineAnnealingDecay
learning_rate: 0.00005
optimizer_builder:
optimizer:
type: Adam
weight_decay: 5.0e-05
PostProcess:
name: CTCLabelDecode
Metric:
name: RecMetric
main_indicator: acc
ignore_space: False
Train:
dataset:
name: SimpleDataSet
data_dir: ./train_data/icdar2015/text_localization/
ext_op_transform_idx: 1
label_file_list:
- ./train_data/icdar2015/text_localization/train_icdar2015_label.txt
transforms:
- DecodeImage:
img_mode: BGR
channel_first: false
- RecConAug:
prob: 0.5
ext_data_num: 2
image_shape: [48, 320, 3]
max_text_length: *max_text_length
- RecAug:
- MultiLabelEncode:
- RecResizeImg:
image_shape: [3, 48, 320]
- KeepKeys:
keep_keys:
- image
- label_ctc
- label_sar
- length
- valid_ratio
loader:
shuffle: true
batch_size_per_card: 64
drop_last: true
num_workers: 0
Eval:
dataset:
name: SimpleDataSet
data_dir: ./train_data/icdar2015/text_localization/
label_file_list:
- ./train_data/icdar2015/text_localization/test_icdar2015_label.txt
transforms:
- DecodeImage:
img_mode: BGR
channel_first: false
- MultiLabelEncode:
- RecResizeImg:
image_shape: [3, 48, 320]
- KeepKeys:
keep_keys:
- image
- label_ctc
- label_sar
- length
- valid_ratio
loader:
shuffle: false
drop_last: false
batch_size_per_card: 64
num_workers: 0
......@@ -12,8 +12,6 @@
# See the License for the specific language governing permissions and
# limitations under the License.
import os
import sys
import logging
import numpy as np
import argparse
......@@ -21,11 +19,10 @@ from tqdm import tqdm
import paddle
from paddleslim.common import load_config as load_slim_config
from paddleslim.common import get_logger
from paddleslim.auto_compression import AutoCompression
import sys
sys.path.append('../PaddleOCR')
from ppocr.data import build_dataloader
from ppocr.modeling.architectures import build_model
from ppocr.losses import build_loss
from ppocr.optimizer import build_optimizer
from ppocr.postprocess import build_post_process
from ppocr.metrics import build_metric
......@@ -37,7 +34,7 @@ def argsparser():
parser.add_argument(
'--config_path',
type=str,
default='./image_classification/configs/eval.yaml',
default='./configs/ppocrv3_det_qat_dist.yaml',
help="path of compression strategy config.")
parser.add_argument(
'--model_dir',
......@@ -47,20 +44,6 @@ def argsparser():
return parser
extra_input_models = [
"SRN", "NRTR", "SAR", "SEED", "SVTR", "VisionLAN", "RobustScanner"
]
def sample_generator(loader):
def __reader__():
for indx, data in enumerate(loader):
images = np.array(data[0])
yield images
return __reader__
def eval():
devices = paddle.device.get_device().split(':')[0]
places = paddle.device._convert_to_place(devices)
......@@ -70,15 +53,13 @@ def eval():
exe,
model_filename=global_config["model_filename"],
params_filename=global_config["params_filename"])
print('Loaded model from: {}'.format(global_config["model_dir"]))
logger.info('Loaded model from: {}'.format(global_config["model_dir"]))
val_loader = build_dataloader(all_config, 'Eval', devices, logger)
post_process_class = build_post_process(all_config['PostProcess'],
global_config)
eval_class = build_metric(all_config['Metric'])
model_type = global_config['model_type']
extra_input = True if global_config[
'algorithm'] in extra_input_models else False
with tqdm(
total=len(val_loader),
......@@ -86,27 +67,20 @@ def eval():
ncols=80) as t:
for batch_id, batch in enumerate(val_loader):
images = batch[0]
if extra_input:
preds = exe.run(
val_program,
feed={feed_target_names[0]: images,
'data': batch[1:]},
fetch_list=fetch_targets)
else:
preds = exe.run(val_program,
feed={feed_target_names[0]: images},
fetch_list=fetch_targets)
preds, = exe.run(val_program,
feed={feed_target_names[0]: images},
fetch_list=fetch_targets)
batch_numpy = []
for item in batch:
batch_numpy.append(np.array(item))
if model_type == 'det':
preds_map = {'maps': preds[0]}
preds_map = {'maps': preds}
post_result = post_process_class(preds_map, batch_numpy[1])
eval_class(post_result, batch_numpy)
elif model_type == 'rec':
post_result = post_process_class(preds[0], batch_numpy[1])
post_result = post_process_class(preds, batch_numpy[1])
eval_class(post_result, batch_numpy)
t.update()
......
此差异已折叠。
......@@ -12,8 +12,6 @@
# See the License for the specific language governing permissions and
# limitations under the License.
import os
import sys
import logging
from tqdm import tqdm
import numpy as np
......@@ -22,13 +20,14 @@ import paddle
from paddleslim.common import load_config as load_slim_config
from paddleslim.common import get_logger
from paddleslim.auto_compression import AutoCompression
from paddleslim.common.dataloader import get_feed_vars
import sys
sys.path.append('../PaddleOCR')
from ppocr.data import build_dataloader
from ppocr.modeling.architectures import build_model
from ppocr.losses import build_loss
from ppocr.optimizer import build_optimizer
from ppocr.postprocess import build_post_process
from ppocr.metrics import build_metric
from paddleslim.common.dataloader import get_feed_vars
logger = get_logger(__name__, level=logging.INFO)
......@@ -76,20 +75,20 @@ def eval_function(exe, compiled_test_program, test_feed_names, test_fetch_list):
ncols=80) as t:
for batch_id, batch in enumerate(val_loader):
images = batch[0]
preds = exe.run(compiled_test_program,
feed={test_feed_names[0]: images},
fetch_list=test_fetch_list)
preds, = exe.run(compiled_test_program,
feed={test_feed_names[0]: images},
fetch_list=test_fetch_list)
batch_numpy = []
for item in batch:
batch_numpy.append(np.array(item))
if model_type == 'det':
preds_map = {'maps': preds[0]}
preds_map = {'maps': preds}
post_result = post_process_class(preds_map, batch_numpy[1])
eval_class(post_result, batch_numpy)
elif model_type == 'rec':
post_result = post_process_class(preds[0], batch_numpy[1])
post_result = post_process_class(preds, batch_numpy[1])
eval_class(post_result, batch_numpy)
t.update()
metric = eval_class.get_metric()
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册