提交 2605b1c0 编写于 作者: qq_25193841's avatar qq_25193841

Merge remote-tracking branch 'origin/release/2.6' into release2.6

...@@ -103,7 +103,7 @@ python PPOCRLabel.py --kie True # [KIE mode] for [detection + recognition + keyw ...@@ -103,7 +103,7 @@ python PPOCRLabel.py --kie True # [KIE mode] for [detection + recognition + keyw
``` ```
#### 1.2.3 Build and Install the Whl Package Locally #### 1.2.3 Build and Install the Whl Package Locally
Compile and install a new whl package, where 1.0.2 is the version number, you can specify the new version in 'setup.py'. Compile and install a new whl package, where 2.1.2 is the version number, you can specify the new version in 'setup.py'.
```bash ```bash
cd ./PPOCRLabel cd ./PPOCRLabel
python3 setup.py bdist_wheel python3 setup.py bdist_wheel
...@@ -101,7 +101,7 @@ python PPOCRLabel.py --lang ch ...@@ -101,7 +101,7 @@ python PPOCRLabel.py --lang ch
#### 1.2.3 本地构建whl包并安装 #### 1.2.3 本地构建whl包并安装
编译与安装新的whl包,其中1.0.2为版本号,可在 `setup.py` 中指定新版本。 编译与安装新的whl包,其中2.1.2为版本号,可在 `setup.py` 中指定新版本。
```bash ```bash
cd ./PPOCRLabel cd ./PPOCRLabel
# 智能运营:通用中文表格识别
- [1. 背景介绍](#1-背景介绍)
- [2. 中文表格识别](#2-中文表格识别)
- [2.1 环境准备](#21-环境准备)
- [2.2 准备数据集](#22-准备数据集)
- [2.2.1 划分训练测试集](#221-划分训练测试集)
- [2.2.2 查看数据集](#222-查看数据集)
- [2.3 训练](#23-训练)
- [2.4 验证](#24-验证)
- [2.5 训练引擎推理](#25-训练引擎推理)
- [2.6 模型导出](#26-模型导出)
- [2.7 预测引擎推理](#27-预测引擎推理)
- [2.8 表格识别](#28-表格识别)
- [3. 表格属性识别](#3-表格属性识别)
- [3.1 代码、环境、数据准备](#31-代码环境数据准备)
- [3.1.1 代码准备](#311-代码准备)
- [3.1.2 环境准备](#312-环境准备)
- [3.1.3 数据准备](#313-数据准备)
- [3.2 表格属性识别训练](#32-表格属性识别训练)
- [3.3 表格属性识别推理和部署](#33-表格属性识别推理和部署)
- [3.3.1 模型转换](#331-模型转换)
- [3.3.2 模型推理](#332-模型推理)
## 1. 背景介绍
本项目AI Studio链接:https://aistudio.baidu.com/aistudio/projectdetail/4588067
## 2. 中文表格识别
### 2.1 环境准备
# 下载PaddleOCR代码
! git clone -b dygraph https://gitee.com/paddlepaddle/PaddleOCR
# 安装PaddleOCR环境
! pip install -r PaddleOCR/requirements.txt --force-reinstall
! pip install protobuf==3.19
### 2.2 准备数据集
! cd data/data165849 && tar -xf table_gen_dataset.tar && cd -
! wc -l data/data165849/table_gen_dataset/gt.txt
#### 2.2.1 划分训练测试集
使用下述命令将数据集划分为训练集和测试集, 这里将90%划分为训练集,10%划分为测试集
import random
with open('/home/aistudio/data/data165849/table_gen_dataset/gt.txt') as f:
lines = f.readlines()
train_len = int(len(lines)*0.9)
train_list = lines[:train_len]
val_list = lines[train_len:]
# 保存结果
with open('/home/aistudio/train.txt','w',encoding='utf-8') as f:
with open('/home/aistudio/val.txt','w',encoding='utf-8') as f:
#### 2.2.2 查看数据集
import cv2
import os, json
import numpy as np
from matplotlib import pyplot as plt
%matplotlib inline
def parse_line(data_dir, line):
data_line = line.strip("\n")
info = json.loads(data_line)
file_name = info['filename']
cells = info['html']['cells'].copy()
structure = info['html']['structure']['tokens'].copy()
img_path = os.path.join(data_dir, file_name)
if not os.path.exists(img_path):
return None
data = {
'img_path': img_path,
'cells': cells,
'structure': structure,
'file_name': file_name
return data
def draw_bbox(img_path, points, color=(255, 0, 0), thickness=2):
if isinstance(img_path, str):
img_path = cv2.imread(img_path)
img_path = img_path.copy()
for point in points:
cv2.polylines(img_path, [point.astype(int)], True, color, thickness)
return img_path
def rebuild_html(data):
html_code = data['structure']
cells = data['cells']
to_insert = [i for i, tag in enumerate(html_code) if tag in ('<td>', '>')]
for i, cell in zip(to_insert[::-1], cells[::-1]):
if cell['tokens']:
text = ''.join(cell['tokens'])
# skip empty text
sp_char_list = ['<b>', '</b>', '\u2028', ' ', '<i>', '</i>']
text_remove_style = skip_char(text, sp_char_list)
if len(text_remove_style) == 0:
html_code.insert(i + 1, text)
html_code = ''.join(html_code)
return html_code
def skip_char(text, sp_char_list):
skip empty cell
@param text: text in cell
@param sp_char_list: style char and special code
for sp_char in sp_char_list:
text = text.replace(sp_char, '')
return text
save_dir = '/home/aistudio/vis'
os.makedirs(save_dir, exist_ok=True)
image_dir = '/home/aistudio/data/data165849/'
html_str = '<table border="1">'
# 解析标注信息并还原html表格
data = parse_line(image_dir, val_list[0])
img = cv2.imread(data['img_path'])
img_name = ''.join(os.path.basename(data['file_name']).split('.')[:-1])
img_save_name = os.path.join(save_dir, img_name)
boxes = [np.array(x['bbox']) for x in data['cells']]
show_img = draw_bbox(data['img_path'], boxes)
cv2.imwrite(img_save_name + '_show.jpg', show_img)
html = rebuild_html(data)
html_str += html
html_str += '</table>'
# 显示标注的html字符串
from IPython.core.display import display, HTML
# 显示单元格坐标
### 2.3 训练
|算法|Acc|[TEDS(Tree-Edit-Distance-based Similarity)](https://github.com/ibm-aur-nlp/PubTabNet/tree/master/src)|Speed|
| --- | --- | --- | ---|
| EDD<sup>[2]</sup> |x| 88.3% |x|
| TableRec-RARE(ours) | 71.73%| 93.88% |779ms|
| SLANet(ours) | 76.31%| 95.89%|766ms|
# 进入PaddleOCR工作目录
# 下载英文预训练模型
! wget -nc -P ./pretrain_models/ https://paddleocr.bj.bcebos.com/ppstructure/models/slanet/en_ppstructure_mobile_v2.0_SLANet_train.tar --no-check-certificate
! cd ./pretrain_models/ && tar xf en_ppstructure_mobile_v2.0_SLANet_train.tar && cd ../
|Optimizer.lr.name|Const|学习率衰减器 |
|Optimizer.lr.learning_rate|0.0005|学习率设为之前的0.05倍 |
|Train.dataset.data_dir|/home/aistudio/data/data165849|指向训练集图片存放目录 |
|Train.dataset.label_file_list|/home/aistudio/data/data165849/table_gen_dataset/train.txt|指向训练集标注文件 |
|Train.loader.batch_size_per_card|32|训练时每张卡的batch_size |
|Train.loader.num_workers|1|训练集多进程数据读取的进程数,在aistudio中需要设为1 |
|Eval.dataset.data_dir|/home/aistudio/data/data165849|指向测试集图片存放目录 |
|Eval.dataset.label_file_list|/home/aistudio/data/data165849/table_gen_dataset/val.txt|指向测试集标注文件 |
|Eval.loader.batch_size_per_card|32|测试时每张卡的batch_size |
|Eval.loader.num_workers|1|测试集多进程数据读取的进程数,在aistudio中需要设为1 |
已经修改好的配置存储在 `/home/aistudio/SLANet_ch.yml`
import os
! python3 tools/train.py -c /home/aistudio/SLANet_ch.yml
大约在7个epoch后达到最高精度 97.49%
### 2.4 验证
! python3 tools/eval.py -c /home/aistudio/SLANet_ch.yml -o Global.checkpoints=/home/aistudio/PaddleOCR/output/SLANet_ch/best_accuracy.pdparams
### 2.5 训练引擎推理
import os;os.chdir('/home/aistudio/PaddleOCR')
! python3 tools/infer_table.py -c /home/aistudio/SLANet_ch.yml -o Global.checkpoints=/home/aistudio/PaddleOCR/output/SLANet_ch/best_accuracy.pdparams Global.infer_img=/home/aistudio/data/data165849/table_gen_dataset/img/no_border_18298_G7XZH93DDCMATGJQ8RW2.jpg
import cv2
from matplotlib import pyplot as plt
%matplotlib inline
# 显示原图
show_img = cv2.imread('/home/aistudio/data/data165849/table_gen_dataset/img/no_border_18298_G7XZH93DDCMATGJQ8RW2.jpg')
# 显示预测的单元格
show_img = cv2.imread('/home/aistudio/PaddleOCR/output/infer/no_border_18298_G7XZH93DDCMATGJQ8RW2.jpg')
### 2.6 模型导出
! python3 tools/export_model.py -c /home/aistudio/SLANet_ch.yml -o Global.checkpoints=/home/aistudio/PaddleOCR/output/SLANet_ch/best_accuracy.pdparams Global.save_inference_dir=/home/aistudio/SLANet_ch/infer
### 2.7 预测引擎推理
! python3 table/predict_structure.py \
--table_model_dir=/home/aistudio/SLANet_ch/infer \
--table_char_dict_path=../ppocr/utils/dict/table_structure_dict.txt \
--image_dir=/home/aistudio/data/data165849/table_gen_dataset/img/no_border_18298_G7XZH93DDCMATGJQ8RW2.jpg \
# 显示原图
show_img = cv2.imread('/home/aistudio/data/data165849/table_gen_dataset/img/no_border_18298_G7XZH93DDCMATGJQ8RW2.jpg')
# 显示预测的单元格
show_img = cv2.imread('/home/aistudio/PaddleOCR/output/inference/no_border_18298_G7XZH93DDCMATGJQ8RW2.jpg')
### 2.8 表格识别
# 下载PP-OCRv3文本检测识别模型并解压
! wget -nc -P ./inference/ https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_det_slim_infer.tar --no-check-certificate
! wget -nc -P ./inference/ https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_rec_slim_infer.tar --no-check-certificate
! cd ./inference/ && tar xf ch_PP-OCRv3_det_slim_infer.tar && tar xf ch_PP-OCRv3_rec_slim_infer.tar && cd ../
import os;os.chdir('/home/aistudio/PaddleOCR/ppstructure')
! python3 table/predict_table.py \
--det_model_dir=inference/ch_PP-OCRv3_det_slim_infer \
--rec_model_dir=inference/ch_PP-OCRv3_rec_slim_infer \
--table_model_dir=/home/aistudio/SLANet_ch/infer \
--rec_char_dict_path=../ppocr/utils/ppocr_keys_v1.txt \
--table_char_dict_path=../ppocr/utils/dict/table_structure_dict.txt \
--image_dir=/home/aistudio/data/data165849/table_gen_dataset/img/no_border_18298_G7XZH93DDCMATGJQ8RW2.jpg \
# 显示原图
show_img = cv2.imread('/home/aistudio/data/data165849/table_gen_dataset/img/no_border_18298_G7XZH93DDCMATGJQ8RW2.jpg')
# 显示预测结果
from IPython.core.display import display, HTML
display(HTML('<html><body><table><tr><td colspan="5">alleadersh</td><td rowspan="2">不贰过,推</td><td rowspan="2">从自己参与浙江数</td><td rowspan="2">。另一方</td></tr><tr><td>AnSha</td><td>自己越</td><td>共商共建工作协商</td><td>w.east </td><td>抓好改革试点任务</td></tr><tr><td>Edime</td><td>ImisesElec</td><td>怀天下”。</td><td></td><td>22.26 </td><td>31.61</td><td>4.30 </td><td>794.94</td></tr><tr><td rowspan="2">ip</td><td> Profundi</td><td>:2019年12月1</td><td>Horspro</td><td>444.48</td><td>2.41 </td><td>87</td><td>679.98</td></tr><tr><td> iehaiTrain</td><td>组长蒋蕊</td><td>Toafterdec</td><td>203.43</td><td>23.54 </td><td>4</td><td>4266.62</td></tr><tr><td>Tyint </td><td> roudlyRol</td><td>谢您的好意,我知道</td><td>ErChows</td><td></td><td>48.90</td><td>1031</td><td>6</td></tr><tr><td>NaFlint</td><td></td><td>一辈的</td><td>aterreclam</td><td>7823.86</td><td>9829.23</td><td>7.96 </td><td> 3068</td></tr><tr><td>家上下游企业,5</td><td>Tr</td><td>景象。当地球上的我们</td><td>Urelaw</td><td>799.62</td><td>354.96</td><td>12.98</td><td>33 </td></tr><tr><td>赛事(</td><td> uestCh</td><td>复制的业务模式并</td><td>Listicjust</td><td>9.23</td><td></td><td>92</td><td>53.22</td></tr><tr><td> Ca</td><td> Iskole</td><td>扶贫"之名引导</td><td> Papua </td><td>7191.90</td><td>1.65</td><td>3.62</td><td>48</td></tr><tr><td rowspan="2">避讳</td><td>ir</td><td>但由于</td><td>Fficeof</td><td>0.22</td><td>6.37</td><td>7.17</td><td>3397.75</td></tr><tr><td>ndaTurk</td><td>百处遗址</td><td>gMa</td><td>1288.34</td><td>2053.66</td><td>2.29</td><td>885.45</td></tr></table></body></html>'))
## 3. 表格属性识别
### 3.1 代码、环境、数据准备
#### 3.1.1 代码准备
! git clone -b develop https://gitee.com/paddlepaddle/PaddleClas
#### 3.1.2 环境准备
! pip install -r PaddleClas/requirements.txt --force-reinstall
! pip install protobuf==3.20.0
#### 3.1.3 数据准备
%cd PaddleClas/dataset
!wget https://paddleclas.bj.bcebos.com/data/PULC/table_attribute.tar
!tar -xf table_attribute.tar
%cd ../PaddleClas/dataset
%cd ../
### 3.2 表格属性识别训练
!python tools/train.py -c ./ppcls/configs/PULC/table_attribute/PPLCNet_x1_0.yaml -o Global.device=cpu -o Global.epochs=10
### 3.3 表格属性识别推理和部署
#### 3.3.1 模型转换
!python tools/export_model.py -c ppcls/configs/PULC/table_attribute/PPLCNet_x1_0.yaml -o Global.pretrained_model=output/PPLCNet_x1_0/best_model
#### 3.3.2 模型推理
安装推理需要的paddleclas包, 此时需要通过下载安装paddleclas的develop的whl包
!pip install https://paddleclas.bj.bcebos.com/whl/paddleclas-0.0.0-py3-none-any.whl
%cd deploy/
!python python/predict_cls.py -c configs/PULC/table_attribute/inference_table_attribute.yaml -o Global.inference_model_dir="../inference" -o Global.infer_imgs="../dataset/table_attribute/Table_val/val_9.jpg"
!python python/predict_cls.py -c configs/PULC/table_attribute/inference_table_attribute.yaml -o Global.inference_model_dir="../inference" -o Global.infer_imgs="../dataset/table_attribute/Table_val/val_3253.jpg"
val_9.jpg: {'attributes': ['Scanned', 'Little', 'Black-and-White', 'Clear', 'Without-Obstacles', 'Horizontal'], 'output': [1, 1, 1, 1, 1, 1]}
val_3253.jpg: {'attributes': ['Photo', 'Little', 'Black-and-White', 'Blurry', 'Without-Obstacles', 'Tilted'], 'output': [0, 1, 1, 0, 1, 0]}
# 金融智能核验:扫描合同关键信息抽取
1. 使用PaddleOCR提取扫描文本内容
2. 使用PaddleNLP抽取自定义信息
点击进入 [AI Studio 项目](https://aistudio.baidu.com/aistudio/projectdetail/4545772)
## 1. 项目背景
- 合同内容对比:合同审核场景中,快速找出不同版本合同修改区域、版本差异;如合同盖章归档场景中有效识别实际签署的纸质合同、电子版合同差异。
- 合规性检查:法务人员进行合同审核,如合同完备性检查、大小写金额检查、签约主体一致性检查、双方权利和义务对等性分析等。
- 风险点识别:通过合同审核可识别事实倾向型风险点和数值计算型风险点等,例如交付地点约定不明、合同总价款不一致、重要条款缺失等风险点。
## 2. 解决方案
### 2.1 扫描合同文本内容提取
#### 2.1.1 环境准备
python -m pip install paddleocr
#### 2.1.2 效果测试
<img src=https://ai-studio-static-online.cdn.bcebos.com/46258d0dc9dc40bab3ea0e70434e4a905646df8a647f4c49921e217de5142def width=300>
from paddleocr import PaddleOCR, draw_ocr
# paddleocr目前支持中英文、英文、法语、德语、韩语、日语等80个语种,可以通过修改lang参数进行切换
ocr = PaddleOCR(use_angle_cls=False, lang="ch") # need to run only once to download and load model into memory
img_path = "./test_img/hetong2.jpg"
result = ocr.ocr(img_path, cls=False)
for line in result:
# 可视化结果
from PIL import Image
image = Image.open(img_path).convert('RGB')
boxes = [line[0] for line in result]
txts = [line[1][0] for line in result]
scores = [line[1][1] for line in result]
im_show = draw_ocr(image, boxes, txts, scores, font_path='./simfang.ttf')
im_show = Image.fromarray(im_show)
#### 2.1.3 图片预处理
import cv2
import numpy as np
import matplotlib.pyplot as plt
image=cv2.imread("./test_img/hetong2.jpg",cv2.IMREAD_COLOR) #timg.jpeg
#### 2.1.4 合同文本信息提取
import numpy as np
import cv2
img_path = './red_channel.jpg'
result = ocr.ocr(img_path, cls=False)
# 可视化结果
from PIL import Image
image = Image.open(img_path).convert('RGB')
boxes = [line[0] for line in result]
txts = [line[1][0] for line in result]
scores = [line[1][1] for line in result]
im_show = draw_ocr(image, boxes, txts, scores, font_path='./simfang.ttf')
im_show = Image.fromarray(im_show)
vis = np.array(im_show)
txts = [line[1][0] for line in result]
all_context = "\n".join(txts)
### 2.2 合同关键信息抽取
#### 2.2.1 环境准备
pip install --upgrade pip
pip install --upgrade paddlenlp
#### 2.2.2 合同关键信息抽取
PaddleNLP 使用 Taskflow 统一管理多场景任务的预测功能,其中`information_extraction` 通过大量的有标签样本进行训练,在通用的场景中一般可以直接使用,只需更换关键字即可。例如在合同信息抽取中,我们重新定义抽取关键字:
from paddlenlp import Taskflow
schema = ["甲方","乙方","总价"]
ie = Taskflow('information_extraction', schema=schema)
## 3.效果优化
### 3.1 文本识别后处理调优
<img src="https://ai-studio-static-online.cdn.bcebos.com/fe350481be0241c58736d487d1bf06c2e65911bf01254a79944be629c4c10091" height="300" width="300">
img_path = "./test_img/hetong3.jpg"
# 预测结果
result = ocr.ocr(img_path, cls=False)
# 可视化结果
from PIL import Image
image = Image.open(img_path).convert('RGB')
boxes = [line[0] for line in result]
txts = [line[1][0] for line in result]
scores = [line[1][1] for line in result]
im_show = draw_ocr(image, boxes, txts, scores, font_path='./simfang.ttf')
im_show = Image.fromarray(im_show)
- 开启`use_dilatiion=True` 膨胀分割区域
- 调小`det_db_box_thresh`阈值
# 重新实例化 PaddleOCR
ocr = PaddleOCR(use_angle_cls=False, lang="ch", det_db_box_thresh=0.3, use_dilation=True)
# 预测并可视化
img_path = "./test_img/hetong3.jpg"
# 预测结果
result = ocr.ocr(img_path, cls=False)
# 可视化结果
image = Image.open(img_path).convert('RGB')
boxes = [line[0] for line in result]
txts = [line[1][0] for line in result]
scores = [line[1][1] for line in result]
im_show = draw_ocr(image, boxes, txts, scores, font_path='./simfang.ttf')
im_show = Image.fromarray(im_show)
txts = [line[1][0] for line in result]
context = "\n".join(txts)
### 3.2 关键信息提取调优
UIE通过大量有标签样本进行训练,得到了一个开箱即用的高精模型。 然而针对不同场景,可能会出现部分实体无法被抽取的情况。通常来说有以下几个方法进行效果调优:
- 修改 schema
- 添加正则方法
- 标注小样本微调模型
schema = ["总金额"] 时无法准确抽取,与原文描述差异较大。 修改 schema = ["总价"] 再次尝试:
from paddlenlp import Taskflow
# schema = ["总金额"]
schema = ["总价"]
ie = Taskflow('information_extraction', schema=schema)
UIE的建模方式主要是通过 `Prompt` 方式来建模, `Prompt` 在小样本上进行微调效果非常有效。详细的数据标注+模型微调步骤可以参考项目:
## 总结
扫描合同的关键信息提取可以使用 PaddleOCR + PaddleNLP 组合实现,两个工具均有以下优势:
* 使用简单:whl包一键安装,3行命令调用
* 效果领先:优秀的模型效果可覆盖几乎全部的应用场景
* 调优成本低:OCR模型可通过后处理参数的调整适配略有偏差的扫描文本, UIE模型可以通过极少的标注样本微调,成本很低。
## 作业
尝试自己解析出 `test_img/homework.png` 扫描合同中的 [甲方、乙方] 关键词:
<img src=https://ai-studio-static-online.cdn.bcebos.com/50a49a3c9f8348bfa04e8c8b97d3cce0d0dd6b14040f43939268d120688ef7ca width=300 hight=400>
<img src=https://ai-studio-static-online.cdn.bcebos.com/606538b59ea845cb99943b1dec6efe724e78f75c1e9c49228c7bf7da9f8837f5 width=300 hight=300>
# Paddle2ONNX模型转化与预测 # Paddle2ONNX model transformation and prediction
本章节介绍 PaddleOCR 模型如何转化为 ONNX 模型,并基于 ONNXRuntime 引擎预测。 This chapter describes how the PaddleOCR model is converted into an ONNX model and predicted based on the ONNXRuntime engine.
## 1. 环境准备 ## 1. Environment preparation
需要准备 PaddleOCR、Paddle2ONNX 模型转化环境,和 ONNXRuntime 预测环境 Need to prepare PaddleOCR, Paddle2ONNX model conversion environment, and ONNXRuntime prediction environment
### PaddleOCR ### PaddleOCR
克隆PaddleOCR的仓库,使用release/2.4分支,并进行安装,由于PaddleOCR仓库比较大,git clone速度比较慢,所以本教程已下载 Clone the PaddleOCR repository, use the release/2.6 branch, and install it.
``` ```
git clone -b release/2.4 https://github.com/PaddlePaddle/PaddleOCR.git git clone -b release/2.6 https://github.com/PaddlePaddle/PaddleOCR.git
cd PaddleOCR && python3.7 setup.py install cd PaddleOCR && python3.7 setup.py install
``` ```
### Paddle2ONNX ### Paddle2ONNX
Paddle2ONNX 支持将 PaddlePaddle 模型格式转化到 ONNX 模型格式,算子目前稳定支持导出 ONNX Opset 9~11,部分Paddle算子支持更低的ONNX Opset转换。 Paddle2ONNX supports converting the PaddlePaddle model format to the ONNX model format. The operator currently supports exporting ONNX Opset 9~11 stably, and some Paddle operators support lower ONNX Opset conversion.
更多细节可参考 [Paddle2ONNX](https://github.com/PaddlePaddle/Paddle2ONNX/blob/develop/README_zh.md) For more details, please refer to [Paddle2ONNX](https://github.com/PaddlePaddle/Paddle2ONNX/blob/develop/README_en.md)
- 安装 Paddle2ONNX
- install Paddle2ONNX
``` ```
python3.7 -m pip install paddle2onnx python3.7 -m pip install paddle2onnx
``` ```
- 安装 ONNXRuntime - install ONNXRuntime
``` ```
# 建议安装 1.9.0 版本,可根据环境更换版本号 # It is recommended to install version 1.9.0, and the version number can be changed according to the environment
python3.7 -m pip install onnxruntime==1.9.0 python3.7 -m pip install onnxruntime==1.9.0
``` ```
## 2. 模型转换 ## 2. Model conversion
- Paddle 模型下载 - Paddle model download
有两种方式获取Paddle静态图模型:在 [model_list](../../doc/doc_ch/models_list.md) 中下载PaddleOCR提供的预测模型; There are two ways to obtain the Paddle model: Download the prediction model provided by PaddleOCR in [model_list](../../doc/doc_en/models_list_en.md);
参考[模型导出说明](../../doc/doc_ch/inference.md#训练模型转inference模型)把训练好的权重转为 inference_model。 Refer to [Model Export Instructions](../../doc/doc_en/inference_en.md#1-convert-training-model-to-inference-model) to convert the trained weights to inference_model.
以 ppocr 中文检测、识别、分类模型为例: Take the PP-OCRv3 detection, recognition, and classification model as an example:
``` ```
wget -nc -P ./inference https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_det_infer.tar wget -nc -P ./inference https://paddleocr.bj.bcebos.com/PP-OCRv3/english/en_PP-OCRv3_det_infer.tar
cd ./inference && tar xf ch_PP-OCRv2_det_infer.tar && cd .. cd ./inference && tar xf en_PP-OCRv3_det_infer.tar && cd ..
wget -nc -P ./inference https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_rec_infer.tar wget -nc -P ./inference https://paddleocr.bj.bcebos.com/PP-OCRv3/english/en_PP-OCRv3_rec_infer.tar
cd ./inference && tar xf ch_PP-OCRv2_rec_infer.tar && cd .. cd ./inference && tar xf en_PP-OCRv3_rec_infer.tar && cd ..
wget -nc -P ./inference https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_infer.tar wget -nc -P ./inference https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_infer.tar
cd ./inference && tar xf ch_ppocr_mobile_v2.0_cls_infer.tar && cd .. cd ./inference && tar xf ch_ppocr_mobile_v2.0_cls_infer.tar && cd ..
``` ```
- 模型转换 - convert model
使用 Paddle2ONNX 将Paddle静态图模型转换为ONNX模型格式: Convert Paddle inference model to ONNX model format using Paddle2ONNX:
``` ```
paddle2onnx --model_dir ./inference/ch_PP-OCRv2_det_infer \ paddle2onnx --model_dir ./inference/en_PP-OCRv3_det_infer \
--model_filename inference.pdmodel \ --model_filename inference.pdmodel \
--params_filename inference.pdiparams \ --params_filename inference.pdiparams \
--save_file ./inference/det_onnx/model.onnx \ --save_file ./inference/det_onnx/model.onnx \
...@@ -65,7 +66,7 @@ paddle2onnx --model_dir ./inference/ch_PP-OCRv2_det_infer \ ...@@ -65,7 +66,7 @@ paddle2onnx --model_dir ./inference/ch_PP-OCRv2_det_infer \
--input_shape_dict="{'x':[-1,3,-1,-1]}" \ --input_shape_dict="{'x':[-1,3,-1,-1]}" \
--enable_onnx_checker True --enable_onnx_checker True
paddle2onnx --model_dir ./inference/ch_PP-OCRv2_rec_infer \ paddle2onnx --model_dir ./inference/en_PP-OCRv3_rec_infer \
--model_filename inference.pdmodel \ --model_filename inference.pdmodel \
--params_filename inference.pdiparams \ --params_filename inference.pdiparams \
--save_file ./inference/rec_onnx/model.onnx \ --save_file ./inference/rec_onnx/model.onnx \
...@@ -81,136 +82,89 @@ paddle2onnx --model_dir ./inference/ch_ppocr_mobile_v2.0_cls_infer \ ...@@ -81,136 +82,89 @@ paddle2onnx --model_dir ./inference/ch_ppocr_mobile_v2.0_cls_infer \
--input_shape_dict="{'x':[-1,3,-1,-1]}" \ --input_shape_dict="{'x':[-1,3,-1,-1]}" \
--enable_onnx_checker True --enable_onnx_checker True
``` ```
After execution, the ONNX model will be saved in `./inference/det_onnx/`, `./inference/rec_onnx/`, `./inference/cls_onnx/` paths respectively
执行完毕后,ONNX 模型会被分别保存在 `./inference/det_onnx/``./inference/rec_onnx/``./inference/cls_onnx/`路径下 * Note: For the OCR model, the conversion process must be in the form of dynamic shape, that is, add the option --input_shape_dict="{'x': [-1, 3, -1, -1]}", otherwise the prediction result may be the same as Predicting directly with Paddle is slightly different.
In addition, the following models do not currently support conversion to ONNX models:
* 注意:对于OCR模型,转化过程中必须采用动态shape的形式,即加入选项--input_shape_dict="{'x': [-1, 3, -1, -1]}",否则预测结果可能与直接使用Paddle预测有细微不同。 NRTR, SAR, RARE, SRN
另外,以下几个模型暂不支持转换为 ONNX 模型:
## 3. 推理预测 ## 3. prediction
以中文OCR模型为例,使用 ONNXRuntime 预测可执行如下命令: Take the English OCR model as an example, use **ONNXRuntime** to predict and execute the following commands:
``` ```
python3.7 tools/infer/predict_system.py --use_gpu=False --use_onnx=True \ python3.7 tools/infer/predict_system.py --use_gpu=False --use_onnx=True \
--det_model_dir=./inference/det_onnx/model.onnx \ --det_model_dir=./inference/det_onnx/model.onnx \
--rec_model_dir=./inference/rec_onnx/model.onnx \ --rec_model_dir=./inference/rec_onnx/model.onnx \
--cls_model_dir=./inference/cls_onnx/model.onnx \ --cls_model_dir=./inference/cls_onnx/model.onnx \
--image_dir=./deploy/lite/imgs/lite_demo.png --image_dir=doc/imgs_en/img_12.jpg \
``` ```
以中文OCR模型为例,使用 Paddle Inference 预测可执行如下命令: Taking the English OCR model as an example, use **Paddle Inference** to predict and execute the following commands:
``` ```
python3.7 tools/infer/predict_system.py --use_gpu=False \ python3.7 tools/infer/predict_system.py --use_gpu=False \
--cls_model_dir=./inference/ch_ppocr_mobile_v2.0_cls_infer \ --cls_model_dir=./inference/ch_ppocr_mobile_v2.0_cls_infer \
--rec_model_dir=./inference/ch_PP-OCRv2_rec_infer \ --rec_model_dir=./inference/en_PP-OCRv3_rec_infer \
--det_model_dir=./inference/ch_PP-OCRv2_det_infer \ --det_model_dir=./inference/en_PP-OCRv3_det_infer \
--image_dir=./deploy/lite/imgs/lite_demo.png --image_dir=doc/imgs_en/img_12.jpg \
``` ```
执行命令后在终端会打印出预测的识别信息,并在 `./inference_results/` 下保存可视化结果。 After executing the command, the predicted identification information will be printed out in the terminal, and the visualization results will be saved under `./inference_results/`.
ONNXRuntime 执行效果 ONNXRuntime result
<div align="center"> <div align="center">
<img src="./images/lite_demo_onnx.png" width=800"> <img src="../../doc/imgs_results/multi_lang/img_12.jpg" width=800">
</div> </div>
Paddle Inference 执行效果 Paddle Inference result
<div align="center"> <div align="center">
<img src="./images/lite_demo_paddle.png" width=800"> <img src="../../doc/imgs_results/multi_lang/img_12.jpg" width=800">
</div> </div>
使用 ONNXRuntime 预测,终端输出: Using ONNXRuntime to predict, terminal output:
``` ```
[2022/02/22 17:48:27] root DEBUG: dt_boxes num : 38, elapse : 0.043187856674194336 [2022/10/10 12:06:28] ppocr DEBUG: dt_boxes num : 11, elapse : 0.3568880558013916
[2022/02/22 17:48:27] root DEBUG: rec_res num : 38, elapse : 0.592170000076294 [2022/10/10 12:06:31] ppocr DEBUG: rec_res num : 11, elapse : 2.6445000171661377
[2022/02/22 17:48:27] root DEBUG: 0 Predict time of ./deploy/lite/imgs/lite_demo.png: 0.642s [2022/10/10 12:06:31] ppocr DEBUG: 0 Predict time of doc/imgs_en/img_12.jpg: 3.021s
[2022/02/22 17:48:27] root DEBUG: The, 0.984 [2022/10/10 12:06:31] ppocr DEBUG: ACKNOWLEDGEMENTS, 0.997
[2022/02/22 17:48:27] root DEBUG: visualized, 0.882 [2022/10/10 12:06:31] ppocr DEBUG: We would like to thank all the designers and, 0.976
[2022/02/22 17:48:27] root DEBUG: etect18片, 0.720 [2022/10/10 12:06:31] ppocr DEBUG: contributors who have been involved in the, 0.979
[2022/02/22 17:48:27] root DEBUG: image saved in./vis.jpg, 0.947 [2022/10/10 12:06:31] ppocr DEBUG: production of this book; their contributions, 0.989
[2022/02/22 17:48:27] root DEBUG: 纯臻营养护发素0.993604, 0.996 [2022/10/10 12:06:31] ppocr DEBUG: have been indispensable to its creation. We, 0.956
[2022/02/22 17:48:27] root DEBUG: 产品信息/参数, 0.922 [2022/10/10 12:06:31] ppocr DEBUG: would also like to express our gratitude to all, 0.991
[2022/02/22 17:48:27] root DEBUG: 0.992728, 0.914 [2022/10/10 12:06:31] ppocr DEBUG: the producers for their invaluable opinions, 0.978
[2022/02/22 17:48:27] root DEBUG: (45元/每公斤,100公斤起订), 0.926 [2022/10/10 12:06:31] ppocr DEBUG: and assistance throughout this project. And to, 0.988
[2022/02/22 17:48:27] root DEBUG: 0.97417, 0.977 [2022/10/10 12:06:31] ppocr DEBUG: the many others whose names are not credited, 0.958
[2022/02/22 17:48:27] root DEBUG: 每瓶22元,1000瓶起订)0.993976, 0.962 [2022/10/10 12:06:31] ppocr DEBUG: but have made specific input in this book, we, 0.970
[2022/02/22 17:48:27] root DEBUG: 【品牌】:代加工方式/0EMODM, 0.945 [2022/10/10 12:06:31] ppocr DEBUG: thank you for your continuous support., 0.998
[2022/02/22 17:48:27] root DEBUG: 0.985133, 0.980 [2022/10/10 12:06:31] ppocr DEBUG: The visualized image saved in ./inference_results/img_12.jpg
[2022/02/22 17:48:27] root DEBUG: 【品名】:纯臻营养护发素, 0.921 [2022/10/10 12:06:31] ppocr INFO: The predict total time is 3.2482550144195557
[2022/02/22 17:48:27] root DEBUG: 0.995007, 0.883 ```
[2022/02/22 17:48:27] root DEBUG: 【产品编号】:YM-X-30110.96899, 0.955
[2022/02/22 17:48:27] root DEBUG: 【净含量】:220ml, 0.943 Using Paddle Inference to predict, terminal output:
[2022/02/22 17:48:27] root DEBUG: Q.996577, 0.932
[2022/02/22 17:48:27] root DEBUG: 【适用人群】:适合所有肤质, 0.913 ```
[2022/02/22 17:48:27] root DEBUG: 0.995842, 0.969 [2022/10/10 12:06:28] ppocr DEBUG: dt_boxes num : 11, elapse : 0.3568880558013916
[2022/02/22 17:48:27] root DEBUG: 【主要成分】:鲸蜡硬脂醇、燕麦B-葡聚, 0.883 [2022/10/10 12:06:31] ppocr DEBUG: rec_res num : 11, elapse : 2.6445000171661377
[2022/02/22 17:48:27] root DEBUG: 0.961928, 0.964 [2022/10/10 12:06:31] ppocr DEBUG: 0 Predict time of doc/imgs_en/img_12.jpg: 3.021s
[2022/02/22 17:48:27] root DEBUG: 10, 0.812 [2022/10/10 12:06:31] ppocr DEBUG: ACKNOWLEDGEMENTS, 0.997
[2022/02/22 17:48:27] root DEBUG: 糖、椰油酰胺丙基甜菜碱、泛醒, 0.866 [2022/10/10 12:06:31] ppocr DEBUG: We would like to thank all the designers and, 0.976
[2022/02/22 17:48:27] root DEBUG: 0.925898, 0.943 [2022/10/10 12:06:31] ppocr DEBUG: contributors who have been involved in the, 0.979
[2022/02/22 17:48:27] root DEBUG: (成品包材), 0.974 [2022/10/10 12:06:31] ppocr DEBUG: production of this book; their contributions, 0.989
[2022/02/22 17:48:27] root DEBUG: 0.972573, 0.961 [2022/10/10 12:06:31] ppocr DEBUG: have been indispensable to its creation. We, 0.956
[2022/02/22 17:48:27] root DEBUG: 【主要功能】:可紧致头发磷层,从而达到, 0.936 [2022/10/10 12:06:31] ppocr DEBUG: would also like to express our gratitude to all, 0.991
[2022/02/22 17:48:27] root DEBUG: 0.994448, 0.952 [2022/10/10 12:06:31] ppocr DEBUG: the producers for their invaluable opinions, 0.978
[2022/02/22 17:48:27] root DEBUG: 13, 0.998 [2022/10/10 12:06:31] ppocr DEBUG: and assistance throughout this project. And to, 0.988
[2022/02/22 17:48:27] root DEBUG: 即时持久改善头发光泽的效果,给干燥的头, 0.994 [2022/10/10 12:06:31] ppocr DEBUG: the many others whose names are not credited, 0.958
[2022/02/22 17:48:27] root DEBUG: 0.990198, 0.975 [2022/10/10 12:06:31] ppocr DEBUG: but have made specific input in this book, we, 0.970
[2022/02/22 17:48:27] root DEBUG: 14, 0.977 [2022/10/10 12:06:31] ppocr DEBUG: thank you for your continuous support., 0.998
[2022/02/22 17:48:27] root DEBUG: 发足够的滋养, 0.991 [2022/10/10 12:06:31] ppocr DEBUG: The visualized image saved in ./inference_results/img_12.jpg
[2022/02/22 17:48:27] root DEBUG: 0.997668, 0.918 [2022/10/10 12:06:31] ppocr INFO: The predict total time is 3.2482550144195557
[2022/02/22 17:48:27] root DEBUG: 花费了0.457335秒, 0.901
[2022/02/22 17:48:27] root DEBUG: The visualized image saved in ./inference_results/lite_demo.png
[2022/02/22 17:48:27] root INFO: The predict total time is 0.7003889083862305
使用 Paddle Inference 预测,终端输出:
[2022/02/22 17:47:25] root DEBUG: dt_boxes num : 38, elapse : 0.11791276931762695
[2022/02/22 17:47:27] root DEBUG: rec_res num : 38, elapse : 2.6206860542297363
[2022/02/22 17:47:27] root DEBUG: 0 Predict time of ./deploy/lite/imgs/lite_demo.png: 2.746s
[2022/02/22 17:47:27] root DEBUG: The, 0.984
[2022/02/22 17:47:27] root DEBUG: visualized, 0.882
[2022/02/22 17:47:27] root DEBUG: etect18片, 0.720
[2022/02/22 17:47:27] root DEBUG: image saved in./vis.jpg, 0.947
[2022/02/22 17:47:27] root DEBUG: 纯臻营养护发素0.993604, 0.996
[2022/02/22 17:47:27] root DEBUG: 产品信息/参数, 0.922
[2022/02/22 17:47:27] root DEBUG: 0.992728, 0.914
[2022/02/22 17:47:27] root DEBUG: (45元/每公斤,100公斤起订), 0.926
[2022/02/22 17:47:27] root DEBUG: 0.97417, 0.977
[2022/02/22 17:47:27] root DEBUG: 每瓶22元,1000瓶起订)0.993976, 0.962
[2022/02/22 17:47:27] root DEBUG: 【品牌】:代加工方式/0EMODM, 0.945
[2022/02/22 17:47:27] root DEBUG: 0.985133, 0.980
[2022/02/22 17:47:27] root DEBUG: 【品名】:纯臻营养护发素, 0.921
[2022/02/22 17:47:27] root DEBUG: 0.995007, 0.883
[2022/02/22 17:47:27] root DEBUG: 【产品编号】:YM-X-30110.96899, 0.955
[2022/02/22 17:47:27] root DEBUG: 【净含量】:220ml, 0.943
[2022/02/22 17:47:27] root DEBUG: Q.996577, 0.932
[2022/02/22 17:47:27] root DEBUG: 【适用人群】:适合所有肤质, 0.913
[2022/02/22 17:47:27] root DEBUG: 0.995842, 0.969
[2022/02/22 17:47:27] root DEBUG: 【主要成分】:鲸蜡硬脂醇、燕麦B-葡聚, 0.883
[2022/02/22 17:47:27] root DEBUG: 0.961928, 0.964
[2022/02/22 17:47:27] root DEBUG: 10, 0.812
[2022/02/22 17:47:27] root DEBUG: 糖、椰油酰胺丙基甜菜碱、泛醒, 0.866
[2022/02/22 17:47:27] root DEBUG: 0.925898, 0.943
[2022/02/22 17:47:27] root DEBUG: (成品包材), 0.974
[2022/02/22 17:47:27] root DEBUG: 0.972573, 0.961
[2022/02/22 17:47:27] root DEBUG: 【主要功能】:可紧致头发磷层,从而达到, 0.936
[2022/02/22 17:47:27] root DEBUG: 0.994448, 0.952
[2022/02/22 17:47:27] root DEBUG: 13, 0.998
[2022/02/22 17:47:27] root DEBUG: 即时持久改善头发光泽的效果,给干燥的头, 0.994
[2022/02/22 17:47:27] root DEBUG: 0.990198, 0.975
[2022/02/22 17:47:27] root DEBUG: 14, 0.977
[2022/02/22 17:47:27] root DEBUG: 发足够的滋养, 0.991
[2022/02/22 17:47:27] root DEBUG: 0.997668, 0.918
[2022/02/22 17:47:27] root DEBUG: 花费了0.457335秒, 0.901
[2022/02/22 17:47:27] root DEBUG: The visualized image saved in ./inference_results/lite_demo.png
[2022/02/22 17:47:27] root INFO: The predict total time is 2.8338775634765625
``` ```
...@@ -84,9 +84,9 @@ For English recognition model inference, you can execute the following commands, ...@@ -84,9 +84,9 @@ For English recognition model inference, you can execute the following commands,
``` ```
# download en model: # download en model:
wget https://paddleocr.bj.bcebos.com/PP-OCRv3/english/en_PP-OCRv3_det_infer.tar wget https://paddleocr.bj.bcebos.com/PP-OCRv3/english/en_PP-OCRv3_rec_infer.tar
tar xf en_PP-OCRv3_det_infer.tar tar xf en_PP-OCRv3_rec_infer.tar
python3 tools/infer/predict_rec.py --image_dir="./doc/imgs_words/en/word_1.png" --rec_model_dir="./en_PP-OCRv3_det_infer/" --rec_char_dict_path="ppocr/utils/en_dict.txt" python3 tools/infer/predict_rec.py --image_dir="./doc/imgs_words/en/word_1.png" --rec_model_dir="./en_PP-OCRv3_rec_infer/" --rec_char_dict_path="ppocr/utils/en_dict.txt"
``` ```
![](../imgs_words/en/word_1.png) ![](../imgs_words/en/word_1.png)
...@@ -52,6 +52,7 @@ def split_regions(axis): ...@@ -52,6 +52,7 @@ def split_regions(axis):
region = axis[min_axis:i] region = axis[min_axis:i]
min_axis = i min_axis = i
regions.append(region) regions.append(region)
regions.append(axis[min_axis:]) # 添加的一行
return regions return regions
...@@ -114,7 +114,7 @@ python3 table/eval_table.py \ ...@@ -114,7 +114,7 @@ python3 table/eval_table.py \
--det_model_dir=path/to/det_model_dir \ --det_model_dir=path/to/det_model_dir \
--rec_model_dir=path/to/rec_model_dir \ --rec_model_dir=path/to/rec_model_dir \
--table_model_dir=path/to/table_model_dir \ --table_model_dir=path/to/table_model_dir \
--image_dir=../doc/table/1.png \ --image_dir=docs/table/table.jpg \
--rec_char_dict_path=../ppocr/utils/dict/table_dict.txt \ --rec_char_dict_path=../ppocr/utils/dict/table_dict.txt \
--table_char_dict_path=../ppocr/utils/dict/table_structure_dict.txt \ --table_char_dict_path=../ppocr/utils/dict/table_structure_dict.txt \
--det_limit_side_len=736 \ --det_limit_side_len=736 \
...@@ -145,6 +145,7 @@ python3 table/eval_table.py \ ...@@ -145,6 +145,7 @@ python3 table/eval_table.py \
--table_char_dict_path=../ppocr/utils/dict/table_structure_dict.txt \ --table_char_dict_path=../ppocr/utils/dict/table_structure_dict.txt \
--det_limit_side_len=736 \ --det_limit_side_len=736 \
--det_limit_type=min \ --det_limit_type=min \
--rec_image_shape=3,32,320 \
--gt_path=path/to/gt.txt --gt_path=path/to/gt.txt
``` ```
...@@ -118,7 +118,7 @@ python3 table/eval_table.py \ ...@@ -118,7 +118,7 @@ python3 table/eval_table.py \
--det_model_dir=path/to/det_model_dir \ --det_model_dir=path/to/det_model_dir \
--rec_model_dir=path/to/rec_model_dir \ --rec_model_dir=path/to/rec_model_dir \
--table_model_dir=path/to/table_model_dir \ --table_model_dir=path/to/table_model_dir \
--image_dir=../doc/table/1.png \ --image_dir=docs/table/table.jpg \
--rec_char_dict_path=../ppocr/utils/dict/table_dict.txt \ --rec_char_dict_path=../ppocr/utils/dict/table_dict.txt \
--table_char_dict_path=../ppocr/utils/dict/table_structure_dict.txt \ --table_char_dict_path=../ppocr/utils/dict/table_structure_dict.txt \
--det_limit_side_len=736 \ --det_limit_side_len=736 \
...@@ -149,6 +149,7 @@ python3 table/eval_table.py \ ...@@ -149,6 +149,7 @@ python3 table/eval_table.py \
--table_char_dict_path=../ppocr/utils/dict/table_structure_dict.txt \ --table_char_dict_path=../ppocr/utils/dict/table_structure_dict.txt \
--det_limit_side_len=736 \ --det_limit_side_len=736 \
--det_limit_type=min \ --det_limit_type=min \
--rec_image_shape=3,32,320 \
--gt_path=path/to/gt.txt --gt_path=path/to/gt.txt
``` ```
...@@ -116,7 +116,10 @@ def sorted_boxes(dt_boxes): ...@@ -116,7 +116,10 @@ def sorted_boxes(dt_boxes):
sorted boxes(array) with shape [4, 2] sorted boxes(array) with shape [4, 2]
""" """
num_boxes = dt_boxes.shape[0] num_boxes = dt_boxes.shape[0]
sorted_boxes = sorted(dt_boxes, key=lambda x: (x[0][1], x[0][0])) if abs(num_boxes - 2) < 1e-4:
sorted_boxes = sorted(dt_boxes, key=lambda x: (x[1], x[0]))
sorted_boxes = sorted(dt_boxes, key=lambda x: (x[0][1], x[0][0]))
_boxes = list(sorted_boxes) _boxes = list(sorted_boxes)
for i in range(num_boxes - 1): for i in range(num_boxes - 1):
...@@ -225,7 +225,8 @@ def create_predictor(args, mode, logger): ...@@ -225,7 +225,8 @@ def create_predictor(args, mode, logger):
use_calib_mode=False) use_calib_mode=False)
# collect shape # collect shape
trt_shape_f = os.path.join(model_dir, f"{mode}_trt_dynamic_shape.txt") trt_shape_f = os.path.join(model_dir,
if not os.path.exists(trt_shape_f): if not os.path.exists(trt_shape_f):
config.collect_shape_range_info(trt_shape_f) config.collect_shape_range_info(trt_shape_f)
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
想要评论请 注册