提交 3bf40c76 编写于 作者: 文幕地方's avatar 文幕地方

rm rec_char_type

上级 0527b02a
......@@ -29,7 +29,6 @@ def read_params():
cfg.rec_model_dir = "./inference/ch_PP-OCRv2_rec_infer/"
cfg.rec_image_shape = "3, 32, 320"
cfg.rec_char_type = 'ch'
cfg.rec_batch_num = 30
cfg.max_text_length = 25
......
......@@ -47,7 +47,6 @@ def read_params():
cfg.rec_model_dir = "./inference/ch_PP-OCRv2_rec_infer/"
cfg.rec_image_shape = "3, 32, 320"
cfg.rec_char_type = 'ch'
cfg.rec_batch_num = 30
cfg.max_text_length = 25
......
......@@ -188,7 +188,7 @@ hub serving start -c deploy/hubserving/ocr_system/config.json
- **output**:可视化结果保存路径,默认为`./hubserving_result`
访问示例:
```python tools/test_hubserving.py --server_url=http://127.0.0.1:8868/predict/ocr_system --image_dir./doc/imgs/ --visualize=false```
```python tools/test_hubserving.py --server_url=http://127.0.0.1:8868/predict/ocr_system --image_dir=./doc/imgs/ --visualize=false```
## 4. 返回结果格式说明
返回结果为列表(list),列表中的每一项为词典(dict),词典一共可能包含3种字段,信息如下:
......
......@@ -196,7 +196,7 @@ For example, if using the configuration file to start the text angle classificat
**Eg.**
```shell
python tools/test_hubserving.py --server_url=http://127.0.0.1:8868/predict/ocr_system --image_dir./doc/imgs/ --visualize=false`
python tools/test_hubserving.py --server_url=http://127.0.0.1:8868/predict/ocr_system --image_dir=./doc/imgs/ --visualize=false`
```
## 4. Returned result format
......
......@@ -25,7 +25,6 @@ def read_params():
# params for table structure model
cfg.table_max_len = 488
cfg.table_model_dir = './inference/en_ppocr_mobile_v2.0_table_structure_infer/'
cfg.table_char_type = 'en'
cfg.table_char_dict_path = './ppocr/utils/dict/table_structure_dict.txt'
cfg.show_log = False
return cfg
......@@ -3,12 +3,13 @@
本文介绍针对PP-OCR模型库的Python推理引擎使用方法,内容依次为文本检测、文本识别、方向分类器以及三者串联在CPU、GPU上的预测方法。
- [1. 文本检测模型推理](#文本检测模型推理)
- [2. 文本识别模型推理](#文本识别模型推理)
- [2.1 超轻量中文识别模型推理](#超轻量中文识别模型推理)
- [2.2 多语言模型的推理](#多语言模型的推理)
- [3. 方向分类模型推理](#方向分类模型推理)
- [4. 文本检测、方向分类和文字识别串联推理](#文本检测、方向分类和文字识别串联推理)
- [基于Python引擎的PP-OCR模型库推理](#基于python引擎的pp-ocr模型库推理)
- [1. 文本检测模型推理](#1-文本检测模型推理)
- [2. 文本识别模型推理](#2-文本识别模型推理)
- [2.1 超轻量中文识别模型推理](#21-超轻量中文识别模型推理)
- [2.2 多语言模型的推理](#22-多语言模型的推理)
- [3. 方向分类模型推理](#3-方向分类模型推理)
- [4. 文本检测、方向分类和文字识别串联推理](#4-文本检测方向分类和文字识别串联推理)
<a name="文本检测模型推理"></a>
......@@ -82,7 +83,7 @@ Predicts of ./doc/imgs_words/ch/word_4.jpg:('实力活力', 0.98458153)
如果您需要预测的是其他语言模型,可以在[此链接](./models_list.md#%E5%A4%9A%E8%AF%AD%E8%A8%80%E8%AF%86%E5%88%AB%E6%A8%A1%E5%9E%8B)中找到对应语言的inference模型,在使用inference模型预测时,需要通过`--rec_char_dict_path`指定使用的字典路径, 同时为了得到正确的可视化结果,需要通过 `--vis_font_path` 指定可视化的字体路径,`doc/fonts/` 路径下有默认提供的小语种字体,例如韩文识别:
```
wget https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/korean_mobile_v2.0_rec_infer.tar
python3 tools/infer/predict_rec.py --image_dir="./doc/imgs_words/korean/1.jpg" --rec_model_dir="./your inference model" --rec_char_type="korean" --rec_char_dict_path="ppocr/utils/dict/korean_dict.txt" --vis_font_path="doc/fonts/korean.ttf"
python3 tools/infer/predict_rec.py --image_dir="./doc/imgs_words/korean/1.jpg" --rec_model_dir="./your inference model" --rec_char_dict_path="ppocr/utils/dict/korean_dict.txt" --vis_font_path="doc/fonts/korean.ttf"
```
![](../imgs_words/korean/1.jpg)
......
# PaddleOCR快速开始
- [1. 安装PaddleOCR whl包](#1)
- [2. 便捷使用](#2)
- [2.1 命令行使用](#21)
- [2.1.1 中英文模型](#211)
- [2.1.2 多语言模型](#212)
- [2.1.3 版面分析](#213)
- [2.2 Python脚本使用](#22)
- [2.2.1 中英文与多语言使用](#221)
- [2.2.2 版面分析](#222)
- [PaddleOCR快速开始](#paddleocr快速开始)
- [1. 安装PaddleOCR whl包](#1-安装paddleocr-whl包)
- [2. 便捷使用](#2-便捷使用)
- [2.1 命令行使用](#21-命令行使用)
- [2.1.1 中英文模型](#211-中英文模型)
- [2.1.2 多语言模型](#212-多语言模型)
- [2.1.3 版面分析](#213-版面分析)
- [2.2 Python脚本使用](#22-python脚本使用)
- [2.2.1 中英文与多语言使用](#221-中英文与多语言使用)
- [2.2.2 版面分析](#222-版面分析)
<a name="1"></a>
......@@ -165,8 +166,8 @@ paddleocr --image_dir=./table/1.png --type=structure
/output/table/1/
└─ res.txt
└─ [454, 360, 824, 658].xlsx 表格识别结果
└─ [16, 2, 828, 305].jpg 被裁剪出的图片区域
└─ [17, 361, 404, 711].xlsx 表格识别结果
└─ [16, 2, 828, 305].jpg 被裁剪出的图片区域
└─ [17, 361, 404, 711].xlsx 表格识别结果
```
- **参数说明**
......@@ -176,7 +177,7 @@ paddleocr --image_dir=./table/1.png --type=structure
| output | excel和识别结果保存的地址 | ./output/table |
| table_max_len | 表格结构模型预测时,图像的长边resize尺度 | 488 |
| table_model_dir | 表格结构模型 inference 模型地址 | None |
| table_char_type | 表格结构模型所用字典地址 | ../ppocr/utils/dict/table_structure_dict.txt |
| table_char_dict_path | 表格结构模型所用字典地址 | ../ppocr/utils/dict/table_structure_dict.txt |
大部分参数和paddleocr whl包保持一致,见 [whl包文档](./whl.md)
......
......@@ -2,19 +2,20 @@
本文提供了PaddleOCR文本识别任务的全流程指南,包括数据准备、模型训练、调优、评估、预测,各个阶段的详细说明:
- [1 数据准备](#数据准备)
- [1.1 自定义数据集](#自定义数据集)
- [1.2 数据下载](#数据下载)
- [1.3 字典](#字典)
- [1.4 支持空格](#支持空格)
- [2 启动训练](#启动训练)
- [2.1 数据增强](#数据增强)
- [2.2 通用模型训练](#通用模型训练)
- [2.3 多语言模型训练](#多语言模型训练)
- [2.4 知识蒸馏训练](#知识蒸馏训练)
- [3 评估](#评估)
- [4 预测](#预测)
- [5 转Inference模型测试](#Inference)
- [文字识别](#文字识别)
- [1. 数据准备](#1-数据准备)
- [1.1 自定义数据集](#11-自定义数据集)
- [1.2 数据下载](#12-数据下载)
- [1.3 字典](#13-字典)
- [1.4 添加空格类别](#14-添加空格类别)
- [2. 启动训练](#2-启动训练)
- [2.1 数据增强](#21-数据增强)
- [2.2 通用模型训练](#22-通用模型训练)
- [2.3 多语言模型训练](#23-多语言模型训练)
- [2.4 知识蒸馏训练](#24-知识蒸馏训练)
- [3 评估](#3-评估)
- [4 预测](#4-预测)
- [5. 转Inference模型测试](#5-转inference模型测试)
<a name="数据准备"></a>
......@@ -474,8 +475,8 @@ python3 tools/export_model.py -c configs/rec/ch_ppocr_v2.0/rec_chinese_lite_trai
- 自定义模型推理
如果训练时修改了文本的字典,在使用inference模型预测时,需要通过`--rec_char_dict_path`指定使用的字典路径,并且设置 `rec_char_type=ch`
如果训练时修改了文本的字典,在使用inference模型预测时,需要通过`--rec_char_dict_path`指定使用的字典路径
```
python3 tools/infer/predict_rec.py --image_dir="./doc/imgs_words_en/word_336.png" --rec_model_dir="./your inference model" --rec_image_shape="3, 32, 100" --rec_char_type="ch" --rec_char_dict_path="your text dict path"
python3 tools/infer/predict_rec.py --image_dir="./doc/imgs_words_en/word_336.png" --rec_model_dir="./your inference model" --rec_image_shape="3, 32, 100" --rec_char_dict_path="your text dict path"
```
......@@ -98,7 +98,6 @@ def read_params():
cfg.rec_model_dir = "./ocr_rec_server/" # 识别算法模型路径
cfg.rec_image_shape = "3, 32, 320"
cfg.rec_char_type = 'ch'
cfg.rec_batch_num = 30
cfg.max_text_length = 25
......
......@@ -401,7 +401,6 @@ im_show.save('result.jpg')
| rec_algorithm | 使用的识别算法类型 | CRNN |
| rec_model_dir | 识别模型所在文件夹。传参方式有两种,1. None: 自动下载内置模型到 `~/.paddleocr/rec`;2.自己转换好的inference模型路径,模型路径下必须包含model和params文件 | None |
| rec_image_shape | 识别算法的输入图片尺寸 | "3,32,320" |
| rec_char_type | 识别算法的字符类型,中英文(ch)、英文(en)、法语(french)、德语(german)、韩语(korean)、日语(japan) | ch |
| rec_batch_num | 进行识别时,同时前向的图片数 | 30 |
| max_text_length | 识别算法能识别的最大文字长度 | 25 |
| rec_char_dict_path | 识别模型字典路径,当rec_model_dir使用方式2传参时需要修改为自己的字典路径 | ./ppocr/utils/ppocr_keys_v1.txt |
......
......@@ -296,7 +296,7 @@ Predicts of ./doc/imgs_words_en/word_336.png:('super', 0.9999073)
- The image resolution used in training is different: the image resolution used in training the above model is [3,32,100], while during our Chinese model training, in order to ensure the recognition effect of long text, the image resolution used in training is [3, 32, 320]. The default shape parameter of the inference stage is the image resolution used in training phase, that is [3, 32, 320]. Therefore, when running inference of the above English model here, you need to set the shape of the recognition image through the parameter `rec_image_shape`.
- Character list: the experiment in the DTRB paper is only for 26 lowercase English characters and 10 numbers, a total of 36 characters. All upper and lower case characters are converted to lower case characters, and characters not in the above list are ignored and considered as spaces. Therefore, no characters dictionary file is used here, but a dictionary is generated by the below command. Therefore, the parameter `rec_char_type` needs to be set during inference, which is specified as "en" in English.
- Character list: the experiment in the DTRB paper is only for 26 lowercase English characters and 10 numbers, a total of 36 characters. All upper and lower case characters are converted to lower case characters, and characters not in the above list are ignored and considered as spaces. Therefore, no characters dictionary file is used here, but a dictionary is generated by the below command.
```
self.character_str = "0123456789abcdefghijklmnopqrstuvwxyz"
......@@ -320,7 +320,7 @@ python3 tools/infer/predict_rec.py --image_dir="./doc/imgs_words_en/word_336.png
<a name="USING_CUSTOM_CHARACTERS"></a>
### 3.4 Text Recognition Model Inference Using Custom Characters Dictionary
If the text dictionary is modified during training, when using the inference model to predict, you need to specify the dictionary path used by `--rec_char_dict_path`, and set `rec_char_type=ch`
If the text dictionary is modified during training, when using the inference model to predict, you need to specify the dictionary path used by `--rec_char_dict_path`
```
python3 tools/infer/predict_rec.py --image_dir="./doc/imgs_words_en/word_336.png" --rec_model_dir="./your inference model" --rec_image_shape="3, 32, 100" --rec_char_dict_path="your text dict path"
......
......@@ -4,12 +4,13 @@
This article introduces the use of the Python inference engine for the PP-OCR model library. The content is in order of text detection, text recognition, direction classifier and the prediction method of the three in series on the CPU and GPU.
- [Text Detection Model Inference](#DETECTION_MODEL_INFERENCE)
- [Text Recognition Model Inference](#RECOGNITION_MODEL_INFERENCE)
- [1. Lightweight Chinese Recognition Model Inference](#LIGHTWEIGHT_RECOGNITION)
- [2. Multilingual Model Inference](#MULTILINGUAL_MODEL_INFERENCE)
- [Angle Classification Model Inference](#ANGLE_CLASS_MODEL_INFERENCE)
- [Text Detection Angle Classification and Recognition Inference Concatenation](#CONCATENATION)
- [Python Inference for PP-OCR Model Zoo](#python-inference-for-pp-ocr-model-zoo)
- [Text Detection Model Inference](#text-detection-model-inference)
- [Text Recognition Model Inference](#text-recognition-model-inference)
- [1. Lightweight Chinese Recognition Model Inference](#1-lightweight-chinese-recognition-model-inference)
- [2. Multilingual Model Inference](#2-multilingual-model-inference)
- [Angle Classification Model Inference](#angle-classification-model-inference)
- [Text Detection Angle Classification and Recognition Inference Concatenation](#text-detection-angle-classification-and-recognition-inference-concatenation)
<a name="DETECTION_MODEL_INFERENCE"></a>
......@@ -82,7 +83,7 @@ You need to specify the visual font path through `--vis_font_path`. There are sm
```
wget wget https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/korean_mobile_v2.0_rec_infer.tar
python3 tools/infer/predict_rec.py --image_dir="./doc/imgs_words/korean/1.jpg" --rec_model_dir="./your inference model" --rec_char_type="korean" --rec_char_dict_path="ppocr/utils/dict/korean_dict.txt" --vis_font_path="doc/fonts/korean.ttf"
python3 tools/infer/predict_rec.py --image_dir="./doc/imgs_words/korean/1.jpg" --rec_model_dir="./your inference model" --rec_char_dict_path="ppocr/utils/dict/korean_dict.txt" --vis_font_path="doc/fonts/korean.ttf"
```
![](../imgs_words/korean/1.jpg)
......
# PaddleOCR Quick Start
+ [1. Install PaddleOCR Whl Package](#1-install-paddleocr-whl-package)
* [2. Easy-to-Use](#2-easy-to-use)
+ [2.1 Use by Command Line](#21-use-by-command-line)
- [2.1.1 English and Chinese Model](#211-english-and-chinese-model)
- [2.1.2 Multi-language Model](#212-multi-language-model)
- [2.1.3 Layout Analysis](#213-layoutAnalysis)
+ [2.2 Use by Code](#22-use-by-code)
- [2.2.1 Chinese & English Model and Multilingual Model](#221-chinese---english-model-and-multilingual-model)
- [2.2.2 Layout Analysis](#222-layoutAnalysis)
- [PaddleOCR Quick Start](#paddleocr-quick-start)
- [1. Install PaddleOCR Whl Package](#1-install-paddleocr-whl-package)
- [2. Easy-to-Use](#2-easy-to-use)
- [2.1 Use by Command Line](#21-use-by-command-line)
- [2.1.1 Chinese and English Model](#211-chinese-and-english-model)
- [2.1.2 Multi-language Model](#212-multi-language-model)
- [2.1.3 Layout Analysis](#213-layout-analysis)
- [2.2 Use by Code](#22-use-by-code)
- [2.2.1 Chinese & English Model and Multilingual Model](#221-chinese--english-model-and-multilingual-model)
- [2.2.2 Layout Analysis](#222-layout-analysis)
......@@ -171,7 +172,7 @@ paddleocr --image_dir=../doc/table/1.png --type=structure
| output | The path where excel and recognition results are saved | ./output/table |
| table_max_len | The long side of the image is resized in table structure model | 488 |
| table_model_dir | inference model path of table structure model | None |
| table_char_type | dict path of table structure model | ../ppocr/utils/dict/table_structure_dict.txt |
| table_char_dict_path | dict path of table structure model | ../ppocr/utils/dict/table_structure_dict.txt |
<a name="22-use-by-code"></a>
......
......@@ -470,8 +470,8 @@ inference/det_db/
- Text recognition model Inference using custom characters dictionary
If the text dictionary is modified during training, when using the inference model to predict, you need to specify the dictionary path used by `--rec_char_dict_path`, and set `rec_char_type=ch`
If the text dictionary is modified during training, when using the inference model to predict, you need to specify the dictionary path used by `--rec_char_dict_path`
```
python3 tools/infer/predict_rec.py --image_dir="./doc/imgs_words_en/word_336.png" --rec_model_dir="./your inference model" --rec_image_shape="3, 32, 100" --rec_char_type="ch" --rec_char_dict_path="your text dict path"
python3 tools/infer/predict_rec.py --image_dir="./doc/imgs_words_en/word_336.png" --rec_model_dir="./your inference model" --rec_image_shape="3, 32, 100" --rec_char_dict_path="your text dict path"
```
......@@ -348,7 +348,6 @@ im_show.save('result.jpg')
| rec_algorithm | Type of recognition algorithm selected | CRNN |
| rec_model_dir | the text recognition inference model folder. There are two ways to transfer parameters, 1. None: Automatically download the built-in model to `~/.paddleocr/rec`; 2. The path of the inference model converted by yourself, the model and params files must be included in the model path | None |
| rec_image_shape | image shape of recognition algorithm | "3,32,320" |
| rec_char_type | Character type of recognition algorithm, Chinese (ch) or English (en) | ch |
| rec_batch_num | When performing recognition, the batchsize of forward images | 30 |
| max_text_length | The maximum text length that the recognition algorithm can recognize | 25 |
| rec_char_dict_path | the alphabet path which needs to be modified to your own path when `rec_model_Name` use mode 2 | ./ppocr/utils/ppocr_keys_v1.txt |
......
......@@ -117,7 +117,7 @@ teds: 93.32
```python
cd PaddleOCR/ppstructure
python3 table/predict_table.py --det_model_dir=path/to/det_model_dir --rec_model_dir=path/to/rec_model_dir --table_model_dir=path/to/table_model_dir --image_dir=../doc/table/1.png --rec_char_dict_path=../ppocr/utils/dict/table_dict.txt --table_char_dict_path=../ppocr/utils/dict/table_structure_dict.txt --rec_char_type=EN --det_limit_side_len=736 --det_limit_type=min --output ../output/table
python3 table/predict_table.py --det_model_dir=path/to/det_model_dir --rec_model_dir=path/to/rec_model_dir --table_model_dir=path/to/table_model_dir --image_dir=../doc/table/1.png --rec_char_dict_path=../ppocr/utils/dict/table_dict.txt --table_char_dict_path=../ppocr/utils/dict/table_structure_dict.txt --det_limit_side_len=736 --det_limit_type=min --output ../output/table
```
After running, the excel sheet of each picture will be saved in the directory specified by the output field
......
......@@ -117,7 +117,7 @@ teds: 93.32
```python
cd PaddleOCR/ppstructure
python3 table/predict_table.py --det_model_dir=path/to/det_model_dir --rec_model_dir=path/to/rec_model_dir --table_model_dir=path/to/table_model_dir --image_dir=../doc/table/1.png --rec_char_dict_path=../ppocr/utils/dict/table_dict.txt --table_char_dict_path=../ppocr/utils/dict/table_structure_dict.txt --rec_char_type=EN --det_limit_side_len=736 --det_limit_type=min --output ../output/table
python3 table/predict_table.py --det_model_dir=path/to/det_model_dir --rec_model_dir=path/to/rec_model_dir --table_model_dir=path/to/table_model_dir --image_dir=../doc/table/1.png --rec_char_dict_path=../ppocr/utils/dict/table_dict.txt --table_char_dict_path=../ppocr/utils/dict/table_structure_dict.txt --det_limit_side_len=736 --det_limit_type=min --output ../output/table
```
Reference
......
......@@ -58,7 +58,6 @@ class TableStructurer(object):
}]
postprocess_params = {
'name': 'TableLabelDecode',
"character_type": args.table_char_type,
"character_dict_path": args.table_char_dict_path,
}
......@@ -104,7 +103,9 @@ class TableStructurer(object):
res_loc_final.append([left, top, right, bottom])
structure_str_list = structure_str_list[0][:-1]
structure_str_list = ['<html>', '<body>', '<table>'] + structure_str_list + ['</table>', '</body>', '</html>']
structure_str_list = [
'<html>', '<body>', '<table>'
] + structure_str_list + ['</table>', '</body>', '</html>']
elapse = time.time() - starttime
return (structure_str_list, res_loc_final), elapse
......
......@@ -26,7 +26,6 @@ def init_args():
# params for table structure
parser.add_argument("--table_max_len", type=int, default=488)
parser.add_argument("--table_model_dir", type=str)
parser.add_argument("--table_char_type", type=str, default='en')
parser.add_argument(
"--table_char_dict_path",
type=str,
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册