未验证 提交 4559f16e 编写于 作者: M MissPenguin 提交者: GitHub

Merge pull request #741 from WenmuZhou/whl

whl包添加分类模型
...@@ -12,11 +12,44 @@ pip install paddleocr ...@@ -12,11 +12,44 @@ pip install paddleocr
本地构建并安装 本地构建并安装
```bash ```bash
python setup.py bdist_wheel python setup.py bdist_wheel
pip install dist/paddleocr-0.0.3-py3-none-any.whl pip install dist/paddleocr-x.x.x-py3-none-any.whl # x.x.x是paddleocr的版本号
``` ```
### 1. 代码使用 ### 1. 代码使用
* 检测+识别全流程 * 检测+分类+识别全流程
```python
from paddleocr import PaddleOCR, draw_ocr
ocr = PaddleOCR(use_angle_cls=True) # need to run only once to download and load model into memory
img_path = 'PaddleOCR/doc/imgs/11.jpg'
result = ocr.ocr(img_path, cls=True)
for line in result:
print(line)
# 显示结果
from PIL import Image
image = Image.open(img_path).convert('RGB')
boxes = [line[0] for line in result]
txts = [line[1][0] for line in result]
scores = [line[1][1] for line in result]
im_show = draw_ocr(image, boxes, txts, scores, font_path='/path/to/PaddleOCR/doc/simfang.ttf')
im_show = Image.fromarray(im_show)
im_show.save('result.jpg')
```
结果是一个list,每个item包含了文本框,文字和识别置信度
```bash
[[[24.0, 36.0], [304.0, 34.0], [304.0, 72.0], [24.0, 74.0]], ['纯臻营养护发素', 0.964739]]
[[[24.0, 80.0], [172.0, 80.0], [172.0, 104.0], [24.0, 104.0]], ['产品信息/参数', 0.98069626]]
[[[24.0, 109.0], [333.0, 109.0], [333.0, 136.0], [24.0, 136.0]], ['(45元/每公斤,100公斤起订)', 0.9676722]]
......
```
结果可视化
<div align="center">
<img src="../imgs_results/whl/11_det_rec.jpg" width="800">
</div>
* 检测+识别
```python ```python
from paddleocr import PaddleOCR, draw_ocr from paddleocr import PaddleOCR, draw_ocr
ocr = PaddleOCR() # need to run only once to download and load model into memory ocr = PaddleOCR() # need to run only once to download and load model into memory
...@@ -48,12 +81,27 @@ im_show.save('result.jpg') ...@@ -48,12 +81,27 @@ im_show.save('result.jpg')
<img src="../imgs_results/whl/11_det_rec.jpg" width="800"> <img src="../imgs_results/whl/11_det_rec.jpg" width="800">
</div> </div>
* 分类+识别
```python
from paddleocr import PaddleOCR
ocr = PaddleOCR(use_angle_cls=True) # need to run only once to download and load model into memory
img_path = 'PaddleOCR/doc/imgs_words/ch/word_1.jpg'
result = ocr.ocr(img_path, det=False, cls=True)
for line in result:
print(line)
```
结果是一个list,每个item只包含识别结果和识别置信度
```bash
['韩国小馆', 0.9907421]
```
* 单独执行检测 * 单独执行检测
```python ```python
from paddleocr import PaddleOCR, draw_ocr from paddleocr import PaddleOCR, draw_ocr
ocr = PaddleOCR() # need to run only once to download and load model into memory ocr = PaddleOCR() # need to run only once to download and load model into memory
img_path = 'PaddleOCR/doc/imgs/11.jpg' img_path = 'PaddleOCR/doc/imgs/11.jpg'
result = ocr.ocr(img_path,rec=False) result = ocr.ocr(img_path, rec=False)
for line in result: for line in result:
print(line) print(line)
...@@ -84,7 +132,7 @@ im_show.save('result.jpg') ...@@ -84,7 +132,7 @@ im_show.save('result.jpg')
from paddleocr import PaddleOCR from paddleocr import PaddleOCR
ocr = PaddleOCR() # need to run only once to download and load model into memory ocr = PaddleOCR() # need to run only once to download and load model into memory
img_path = 'PaddleOCR/doc/imgs_words/ch/word_1.jpg' img_path = 'PaddleOCR/doc/imgs_words/ch/word_1.jpg'
result = ocr.ocr(img_path,det=False) result = ocr.ocr(img_path, det=False)
for line in result: for line in result:
print(line) print(line)
``` ```
...@@ -93,6 +141,20 @@ for line in result: ...@@ -93,6 +141,20 @@ for line in result:
['韩国小馆', 0.9907421] ['韩国小馆', 0.9907421]
``` ```
* 单独执行分类
```python
from paddleocr import PaddleOCR
ocr = PaddleOCR(use_angle_cls=True) # need to run only once to download and load model into memory
img_path = 'PaddleOCR/doc/imgs_words/ch/word_1.jpg'
result = ocr.ocr(img_path, det=False, rec=False, cls=True)
for line in result:
print(line)
```
结果是一个list,每个item只包含分类结果和分类置信度
```bash
['0', 0.9999924]
```
### 通过命令行使用 ### 通过命令行使用
查看帮助信息 查看帮助信息
...@@ -100,7 +162,19 @@ for line in result: ...@@ -100,7 +162,19 @@ for line in result:
paddleocr -h paddleocr -h
``` ```
* 检测+识别全流程 * 检测+分类+识别全流程
```bash
paddleocr --image_dir PaddleOCR/doc/imgs/11.jpg --use_angle_cls true --cls true
```
结果是一个list,每个item包含了文本框,文字和识别置信度
```bash
[[[24.0, 36.0], [304.0, 34.0], [304.0, 72.0], [24.0, 74.0]], ['纯臻营养护发素', 0.964739]]
[[[24.0, 80.0], [172.0, 80.0], [172.0, 104.0], [24.0, 104.0]], ['产品信息/参数', 0.98069626]]
[[[24.0, 109.0], [333.0, 109.0], [333.0, 136.0], [24.0, 136.0]], ['(45元/每公斤,100公斤起订)', 0.9676722]]
......
```
* 检测+识别
```bash ```bash
paddleocr --image_dir PaddleOCR/doc/imgs/11.jpg paddleocr --image_dir PaddleOCR/doc/imgs/11.jpg
``` ```
...@@ -112,6 +186,16 @@ paddleocr --image_dir PaddleOCR/doc/imgs/11.jpg ...@@ -112,6 +186,16 @@ paddleocr --image_dir PaddleOCR/doc/imgs/11.jpg
...... ......
``` ```
* 分类+识别
```bash
paddleocr --image_dir PaddleOCR/doc/imgs_words/ch/word_1.jpg --use_angle_cls true --cls true --det false
```
结果是一个list,每个item只包含识别结果和识别置信度
```bash
['韩国小馆', 0.9907421]
```
* 单独执行检测 * 单独执行检测
```bash ```bash
paddleocr --image_dir PaddleOCR/doc/imgs/11.jpg --rec false paddleocr --image_dir PaddleOCR/doc/imgs/11.jpg --rec false
...@@ -134,17 +218,27 @@ paddleocr --image_dir PaddleOCR/doc/imgs_words/ch/word_1.jpg --det false ...@@ -134,17 +218,27 @@ paddleocr --image_dir PaddleOCR/doc/imgs_words/ch/word_1.jpg --det false
['韩国小馆', 0.9907421] ['韩国小馆', 0.9907421]
``` ```
* 单独执行分类
```bash
paddleocr --image_dir PaddleOCR/doc/imgs_words/ch/word_1.jpg --use_angle_cls true --cls true --det false --rec false
```
结果是一个list,每个item只包含分类结果和分类置信度
```bash
['0', 0.9999924]
```
## 自定义模型 ## 自定义模型
当内置模型无法满足需求时,需要使用到自己训练的模型。 当内置模型无法满足需求时,需要使用到自己训练的模型。
首先,参照[inference.md](./inference.md) 第一节转换将检测和识别模型转换为inference模型,然后按照如下方式使用 首先,参照[inference.md](./inference.md) 第一节转换将检测、分类和识别模型转换为inference模型,然后按照如下方式使用
### 代码使用 ### 代码使用
```python ```python
from paddleocr import PaddleOCR, draw_ocr from paddleocr import PaddleOCR, draw_ocr
# 检测模型和识别模型路径下必须含有model和params文件 # 模型路径下必须含有model和params文件
ocr = PaddleOCR(det_model_dir='{your_det_model_dir}',rec_model_dir='{your_rec_model_dir}') ocr = PaddleOCR(det_model_dir='{your_det_model_dir}', rec_model_dir='{your_rec_model_dir}', rec_char_dict_path='{your_rec_char_dict_path}', cls_model_dir='{your_cls_model_dir}', use_angle_cls=True)
img_path = 'PaddleOCR/doc/imgs/11.jpg' img_path = 'PaddleOCR/doc/imgs/11.jpg'
result = ocr.ocr(img_path) result = ocr.ocr(img_path, cls=True)
for line in result: for line in result:
print(line) print(line)
...@@ -162,7 +256,7 @@ im_show.save('result.jpg') ...@@ -162,7 +256,7 @@ im_show.save('result.jpg')
### 通过命令行使用 ### 通过命令行使用
```bash ```bash
paddleocr --image_dir PaddleOCR/doc/imgs/11.jpg --det_model_dir {your_det_model_dir} --rec_model_dir {your_rec_model_dir} paddleocr --image_dir PaddleOCR/doc/imgs/11.jpg --det_model_dir {your_det_model_dir} --rec_model_dir {your_rec_model_dir} --rec_char_dict_path {your_rec_char_dict_path} --cls_model_dir {your_cls_model_dir} --use_angle_cls true --cls true
``` ```
## 参数说明 ## 参数说明
...@@ -182,13 +276,21 @@ paddleocr --image_dir PaddleOCR/doc/imgs/11.jpg --det_model_dir {your_det_model_ ...@@ -182,13 +276,21 @@ paddleocr --image_dir PaddleOCR/doc/imgs/11.jpg --det_model_dir {your_det_model_
| det_east_cover_thresh | EAST模型输出框的阈值,低于此值的预测框会被丢弃 | 0.1 | | det_east_cover_thresh | EAST模型输出框的阈值,低于此值的预测框会被丢弃 | 0.1 |
| det_east_nms_thresh | EAST模型输出框NMS的阈值 | 0.2 | | det_east_nms_thresh | EAST模型输出框NMS的阈值 | 0.2 |
| rec_algorithm | 使用的识别算法类型 | CRNN | | rec_algorithm | 使用的识别算法类型 | CRNN |
| rec_model_dir | 识别模型所在文件夹。传承那方式有两种,1. None: 自动下载内置模型到 `~/.paddleocr/rec`;2.自己转换好的inference模型路径,模型路径下必须包含model和params文件 | None | | rec_model_dir | 识别模型所在文件夹。传方式有两种,1. None: 自动下载内置模型到 `~/.paddleocr/rec`;2.自己转换好的inference模型路径,模型路径下必须包含model和params文件 | None |
| rec_image_shape | 识别算法的输入图片尺寸 | "3,32,320" | | rec_image_shape | 识别算法的输入图片尺寸 | "3,32,320" |
| rec_char_type | 识别算法的字符类型,中文(ch)或英文(en) | ch | | rec_char_type | 识别算法的字符类型,中文(ch)或英文(en) | ch |
| rec_batch_num | 进行识别时,同时前向的图片数 | 30 | | rec_batch_num | 进行识别时,同时前向的图片数 | 30 |
| max_text_length | 识别算法能识别的最大文字长度 | 25 | | max_text_length | 识别算法能识别的最大文字长度 | 25 |
| rec_char_dict_path | 识别模型字典路径,当rec_model_dir使用方式2传参时需要修改为自己的字典路径 | ./ppocr/utils/ppocr_keys_v1.txt | | rec_char_dict_path | 识别模型字典路径,当rec_model_dir使用方式2传参时需要修改为自己的字典路径 | ./ppocr/utils/ppocr_keys_v1.txt |
| use_space_char | 是否识别空格 | TRUE | | use_space_char | 是否识别空格 | TRUE |
| use_angle_cls | 是否加载分类模型 | FALSE |
| cls_model_dir | 分类模型所在文件夹。传参方式有两种,1. None: 自动下载内置模型到 `~/.paddleocr/cls`;2.自己转换好的inference模型路径,模型路径下必须包含model和params文件 | None |
| cls_image_shape | 分类算法的输入图片尺寸 | "3, 48, 192" |
| label_list | 分类算法的标签列表 | ['0', '180'] |
| cls_batch_num | 进行分类时,同时前向的图片数 |30 |
| enable_mkldnn | 是否启用mkldnn | FALSE | | enable_mkldnn | 是否启用mkldnn | FALSE |
| use_zero_copy_run | 是否通过zero_copy_run的方式进行前向 | FALSE |
| lang | 模型语言类型,目前支持 中文(ch)和英文(en) | ch |
| det | 前向时使用启动检测 | TRUE | | det | 前向时使用启动检测 | TRUE |
| rec | 前向时是否启动识别 | TRUE | | rec | 前向时是否启动识别 | TRUE |
| cls | 前向时是否启动分类 | FALSE |
...@@ -10,14 +10,48 @@ pip install paddleocr ...@@ -10,14 +10,48 @@ pip install paddleocr
build own whl package and install build own whl package and install
```bash ```bash
python setup.py bdist_wheel python setup.py bdist_wheel
pip install dist/paddleocr-0.0.3-py3-none-any.whl pip install dist/paddleocr-x.x.x-py3-none-any.whl # x.x.x is the version of paddleocr
``` ```
### 1. Use by code ### 1. Use by code
* detection classification and recognition
```python
from paddleocr import PaddleOCR,draw_ocr
ocr = PaddleOCR(use_angle_cls=True, lang='en') # need to run only once to download and load model into memory
img_path = 'PaddleOCR/doc/imgs_en/img_12.jpg'
result = ocr.ocr(img_path, cls=True)
for line in result:
print(line)
# draw result
from PIL import Image
image = Image.open(img_path).convert('RGB')
boxes = [line[0] for line in result]
txts = [line[1][0] for line in result]
scores = [line[1][1] for line in result]
im_show = draw_ocr(image, boxes, txts, scores, font_path='/path/to/PaddleOCR/doc/simfang.ttf')
im_show = Image.fromarray(im_show)
im_show.save('result.jpg')
```
Output will be a list, each item contains bounding box, text and recognition confidence
```bash
[[[442.0, 173.0], [1169.0, 173.0], [1169.0, 225.0], [442.0, 225.0]], ['ACKNOWLEDGEMENTS', 0.99283075]]
[[[393.0, 340.0], [1207.0, 342.0], [1207.0, 389.0], [393.0, 387.0]], ['We would like to thank all the designers and', 0.9357758]]
[[[399.0, 398.0], [1204.0, 398.0], [1204.0, 433.0], [399.0, 433.0]], ['contributors whohave been involved in the', 0.9592447]]
......
```
Visualization of results
<div align="center">
<img src="../imgs_results/whl/12_det_rec.jpg" width="800">
</div>
* detection and recognition * detection and recognition
```python ```python
from paddleocr import PaddleOCR,draw_ocr from paddleocr import PaddleOCR,draw_ocr
ocr = PaddleOCR() # need to run only once to download and load model into memory ocr = PaddleOCR(lang='en') # need to run only once to download and load model into memory
img_path = 'PaddleOCR/doc/imgs_en/img_12.jpg' img_path = 'PaddleOCR/doc/imgs_en/img_12.jpg'
result = ocr.ocr(img_path) result = ocr.ocr(img_path)
for line in result: for line in result:
...@@ -48,6 +82,21 @@ Visualization of results ...@@ -48,6 +82,21 @@ Visualization of results
<img src="../imgs_results/whl/12_det_rec.jpg" width="800"> <img src="../imgs_results/whl/12_det_rec.jpg" width="800">
</div> </div>
* classification and recognition
```python
from paddleocr import PaddleOCR
ocr = PaddleOCR(use_angle_cls=True, lang='en') # need to run only once to load model into memory
img_path = 'PaddleOCR/doc/imgs_words_en/word_10.png'
result = ocr.ocr(img_path, det=False, cls=True)
for line in result:
print(line)
```
Output will be a list, each item contains recognition text and confidence
```bash
['PAIN', 0.990372]
```
* only detection * only detection
```python ```python
from paddleocr import PaddleOCR,draw_ocr from paddleocr import PaddleOCR,draw_ocr
...@@ -83,18 +132,33 @@ Visualization of results ...@@ -83,18 +132,33 @@ Visualization of results
* only recognition * only recognition
```python ```python
from paddleocr import PaddleOCR from paddleocr import PaddleOCR
ocr = PaddleOCR() # need to run only once to load model into memory ocr = PaddleOCR(lang='en') # need to run only once to load model into memory
img_path = 'PaddleOCR/doc/imgs_words_en/word_10.png' img_path = 'PaddleOCR/doc/imgs_words_en/word_10.png'
result = ocr.ocr(img_path,det=False) result = ocr.ocr(img_path, det=False, cls=False)
for line in result: for line in result:
print(line) print(line)
``` ```
Output will be a list, each item contains text and recognition confidence Output will be a list, each item contains recognition text and confidence
```bash ```bash
['PAIN', 0.990372] ['PAIN', 0.990372]
``` ```
* only classification
```python
from paddleocr import PaddleOCR
ocr = PaddleOCR(use_angle_cls=True) # need to run only once to load model into memory
img_path = 'PaddleOCR/doc/imgs_words_en/word_10.png'
result = ocr.ocr(img_path, det=False, rec=False, cls=True)
for line in result:
print(line)
```
Output will be a list, each item contains classification result and confidence
```bash
['0', 0.99999964]
```
### Use by command line ### Use by command line
show help information show help information
...@@ -102,9 +166,22 @@ show help information ...@@ -102,9 +166,22 @@ show help information
paddleocr -h paddleocr -h
``` ```
* detection classification and recognition
```bash
paddleocr --image_dir PaddleOCR/doc/imgs_en/img_12.jpg --use_angle_cls true -cls true --lang en
```
Output will be a list, each item contains bounding box, text and recognition confidence
```bash
[[[442.0, 173.0], [1169.0, 173.0], [1169.0, 225.0], [442.0, 225.0]], ['ACKNOWLEDGEMENTS', 0.99283075]]
[[[393.0, 340.0], [1207.0, 342.0], [1207.0, 389.0], [393.0, 387.0]], ['We would like to thank all the designers and', 0.9357758]]
[[[399.0, 398.0], [1204.0, 398.0], [1204.0, 433.0], [399.0, 433.0]], ['contributors whohave been involved in the', 0.9592447]]
......
```
* detection and recognition * detection and recognition
```bash ```bash
paddleocr --image_dir PaddleOCR/doc/imgs_en/img_12.jpg paddleocr --image_dir PaddleOCR/doc/imgs_en/img_12.jpg --lang en
``` ```
Output will be a list, each item contains bounding box, text and recognition confidence Output will be a list, each item contains bounding box, text and recognition confidence
...@@ -115,6 +192,16 @@ Output will be a list, each item contains bounding box, text and recognition con ...@@ -115,6 +192,16 @@ Output will be a list, each item contains bounding box, text and recognition con
...... ......
``` ```
* classification and recognition
```bash
paddleocr --image_dir PaddleOCR/doc/imgs_words_en/word_10.png --use_angle_cls true -cls true --det false --lang en
```
Output will be a list, each item contains text and recognition confidence
```bash
['PAIN', 0.990372]
```
* only detection * only detection
```bash ```bash
paddleocr --image_dir PaddleOCR/doc/imgs_en/img_12.jpg --rec false paddleocr --image_dir PaddleOCR/doc/imgs_en/img_12.jpg --rec false
...@@ -130,7 +217,7 @@ Output will be a list, each item only contains bounding box ...@@ -130,7 +217,7 @@ Output will be a list, each item only contains bounding box
* only recognition * only recognition
```bash ```bash
paddleocr --image_dir PaddleOCR/doc/imgs_words_en/word_10.png --det false paddleocr --image_dir PaddleOCR/doc/imgs_words_en/word_10.png --det false --cls false --lang en
``` ```
Output will be a list, each item contains text and recognition confidence Output will be a list, each item contains text and recognition confidence
...@@ -138,6 +225,16 @@ Output will be a list, each item contains text and recognition confidence ...@@ -138,6 +225,16 @@ Output will be a list, each item contains text and recognition confidence
['PAIN', 0.990372] ['PAIN', 0.990372]
``` ```
* only classification
```bash
paddleocr --image_dir PaddleOCR/doc/imgs_words_en/word_10.png --use_angle_cls true -cls true --det false --rec false
```
Output will be a list, each item contains classification result and confidence
```bash
['0', 0.99999964]
```
## Use custom model ## Use custom model
When the built-in model cannot meet the needs, you need to use your own trained model. When the built-in model cannot meet the needs, you need to use your own trained model.
First, refer to the first section of [inference_en.md](./inference_en.md) to convert your det and rec model to inference model, and then use it as follows First, refer to the first section of [inference_en.md](./inference_en.md) to convert your det and rec model to inference model, and then use it as follows
...@@ -147,9 +244,9 @@ First, refer to the first section of [inference_en.md](./inference_en.md) to con ...@@ -147,9 +244,9 @@ First, refer to the first section of [inference_en.md](./inference_en.md) to con
```python ```python
from paddleocr import PaddleOCR,draw_ocr from paddleocr import PaddleOCR,draw_ocr
# The path of detection and recognition model must contain model and params files # The path of detection and recognition model must contain model and params files
ocr = PaddleOCR(det_model_dir='{your_det_model_dir}',rec_model_dir='{your_rec_model_dir}å') ocr = PaddleOCR(det_model_dir='{your_det_model_dir}', rec_model_dir='{your_rec_model_dir}', rec_char_dict_path='{your_rec_char_dict_path}', cls_model_dir='{your_cls_model_dir}', use_angle_cls=True)
img_path = 'PaddleOCR/doc/imgs_en/img_12.jpg' img_path = 'PaddleOCR/doc/imgs_en/img_12.jpg'
result = ocr.ocr(img_path) result = ocr.ocr(img_path, cls=True)
for line in result: for line in result:
print(line) print(line)
...@@ -167,7 +264,7 @@ im_show.save('result.jpg') ...@@ -167,7 +264,7 @@ im_show.save('result.jpg')
### Use by command line ### Use by command line
```bash ```bash
paddleocr --image_dir PaddleOCR/doc/imgs/11.jpg --det_model_dir {your_det_model_dir} --rec_model_dir {your_rec_model_dir} paddleocr --image_dir PaddleOCR/doc/imgs/11.jpg --det_model_dir {your_det_model_dir} --rec_model_dir {your_rec_model_dir} --rec_char_dict_path {your_rec_char_dict_path} --cls_model_dir {your_cls_model_dir} --use_angle_cls true --cls true
``` ```
## Parameter Description ## Parameter Description
...@@ -194,6 +291,14 @@ paddleocr --image_dir PaddleOCR/doc/imgs/11.jpg --det_model_dir {your_det_model_ ...@@ -194,6 +291,14 @@ paddleocr --image_dir PaddleOCR/doc/imgs/11.jpg --det_model_dir {your_det_model_
| max_text_length | The maximum text length that the recognition algorithm can recognize | 25 | | max_text_length | The maximum text length that the recognition algorithm can recognize | 25 |
| rec_char_dict_path | the alphabet path which needs to be modified to your own path when `rec_model_Name` use mode 2 | ./ppocr/utils/ppocr_keys_v1.txt | | rec_char_dict_path | the alphabet path which needs to be modified to your own path when `rec_model_Name` use mode 2 | ./ppocr/utils/ppocr_keys_v1.txt |
| use_space_char | Whether to recognize spaces | TRUE | | use_space_char | Whether to recognize spaces | TRUE |
| use_angle_cls | Whether to load classification model | FALSE |
| cls_model_dir | the classification inference model folder. There are two ways to transfer parameters, 1. None: Automatically download the built-in model to `~/.paddleocr/cls`; 2. The path of the inference model converted by yourself, the model and params files must be included in the model path | None |
| cls_image_shape | image shape of classification algorithm | "3,48,192" |
| label_list | label list of classification algorithm | ['0','180'] |
| cls_batch_num | When performing classification, the batchsize of forward images | 30 |
| enable_mkldnn | Whether to enable mkldnn | FALSE | | enable_mkldnn | Whether to enable mkldnn | FALSE |
| use_zero_copy_run | Whether to forward by zero_copy_run | FALSE |
| lang | The support language, now only chinese(ch) and english(en) are supported | ch |
| det | Enable detction when `ppocr.ocr` func exec | TRUE | | det | Enable detction when `ppocr.ocr` func exec | TRUE |
| rec | Enable detction when `ppocr.ocr` func exec | TRUE | | rec | Enable recognition when `ppocr.ocr` func exec | TRUE |
| cls | Enable classification when `ppocr.ocr` func exec | FALSE |
...@@ -33,10 +33,23 @@ from ppocr.utils.utility import check_and_read_gif, get_image_file_list ...@@ -33,10 +33,23 @@ from ppocr.utils.utility import check_and_read_gif, get_image_file_list
__all__ = ['PaddleOCR'] __all__ = ['PaddleOCR']
model_params = { model_urls = {
'det': 'https://paddleocr.bj.bcebos.com/ch_models/ch_det_mv3_db_infer.tar', 'det':
'rec': 'https://paddleocr.bj.bcebos.com/20-09-22/mobile/det/ch_ppocr_mobile_v1.1_det_infer.tar',
'https://paddleocr.bj.bcebos.com/ch_models/ch_rec_mv3_crnn_enhance_infer.tar', 'rec': {
'ch': {
'url':
'https://paddleocr.bj.bcebos.com/20-09-22/mobile/rec/ch_ppocr_mobile_v1.1_rec_infer.tar',
'dict_path': './ppocr/utils/ppocr_keys_v1.txt'
},
'en': {
'url':
'https://paddleocr.bj.bcebos.com/20-09-22/mobile/en/en_ppocr_mobile_v1.1_rec_infer.tar',
'dict_path': './ppocr/utils/ic15_dict.txt'
}
},
'cls':
'https://paddleocr.bj.bcebos.com/20-09-22/cls/ch_ppocr_mobile_v1.1_cls_infer.tar'
} }
SUPPORT_DET_MODEL = ['DB'] SUPPORT_DET_MODEL = ['DB']
...@@ -120,16 +133,24 @@ def parse_args(): ...@@ -120,16 +133,24 @@ def parse_args():
parser.add_argument("--rec_char_type", type=str, default='ch') parser.add_argument("--rec_char_type", type=str, default='ch')
parser.add_argument("--rec_batch_num", type=int, default=30) parser.add_argument("--rec_batch_num", type=int, default=30)
parser.add_argument("--max_text_length", type=int, default=25) parser.add_argument("--max_text_length", type=int, default=25)
parser.add_argument( parser.add_argument("--rec_char_dict_path", type=str, default=None)
"--rec_char_dict_path",
type=str,
default="./ppocr/utils/ppocr_keys_v1.txt")
parser.add_argument("--use_space_char", type=bool, default=True) parser.add_argument("--use_space_char", type=bool, default=True)
# params for text classifier
parser.add_argument("--use_angle_cls", type=str2bool, default=False)
parser.add_argument("--cls_model_dir", type=str, default=None)
parser.add_argument("--cls_image_shape", type=str, default="3, 48, 192")
parser.add_argument("--label_list", type=list, default=['0', '180'])
parser.add_argument("--cls_batch_num", type=int, default=30)
parser.add_argument("--cls_thresh", type=float, default=0.9)
parser.add_argument("--enable_mkldnn", type=bool, default=False) parser.add_argument("--enable_mkldnn", type=bool, default=False)
parser.add_argument("--use_zero_copy_run", type=bool, default=False)
parser.add_argument("--lang", type=str, default='ch')
parser.add_argument("--det", type=str2bool, default=True) parser.add_argument("--det", type=str2bool, default=True)
parser.add_argument("--rec", type=str2bool, default=True) parser.add_argument("--rec", type=str2bool, default=True)
parser.add_argument("--use_zero_copy_run", type=bool, default=False) parser.add_argument("--cls", type=str2bool, default=False)
return parser.parse_args() return parser.parse_args()
...@@ -142,16 +163,29 @@ class PaddleOCR(predict_system.TextSystem): ...@@ -142,16 +163,29 @@ class PaddleOCR(predict_system.TextSystem):
""" """
postprocess_params = parse_args() postprocess_params = parse_args()
postprocess_params.__dict__.update(**kwargs) postprocess_params.__dict__.update(**kwargs)
self.use_angle_cls = postprocess_params.use_angle_cls
lang = postprocess_params.lang
assert lang in model_urls['rec'], 'param lang must in {}'.format(
model_urls['rec'].keys())
if postprocess_params.rec_char_dict_path is None:
postprocess_params.rec_char_dict_path = model_urls['rec'][lang][
'dict_path']
# init model dir # init model dir
if postprocess_params.det_model_dir is None: if postprocess_params.det_model_dir is None:
postprocess_params.det_model_dir = os.path.join(BASE_DIR, 'det') postprocess_params.det_model_dir = os.path.join(BASE_DIR, 'det')
if postprocess_params.rec_model_dir is None: if postprocess_params.rec_model_dir is None:
postprocess_params.rec_model_dir = os.path.join(BASE_DIR, 'rec') postprocess_params.rec_model_dir = os.path.join(
BASE_DIR, 'rec/{}'.format(lang))
if postprocess_params.cls_model_dir is None:
postprocess_params.cls_model_dir = os.path.join(BASE_DIR, 'cls')
print(postprocess_params) print(postprocess_params)
# download model # download model
maybe_download(postprocess_params.det_model_dir, model_params['det']) maybe_download(postprocess_params.det_model_dir, model_urls['det'])
maybe_download(postprocess_params.rec_model_dir, model_params['rec']) maybe_download(postprocess_params.rec_model_dir,
model_urls['rec'][lang]['url'])
if self.use_angle_cls:
maybe_download(postprocess_params.cls_model_dir, model_urls['cls'])
if postprocess_params.det_algorithm not in SUPPORT_DET_MODEL: if postprocess_params.det_algorithm not in SUPPORT_DET_MODEL:
logger.error('det_algorithm must in {}'.format(SUPPORT_DET_MODEL)) logger.error('det_algorithm must in {}'.format(SUPPORT_DET_MODEL))
...@@ -166,7 +200,7 @@ class PaddleOCR(predict_system.TextSystem): ...@@ -166,7 +200,7 @@ class PaddleOCR(predict_system.TextSystem):
# init det_model and rec_model # init det_model and rec_model
super().__init__(postprocess_params) super().__init__(postprocess_params)
def ocr(self, img, det=True, rec=True): def ocr(self, img, det=True, rec=True, cls=False):
""" """
ocr with paddleocr ocr with paddleocr
args: args:
...@@ -175,6 +209,10 @@ class PaddleOCR(predict_system.TextSystem): ...@@ -175,6 +209,10 @@ class PaddleOCR(predict_system.TextSystem):
rec: use text recognition or not, if false, only det will be exec. default is True rec: use text recognition or not, if false, only det will be exec. default is True
""" """
assert isinstance(img, (np.ndarray, list, str)) assert isinstance(img, (np.ndarray, list, str))
if cls and not self.use_angle_cls:
print('cls should be false when use_angle_cls is false')
exit(-1)
self.use_angle_cls = cls
if isinstance(img, str): if isinstance(img, str):
image_file = img image_file = img
img, flag = check_and_read_gif(image_file) img, flag = check_and_read_gif(image_file)
...@@ -194,6 +232,10 @@ class PaddleOCR(predict_system.TextSystem): ...@@ -194,6 +232,10 @@ class PaddleOCR(predict_system.TextSystem):
else: else:
if not isinstance(img, list): if not isinstance(img, list):
img = [img] img = [img]
if self.use_angle_cls:
img, cls_res, elapse = self.text_classifier(img)
if not rec:
return cls_res
rec_res, elapse = self.text_recognizer(img) rec_res, elapse = self.text_recognizer(img)
return rec_res return rec_res
...@@ -208,6 +250,9 @@ def main(): ...@@ -208,6 +250,9 @@ def main():
ocr_engine = PaddleOCR() ocr_engine = PaddleOCR()
for img_path in image_file_list: for img_path in image_file_list:
print(img_path) print(img_path)
result = ocr_engine.ocr(img_path, det=args.det, rec=args.rec) result = ocr_engine.ocr(img_path,
det=args.det,
rec=args.rec,
cls=args.cls)
for line in result: for line in result:
print(line) print(line)
...@@ -32,7 +32,7 @@ setup( ...@@ -32,7 +32,7 @@ setup(
package_dir={'paddleocr': ''}, package_dir={'paddleocr': ''},
include_package_data=True, include_package_data=True,
entry_points={"console_scripts": ["paddleocr= paddleocr.paddleocr:main"]}, entry_points={"console_scripts": ["paddleocr= paddleocr.paddleocr:main"]},
version='0.0.3', version='1.0.0',
install_requires=requirements, install_requires=requirements,
license='Apache License 2.0', license='Apache License 2.0',
description='Awesome OCR toolkits based on PaddlePaddle (8.6M ultra-lightweight pre-trained model, support training and deployment among server, mobile, embeded and IoT devices', description='Awesome OCR toolkits based on PaddlePaddle (8.6M ultra-lightweight pre-trained model, support training and deployment among server, mobile, embeded and IoT devices',
......
...@@ -39,6 +39,7 @@ class TextClassifier(object): ...@@ -39,6 +39,7 @@ class TextClassifier(object):
self.cls_batch_num = args.rec_batch_num self.cls_batch_num = args.rec_batch_num
self.label_list = args.label_list self.label_list = args.label_list
self.use_zero_copy_run = args.use_zero_copy_run self.use_zero_copy_run = args.use_zero_copy_run
self.cls_thresh = args.cls_thresh
def resize_norm_img(self, img): def resize_norm_img(self, img):
imgC, imgH, imgW = self.cls_image_shape imgC, imgH, imgW = self.cls_image_shape
...@@ -110,7 +111,7 @@ class TextClassifier(object): ...@@ -110,7 +111,7 @@ class TextClassifier(object):
score = prob_out[rno][label_idx] score = prob_out[rno][label_idx]
label = self.label_list[label_idx] label = self.label_list[label_idx]
cls_res[indices[beg_img_no + rno]] = [label, score] cls_res[indices[beg_img_no + rno]] = [label, score]
if '180' in label and score > 0.9999: if '180' in label and score > self.cls_thresh:
img_list[indices[beg_img_no + rno]] = cv2.rotate( img_list[indices[beg_img_no + rno]] = cv2.rotate(
img_list[indices[beg_img_no + rno]], 1) img_list[indices[beg_img_no + rno]], 1)
return img_list, cls_res, predict_time return img_list, cls_res, predict_time
......
...@@ -78,6 +78,7 @@ def parse_args(): ...@@ -78,6 +78,7 @@ def parse_args():
parser.add_argument("--cls_image_shape", type=str, default="3, 48, 192") parser.add_argument("--cls_image_shape", type=str, default="3, 48, 192")
parser.add_argument("--label_list", type=list, default=['0', '180']) parser.add_argument("--label_list", type=list, default=['0', '180'])
parser.add_argument("--cls_batch_num", type=int, default=30) parser.add_argument("--cls_batch_num", type=int, default=30)
parser.add_argument("--cls_thresh", type=float, default=0.9)
parser.add_argument("--enable_mkldnn", type=str2bool, default=False) parser.add_argument("--enable_mkldnn", type=str2bool, default=False)
parser.add_argument("--use_zero_copy_run", type=str2bool, default=False) parser.add_argument("--use_zero_copy_run", type=str2bool, default=False)
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册