提交 86b1fc1b 编写于 作者: T tink2123

short multi-doc

上级 01441e4e
...@@ -11,7 +11,7 @@ PaddleOCR 旨在打造一套丰富、领先、且实用的OCR工具库,不仅 ...@@ -11,7 +11,7 @@ PaddleOCR 旨在打造一套丰富、领先、且实用的OCR工具库,不仅
其中英文模型支持,大小写字母和常见标点的检测识别,并优化了空格字符的识别: 其中英文模型支持,大小写字母和常见标点的检测识别,并优化了空格字符的识别:
<div align="center"> <div align="center">
<img src="../imgs_results/multi_lang/en_1.jpg" width="400" height="600"> <img src="../imgs_results/multi_lang/img_12.jpg" width="400" height="600">
</div> </div>
小语种模型覆盖了拉丁语系、阿拉伯语系、中文繁体、韩语、日语等等: 小语种模型覆盖了拉丁语系、阿拉伯语系、中文繁体、韩语、日语等等:
...@@ -19,6 +19,8 @@ PaddleOCR 旨在打造一套丰富、领先、且实用的OCR工具库,不仅 ...@@ -19,6 +19,8 @@ PaddleOCR 旨在打造一套丰富、领先、且实用的OCR工具库,不仅
<div align="center"> <div align="center">
<img src="../imgs_results/multi_lang/japan_2.jpg" width="600" height="300"> <img src="../imgs_results/multi_lang/japan_2.jpg" width="600" height="300">
<img src="../imgs_results/multi_lang/french_0.jpg" width="300" height="300"> <img src="../imgs_results/multi_lang/french_0.jpg" width="300" height="300">
<img src="../imgs_results/multi_lang/korean_0.jpg" width="400" height="300">
<img src="../imgs_results/multi_lang/arabic_0.jpg" width="400" height="300">
</div> </div>
...@@ -30,14 +32,9 @@ PaddleOCR 旨在打造一套丰富、领先、且实用的OCR工具库,不仅 ...@@ -30,14 +32,9 @@ PaddleOCR 旨在打造一套丰富、领先、且实用的OCR工具库,不仅
- [2 快速使用](#快速使用) - [2 快速使用](#快速使用)
- [2.1 命令行运行](#命令行运行) - [2.1 命令行运行](#命令行运行)
- [2.1.1 整图预测](#bash_检测+识别)
- [2.1.2 识别预测](#bash_识别)
- [2.1.3 检测预测](#bash_检测)
- [2.2 python 脚本运行](#python_脚本运行) - [2.2 python 脚本运行](#python_脚本运行)
- [2.2.1 整图预测](#python_检测+识别)
- [2.2.2 识别预测](#python_识别)
- [2.2.3 检测预测](#python_检测)
- [3 自定义训练](#自定义训练) - [3 自定义训练](#自定义训练)
- [4 预测部署](#预测部署)
- [4 支持语种及缩写](#语种缩写) - [4 支持语种及缩写](#语种缩写)
<a name="安装"></a> <a name="安装"></a>
...@@ -108,8 +105,6 @@ paddleocr --image_dir doc/imgs/japan_2.jpg --lang=japan ...@@ -108,8 +105,6 @@ paddleocr --image_dir doc/imgs/japan_2.jpg --lang=japan
paddleocr --image_dir doc/imgs_words/japan/1.jpg --det false --lang=japan paddleocr --image_dir doc/imgs_words/japan/1.jpg --det false --lang=japan
``` ```
![](https://raw.githubusercontent.com/PaddlePaddle/PaddleOCR/release/2.0/doc/imgs_words/japan/1.jpg)
结果是一个tuple,返回识别结果和识别置信度 结果是一个tuple,返回识别结果和识别置信度
```text ```text
...@@ -143,6 +138,9 @@ from paddleocr import PaddleOCR, draw_ocr ...@@ -143,6 +138,9 @@ from paddleocr import PaddleOCR, draw_ocr
# 同样也是通过修改 lang 参数切换语种 # 同样也是通过修改 lang 参数切换语种
ocr = PaddleOCR(lang="korean") # 首次执行会自动下载模型文件 ocr = PaddleOCR(lang="korean") # 首次执行会自动下载模型文件
# 可通过参数控制单独执行识别、检测
# result = ocr.ocr(img_path, det=False) 只执行识别
# result = ocr.ocr(img_path, rec=False) 只执行检测
img_path = 'doc/imgs/korean_1.jpg ' img_path = 'doc/imgs/korean_1.jpg '
result = ocr.ocr(img_path) result = ocr.ocr(img_path)
# 打印检测框和识别结果 # 打印检测框和识别结果
...@@ -166,59 +164,7 @@ im_show.save('result.jpg') ...@@ -166,59 +164,7 @@ im_show.save('result.jpg')
<img src="https://raw.githubusercontent.com/PaddlePaddle/PaddleOCR/release/2.1/doc/imgs_results/korean.jpg" width="800"> <img src="https://raw.githubusercontent.com/PaddlePaddle/PaddleOCR/release/2.1/doc/imgs_results/korean.jpg" width="800">
</div> </div>
* 识别预测 ppocr 还支持方向分类, 更多使用方式请参考:[whl包使用说明](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.0/doc/doc_ch/whl.md)
```
from paddleocr import PaddleOCR
ocr = PaddleOCR(lang="german")
img_path = 'PaddleOCR/doc/imgs_words/german/1.jpg'
result = ocr.ocr(img_path, det=False, cls=True)
for line in result:
print(line)
```
![](../imgs_words/german/1.jpg)
结果是一个tuple,只包含识别结果和识别置信度
```
('leider auch jetzt', 0.97538936)
```
* 检测预测
```python
from paddleocr import PaddleOCR, draw_ocr
ocr = PaddleOCR() # need to run only once to download and load model into memory
img_path = 'PaddleOCR/doc/imgs_en/img_12.jpg'
result = ocr.ocr(img_path, rec=False)
for line in result:
print(line)
# 显示结果
from PIL import Image
image = Image.open(img_path).convert('RGB')
im_show = draw_ocr(image, result, txts=None, scores=None, font_path='/path/to/PaddleOCR/doc/fonts/simfang.ttf')
im_show = Image.fromarray(im_show)
im_show.save('result.jpg')
```
结果是一个list,每个item只包含文本框
```bash
[[26.0, 457.0], [137.0, 457.0], [137.0, 477.0], [26.0, 477.0]]
[[25.0, 425.0], [372.0, 425.0], [372.0, 448.0], [25.0, 448.0]]
[[128.0, 397.0], [273.0, 397.0], [273.0, 414.0], [128.0, 414.0]]
......
```
结果可视化 :
<div align="center">
<img src="https://raw.githubusercontent.com/PaddlePaddle/PaddleOCR/release/2.1/doc/imgs_results/whl/12_det.jpg" width="800">
</div>
ppocr 还支持方向分类, 更多使用方式请参考:[whl包使用说明](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.0/doc/doc_ch/whl.md)
<a name="自定义训练"></a> <a name="自定义训练"></a>
## 3 自定义训练 ## 3 自定义训练
...@@ -229,85 +175,59 @@ ppocr 支持使用自己的数据进行自定义训练或finetune, 其中识别 ...@@ -229,85 +175,59 @@ ppocr 支持使用自己的数据进行自定义训练或finetune, 其中识别
具体数据准备、训练过程可参考:[文本检测](../doc_ch/detection.md)[文本识别](../doc_ch/recognition.md),更多功能如预测部署、 具体数据准备、训练过程可参考:[文本检测](../doc_ch/detection.md)[文本识别](../doc_ch/recognition.md),更多功能如预测部署、
数据标注等功能可以阅读完整的[文档教程](../../README_ch.md) 数据标注等功能可以阅读完整的[文档教程](../../README_ch.md)
<a name="预测部署"></a>
## 4 预测部署
除了安装whl包进行快速预测,ppocr 也提供了多种预测部署方式,如有需求可阅读相关文档:
- [基于Python脚本预测引擎推理](./doc/doc_ch/inference.md)
- [基于C++预测引擎推理](./deploy/cpp_infer/readme.md)
- [服务化部署](./deploy/pdserving/README_CN.md)
- [端侧部署](https://github.com/PaddlePaddle/PaddleOCR/blob/develop/deploy/lite/readme.md)
- [Benchmark](./doc/doc_ch/benchmark.md)
<a name="语种缩写"></a> <a name="语种缩写"></a>
## 4 支持语种及缩写 ## 5 支持语种及缩写
| 语种 | 描述 | 缩写 | | 语种 | 描述 | 缩写 | | 语种 | 描述 | 缩写 |
| --- | --- | --- | | --- | --- | --- | ---|--- | --- | --- |
|中文|chinese and english|ch| |中文|chinese and english|ch| |保加利亚文|Bulgarian |bg|
|英文|english|en| |英文|english|en| |乌克兰文|Ukranian|uk|
|法文|french|fr| |法文|french|fr| |白俄罗斯文|Belarusian|be|
|德文|german|german| |德文|german|german| |泰卢固文|Telugu |te|
|日文|japan|japan| |日文|japan|japan| |卡纳达文|Kannada |kn|
|韩文|korean|korean| |韩文|korean|korean| |泰米尔文|Tamil |ta|
|中文繁体|chinese traditional |ch_tra| |中文繁体|chinese traditional |ch_tra| |南非荷兰文 |Afrikaans |af|
|意大利文| Italian |it| |意大利文| Italian |it| |阿塞拜疆文 |Azerbaijani |az|
|西班牙文|Spanish |es| |西班牙文|Spanish |es| |波斯尼亚文|Bosnian|bs|
|葡萄牙文| Portuguese|pt| |葡萄牙文| Portuguese|pt| |捷克文|Czech|cs|
|俄罗斯文|Russia|ru| |俄罗斯文|Russia|ru| |威尔士文 |Welsh |cy|
|阿拉伯文|Arabic|ar| |阿拉伯文|Arabic|ar| |丹麦文 |Danish|da|
|印地文|Hindi|hi| |印地文|Hindi|hi| |爱沙尼亚文 |Estonian |et|
|维吾尔|Uyghur|ug| |维吾尔|Uyghur|ug| |爱尔兰文 |Irish |ga|
|波斯文|Persian|fa| |波斯文|Persian|fa| |克罗地亚文|Croatian |hr|
|乌尔都文|Urdu|ur| |乌尔都文|Urdu|ur| |匈牙利文|Hungarian |hu|
|塞尔维亚文(latin)| Serbian(latin) |rs_latin| |塞尔维亚文(latin)| Serbian(latin) |rs_latin| |印尼文|Indonesian|id|
|欧西坦文|Occitan |oc| |欧西坦文|Occitan |oc| |冰岛文 |Icelandic|is|
|马拉地文|Marathi|mr| |马拉地文|Marathi|mr| |库尔德文 |Kurdish|ku|
|尼泊尔文|Nepali|ne| |尼泊尔文|Nepali|ne| |立陶宛文|Lithuanian |lt|
|塞尔维亚文(cyrillic)|Serbian(cyrillic)|rs_cyrillic| |塞尔维亚文(cyrillic)|Serbian(cyrillic)|rs_cyrillic| |拉脱维亚文 |Latvian |lv|
|保加利亚文|Bulgarian |bg| |毛利文|Maori|mi| | 达尔瓦文|Dargwa |dar|
|乌克兰文|Ukranian|uk| |马来文 |Malay|ms| | 因古什文|Ingush |inh|
|白俄罗斯文|Belarusian|be| |马耳他文 |Maltese |mt| | 拉克文|Lak |lbe|
|泰卢固文|Telugu |te| |荷兰文 |Dutch |nl| | 莱兹甘文|Lezghian |lez|
|卡纳达文|Kannada |kn| |挪威文 |Norwegian |no| |塔巴萨兰文 |Tabassaran |tab|
|泰米尔文|Tamil |ta| |波兰文|Polish |pl| | 比尔哈文|Bihari |bh|
|南非荷兰文 |Afrikaans |af| | 罗马尼亚文|Romanian |ro| | 迈蒂利文|Maithili |mai|
|阿塞拜疆文 |Azerbaijani |az| | 斯洛伐克文|Slovak |sk| | 昂加文|Angika |ang|
|波斯尼亚文|Bosnian|bs| | 斯洛文尼亚文|Slovenian |sl| | 孟加拉文|Bhojpuri |bho|
|捷克文|Czech|cs| | 阿尔巴尼亚文|Albanian |sq| | 摩揭陀文 |Magahi |mah|
|威尔士文 |Welsh |cy| | 瑞典文|Swedish |sv| | 那格浦尔文|Nagpur |sck|
|丹麦文 |Danish|da| | 西瓦希里文|Swahili |sw| | 尼瓦尔文|Newari |new|
|爱沙尼亚文 |Estonian |et| | 塔加洛文|Tagalog |tl| | 保加利亚文 |Goan Konkani|gom|
|爱尔兰文 |Irish |ga| | 土耳其文|Turkish |tr| | 沙特阿拉伯文|Saudi Arabia|sa|
|克罗地亚文|Croatian |hr| | 乌兹别克文|Uzbek |uz| | 阿瓦尔文|Avar |ava|
|匈牙利文|Hungarian |hu| | 越南文|Vietnamese |vi| | 阿瓦尔文|Avar |ava|
|印尼文|Indonesian|id| | 蒙古文|Mongolian |mn| | 阿迪赫文|Adyghe |ady|
|冰岛文 |Icelandic|is|
|库尔德文 |Kurdish|ku|
|立陶宛文|Lithuanian |lt|
|拉脱维亚文 |Latvian |lv|
|毛利文|Maori|mi|
|马来文 |Malay|ms|
|马耳他文 |Maltese |mt|
|荷兰文 |Dutch |nl|
|挪威文 |Norwegian |no|
|波兰文|Polish |pl|
| 罗马尼亚文|Romanian |ro|
| 斯洛伐克文|Slovak |sk|
| 斯洛文尼亚文|Slovenian |sl|
| 阿尔巴尼亚文|Albanian |sq|
| 瑞典文|Swedish |sv|
| 西瓦希里文|Swahili |sw|
| 塔加洛文|Tagalog |tl|
| 土耳其文|Turkish |tr|
| 乌兹别克文|Uzbek |uz|
| 越南文|Vietnamese |vi|
| 蒙古文|Mongolian |mn|
| 阿巴扎文|Abaza |abq| | 阿巴扎文|Abaza |abq|
| 阿迪赫文|Adyghe |ady|
| 卡巴丹文|Kabardian |kbd|
| 阿瓦尔文|Avar |ava|
| 达尔瓦文|Dargwa |dar|
| 因古什文|Ingush |inh|
| 拉克文|Lak |lbe|
| 莱兹甘文|Lezghian |lez|
|塔巴萨兰文 |Tabassaran |tab|
| 比尔哈文|Bihari |bh|
| 迈蒂利文|Maithili |mai|
| 昂加文|Angika |ang|
| 孟加拉文|Bhojpuri |bho|
| 摩揭陀文 |Magahi |mah|
| 那格浦尔文|Nagpur |sck|
| 尼瓦尔文|Newari |new|
| 保加利亚文 |Goan Konkani|gom|
| 沙特阿拉伯文|Saudi Arabia|sa|
...@@ -102,14 +102,14 @@ python3 generate_multi_language_configs.py -l it \ ...@@ -102,14 +102,14 @@ python3 generate_multi_language_configs.py -l it \
| german_mobile_v2.0_rec |Lightweight model for German recognition|[rec_german_lite_train.yml](../../configs/rec/multi_language/rec_german_lite_train.yml)|2.65M|[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/german_mobile_v2.0_rec_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/german_mobile_v2.0_rec_train.tar) | | german_mobile_v2.0_rec |Lightweight model for German recognition|[rec_german_lite_train.yml](../../configs/rec/multi_language/rec_german_lite_train.yml)|2.65M|[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/german_mobile_v2.0_rec_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/german_mobile_v2.0_rec_train.tar) |
| korean_mobile_v2.0_rec |Lightweight model for Korean recognition|[rec_korean_lite_train.yml](../../configs/rec/multi_language/rec_korean_lite_train.yml)|3.9M|[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/korean_mobile_v2.0_rec_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/korean_mobile_v2.0_rec_train.tar) | | korean_mobile_v2.0_rec |Lightweight model for Korean recognition|[rec_korean_lite_train.yml](../../configs/rec/multi_language/rec_korean_lite_train.yml)|3.9M|[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/korean_mobile_v2.0_rec_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/korean_mobile_v2.0_rec_train.tar) |
| japan_mobile_v2.0_rec |Lightweight model for Japanese recognition|[rec_japan_lite_train.yml](../../configs/rec/multi_language/rec_japan_lite_train.yml)|4.23M|[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/japan_mobile_v2.0_rec_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/japan_mobile_v2.0_rec_train.tar) | | japan_mobile_v2.0_rec |Lightweight model for Japanese recognition|[rec_japan_lite_train.yml](../../configs/rec/multi_language/rec_japan_lite_train.yml)|4.23M|[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/japan_mobile_v2.0_rec_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/japan_mobile_v2.0_rec_train.tar) |
| chinese_cht_mobile_v2.0_rec |Lightweight model for chinese cht recognition|rec_chinese_cht_lite_train.yml|5.63M|[推理模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/chinese_cht_mobile_v2.0_rec_infer.tar) / [训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/chinese_cht_mobile_v2.0_rec_train.tar) | | chinese_cht_mobile_v2.0_rec |Lightweight model for chinese cht recognition|rec_chinese_cht_lite_train.yml|5.63M|[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/chinese_cht_mobile_v2.0_rec_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/chinese_cht_mobile_v2.0_rec_train.tar) |
| te_mobile_v2.0_rec |Lightweight model for Telugu recognition|rec_te_lite_train.yml|2.63M|[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/te_mobile_v2.0_rec_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/te_mobile_v2.0_rec_train.tar) | | te_mobile_v2.0_rec |Lightweight model for Telugu recognition|rec_te_lite_train.yml|2.63M|[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/te_mobile_v2.0_rec_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/te_mobile_v2.0_rec_train.tar) |
| ka_mobile_v2.0_rec |Lightweight model for Kannada recognition|rec_ka_lite_train.yml|2.63M|[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/ka_mobile_v2.0_rec_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/ka_mobile_v2.0_rec_train.tar) | | ka_mobile_v2.0_rec |Lightweight model for Kannada recognition|rec_ka_lite_train.yml|2.63M|[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/ka_mobile_v2.0_rec_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/ka_mobile_v2.0_rec_train.tar) |
| ta_mobile_v2.0_rec |Lightweight model for Tamil recognition|rec_ta_lite_train.yml|2.63M|[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/ta_mobile_v2.0_rec_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/ta_mobile_v2.0_rec_train.tar) | | ta_mobile_v2.0_rec |Lightweight model for Tamil recognition|rec_ta_lite_train.yml|2.63M|[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/ta_mobile_v2.0_rec_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/ta_mobile_v2.0_rec_train.tar) |
| latin_mobile_v2.0_rec | Lightweight model for latin recognition | [rec_latin_lite_train.yml](../../configs/rec/multi_language/rec_latin_lite_train.yml) |2.6M|[推理模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/latin_ppocr_mobile_v2.0_rec_infer.tar) / [训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/latin_ppocr_mobile_v2.0_rec_train.tar) | | latin_mobile_v2.0_rec | Lightweight model for latin recognition | [rec_latin_lite_train.yml](../../configs/rec/multi_language/rec_latin_lite_train.yml) |2.6M|[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/latin_ppocr_mobile_v2.0_rec_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/latin_ppocr_mobile_v2.0_rec_train.tar) |
| arabic_mobile_v2.0_rec | Lightweight model for arabic recognition | [rec_arabic_lite_train.yml](../../configs/rec/multi_language/rec_arabic_lite_train.yml) |2.6M|[推理模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/arabic_ppocr_mobile_v2.0_rec_infer.tar) / [训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/arabic_ppocr_mobile_v2.0_rec_train.tar) | | arabic_mobile_v2.0_rec | Lightweight model for arabic recognition | [rec_arabic_lite_train.yml](../../configs/rec/multi_language/rec_arabic_lite_train.yml) |2.6M|[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/arabic_ppocr_mobile_v2.0_rec_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/arabic_ppocr_mobile_v2.0_rec_train.tar) |
| cyrillic_mobile_v2.0_rec | Lightweight model for cyrillic recognition | [rec_cyrillic_lite_train.yml](../../configs/rec/multi_language/rec_cyrillic_lite_train.yml) |2.6M|[推理模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/cyrillic_ppocr_mobile_v2.0_rec_infer.tar) / [训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/cyrillic_ppocr_mobile_v2.0_rec_train.tar) | | cyrillic_mobile_v2.0_rec | Lightweight model for cyrillic recognition | [rec_cyrillic_lite_train.yml](../../configs/rec/multi_language/rec_cyrillic_lite_train.yml) |2.6M|[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/cyrillic_ppocr_mobile_v2.0_rec_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/cyrillic_ppocr_mobile_v2.0_rec_train.tar) |
| devanagari_mobile_v2.0_rec | Lightweight model for devanagari recognition | [rec_devanagari_lite_train.yml](../../configs/rec/multi_language/rec_devanagari_lite_train.yml) |2.6M|[推理模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/devanagari_ppocr_mobile_v2.0_rec_infer.tar) / [训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/devanagari_ppocr_mobile_v2.0_rec_train.tar) | | devanagari_mobile_v2.0_rec | Lightweight model for devanagari recognition | [rec_devanagari_lite_train.yml](../../configs/rec/multi_language/rec_devanagari_lite_train.yml) |2.6M|[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/devanagari_ppocr_mobile_v2.0_rec_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/devanagari_ppocr_mobile_v2.0_rec_train.tar) |
For more supported languages, please refer to : [Multi-language model](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.1/doc/doc_en/multi_languages_en.md#4-support-languages-and-abbreviations) For more supported languages, please refer to : [Multi-language model](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.1/doc/doc_en/multi_languages_en.md#4-support-languages-and-abbreviations)
......
...@@ -13,7 +13,7 @@ Among them, the English model supports the detection and recognition of uppercas ...@@ -13,7 +13,7 @@ Among them, the English model supports the detection and recognition of uppercas
letters and common punctuation, and the recognition of space characters is optimized: letters and common punctuation, and the recognition of space characters is optimized:
<div align="center"> <div align="center">
<img src="../imgs_results/multi_lang/en_1.jpg" width="400" height="600"> <img src="../imgs_results/multi_lang/img_12.jpg" width="400" height="600">
</div> </div>
The multilingual models cover Latin, Arabic, Traditional Chinese, Korean, Japanese, etc.: The multilingual models cover Latin, Arabic, Traditional Chinese, Korean, Japanese, etc.:
...@@ -21,6 +21,8 @@ The multilingual models cover Latin, Arabic, Traditional Chinese, Korean, Japane ...@@ -21,6 +21,8 @@ The multilingual models cover Latin, Arabic, Traditional Chinese, Korean, Japane
<div align="center"> <div align="center">
<img src="../imgs_results/multi_lang/japan_2.jpg" width="600" height="300"> <img src="../imgs_results/multi_lang/japan_2.jpg" width="600" height="300">
<img src="../imgs_results/multi_lang/french_0.jpg" width="300" height="300"> <img src="../imgs_results/multi_lang/french_0.jpg" width="300" height="300">
<img src="../imgs_results/multi_lang/korean_0.jpg" width="400" height="300">
<img src="../imgs_results/multi_lang/arabic_0.jpg" width="400" height="300">
</div> </div>
This document will briefly introduce how to use the multilingual model. This document will briefly introduce how to use the multilingual model.
...@@ -31,14 +33,9 @@ This document will briefly introduce how to use the multilingual model. ...@@ -31,14 +33,9 @@ This document will briefly introduce how to use the multilingual model.
- [2 Quick Use](#Quick_Use) - [2 Quick Use](#Quick_Use)
- [2.1 Command line operation](#Command_line_operation) - [2.1 Command line operation](#Command_line_operation)
- [2.1.1 Prediction of the whole image](#bash_detection+recognition)
- [2.1.2 Recognition](#bash_Recognition)
- [2.1.3 Detection](#bash_detection)
- [2.2 python script running](#python_Script_running) - [2.2 python script running](#python_Script_running)
- [2.2.1 Whole image prediction](#python_detection+recognition)
- [2.2.2 Recognition](#python_Recognition)
- [2.2.3 Detection](#python_detection)
- [3 Custom Training](#Custom_Training) - [3 Custom Training](#Custom_Training)
- [4 Inference and Deployment](#inference)
- [4 Supported languages and abbreviations](#language_abbreviations) - [4 Supported languages and abbreviations](#language_abbreviations)
<a name="Install"></a> <a name="Install"></a>
...@@ -143,6 +140,9 @@ from paddleocr import PaddleOCR, draw_ocr ...@@ -143,6 +140,9 @@ from paddleocr import PaddleOCR, draw_ocr
ocr = PaddleOCR(lang="korean") # The model file will be downloaded automatically when executed for the first time ocr = PaddleOCR(lang="korean") # The model file will be downloaded automatically when executed for the first time
img_path ='doc/imgs/korean_1.jpg' img_path ='doc/imgs/korean_1.jpg'
result = ocr.ocr(img_path) result = ocr.ocr(img_path)
# Recognition and detection can be performed separately through parameter control
# result = ocr.ocr(img_path, det=False) Only perform recognition
# result = ocr.ocr(img_path, rec=False) Only perform detection
# Print detection frame and recognition result # Print detection frame and recognition result
for line in result: for line in result:
print(line) print(line)
...@@ -162,54 +162,6 @@ Visualization of results: ...@@ -162,54 +162,6 @@ Visualization of results:
![](https://raw.githubusercontent.com/PaddlePaddle/PaddleOCR/release/2.1/doc/imgs_results/korean.jpg) ![](https://raw.githubusercontent.com/PaddlePaddle/PaddleOCR/release/2.1/doc/imgs_results/korean.jpg)
* Recognition
```
from paddleocr import PaddleOCR
ocr = PaddleOCR(lang="german")
img_path ='PaddleOCR/doc/imgs_words/german/1.jpg'
result = ocr.ocr(img_path, det=False, cls=True)
for line in result:
print(line)
```
![](https://raw.githubusercontent.com/PaddlePaddle/PaddleOCR/release/2.1/doc/imgs_words/german/1.jpg)
The result is a tuple, which only contains the recognition result and recognition confidence
```
('leider auch jetzt', 0.97538936)
```
* Detection
```python
from paddleocr import PaddleOCR, draw_ocr
ocr = PaddleOCR() # need to run only once to download and load model into memory
img_path ='PaddleOCR/doc/imgs_en/img_12.jpg'
result = ocr.ocr(img_path, rec=False)
for line in result:
print(line)
# show result
from PIL import Image
image = Image.open(img_path).convert('RGB')
im_show = draw_ocr(image, result, txts=None, scores=None, font_path='/path/to/PaddleOCR/doc/fonts/simfang.ttf')
im_show = Image.fromarray(im_show)
im_show.save('result.jpg')
```
The result is a list, each item contains only text boxes
```bash
[[26.0, 457.0], [137.0, 457.0], [137.0, 477.0], [26.0, 477.0]]
[[25.0, 425.0], [372.0, 425.0], [372.0, 448.0], [25.0, 448.0]]
[[128.0, 397.0], [273.0, 397.0], [273.0, 414.0], [128.0, 414.0]]
......
```
Visualization of results:
![](https://raw.githubusercontent.com/PaddlePaddle/PaddleOCR/release/2.1/doc/imgs_results/whl/12_det.jpg)
ppocr also supports direction classification. For more usage methods, please refer to: [whl package instructions](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.0/doc/doc_ch/whl.md). ppocr also supports direction classification. For more usage methods, please refer to: [whl package instructions](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.0/doc/doc_ch/whl.md).
<a name="Custom_training"></a> <a name="Custom_training"></a>
...@@ -221,85 +173,62 @@ Modify the training data path, dictionary and other parameters. ...@@ -221,85 +173,62 @@ Modify the training data path, dictionary and other parameters.
For specific data preparation and training process, please refer to: [Text Detection](../doc_en/detection_en.md), [Text Recognition](../doc_en/recognition_en.md), more functions such as predictive deployment, For specific data preparation and training process, please refer to: [Text Detection](../doc_en/detection_en.md), [Text Recognition](../doc_en/recognition_en.md), more functions such as predictive deployment,
For functions such as data annotation, you can read the complete [Document Tutorial](../../README.md). For functions such as data annotation, you can read the complete [Document Tutorial](../../README.md).
<a name="inference"></a>
## 4 Inference and Deployment
In addition to installing the whl package for quick forecasting,
ppocr also provides a variety of forecasting deployment methods.
If necessary, you can read related documents:
- [Python Inference](./doc/doc_en/inference_en.md)
- [C++ Inference](./deploy/cpp_infer/readme_en.md)
- [Serving](./deploy/pdserving/README.md)
- [Mobile](https://github.com/PaddlePaddle/PaddleOCR/blob/develop/deploy/lite/readme_en.md)
- [Benchmark](./doc/doc_en/benchmark_en.md)
<a name="language_abbreviation"></a> <a name="language_abbreviation"></a>
## 4 Support languages and abbreviations ## 5 Support languages and abbreviations
| Language | Abbreviation | | Language | Abbreviation | | Language | Abbreviation |
| --- | --- | | --- | --- | | --- | --- |
|chinese and english|ch| |chinese and english|ch| |Arabic|ar|
|english|en| |english|en| |Hindi|hi|
|french|fr| |french|fr| |Uyghur|ug|
|german|german| |german|german| |Persian|fa|
|japan|japan| |japan|japan| |Urdu|ur|
|korean|korean| |korean|korean| | Serbian(latin) |rs_latin|
|chinese traditional |ch_tra| |chinese traditional |ch_tra| |Occitan |oc|
| Italian |it| | Italian |it| |Marathi|mr|
|Spanish |es| |Spanish |es| |Nepali|ne|
| Portuguese|pt| | Portuguese|pt| |Serbian(cyrillic)|rs_cyrillic|
|Russia|ru| |Russia|ru||Bulgarian |bg|
|Arabic|ar| |Ukranian|uk| |Estonian |et|
|Hindi|hi| |Belarusian|be| |Irish |ga|
|Uyghur|ug| |Telugu |te| |Croatian |hr|
|Persian|fa| |Kannada |kn| |Hungarian |hu|
|Urdu|ur| |Tamil |ta| |Indonesian|id|
| Serbian(latin) |rs_latin| |Afrikaans |af| |Icelandic|is|
|Occitan |oc| |Azerbaijani |az||Kurdish|ku|
|Marathi|mr| |Bosnian|bs| |Lithuanian |lt|
|Nepali|ne| |Czech|cs| |Latvian |lv|
|Serbian(cyrillic)|rs_cyrillic| |Welsh |cy| |Maori|mi|
|Bulgarian |bg| |Danish|da| |Malay|ms|
|Ukranian|uk| |Maltese |mt| |Adyghe |ady|
|Belarusian|be| |Dutch |nl| |Kabardian |kbd|
|Telugu |te| |Norwegian |no| |Avar |ava|
|Kannada |kn| |Polish |pl| |Dargwa |dar|
|Tamil |ta| |Romanian |ro| |Ingush |inh|
|Afrikaans |af| |Slovak |sk| |Lak |lbe|
|Azerbaijani |az| |Slovenian |sl| |Lezghian |lez|
|Bosnian|bs| |Albanian |sq| |Tabassaran |tab|
|Czech|cs| |Swedish |sv| |Bihari |bh|
|Welsh |cy| |Swahili |sw| |Maithili |mai|
|Danish|da| |Tagalog |tl| |Angika |ang|
|Estonian |et| |Turkish |tr| |Bhojpuri |bho|
|Irish |ga| |Uzbek |uz| |Magahi |mah|
|Croatian |hr| |Vietnamese |vi| |Nagpur |sck|
|Hungarian |hu| |Mongolian |mn| |Newari |new|
|Indonesian|id| |Abaza |abq| |Goan Konkani|gom|
|Icelandic|is|
|Kurdish|ku|
|Lithuanian |lt|
|Latvian |lv|
|Maori|mi|
|Malay|ms|
|Maltese |mt|
|Dutch |nl|
|Norwegian |no|
|Polish |pl|
|Romanian |ro|
|Slovak |sk|
|Slovenian |sl|
|Albanian |sq|
|Swedish |sv|
|Swahili |sw|
|Tagalog |tl|
|Turkish |tr|
|Uzbek |uz|
|Vietnamese |vi|
|Mongolian |mn|
|Abaza |abq|
|Adyghe |ady|
|Kabardian |kbd|
|Avar |ava|
|Dargwa |dar|
|Ingush |inh|
|Lak |lbe|
|Lezghian |lez|
|Tabassaran |tab|
|Bihari |bh|
|Maithili |mai|
|Angika |ang|
|Bhojpuri |bho|
|Magahi |mah|
|Nagpur |sck|
|Newari |new|
|Goan Konkani|gom|
|Saudi Arabia|sa| |Saudi Arabia|sa|
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册