diff --git a/doc/doc_ch/multi_languages.md b/doc/doc_ch/multi_languages.md
index 741602e3c26725304c8a5e8300969fbea6ece4d0..306eba36e463cb4aef20a1d8ff895ecfcc77d0ef 100644
--- a/doc/doc_ch/multi_languages.md
+++ b/doc/doc_ch/multi_languages.md
@@ -11,7 +11,7 @@ PaddleOCR 旨在打造一套丰富、领先、且实用的OCR工具库,不仅
其中英文模型支持,大小写字母和常见标点的检测识别,并优化了空格字符的识别:
-
+
小语种模型覆盖了拉丁语系、阿拉伯语系、中文繁体、韩语、日语等等:
@@ -19,6 +19,8 @@ PaddleOCR 旨在打造一套丰富、领先、且实用的OCR工具库,不仅
@@ -30,14 +32,9 @@ PaddleOCR 旨在打造一套丰富、领先、且实用的OCR工具库,不仅
- [2 快速使用](#快速使用)
- [2.1 命令行运行](#命令行运行)
- - [2.1.1 整图预测](#bash_检测+识别)
- - [2.1.2 识别预测](#bash_识别)
- - [2.1.3 检测预测](#bash_检测)
- [2.2 python 脚本运行](#python_脚本运行)
- - [2.2.1 整图预测](#python_检测+识别)
- - [2.2.2 识别预测](#python_识别)
- - [2.2.3 检测预测](#python_检测)
- [3 自定义训练](#自定义训练)
+- [4 预测部署](#预测部署)
- [4 支持语种及缩写](#语种缩写)
@@ -50,7 +47,7 @@ PaddleOCR 旨在打造一套丰富、领先、且实用的OCR工具库,不仅
pip install paddlepaddle
# gpu
-pip instll paddlepaddle-gpu
+pip install paddlepaddle-gpu
```
@@ -108,8 +105,6 @@ paddleocr --image_dir doc/imgs/japan_2.jpg --lang=japan
paddleocr --image_dir doc/imgs_words/japan/1.jpg --det false --lang=japan
```
-![](https://raw.githubusercontent.com/PaddlePaddle/PaddleOCR/release/2.0/doc/imgs_words/japan/1.jpg)
-
结果是一个tuple,返回识别结果和识别置信度
```text
@@ -145,6 +140,9 @@ from paddleocr import PaddleOCR, draw_ocr
ocr = PaddleOCR(lang="korean") # 首次执行会自动下载模型文件
img_path = 'doc/imgs/korean_1.jpg '
result = ocr.ocr(img_path)
+# 可通过参数控制单独执行识别、检测
+# result = ocr.ocr(img_path, det=False) 只执行识别
+# result = ocr.ocr(img_path, rec=False) 只执行检测
# 打印检测框和识别结果
for line in result:
print(line)
@@ -166,59 +164,7 @@ im_show.save('result.jpg')
-* 识别预测
-
-```
-from paddleocr import PaddleOCR
-ocr = PaddleOCR(lang="german")
-img_path = 'PaddleOCR/doc/imgs_words/german/1.jpg'
-result = ocr.ocr(img_path, det=False, cls=True)
-for line in result:
- print(line)
-```
-
-
-![](../imgs_words/german/1.jpg)
-
-结果是一个tuple,只包含识别结果和识别置信度
-
-```
-('leider auch jetzt', 0.97538936)
-```
-
-* 检测预测
-
-```python
-from paddleocr import PaddleOCR, draw_ocr
-ocr = PaddleOCR() # need to run only once to download and load model into memory
-img_path = 'PaddleOCR/doc/imgs_en/img_12.jpg'
-result = ocr.ocr(img_path, rec=False)
-for line in result:
- print(line)
-
-# 显示结果
-from PIL import Image
-
-image = Image.open(img_path).convert('RGB')
-im_show = draw_ocr(image, result, txts=None, scores=None, font_path='/path/to/PaddleOCR/doc/fonts/simfang.ttf')
-im_show = Image.fromarray(im_show)
-im_show.save('result.jpg')
-```
-结果是一个list,每个item只包含文本框
-```bash
-[[26.0, 457.0], [137.0, 457.0], [137.0, 477.0], [26.0, 477.0]]
-[[25.0, 425.0], [372.0, 425.0], [372.0, 448.0], [25.0, 448.0]]
-[[128.0, 397.0], [273.0, 397.0], [273.0, 414.0], [128.0, 414.0]]
-......
-```
-
-结果可视化 :
-
-
-
-
-
-ppocr 还支持方向分类, 更多使用方式请参考:[whl包使用说明](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.0/doc/doc_ch/whl.md)。
+ppocr 还支持方向分类, 更多使用方式请参考:[whl包使用说明](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.0/doc/doc_ch/whl.md)
## 3 自定义训练
@@ -229,84 +175,58 @@ ppocr 支持使用自己的数据进行自定义训练或finetune, 其中识别
具体数据准备、训练过程可参考:[文本检测](../doc_ch/detection.md)、[文本识别](../doc_ch/recognition.md),更多功能如预测部署、
数据标注等功能可以阅读完整的[文档教程](../../README_ch.md)。
+
+## 4 预测部署
+
+除了安装whl包进行快速预测,ppocr 也提供了多种预测部署方式,如有需求可阅读相关文档:
+- [基于Python脚本预测引擎推理](./inference.md)
+- [基于C++预测引擎推理](../../deploy/cpp_infer/readme.md)
+- [服务化部署](../../deploy/hubserving/readme.md)
+- [端侧部署](https://github.com/PaddlePaddle/PaddleOCR/blob/develop/deploy/lite/readme.md)
+- [Benchmark](./benchmark.md)
+
+
+
-## 4 支持语种及缩写
-
-| 语种 | 描述 | 缩写 |
-| --- | --- | --- |
-|中文|chinese and english|ch|
-|英文|english|en|
-|法文|french|fr|
-|德文|german|german|
-|日文|japan|japan|
-|韩文|korean|korean|
-|中文繁体|chinese traditional |chinese_cht|
-|意大利文| Italian |it|
-|西班牙文|Spanish |es|
-|葡萄牙文| Portuguese|pt|
-|俄罗斯文|Russia|ru|
-|阿拉伯文|Arabic|ar|
-|印地文|Hindi|hi|
-|维吾尔|Uyghur|ug|
-|波斯文|Persian|fa|
-|乌尔都文|Urdu|ur|
-|塞尔维亚文(latin)| Serbian(latin) |rs_latin|
-|欧西坦文|Occitan |oc|
-|马拉地文|Marathi|mr|
-|尼泊尔文|Nepali|ne|
-|塞尔维亚文(cyrillic)|Serbian(cyrillic)|rs_cyrillic|
-|保加利亚文|Bulgarian |bg|
-|乌克兰文|Ukranian|uk|
-|白俄罗斯文|Belarusian|be|
-|泰卢固文|Telugu |te|
-|泰米尔文|Tamil |ta|
-|南非荷兰文 |Afrikaans |af|
-|阿塞拜疆文 |Azerbaijani |az|
-|波斯尼亚文|Bosnian|bs|
-|捷克文|Czech|cs|
-|威尔士文 |Welsh |cy|
-|丹麦文 |Danish|da|
-|爱沙尼亚文 |Estonian |et|
-|爱尔兰文 |Irish |ga|
-|克罗地亚文|Croatian |hr|
-|匈牙利文|Hungarian |hu|
-|印尼文|Indonesian|id|
-|冰岛文 |Icelandic|is|
-|库尔德文 |Kurdish|ku|
-|立陶宛文|Lithuanian |lt|
-|拉脱维亚文 |Latvian |lv|
-|毛利文|Maori|mi|
-|马来文 |Malay|ms|
-|马耳他文 |Maltese |mt|
-|荷兰文 |Dutch |nl|
-|挪威文 |Norwegian |no|
-|波兰文|Polish |pl|
-| 罗马尼亚文|Romanian |ro|
-| 斯洛伐克文|Slovak |sk|
-| 斯洛文尼亚文|Slovenian |sl|
-| 阿尔巴尼亚文|Albanian |sq|
-| 瑞典文|Swedish |sv|
-| 西瓦希里文|Swahili |sw|
-| 塔加洛文|Tagalog |tl|
-| 土耳其文|Turkish |tr|
-| 乌兹别克文|Uzbek |uz|
-| 越南文|Vietnamese |vi|
-| 蒙古文|Mongolian |mn|
-| 阿巴扎文|Abaza |abq|
-| 阿迪赫文|Adyghe |ady|
-| 卡巴丹文|Kabardian |kbd|
-| 阿瓦尔文|Avar |ava|
-| 达尔瓦文|Dargwa |dar|
-| 因古什文|Ingush |inh|
-| 拉克文|Lak |lbe|
-| 莱兹甘文|Lezghian |lez|
-|塔巴萨兰文 |Tabassaran |tab|
-| 比尔哈文|Bihari |bh|
-| 迈蒂利文|Maithili |mai|
-| 昂加文|Angika |ang|
-| 孟加拉文|Bhojpuri |bho|
-| 摩揭陀文 |Magahi |mah|
-| 那格浦尔文|Nagpur |sck|
-| 尼瓦尔文|Newari |new|
-| 保加利亚文 |Goan Konkani|gom|
-| 沙特阿拉伯文|Saudi Arabia|sa|
+## 5 支持语种及缩写
+
+| 语种 | 描述 | 缩写 | | 语种 | 描述 | 缩写 |
+| --- | --- | --- | ---|--- | --- | --- |
+|中文|chinese and english|ch| |保加利亚文|Bulgarian |bg|
+|英文|english|en| |乌克兰文|Ukranian|uk|
+|法文|french|fr| |白俄罗斯文|Belarusian|be|
+|德文|german|german| |泰卢固文|Telugu |te|
+|日文|japan|japan| | |阿巴扎文|Abaza |abq|
+|韩文|korean|korean| |泰米尔文|Tamil |ta|
+|中文繁体|chinese traditional |ch_tra| |南非荷兰文 |Afrikaans |af|
+|意大利文| Italian |it| |阿塞拜疆文 |Azerbaijani |az|
+|西班牙文|Spanish |es| |波斯尼亚文|Bosnian|bs|
+|葡萄牙文| Portuguese|pt| |捷克文|Czech|cs|
+|俄罗斯文|Russia|ru| |威尔士文 |Welsh |cy|
+|阿拉伯文|Arabic|ar| |丹麦文 |Danish|da|
+|印地文|Hindi|hi| |爱沙尼亚文 |Estonian |et|
+|维吾尔|Uyghur|ug| |爱尔兰文 |Irish |ga|
+|波斯文|Persian|fa| |克罗地亚文|Croatian |hr|
+|乌尔都文|Urdu|ur| |匈牙利文|Hungarian |hu|
+|塞尔维亚文(latin)| Serbian(latin) |rs_latin| |印尼文|Indonesian|id|
+|欧西坦文|Occitan |oc| |冰岛文 |Icelandic|is|
+|马拉地文|Marathi|mr| |库尔德文 |Kurdish|ku|
+|尼泊尔文|Nepali|ne| |立陶宛文|Lithuanian |lt|
+|塞尔维亚文(cyrillic)|Serbian(cyrillic)|rs_cyrillic| |拉脱维亚文 |Latvian |lv|
+|毛利文|Maori|mi| | 达尔瓦文|Dargwa |dar|
+|马来文 |Malay|ms| | 因古什文|Ingush |inh|
+|马耳他文 |Maltese |mt| | 拉克文|Lak |lbe|
+|荷兰文 |Dutch |nl| | 莱兹甘文|Lezghian |lez|
+|挪威文 |Norwegian |no| |塔巴萨兰文 |Tabassaran |tab|
+|波兰文|Polish |pl| | 比尔哈文|Bihari |bh|
+| 罗马尼亚文|Romanian |ro| | 迈蒂利文|Maithili |mai|
+| 斯洛伐克文|Slovak |sk| | 昂加文|Angika |ang|
+| 斯洛文尼亚文|Slovenian |sl| | 孟加拉文|Bhojpuri |bho|
+| 阿尔巴尼亚文|Albanian |sq| | 摩揭陀文 |Magahi |mah|
+| 瑞典文|Swedish |sv| | 那格浦尔文|Nagpur |sck|
+| 西瓦希里文|Swahili |sw| | 尼瓦尔文|Newari |new|
+| 塔加洛文|Tagalog |tl| | 保加利亚文 |Goan Konkani|gom|
+| 土耳其文|Turkish |tr| | 沙特阿拉伯文|Saudi Arabia|sa|
+| 乌兹别克文|Uzbek |uz| | 阿瓦尔文|Avar |ava|
+| 越南文|Vietnamese |vi| | 阿瓦尔文|Avar |ava|
+| 蒙古文|Mongolian |mn| | 阿迪赫文|Adyghe |ady|
diff --git a/doc/doc_en/multi_languages_en.md b/doc/doc_en/multi_languages_en.md
index f801db5067e70e174491f41bc6ac5f9764364a0f..e58b782ca18d55dbd954382fd0df6f53910e2e52 100644
--- a/doc/doc_en/multi_languages_en.md
+++ b/doc/doc_en/multi_languages_en.md
@@ -13,7 +13,7 @@ Among them, the English model supports the detection and recognition of uppercas
letters and common punctuation, and the recognition of space characters is optimized:
-
+
The multilingual models cover Latin, Arabic, Traditional Chinese, Korean, Japanese, etc.:
@@ -21,6 +21,8 @@ The multilingual models cover Latin, Arabic, Traditional Chinese, Korean, Japane
This document will briefly introduce how to use the multilingual model.
@@ -31,14 +33,9 @@ This document will briefly introduce how to use the multilingual model.
- [2 Quick Use](#Quick_Use)
- [2.1 Command line operation](#Command_line_operation)
- - [2.1.1 Prediction of the whole image](#bash_detection+recognition)
- - [2.1.2 Recognition](#bash_Recognition)
- - [2.1.3 Detection](#bash_detection)
- [2.2 python script running](#python_Script_running)
- - [2.2.1 Whole image prediction](#python_detection+recognition)
- - [2.2.2 Recognition](#python_Recognition)
- - [2.2.3 Detection](#python_detection)
- [3 Custom Training](#Custom_Training)
+- [4 Inference and Deployment](#inference)
- [4 Supported languages and abbreviations](#language_abbreviations)
@@ -51,7 +48,7 @@ This document will briefly introduce how to use the multilingual model.
pip install paddlepaddle
# gpu
-pip instll paddlepaddle-gpu
+pip install paddlepaddle-gpu
```
@@ -89,7 +86,7 @@ The specific supported [language] (#language_abbreviations) can be viewed in the
paddleocr --image_dir doc/imgs/japan_2.jpg --lang=japan
```
-![](https://raw.githubusercontent.com/PaddlePaddle/PaddleOCR/release/2.1/doc/imgs/japan_2.jpg)
+![](https://raw.githubusercontent.com/PaddlePaddle/PaddleOCR/release/2.0/doc/imgs/japan_2.jpg)
The result is a list, each item contains a text box, text and recognition confidence
```text
@@ -106,7 +103,7 @@ The result is a list, each item contains a text box, text and recognition confid
paddleocr --image_dir doc/imgs_words/japan/1.jpg --det false --lang=japan
```
-![](https://raw.githubusercontent.com/PaddlePaddle/PaddleOCR/release/2.1/doc/imgs_words/japan/1.jpg)
+![](https://raw.githubusercontent.com/PaddlePaddle/PaddleOCR/release/2.0/doc/imgs_words/japan/1.jpg)
The result is a tuple, which returns the recognition result and recognition confidence
@@ -143,6 +140,9 @@ from paddleocr import PaddleOCR, draw_ocr
ocr = PaddleOCR(lang="korean") # The model file will be downloaded automatically when executed for the first time
img_path ='doc/imgs/korean_1.jpg'
result = ocr.ocr(img_path)
+# Recognition and detection can be performed separately through parameter control
+# result = ocr.ocr(img_path, det=False) Only perform recognition
+# result = ocr.ocr(img_path, rec=False) Only perform detection
# Print detection frame and recognition result
for line in result:
print(line)
@@ -162,54 +162,6 @@ Visualization of results:
![](https://raw.githubusercontent.com/PaddlePaddle/PaddleOCR/release/2.1/doc/imgs_results/korean.jpg)
-* Recognition
-
-```
-from paddleocr import PaddleOCR
-ocr = PaddleOCR(lang="german")
-img_path ='PaddleOCR/doc/imgs_words/german/1.jpg'
-result = ocr.ocr(img_path, det=False, cls=True)
-for line in result:
- print(line)
-```
-
-![](https://raw.githubusercontent.com/PaddlePaddle/PaddleOCR/release/2.1/doc/imgs_words/german/1.jpg)
-
-The result is a tuple, which only contains the recognition result and recognition confidence
-
-```
-('leider auch jetzt', 0.97538936)
-```
-
-* Detection
-
-```python
-from paddleocr import PaddleOCR, draw_ocr
-ocr = PaddleOCR() # need to run only once to download and load model into memory
-img_path ='PaddleOCR/doc/imgs_en/img_12.jpg'
-result = ocr.ocr(img_path, rec=False)
-for line in result:
- print(line)
-
-# show result
-from PIL import Image
-
-image = Image.open(img_path).convert('RGB')
-im_show = draw_ocr(image, result, txts=None, scores=None, font_path='/path/to/PaddleOCR/doc/fonts/simfang.ttf')
-im_show = Image.fromarray(im_show)
-im_show.save('result.jpg')
-```
-The result is a list, each item contains only text boxes
-```bash
-[[26.0, 457.0], [137.0, 457.0], [137.0, 477.0], [26.0, 477.0]]
-[[25.0, 425.0], [372.0, 425.0], [372.0, 448.0], [25.0, 448.0]]
-[[128.0, 397.0], [273.0, 397.0], [273.0, 414.0], [128.0, 414.0]]
-......
-```
-
-Visualization of results:
-![](https://raw.githubusercontent.com/PaddlePaddle/PaddleOCR/release/2.1/doc/imgs_results/whl/12_det.jpg)
-
ppocr also supports direction classification. For more usage methods, please refer to: [whl package instructions](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.0/doc/doc_ch/whl.md).
@@ -221,84 +173,61 @@ Modify the training data path, dictionary and other parameters.
For specific data preparation and training process, please refer to: [Text Detection](../doc_en/detection_en.md), [Text Recognition](../doc_en/recognition_en.md), more functions such as predictive deployment,
For functions such as data annotation, you can read the complete [Document Tutorial](../../README.md).
-
-## 4 Support languages and abbreviations
-
-| Language | Abbreviation |
-| --- | --- |
-|chinese and english|ch|
-|english|en|
-|french|fr|
-|german|german|
-|japan|japan|
-|korean|korean|
-|chinese traditional |chinese_cht|
-| Italian |it|
-|Spanish |es|
-| Portuguese|pt|
-|Russia|ru|
-|Arabic|ar|
-|Hindi|hi|
-|Uyghur|ug|
-|Persian|fa|
-|Urdu|ur|
-| Serbian(latin) |rs_latin|
-|Occitan |oc|
-|Marathi|mr|
-|Nepali|ne|
-|Serbian(cyrillic)|rs_cyrillic|
-|Bulgarian |bg|
-|Ukranian|uk|
-|Belarusian|be|
-|Telugu |te|
-|Tamil |ta|
-|Afrikaans |af|
-|Azerbaijani |az|
-|Bosnian|bs|
-|Czech|cs|
-|Welsh |cy|
-|Danish|da|
-|Estonian |et|
-|Irish |ga|
-|Croatian |hr|
-|Hungarian |hu|
-|Indonesian|id|
-|Icelandic|is|
-|Kurdish|ku|
-|Lithuanian |lt|
- |Latvian |lv|
-|Maori|mi|
-|Malay|ms|
-|Maltese |mt|
-|Dutch |nl|
-|Norwegian |no|
-|Polish |pl|
-|Romanian |ro|
-|Slovak |sk|
-|Slovenian |sl|
-|Albanian |sq|
-|Swedish |sv|
-|Swahili |sw|
-|Tagalog |tl|
-|Turkish |tr|
-|Uzbek |uz|
-|Vietnamese |vi|
-|Mongolian |mn|
-|Abaza |abq|
-|Adyghe |ady|
-|Kabardian |kbd|
-|Avar |ava|
-|Dargwa |dar|
-|Ingush |inh|
-|Lak |lbe|
-|Lezghian |lez|
-|Tabassaran |tab|
-|Bihari |bh|
-|Maithili |mai|
-|Angika |ang|
-|Bhojpuri |bho|
-|Magahi |mah|
-|Nagpur |sck|
-|Newari |new|
-|Goan Konkani|gom|
-|Saudi Arabia|sa|
+
+
+## 4 Inference and Deployment
+
+In addition to installing the whl package for quick forecasting,
+ppocr also provides a variety of forecasting deployment methods.
+If necessary, you can read related documents:
+
+- [Python Inference](./inference_en.md)
+- [C++ Inference](../../deploy/cpp_infer/readme_en.md)
+- [Serving](../../deploy/hubserving/readme_en.md)
+- [Mobile](https://github.com/PaddlePaddle/PaddleOCR/blob/develop/deploy/lite/readme_en.md)
+- [Benchmark](./benchmark_en.md)
+
+
+
+## 5 Support languages and abbreviations
+
+| Language | Abbreviation | | Language | Abbreviation |
+| --- | --- | --- | --- | --- |
+|chinese and english|ch| |Arabic|ar|
+|english|en| |Hindi|hi|
+|french|fr| |Uyghur|ug|
+|german|german| |Persian|fa|
+|japan|japan| |Urdu|ur|
+|korean|korean| | Serbian(latin) |rs_latin|
+|chinese traditional |ch_tra| |Occitan |oc|
+| Italian |it| |Marathi|mr|
+|Spanish |es| |Nepali|ne|
+| Portuguese|pt| |Serbian(cyrillic)|rs_cyrillic|
+|Russia|ru||Bulgarian |bg|
+|Ukranian|uk| |Estonian |et|
+|Belarusian|be| |Irish |ga|
+|Telugu |te| |Croatian |hr|
+|Saudi Arabia|sa| |Hungarian |hu|
+|Tamil |ta| |Indonesian|id|
+|Afrikaans |af| |Icelandic|is|
+|Azerbaijani |az||Kurdish|ku|
+|Bosnian|bs| |Lithuanian |lt|
+|Czech|cs| |Latvian |lv|
+|Welsh |cy| |Maori|mi|
+|Danish|da| |Malay|ms|
+|Maltese |mt| |Adyghe |ady|
+|Dutch |nl| |Kabardian |kbd|
+|Norwegian |no| |Avar |ava|
+|Polish |pl| |Dargwa |dar|
+|Romanian |ro| |Ingush |inh|
+|Slovak |sk| |Lak |lbe|
+|Slovenian |sl| |Lezghian |lez|
+|Albanian |sq| |Tabassaran |tab|
+|Swedish |sv| |Bihari |bh|
+|Swahili |sw| |Maithili |mai|
+|Tagalog |tl| |Angika |ang|
+|Turkish |tr| |Bhojpuri |bho|
+|Uzbek |uz| |Magahi |mah|
+|Vietnamese |vi| |Nagpur |sck|
+|Mongolian |mn| |Newari |new|
+|Abaza |abq| |Goan Konkani|gom|
diff --git a/doc/imgs_results/multi_lang/arabic_0.jpg b/doc/imgs_results/multi_lang/arabic_0.jpg
new file mode 100644
index 0000000000000000000000000000000000000000..9941b906427b8f08c076ecc47a328780bd857598
Binary files /dev/null and b/doc/imgs_results/multi_lang/arabic_0.jpg differ
diff --git a/doc/imgs_results/multi_lang/img_12.jpg b/doc/imgs_results/multi_lang/img_12.jpg
new file mode 100644
index 0000000000000000000000000000000000000000..822d562eda747389157b8e49927a1841a193c9e7
Binary files /dev/null and b/doc/imgs_results/multi_lang/img_12.jpg differ
diff --git a/doc/imgs_results/multi_lang/korean_0.jpg b/doc/imgs_results/multi_lang/korean_0.jpg
new file mode 100644
index 0000000000000000000000000000000000000000..3fe6305aa03edc5d6fe1bc10a140b55be619df72
Binary files /dev/null and b/doc/imgs_results/multi_lang/korean_0.jpg differ