Merge branch 'release/2.5' of https://github.com/PaddlePaddle/PaddleOCR into release/2.5

92d01dd1 · andyjpaddle · 491e2ef3 · 460b1e87 · 92d01dd1 · 92d01dd1
10 changed file
--- a/PPOCRLabel/README.md
+++ b/PPOCRLabel/README.md
 English | [简体中文](README_ch.md)

-# PPOCRLabel
+# PPOCRLabelv2

-PPOCRLabel is a semi-automatic graphic annotation tool suitable for OCR field, with built-in PPOCR model to automatically detect and re-recognize data. It is written in python3 and pyqt5, supporting rectangular box, table and multi-point annotation modes. Annotations can be directly used for the training of PPOCR detection and recognition models.
+PPOCRLabelv2 is a semi-automatic graphic annotation tool suitable for OCR field, with built-in PP-OCR model to automatically detect and re-recognize data. It is written in Python3 and PyQT5, supporting rectangular box, table, irregular text and key information annotation modes. Annotations can be directly used for the training of PP-OCR detection and recognition models.

-<img src="./data/gif/steps_en.gif" width="100%"/>
+| regular text annotation                              | table annotation                                       |
+| :-------------------------------------------------: | :--------------------------------------------: |
+| <img src="./data/gif/steps_en.gif" width="80%"/>    | <img src="./data/gif/table.gif" width="100%"/> |
+| **irregular text annotation**                        | **key information annotation**                               |
+| <img src="./data/gif/multi-point.gif" width="80%"/> | <img src="./data/gif/kie.gif" width="300%"/>   |

 ### Recent Update


--- a/PPOCRLabel/README_ch.md
+++ b/PPOCRLabel/README_ch.md
 [English](README.md) | 简体中文

-# PPOCRLabel
+# PPOCRLabelv2

-PPOCRLabel是一款适用于OCR领域的半自动化图形标注工具，内置PP-OCR模型对数据自动标注和重新识别。使用Python3和PyQT5编写，支持矩形框标注和四点标注模式，导出格式可直接用于PaddleOCR检测和识别模型的训练。
+PPOCRLabel是一款适用于OCR领域的半自动化图形标注工具，内置PP-OCR模型对数据自动标注和重新识别。使用Python3和PyQT5编写，支持矩形框标注、表格标注、不规则文本标注、关键信息标注模式，导出格式可直接用于PaddleOCR检测和识别模型的训练。
+
+| 常规标注                              | 表格标注                                      |
+| :---------------------------------------------------: | :----------------------------------------------: |
+| <img src="./data/gif/steps_en.gif" width="80%"/>    | <img src="./data/gif/table.gif" width="100%"/> |
+| **不规则文本标注**                        | **关键信息标注**                               |
+| <img src="./data/gif/multi-point.gif" width="80%"/> | <img src="./data/gif/kie.gif" width="300%"/>   |

-<img src="./data/gif/steps.gif" width="100%"/>

 #### 近期更新
 - 2022.05：新增表格标注，使用方法见下方`2.2 表格标注`（by [whjdark](https://github.com/peterh0323); [Evezerest](https://github.com/Evezerest))

--- a/PPOCRLabel/data/gif/kie.gif
+++ b/PPOCRLabel/data/gif/kie.gif
--- a/PPOCRLabel/data/gif/multi-point.gif
+++ b/PPOCRLabel/data/gif/multi-point.gif
--- a/PPOCRLabel/data/gif/table.gif
+++ b/PPOCRLabel/data/gif/table.gif
--- a/README.md
+++ b/README.md
@@ -26,13 +26,17 @@ PaddleOCR aims to create multilingual, awesome, leading, and practical OCR tools
 </div>

 ## Recent updates
- 2022.5.9 release PaddleOCR v2.5, including:
-    - [PP-OCRv3](./doc/doc_en/ppocr_introduction_en.md#pp-ocrv3): With comparable speed, the effect of Chinese scene is further improved by 5% compared with PP-OCRv2, the effect of English scene is improved by 11%, and the average recognition accuracy of 80 language multilingual models is improved by more than 5%.
-    - [PPOCRLabelv2](./PPOCRLabel): Add the annotation function for table recognition task, key information extraction task and irregular text image.
-    - Interactive e-book [*"Dive into OCR"*](./doc/doc_en/ocr_book_en.md), covers the cutting-edge theory and code practice of OCR full stack technology.
- 2021.12.21 release PaddleOCR v2.4, release 1 text detection algorithm (PSENet), 3 text recognition algorithms (NRTR、SEED、SAR), 1 key information extraction algorithm (SDMGR, [tutorial](./ppstructure/docs/kie_en.md)) and 3 DocVQA algorithms (LayoutLM, LayoutLMv2, LayoutXLM, [tutorial](./ppstructure/vqa)).
- 2021.9.7 release PaddleOCR v2.3, [PP-OCRv2](./doc/doc_en/ppocr_introduction_en.md#pp-ocrv2) is proposed. The inference speed of PP-OCRv2 is 220% higher than that of PP-OCR server in CPU device. The F-score of PP-OCRv2 is 7% higher than that of PP-OCR mobile.
- 2021.8.3 released PaddleOCR v2.2, add a new structured documents analysis toolkit, i.e., [PP-Structure](./ppstructure/README.md), support layout analysis and table recognition (One-key to export chart images to Excel files).
+- **🔥2022.5.9 Release PaddleOCR [release/2.5](https://github.com/PaddlePaddle/PaddleOCR/tree/release/2.5)**
+    - Release [PP-OCRv3](./doc/doc_en/ppocr_introduction_en.md#pp-ocrv3): With comparable speed, the effect of Chinese scene is further improved by 5% compared with PP-OCRv2, the effect of English scene is improved by 11%, and the average recognition accuracy of 80 language multilingual models is improved by more than 5%.
+    - Release [PPOCRLabelv2](./PPOCRLabel): Add the annotation function for table recognition task, key information extraction task and irregular text image.
+    - Release interactive e-book [*"Dive into OCR"*](./doc/doc_en/ocr_book_en.md), covers the cutting-edge theory and code practice of OCR full stack technology.
+- 2021.12.21 Release PaddleOCR [release/2.4](https://github.com/PaddlePaddle/PaddleOCR/tree/release/2.4)
+    - Release 1 text detection algorithm ([PSENet](./doc/doc_en/algorithm_det_psenet_en.md)), 3 text recognition algorithms ([NRTR](./doc/doc_en/algorithm_rec_nrtr_en.md)、[SEED](./doc/doc_en/algorithm_rec_seed_en.md)、[SAR](./doc/doc_en/algorithm_rec_nrtr_en.md)).
+    - Release 1 key information extraction algorithm [SDMGR](./ppstructure/docs/kie_en.md) and 3 [DocVQA](./ppstructure/vqa) algorithms (LayoutLM, LayoutLMv2, LayoutXLM).
+- 2021.9.7 Release PaddleOCR [release/2.3](https://github.com/PaddlePaddle/PaddleOCR/tree/release/2.3)
+    - Release [PP-OCRv2](./doc/doc_en/ppocr_introduction_en.md#pp-ocrv2). The inference speed of PP-OCRv2 is 220% higher than that of PP-OCR server in CPU device. The F-score of PP-OCRv2 is 7% higher than that of PP-OCR mobile.
+- 2021.8.3 Release PaddleOCR [release/2.2](https://github.com/PaddlePaddle/PaddleOCR/tree/release/2.2)
+    - Release a new structured documents analysis toolkit, i.e., [PP-Structure](./ppstructure/README.md), support layout analysis and table recognition (One-key to export chart images to Excel files).

 - [more](./doc/doc_en/update_en.md)


--- a/README_ch.md
+++ b/README_ch.md
@@ -27,14 +27,28 @@ PaddleOCR旨在打造一套丰富、领先、且实用的OCR工具库，助力

 ## 近期更新

- 2022.5.9 发布PaddleOCR v2.5。研发团队将于5.11~5.13带来三日直播课详细解读，扫描下文二维码入群[获取直播课链接](#开源社区)。发布内容包括：
-    - [PP-OCRv3](./doc/doc_ch/ppocr_introduction.md#pp-ocrv3)，速度可比情况下，中文场景效果相比于PP-OCRv2再提升5%，英文场景提升11%，80语种多语言模型平均识别准确率提升5%以上；
-    - 半自动标注工具[PPOCRLabelv2](./PPOCRLabel)：新增表格文字图像、图像关键信息抽取任务和不规则文字图像的标注功能；
-    - OCR产业落地工具集：打通22种训练部署软硬件环境与方式，覆盖企业90%的训练部署环境需求
-    - 交互式OCR开源电子书[《动手学OCR》](./doc/doc_ch/ocr_book.md)，覆盖OCR全栈技术的前沿理论与代码实践，并配套教学视频。
- 2021.12.21 发布PaddleOCR v2.4。OCR算法新增1种文本检测算法（PSENet），3种文本识别算法（NRTR、SEED、SAR）；文档结构化算法新增1种关键信息提取算法（SDMGR，[文档](./ppstructure/docs/kie.md)），3种DocVQA算法（LayoutLM、LayoutLMv2，LayoutXLM，[文档](./ppstructure/vqa)）。
- 2021.9.7 发布PaddleOCR v2.3与[PP-OCRv2](./doc/doc_ch/ppocr_introduction.md#pp-ocrv2)，CPU推理速度相比于PP-OCR server提升220%；效果相比于PP-OCR mobile 提升7%。
- 2021.8.3 发布PaddleOCR v2.2，新增文档结构分析[PP-Structure](./ppstructure/README_ch.md)工具包，支持版面分析与表格识别（含Excel导出）。
+- **🔥2022.5.11~13 每晚8：30【超强OCR技术详解与产业应用实战】三日直播课**
+  - 11日：开源最强OCR系统PP-OCRv3揭秘
+  - 12日：云边端全覆盖的PP-OCRv3训练部署实战
+  - 13日：OCR产业应用全流程拆解与实战
+  
+   赶紧扫码报名吧！
+<div align="center">
+<img src="https://raw.githubusercontent.com/PaddlePaddle/PaddleOCR/dygraph/doc/joinus.PNG"  width = "150" height = "150" />
+</div>
+
+- **🔥2022.5.9 发布PaddleOCR [release/2.5](https://github.com/PaddlePaddle/PaddleOCR/tree/release/2.5)**
+    - 发布[PP-OCRv3](./doc/doc_ch/ppocr_introduction.md#pp-ocrv3)，速度可比情况下，中文场景效果相比于PP-OCRv2再提升5%，英文场景提升11%，80语种多语言模型平均识别准确率提升5%以上；
+    - 发布半自动标注工具[PPOCRLabelv2](./PPOCRLabel)：新增表格文字图像、图像关键信息抽取任务和不规则文字图像的标注功能；
+    - 发布OCR产业落地工具集：打通22种训练部署软硬件环境与方式，覆盖企业90%的训练部署环境需求；
+    - 发布交互式OCR开源电子书[《动手学OCR》](./doc/doc_ch/ocr_book.md)，覆盖OCR全栈技术的前沿理论与代码实践，并配套教学视频。
+- 2021.12.21 发布PaddleOCR [release/2.4](https://github.com/PaddlePaddle/PaddleOCR/tree/release/2.4)
+    - OCR算法新增1种文本检测算法（[PSENet](./doc/doc_ch/algorithm_det_psenet.md)），3种文本识别算法（[NRTR](./doc/doc_ch/algorithm_rec_nrtr.md)、[SEED](./doc/doc_ch/algorithm_rec_seed.md)、[SAR](./doc/doc_ch/algorithm_rec_sar.md)）；
+    - 文档结构化算法新增1种关键信息提取算法（[SDMGR](./ppstructure/docs/kie.md)），3种[DocVQA](./ppstructure/vqa)算法（LayoutLM、LayoutLMv2，LayoutXLM）。
+- 2021.9.7 发布PaddleOCR [release/2.3](https://github.com/PaddlePaddle/PaddleOCR/tree/release/2.3)
+    - 发布[PP-OCRv2](./doc/doc_ch/ppocr_introduction.md#pp-ocrv2)，CPU推理速度相比于PP-OCR server提升220%；效果相比于PP-OCR mobile 提升7%。
+- 2021.8.3 发布PaddleOCR [release/2.2](https://github.com/PaddlePaddle/PaddleOCR/tree/release/2.2)
+    - 发布文档结构分析[PP-Structure](./ppstructure/README_ch.md)工具包，支持版面分析与表格识别（含Excel导出）。

 > [更多](./doc/doc_ch/update.md)

@@ -87,6 +101,7 @@ PaddleOCR旨在打造一套丰富、领先、且实用的OCR工具库，助力
 更多模型下载（包括多语言），可以参考[PP-OCR 系列模型下载](./doc/doc_ch/models_list.md)，文档分析相关模型参考[PP-Structure 系列模型下载](./ppstructure/docs/models_list.md)


+<a name="文档教程"></a>
 ## 文档教程

 - [运行环境准备](./doc/doc_ch/environment.md)

--- a/doc/doc_ch/ppocr_introduction.md
+++ b/doc/doc_ch/ppocr_introduction.md
@@ -71,38 +71,28 @@ PP-OCRv3系统pipeline如下：
 ## 4. 效果展示 [more](./visualization.md)

 <details open>
-<summary>PP-OCRv2 中文模型</summary>
-
-<div align="center">
-      <img src="../imgs_results/ch_ppocr_mobile_v2.0/test_add_91.jpg" width="800">
-      <img src="../imgs_results/ch_ppocr_mobile_v2.0/00018069.jpg" width="800">
-</div>
+<summary>PP-OCRv3 中文模型</summary>
 <div align="center">
-    <img src="../imgs_results/ch_ppocr_mobile_v2.0/00056221.jpg" width="800">
-    <img src="../imgs_results/ch_ppocr_mobile_v2.0/rotate_00052204.jpg" width="800">
+    <img src="../imgs_results/PP-OCRv3/ch/PP-OCRv3-pic001.jpg" width="800">
+    <img src="../imgs_results/PP-OCRv3/ch/PP-OCRv3-pic002.jpg" width="800">
+    <img src="../imgs_results/PP-OCRv3/ch/PP-OCRv3-pic003.jpg" width="800">
 </div>
-
 </details>

-
 <details open>
-<summary>PP-OCRv2 英文模型</summary>
-
+<summary>PP-OCRv3 英文模型</summary>
 <div align="center">
-    <img src="../imgs_results/ch_ppocr_mobile_v2.0/img_12.jpg" width="800">
+    <img src="../imgs_results/PP-OCRv3/en/en_1.png" width="800">
+    <img src="../imgs_results/PP-OCRv3/en/en_2.png" width="800">
 </div>
-
 </details>

-
 <details open>
-<summary>PP-OCRv2 其他语言模型</summary>
-
+<summary>PP-OCRv3 多语言模型</summary>
 <div align="center">
-    <img src="../imgs_results/french_0.jpg" width="800">
-    <img src="../imgs_results/korean.jpg" width="800">
+    <img src="../imgs_results/PP-OCRv3/multi_lang/japan_2.jpg" width="800">
+    <img src="../imgs_results/PP-OCRv3/multi_lang/korean_1.jpg" width="800">
 </div>
-
 </details>



--- a/doc/doc_en/ppocr_introduction_en.md
+++ b/doc/doc_en/ppocr_introduction_en.md
@@ -67,36 +67,28 @@ For the performance comparison between PP-OCR series models, please check the [b
 ## 4. Visualization [more](./visualization.md)

 <details open>
-<summary>PP-OCRv2 English model</summary>
-
+<summary>PP-OCRv3 Chinese model</summary>
 <div align="center">
-    <img src="../imgs_results/ch_ppocr_mobile_v2.0/img_12.jpg" width="800">
+    <img src="../imgs_results/PP-OCRv3/ch/PP-OCRv3-pic001.jpg" width="800">
+    <img src="../imgs_results/PP-OCRv3/ch/PP-OCRv3-pic002.jpg" width="800">
+    <img src="../imgs_results/PP-OCRv3/ch/PP-OCRv3-pic003.jpg" width="800">
 </div>
-
 </details>

 <details open>
-<summary>PP-OCRv2 Chinese model</summary>
-
-<div align="center">
-      <img src="../imgs_results/ch_ppocr_mobile_v2.0/test_add_91.jpg" width="800">
-      <img src="../imgs_results/ch_ppocr_mobile_v2.0/00018069.jpg" width="800">
-</div>
+<summary>PP-OCRv3 English model</summary>
 <div align="center">
-    <img src="../imgs_results/ch_ppocr_mobile_v2.0/00056221.jpg" width="800">
-    <img src="../imgs_results/ch_ppocr_mobile_v2.0/rotate_00052204.jpg" width="800">
+    <img src="../imgs_results/PP-OCRv3/en/en_1.png" width="800">
+    <img src="../imgs_results/PP-OCRv3/en/en_2.png" width="800">
 </div>
-
 </details>

 <details open>
-<summary>PP-OCRv2 Multilingual model</summary>
-
+<summary>PP-OCRv3 Multilingual model</summary>
 <div align="center">
-    <img src="../imgs_results/french_0.jpg" width="800">
-    <img src="../imgs_results/korean.jpg" width="800">
+    <img src="../imgs_results/PP-OCRv3/multi_lang/japan_2.jpg" width="800">
+    <img src="../imgs_results/PP-OCRv3/multi_lang/korean_1.jpg" width="800">
 </div>
-
 </details>



--- a/ppocr/modeling/heads/rec_multi_head.py
+++ b/ppocr/modeling/heads/rec_multi_head.py
@@ -10,7 +10,7 @@
 # distributed under the License is distributed on an "AS IS" BASIS,
 # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 # See the License for the specific language governing permissions and
-# limitations under the License.
+# limitations under the License. 

 from __future__ import absolute_import
 from __future__ import division