Merge remote-tracking branch 'origin/release/2.5' into release2.5

88a1851e · qq_25193841 · 8c760e14 · 2909454f · 88a1851e · 88a1851e
11 changed file
--- a/README_ch.md
+++ b/README_ch.md
@@ -73,7 +73,7 @@ PaddleOCR旨在打造一套丰富、领先、且实用的OCR工具库，助力
 - **加入社区👬：** 微信扫描二维码并填写问卷之后，加入交流群领取福利
  - **获取PaddleOCR最新发版解说《OCR超强技术详解与产业应用实战》系列直播课回放链接**
  - **10G重磅OCR学习大礼包：**《动手学OCR》电子书，配套讲解视频和notebook项目；66篇OCR相关顶会前沿论文打包放送，包括CVPR、AAAI、IJCAI、ICCV等；PaddleOCR历次发版直播课视频；OCR社区优秀开发者项目分享视频。
- **社区贡献**🏅️：[社区贡献](./doc/doc_ch/thirdparty.md)文档中包含了社区用户**使用PaddleOCR开发的各种工具、应用**以及**为PaddleOCR贡献的功能、优化的文档与代码**等，是官方为社区开发者打造的荣誉墙，也是帮助优质项目宣传的广播站。
+- **社区项目**🏅️：[社区项目](./doc/doc_ch/thirdparty.md)文档中包含了社区用户**使用PaddleOCR开发的各种工具、应用**以及**为PaddleOCR贡献的功能、优化的文档与代码**等，是官方为社区开发者打造的荣誉墙，也是帮助优质项目宣传的广播站。
 - **社区常规赛**🎁：社区常规赛是面向OCR开发者的积分赛事，覆盖文档、代码、模型和应用四大类型，以季度为单位评选并发放奖励，赛题详情与报名方法可参考[链接](https://github.com/PaddlePaddle/PaddleOCR/issues/4982)。
 <div align="center">
@@ -88,12 +88,9 @@ PaddleOCR旨在打造一套丰富、领先、且实用的OCR工具库，助力
 | ------------------------------------- | ----------------------- | --------------- | ------------------------------------------------------------ | ------------------------------------------------------------ | ------------------------------------------------------------ |
 | 中英文超轻量PP-OCRv3模型（16.2M）     | ch_PP-OCRv3_xx          | 移动端&服务器端 | [推理模型](https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_det_infer.tar) / [训练模型](https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_det_distill_train.tar) | [推理模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_infer.tar) / [训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_train.tar) | [推理模型](https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_rec_infer.tar) / [训练模型](https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_rec_train.tar) |
 | 英文超轻量PP-OCRv3模型（13.4M）     | en_PP-OCRv3_xx          | 移动端&服务器端 | [推理模型](https://paddleocr.bj.bcebos.com/PP-OCRv3/english/en_PP-OCRv3_det_infer.tar) / [训练模型](https://paddleocr.bj.bcebos.com/PP-OCRv3/english/en_PP-OCRv3_det_distill_train.tar) | [推理模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_infer.tar) / [训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_train.tar) | [推理模型](https://paddleocr.bj.bcebos.com/PP-OCRv3/english/en_PP-OCRv3_rec_infer.tar) / [训练模型](https://paddleocr.bj.bcebos.com/PP-OCRv3/english/en_PP-OCRv3_rec_train.tar) |
-| 中英文超轻量PP-OCRv2模型（13.0M）     | ch_PP-OCRv2_xx          | 移动端&服务器端 | [推理模型](https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_det_infer.tar) / [训练模型](https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_det_distill_train.tar) | [推理模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_train.tar) | [推理模型](https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_rec_infer.tar) / [训练模型](https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_rec_train.tar) |
-| 中英文超轻量PP-OCR mobile模型（9.4M） | ch_ppocr_mobile_v2.0_xx | 移动端&服务器端 | [推理模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_det_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_det_train.tar) | [推理模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_train.tar) | [推理模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_rec_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_rec_pre.tar) |
-| 中英文通用PP-OCR server模型（143.4M） | ch_ppocr_server_v2.0_xx | 服务器端        | [推理模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_det_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_det_train.tar) | [推理模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_train.tar) | [推理模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_rec_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_rec_pre.tar) |
-更多模型下载（包括多语言），可以参考[PP-OCR 系列模型下载](./doc/doc_ch/models_list.md)，文档分析相关模型参考[PP-Structure 系列模型下载](./ppstructure/docs/models_list.md)
+- 超轻量OCR系列更多模型下载（包括多语言），可以参考[PP-OCR系列模型下载](./doc/doc_ch/models_list.md)，文档分析相关模型参考[PP-Structure系列模型下载](./ppstructure/docs/models_list.md)
+- 制造、金融、交通行业的主要OCR垂类应用（如电表、数码管、液晶屏、不动产证、车牌、SVTR大模型），可参考[场景应用模型下载](./applications)
 <a name="文档教程"></a>
 ## 文档教程

--- a/applications/README.md
+++ b/applications/README.md
@@ -50,10 +50,10 @@ PaddleOCR场景应用覆盖通用，制造、金融、交通行业的主要OCR
 ## 模型下载
-如需下载全部垂类模型，可以扫描下方二维码，关注公众号填写问卷后，加入PaddleOCR官方交流群获取20G OCR学习大礼包（内含《动手学OCR》电子书、课程回放视频、前沿论文等重磅资料）
+如需下载上述场景中已经训练好的垂类模型，可以扫描下方二维码，关注公众号填写问卷后，加入PaddleOCR官方交流群获取20G OCR学习大礼包（内含《动手学OCR》电子书、课程回放视频、前沿论文等重磅资料）
 <div align="center">
 <img src="https://ai-studio-static-online.cdn.bcebos.com/dd721099bd50478f9d5fb13d8dd00fad69c22d6848244fd3a1d3980d7fefc63e"  width = "150" height = "150" />
 </div>
 如果您是企业开发者且未在上述场景中找到合适的方案，可以填写[OCR应用合作调研问卷](https://paddle.wjx.cn/vj/QwF7GKw.aspx)，免费与官方团队展开不同层次的合作，包括但不限于问题抽象、确定技术方案、项目答疑、共同研发等。如果您已经使用PaddleOCR落地项目，也可以填写此问卷，与飞桨平台共同宣传推广，提升企业技术品宣。期待您的提交！
\ No newline at end of file
--- a/deploy/android_demo/app/src/main/java/com/baidu/paddle/lite/demo/ocr/OCRPredictorNative.java
+++ b/deploy/android_demo/app/src/main/java/com/baidu/paddle/lite/demo/ocr/OCRPredictorNative.java
@@ -54,7 +54,7 @@ public class OCRPredictorNative {
    }
    public void destory() {
-        if (nativePointer > 0) {
+        if (nativePointer != 0) {
            release(nativePointer);
            nativePointer = 0;
        }

--- a/deploy/lite/crnn_process.cc
+++ b/deploy/lite/crnn_process.cc
@@ -35,11 +35,13 @@ cv::Mat CrnnResizeImg(cv::Mat img, float wh_ratio, int rec_image_height) {
  else
    resize_w = int(ceilf(imgH * ratio));
+  cv::Mat resize_img;
  cv::resize(img, resize_img, cv::Size(resize_w, imgH), 0.f, 0.f,
             cv::INTER_LINEAR);
  cv::copyMakeBorder(resize_img, resize_img, 0, 0, 0,
                     int(imgW - resize_img.cols), cv::BORDER_CONSTANT,
                     {127, 127, 127});
+  return resize_img;
 }
 std::vector<std::string> ReadDict(std::string path) {

--- a/deploy/lite/ocr_db_crnn.cc
+++ b/deploy/lite/ocr_db_crnn.cc
@@ -474,7 +474,7 @@ void system(char **argv){
    std::vector<double> rec_times;
    RunRecModel(boxes, srcimg, rec_predictor, rec_text, rec_text_score,
-                charactor_dict, cls_predictor, use_direction_classify, &rec_times);
+                charactor_dict, cls_predictor, use_direction_classify, &rec_times, rec_image_height);
    //// visualization
    auto img_vis = Visualization(srcimg, boxes);

--- a/doc/doc_ch/inference_ppocr.md
+++ b/doc/doc_ch/inference_ppocr.md
@@ -7,7 +7,8 @@
  - [1. 文本检测模型推理](#1-文本检测模型推理)
  - [2. 文本识别模型推理](#2-文本识别模型推理)
    - [2.1 超轻量中文识别模型推理](#21-超轻量中文识别模型推理)
-    - [2.2 多语言模型的推理](#22-多语言模型的推理)
+    - [2.2 英文识别模型推理](#22-英文识别模型推理)
+    - [2.3 多语言模型的推理](#23-多语言模型的推理)
  - [3. 方向分类模型推理](#3-方向分类模型推理)
  - [4. 文本检测、方向分类和文字识别串联推理](#4-文本检测方向分类和文字识别串联推理)
@@ -78,9 +79,29 @@ python3 tools/infer/predict_rec.py --image_dir="./doc/imgs_words/ch/word_4.jpg"
 Predicts of ./doc/imgs_words/ch/word_4.jpg:('实力活力', 0.9956803321838379)
 ```
+<a name="英文识别模型推理"></a>
+### 2.2 英文识别模型推理
+英文识别模型推理，可以执行如下命令， 注意修改字典路径：
+```
+# 下载英文数字识别模型：
+wget https://paddleocr.bj.bcebos.com/PP-OCRv3/english/en_PP-OCRv3_det_infer.tar
+tar xf en_PP-OCRv3_det_infer.tar
+python3 tools/infer/predict_rec.py --image_dir="./doc/imgs_words/en/word_1.png" --rec_model_dir="./en_PP-OCRv3_det_infer/" --rec_char_dict_path="ppocr/utils/en_dict.txt"
+```
+![](../imgs_words/en/word_1.png)
+执行命令后，上图的预测结果为：
+```
+Predicts of ./doc/imgs_words/en/word_1.png: ('JOINT', 0.998160719871521)
+```
 <a name="多语言模型的推理"></a>
-### 2.2 多语言模型的推理
+### 2.3 多语言模型的推理
 如果您需要预测的是其他语言模型，可以在[此链接](./models_list.md#%E5%A4%9A%E8%AF%AD%E8%A8%80%E8%AF%86%E5%88%AB%E6%A8%A1%E5%9E%8B)中找到对应语言的inference模型，在使用inference模型预测时，需要通过`--rec_char_dict_path`指定使用的字典路径, 同时为了得到正确的可视化结果，需要通过 `--vis_font_path` 指定可视化的字体路径，`doc/fonts/` 路径下有默认提供的小语种字体，例如韩文识别：
 ```

--- a/doc/doc_en/inference_ppocr_en.md
+++ b/doc/doc_en/inference_ppocr_en.md
@@ -8,7 +8,8 @@ This article introduces the use of the Python inference engine for the PP-OCR mo
  - [Text Detection Model Inference](#text-detection-model-inference)
  - [Text Recognition Model Inference](#text-recognition-model-inference)
    - [1. Lightweight Chinese Recognition Model Inference](#1-lightweight-chinese-recognition-model-inference)
-    - [2. Multilingual Model Inference](#2-multilingual-model-inference)
+    - [2. English Recognition Model Inference](#2-english-recognition-model-inference)
+    - [3. Multilingual Model Inference](#3-multilingual-model-inference)
  - [Angle Classification Model Inference](#angle-classification-model-inference)
  - [Text Detection Angle Classification and Recognition Inference Concatenation](#text-detection-angle-classification-and-recognition-inference-concatenation)
@@ -76,10 +77,31 @@ After executing the command, the prediction results (recognized text and score)
 ```bash
 Predicts of ./doc/imgs_words_en/word_10.png:('PAIN', 0.988671)
 ```
+<a name="2-english-recognition-model-inference"></a>
+### 2. English Recognition Model Inference
-<a name="MULTILINGUAL_MODEL_INFERENCE"></a>
+For English recognition model inference, you can execute the following commands,you need to specify the dictionary path used by `--rec_char_dict_path`:
-### 2. Multilingual Model Inference
+```
+# download en model：
+wget https://paddleocr.bj.bcebos.com/PP-OCRv3/english/en_PP-OCRv3_det_infer.tar
+tar xf en_PP-OCRv3_det_infer.tar
+python3 tools/infer/predict_rec.py --image_dir="./doc/imgs_words/en/word_1.png" --rec_model_dir="./en_PP-OCRv3_det_infer/" --rec_char_dict_path="ppocr/utils/en_dict.txt"
+```
+![](../imgs_words/en/word_1.png)
+After executing the command, the prediction result of the above figure is:
+```
+Predicts of ./doc/imgs_words/en/word_1.png: ('JOINT', 0.998160719871521)
+```
+<a name="3-multilingual-model-inference"></a>
+### 3. Multilingual Model Inference
 If you need to predict [other language models](./models_list_en.md#Multilingual), when using inference model prediction, you need to specify the dictionary path used by `--rec_char_dict_path`. At the same time, in order to get the correct visualization results,
 You need to specify the visual font path through `--vis_font_path`. There are small language fonts provided by default under the `doc/fonts` path, such as Korean recognition:

--- a/ppocr/metrics/rec_metric.py
+++ b/ppocr/metrics/rec_metric.py
@@ -12,7 +12,7 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
-import Levenshtein
+from rapidfuzz.distance import Levenshtein
 import string
@@ -45,8 +45,7 @@ class RecMetric(object):
            if self.is_filter:
                pred = self._normalize_text(pred)
                target = self._normalize_text(target)
-            norm_edit_dis += Levenshtein.distance(pred, target) / max(
+            norm_edit_dis += Levenshtein.normalized_distance(pred, target)
-                len(pred), len(target), 1)
            if pred == target:
                correct_num += 1
            all_num += 1

--- a/ppstructure/table/table_metric/table_metric.py
+++ b/ppstructure/table/table_metric/table_metric.py
@@ -9,7 +9,7 @@
 # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 # Apache 2.0 License for more details.
-import distance
+from rapidfuzz.distance import Levenshtein
 from apted import APTED, Config
 from apted.helpers import Tree
 from lxml import etree, html
@@ -39,17 +39,6 @@ class TableTree(Tree):
 class CustomConfig(Config):
-    @staticmethod
-    def maximum(*sequences):
-        """Get maximum possible value
-        """
-        return max(map(len, sequences))
-    def normalized_distance(self, *sequences):
-        """Get distance from 0 to 1
-        """
-        return float(distance.levenshtein(*sequences)) / self.maximum(*sequences)
    def rename(self, node1, node2):
        """Compares attributes of trees"""
        #print(node1.tag)
@@ -58,23 +47,12 @@ class CustomConfig(Config):
        if node1.tag == 'td':
            if node1.content or node2.content:
                #print(node1.content, )
-                return self.normalized_distance(node1.content, node2.content)
+                return Levenshtein.normalized_distance(node1.content, node2.content)
        return 0.
 class CustomConfig_del_short(Config):
-    @staticmethod
-    def maximum(*sequences):
-        """Get maximum possible value
-        """
-        return max(map(len, sequences))
-    def normalized_distance(self, *sequences):
-        """Get distance from 0 to 1
-        """
-        return float(distance.levenshtein(*sequences)) / self.maximum(*sequences)
    def rename(self, node1, node2):
        """Compares attributes of trees"""
        if (node1.tag != node2.tag) or (node1.colspan != node2.colspan) or (node1.rowspan != node2.rowspan):
@@ -90,21 +68,10 @@ class CustomConfig_del_short(Config):
                    node1_content = ['####']
                if len(node2_content) < 3:
                    node2_content = ['####']   
-                return self.normalized_distance(node1_content, node2_content)
+                return Levenshtein.normalized_distance(node1_content, node2_content)
        return 0.
 class CustomConfig_del_block(Config):
-    @staticmethod
-    def maximum(*sequences):
-        """Get maximum possible value
-        """
-        return max(map(len, sequences))
-    def normalized_distance(self, *sequences):
-        """Get distance from 0 to 1
-        """
-        return float(distance.levenshtein(*sequences)) / self.maximum(*sequences)
    def rename(self, node1, node2):
        """Compares attributes of trees"""
        if (node1.tag != node2.tag) or (node1.colspan != node2.colspan) or (node1.rowspan != node2.rowspan):
@@ -120,7 +87,7 @@ class CustomConfig_del_block(Config):
                while ' ' in node2_content:
                    print(node2_content.index(' '))
                    node2_content.pop(node2_content.index(' '))
-                return self.normalized_distance(node1_content, node2_content)
+                return Levenshtein.normalized_distance(node1_content, node2_content)
        return 0.
 class TEDS(object):

--- a/ppstructure/vqa/tools/eval_with_label_end2end.py
+++ b/ppstructure/vqa/tools/eval_with_label_end2end.py
@@ -20,7 +20,7 @@ from shapely.geometry import Polygon
 import numpy as np
 from collections import defaultdict
 import operator
-import Levenshtein
+from rapidfuzz.distance import Levenshtein
 import argparse
 import json
 import copy

--- a/requirements.txt
+++ b/requirements.txt
@@ -6,7 +6,7 @@ lmdb
 tqdm
 numpy
 visualdl
-python-Levenshtein
+rapidfuzz
 opencv-contrib-python==4.4.0.46
 cython
 lxml