diff --git a/modules/image/text_recognition/german_ocr_db_crnn_mobile/README.md b/modules/image/text_recognition/german_ocr_db_crnn_mobile/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..d5cfe848f7c27281e82789787ffc2688f643af52
--- /dev/null
+++ b/modules/image/text_recognition/german_ocr_db_crnn_mobile/README.md
@@ -0,0 +1,171 @@
+# german_ocr_db_crnn_mobile
+
+|模型名称|german_ocr_db_crnn_mobile|
+| :--- | :---: |
+|类别|图像-文字识别|
+|网络|Differentiable Binarization+CRNN|
+|数据集|icdar2015数据集|
+|是否支持Fine-tuning|否|
+|模型大小|3.8MB|
+|最新更新日期|2021-02-26|
+|数据指标|-|
+
+
+## 一、模型基本信息
+
+- ### 应用效果展示
+ - 样例结果示例:
+
+
+
+
+- ### 模型介绍
+
+ - german_ocr_db_crnn_mobile Module用于识别图片当中的德文。其基于chinese_text_detection_db_mobile检测得到的文本框,继续识别文本框中的德文文字。最终识别文字算法采用CRNN(Convolutional Recurrent Neural Network)即卷积递归神经网络。其是DCNN和RNN的组合,专门用于识别图像中的序列式对象。与CTC loss配合使用,进行文字识别,可以直接从文本词级或行级的标注中学习,不需要详细的字符级的标注。该Module是一个识别德文的轻量级OCR模型,支持直接预测。
+
+## 二、安装
+
+- ### 1、环境依赖
+
+ - paddlepaddle >= 1.8.0
+
+ - paddlehub >= 1.8.0 | [如何安装paddlehub](../../../../docs/docs_ch/get_start/installation.rst)
+
+ - shapely
+
+ - pyclipper
+
+ - ```shell
+ $ pip install shapely pyclipper
+ ```
+ - **该Module依赖于第三方库shapely和pyclipper,使用该Module之前,请先安装shapely和pyclipper。**
+
+- ### 2、安装
+
+ - ```shell
+ $ hub install german_ocr_db_crnn_mobile
+ ```
+ - 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md)
+ | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md)
+
+## 三、模型API预测
+
+- ### 1、命令行预测
+
+ - ```shell
+ $ hub run german_ocr_db_crnn_mobile --input_path "/PATH/TO/IMAGE"
+ ```
+ - 通过命令行方式实现文字识别模型的调用,更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst)
+
+
+- ### 2、代码示例
+
+ - ```python
+ import paddlehub as hub
+ import cv2
+
+ ocr = hub.Module(name="german_ocr_db_crnn_mobile", enable_mkldnn=True) # mkldnn加速仅在CPU下有效
+ result = ocr.recognize_text(images=[cv2.imread('/PATH/TO/IMAGE')])
+
+ # or
+ # result = ocr.recognize_text(paths=['/PATH/TO/IMAGE'])
+ ```
+
+- ### 3、API
+
+ - ```python
+ def __init__(text_detector_module=None, enable_mkldnn=False)
+ ```
+
+ - 构造GenmanOCRDBCRNNMobile对象
+
+ - **参数**
+
+ - text_detector_module(str): 文字检测PaddleHub Module名字,如设置为None,则默认使用[chinese_text_detection_db_mobile Module](../chinese_text_detection_db_mobile/)。其作用为检测图片当中的文本。
+ - enable_mkldnn(bool): 是否开启mkldnn加速CPU计算。该参数仅在CPU运行下设置有效。默认为False。
+
+ - ```python
+ def recognize_text(images=[],
+ paths=[],
+ use_gpu=False,
+ output_dir='ocr_result',
+ visualization=False,
+ box_thresh=0.5,
+ text_thresh=0.5,
+ angle_classification_thresh=0.9)
+ ```
+
+ - 预测API,检测输入图片中的所有德文文本的位置。
+
+ - **参数**
+
+ - paths (list\[str\]): 图片的路径;
+ - images (list\[numpy.ndarray\]): 图片数据,ndarray.shape 为 \[H, W, C\],BGR格式;
+ - use\_gpu (bool): 是否使用 GPU;**若使用GPU,请先设置CUDA_VISIBLE_DEVICES环境变量**
+ - box\_thresh (float): 检测文本框置信度的阈值;
+ - text\_thresh (float): 识别德文文本置信度的阈值;
+ - angle_classification_thresh(float): 文本角度分类置信度的阈值
+ - visualization (bool): 是否将识别结果保存为图片文件;
+ - output\_dir (str): 图片的保存路径,默认设为 ocr\_result;
+
+ - **返回**
+
+ - res (list\[dict\]): 识别结果的列表,列表中每一个元素为 dict,各字段为:
+ - data (list\[dict\]): 识别文本结果,列表中每一个元素为 dict,各字段为:
+ - text(str): 识别得到的文本
+ - confidence(float): 识别文本结果置信度
+ - text_box_position(list): 文本框在原图中的像素坐标,4*2的矩阵,依次表示文本框左下、右下、右上、左上顶点的坐标
+ 如果无识别结果则data为\[\]
+ - save_path (str, optional): 识别结果的保存路径,如不保存图片则save_path为''
+
+
+
+## 四、服务部署
+
+- PaddleHub Serving 可以部署一个目标检测的在线服务。
+
+- ### 第一步:启动PaddleHub Serving
+
+ - 运行启动命令:
+ - ```shell
+ $ hub serving start -m german_ocr_db_crnn_mobile
+ ```
+
+ - 这样就完成了一个目标检测的服务化API的部署,默认端口号为8866。
+
+ - **NOTE:** 如使用GPU预测,则需要在启动服务之前,请设置CUDA\_VISIBLE\_DEVICES环境变量,否则不用设置。
+
+- ### 第二步:发送预测请求
+
+ - 配置好服务端,以下数行代码即可实现发送预测请求,获取预测结果
+
+ - ```python
+ import requests
+ import json
+ import cv2
+ import base64
+
+ def cv2_to_base64(image):
+ data = cv2.imencode('.jpg', image)[1]
+ return base64.b64encode(data.tostring()).decode('utf8')
+
+ # 发送HTTP请求
+ data = {'images':[cv2_to_base64(cv2.imread("/PATH/TO/IMAGE"))]}
+ headers = {"Content-type": "application/json"}
+ url = "http://127.0.0.1:8866/predict/german_ocr_db_crnn_mobile"
+ r = requests.post(url=url, headers=headers, data=json.dumps(data))
+
+ # 打印预测结果
+ print(r.json()["results"])
+ ```
+
+
+## 五、更新历史
+
+* 1.0.0
+
+ 初始发布
+
+ - ```shell
+ $ hub install german_ocr_db_crnn_mobile==1.0.0
+ ```
diff --git a/modules/image/text_recognition/german_ocr_db_crnn_mobile/__init__.py b/modules/image/text_recognition/german_ocr_db_crnn_mobile/__init__.py
new file mode 100644
index 0000000000000000000000000000000000000000..e69de29bb2d1d6434b8b29ae775ad8c2e48c5391
diff --git a/modules/image/text_recognition/german_ocr_db_crnn_mobile/assets/german.ttf b/modules/image/text_recognition/german_ocr_db_crnn_mobile/assets/german.ttf
new file mode 100644
index 0000000000000000000000000000000000000000..ab68fb197d4479b3b6dec6e85bd5cbaf433a87c5
Binary files /dev/null and b/modules/image/text_recognition/german_ocr_db_crnn_mobile/assets/german.ttf differ
diff --git a/modules/image/text_recognition/german_ocr_db_crnn_mobile/assets/german_dict.txt b/modules/image/text_recognition/german_ocr_db_crnn_mobile/assets/german_dict.txt
new file mode 100644
index 0000000000000000000000000000000000000000..30c4d4218e8a77386db912e24117b1f197466e83
--- /dev/null
+++ b/modules/image/text_recognition/german_ocr_db_crnn_mobile/assets/german_dict.txt
@@ -0,0 +1,131 @@
+!
+"
+$
+%
+&
+'
+(
+)
++
+,
+-
+.
+/
+0
+1
+2
+3
+4
+5
+6
+7
+8
+9
+:
+;
+>
+?
+A
+B
+C
+D
+E
+F
+G
+H
+I
+J
+K
+L
+M
+N
+O
+P
+Q
+R
+S
+T
+U
+V
+W
+X
+Y
+Z
+[
+]
+a
+b
+c
+d
+e
+f
+g
+h
+i
+j
+k
+l
+m
+n
+o
+p
+q
+r
+s
+t
+u
+v
+w
+x
+y
+z
+£
+§
+
+²
+´
+µ
+·
+º
+¼
+½
+¿
+À
+Á
+Ä
+Å
+Ç
+É
+Í
+Ï
+Ô
+Ö
+Ø
+Ù
+Ü
+ß
+à
+á
+â
+ã
+ä
+å
+æ
+ç
+è
+é
+ê
+ë
+í
+ï
+ñ
+ò
+ó
+ô
+ö
+ø
+ù
+ú
+û
+ü
+
diff --git a/modules/image/text_recognition/german_ocr_db_crnn_mobile/character.py b/modules/image/text_recognition/german_ocr_db_crnn_mobile/character.py
new file mode 100644
index 0000000000000000000000000000000000000000..21dbbd9dc790e3d009f45c1ef1b68c001e9f0e0b
--- /dev/null
+++ b/modules/image/text_recognition/german_ocr_db_crnn_mobile/character.py
@@ -0,0 +1,213 @@
+# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import numpy as np
+import string
+
+class CharacterOps(object):
+ """ Convert between text-label and text-index """
+
+ def __init__(self, config):
+ self.character_type = config['character_type']
+ self.loss_type = config['loss_type']
+ self.max_text_len = config['max_text_length']
+ if self.character_type == "en":
+ self.character_str = "0123456789abcdefghijklmnopqrstuvwxyz"
+ dict_character = list(self.character_str)
+ elif self.character_type in [
+ "ch", 'japan', 'korean', 'french', 'german'
+ ]:
+ character_dict_path = config['character_dict_path']
+ add_space = False
+ if 'use_space_char' in config:
+ add_space = config['use_space_char']
+ self.character_str = ""
+ with open(character_dict_path, "rb") as fin:
+ lines = fin.readlines()
+ for line in lines:
+ line = line.decode('utf-8').strip("\n").strip("\r\n")
+ self.character_str += line
+ if add_space:
+ self.character_str += " "
+ dict_character = list(self.character_str)
+ elif self.character_type == "en_sensitive":
+ # same with ASTER setting (use 94 char).
+ self.character_str = string.printable[:-6]
+ dict_character = list(self.character_str)
+ else:
+ self.character_str = None
+ assert self.character_str is not None, \
+ "Nonsupport type of the character: {}".format(self.character_str)
+ self.beg_str = "sos"
+ self.end_str = "eos"
+ if self.loss_type == "attention":
+ dict_character = [self.beg_str, self.end_str] + dict_character
+ elif self.loss_type == "srn":
+ dict_character = dict_character + [self.beg_str, self.end_str]
+ self.dict = {}
+ for i, char in enumerate(dict_character):
+ self.dict[char] = i
+ self.character = dict_character
+
+ def encode(self, text):
+ """convert text-label into text-index.
+ input:
+ text: text labels of each image. [batch_size]
+
+ output:
+ text: concatenated text index for CTCLoss.
+ [sum(text_lengths)] = [text_index_0 + text_index_1 + ... + text_index_(n - 1)]
+ length: length of each text. [batch_size]
+ """
+ if self.character_type == "en":
+ text = text.lower()
+
+ text_list = []
+ for char in text:
+ if char not in self.dict:
+ continue
+ text_list.append(self.dict[char])
+ text = np.array(text_list)
+ return text
+
+ def decode(self, text_index, is_remove_duplicate=False):
+ """ convert text-index into text-label. """
+ char_list = []
+ char_num = self.get_char_num()
+
+ if self.loss_type == "attention":
+ beg_idx = self.get_beg_end_flag_idx("beg")
+ end_idx = self.get_beg_end_flag_idx("end")
+ ignored_tokens = [beg_idx, end_idx]
+ else:
+ ignored_tokens = [char_num]
+
+ for idx in range(len(text_index)):
+ if text_index[idx] in ignored_tokens:
+ continue
+ if is_remove_duplicate:
+ if idx > 0 and text_index[idx - 1] == text_index[idx]:
+ continue
+ char_list.append(self.character[int(text_index[idx])])
+ text = ''.join(char_list)
+ return text
+
+ def get_char_num(self):
+ return len(self.character)
+
+ def get_beg_end_flag_idx(self, beg_or_end):
+ if self.loss_type == "attention":
+ if beg_or_end == "beg":
+ idx = np.array(self.dict[self.beg_str])
+ elif beg_or_end == "end":
+ idx = np.array(self.dict[self.end_str])
+ else:
+ assert False, "Unsupport type %s in get_beg_end_flag_idx"\
+ % beg_or_end
+ return idx
+ else:
+ err = "error in get_beg_end_flag_idx when using the loss %s"\
+ % (self.loss_type)
+ assert False, err
+
+
+def cal_predicts_accuracy(char_ops,
+ preds,
+ preds_lod,
+ labels,
+ labels_lod,
+ is_remove_duplicate=False):
+ acc_num = 0
+ img_num = 0
+ for ino in range(len(labels_lod) - 1):
+ beg_no = preds_lod[ino]
+ end_no = preds_lod[ino + 1]
+ preds_text = preds[beg_no:end_no].reshape(-1)
+ preds_text = char_ops.decode(preds_text, is_remove_duplicate)
+
+ beg_no = labels_lod[ino]
+ end_no = labels_lod[ino + 1]
+ labels_text = labels[beg_no:end_no].reshape(-1)
+ labels_text = char_ops.decode(labels_text, is_remove_duplicate)
+ img_num += 1
+
+ if preds_text == labels_text:
+ acc_num += 1
+ acc = acc_num * 1.0 / img_num
+ return acc, acc_num, img_num
+
+
+def cal_predicts_accuracy_srn(char_ops,
+ preds,
+ labels,
+ max_text_len,
+ is_debug=False):
+ acc_num = 0
+ img_num = 0
+
+ char_num = char_ops.get_char_num()
+
+ total_len = preds.shape[0]
+ img_num = int(total_len / max_text_len)
+ for i in range(img_num):
+ cur_label = []
+ cur_pred = []
+ for j in range(max_text_len):
+ if labels[j + i * max_text_len] != int(char_num - 1): #0
+ cur_label.append(labels[j + i * max_text_len][0])
+ else:
+ break
+
+ for j in range(max_text_len + 1):
+ if j < len(cur_label) and preds[j + i * max_text_len][
+ 0] != cur_label[j]:
+ break
+ elif j == len(cur_label) and j == max_text_len:
+ acc_num += 1
+ break
+ elif j == len(cur_label) and preds[j + i * max_text_len][0] == int(
+ char_num - 1):
+ acc_num += 1
+ break
+ acc = acc_num * 1.0 / img_num
+ return acc, acc_num, img_num
+
+
+def convert_rec_attention_infer_res(preds):
+ img_num = preds.shape[0]
+ target_lod = [0]
+ convert_ids = []
+ for ino in range(img_num):
+ end_pos = np.where(preds[ino, :] == 1)[0]
+ if len(end_pos) <= 1:
+ text_list = preds[ino, 1:]
+ else:
+ text_list = preds[ino, 1:end_pos[1]]
+ target_lod.append(target_lod[ino] + len(text_list))
+ convert_ids = convert_ids + list(text_list)
+ convert_ids = np.array(convert_ids)
+ convert_ids = convert_ids.reshape((-1, 1))
+ return convert_ids, target_lod
+
+
+def convert_rec_label_to_lod(ori_labels):
+ img_num = len(ori_labels)
+ target_lod = [0]
+ convert_ids = []
+ for ino in range(img_num):
+ target_lod.append(target_lod[ino] + len(ori_labels[ino]))
+ convert_ids = convert_ids + list(ori_labels[ino])
+ convert_ids = np.array(convert_ids)
+ convert_ids = convert_ids.reshape((-1, 1))
+ return convert_ids, target_lod
diff --git a/modules/image/text_recognition/german_ocr_db_crnn_mobile/module.py b/modules/image/text_recognition/german_ocr_db_crnn_mobile/module.py
new file mode 100644
index 0000000000000000000000000000000000000000..6b59d274faa7a583851369a38fb73756dfcbcebe
--- /dev/null
+++ b/modules/image/text_recognition/german_ocr_db_crnn_mobile/module.py
@@ -0,0 +1,591 @@
+# -*- coding:utf-8 -*-
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+
+import argparse
+import ast
+import copy
+import math
+import os
+import time
+
+from paddle.fluid.core import AnalysisConfig, create_paddle_predictor, PaddleTensor
+from paddlehub.common.logger import logger
+from paddlehub.module.module import moduleinfo, runnable, serving
+from PIL import Image
+import cv2
+import numpy as np
+import paddle.fluid as fluid
+import paddlehub as hub
+
+from german_ocr_db_crnn_mobile.character import CharacterOps
+from german_ocr_db_crnn_mobile.utils import base64_to_cv2, draw_ocr, get_image_ext, sorted_boxes
+
+
+@moduleinfo(
+ name="german_ocr_db_crnn_mobile",
+ version="1.0.0",
+ summary=
+ "The module can recognize the german texts in an image. Firstly, it will detect the text box positions based on the differentiable_binarization module. Then it recognizes the german texts. ",
+ author="paddle-dev",
+ author_email="paddle-dev@baidu.com",
+ type="cv/text_recognition")
+class GermanOCRDBCRNNMobile(hub.Module):
+ def _initialize(self, text_detector_module=None, enable_mkldnn=False, use_angle_classification=False):
+ """
+ initialize with the necessary elements
+ """
+ self.character_dict_path = os.path.join(self.directory, 'assets',
+ 'german_dict.txt')
+ char_ops_params = {
+ 'character_type': 'german',
+ 'character_dict_path': self.character_dict_path,
+ 'loss_type': 'ctc',
+ 'max_text_length': 25,
+ 'use_space_char': True
+ }
+ self.char_ops = CharacterOps(char_ops_params)
+ self.rec_image_shape = [3, 32, 320]
+ self._text_detector_module = text_detector_module
+ self.font_file = os.path.join(self.directory, 'assets', 'german.ttf')
+ self.enable_mkldnn = enable_mkldnn
+ self.use_angle_classification = use_angle_classification
+
+ self.rec_pretrained_model_path = os.path.join(
+ self.directory, 'inference_model', 'character_rec')
+ self.rec_predictor, self.rec_input_tensor, self.rec_output_tensors = self._set_config(
+ self.rec_pretrained_model_path)
+
+ if self.use_angle_classification:
+ self.cls_pretrained_model_path = os.path.join(
+ self.directory, 'inference_model', 'angle_cls')
+
+ self.cls_predictor, self.cls_input_tensor, self.cls_output_tensors = self._set_config(
+ self.cls_pretrained_model_path)
+
+ def _set_config(self, pretrained_model_path):
+ """
+ predictor config path
+ """
+ model_file_path = os.path.join(pretrained_model_path, 'model')
+ params_file_path = os.path.join(pretrained_model_path, 'params')
+
+ config = AnalysisConfig(model_file_path, params_file_path)
+ try:
+ _places = os.environ["CUDA_VISIBLE_DEVICES"]
+ int(_places[0])
+ use_gpu = True
+ except:
+ use_gpu = False
+
+ if use_gpu:
+ config.enable_use_gpu(8000, 0)
+ else:
+ config.disable_gpu()
+ if self.enable_mkldnn:
+ # cache 10 different shapes for mkldnn to avoid memory leak
+ config.set_mkldnn_cache_capacity(10)
+ config.enable_mkldnn()
+
+ config.disable_glog_info()
+ config.delete_pass("conv_transpose_eltwiseadd_bn_fuse_pass")
+ config.switch_use_feed_fetch_ops(False)
+
+ predictor = create_paddle_predictor(config)
+
+ input_names = predictor.get_input_names()
+ input_tensor = predictor.get_input_tensor(input_names[0])
+ output_names = predictor.get_output_names()
+ output_tensors = []
+ for output_name in output_names:
+ output_tensor = predictor.get_output_tensor(output_name)
+ output_tensors.append(output_tensor)
+
+ return predictor, input_tensor, output_tensors
+
+ @property
+ def text_detector_module(self):
+ """
+ text detect module
+ """
+ if not self._text_detector_module:
+ self._text_detector_module = hub.Module(
+ name='chinese_text_detection_db_mobile',
+ enable_mkldnn=self.enable_mkldnn,
+ version='1.0.4')
+ return self._text_detector_module
+
+ def read_images(self, paths=[]):
+ images = []
+ for img_path in paths:
+ assert os.path.isfile(
+ img_path), "The {} isn't a valid file.".format(img_path)
+ img = cv2.imread(img_path)
+ if img is None:
+ logger.info("error in loading image:{}".format(img_path))
+ continue
+ images.append(img)
+ return images
+
+ def get_rotate_crop_image(self, img, points):
+ '''
+ img_height, img_width = img.shape[0:2]
+ left = int(np.min(points[:, 0]))
+ right = int(np.max(points[:, 0]))
+ top = int(np.min(points[:, 1]))
+ bottom = int(np.max(points[:, 1]))
+ img_crop = img[top:bottom, left:right, :].copy()
+ points[:, 0] = points[:, 0] - left
+ points[:, 1] = points[:, 1] - top
+ '''
+ img_crop_width = int(
+ max(
+ np.linalg.norm(points[0] - points[1]),
+ np.linalg.norm(points[2] - points[3])))
+ img_crop_height = int(
+ max(
+ np.linalg.norm(points[0] - points[3]),
+ np.linalg.norm(points[1] - points[2])))
+ pts_std = np.float32([[0, 0], [img_crop_width, 0],
+ [img_crop_width, img_crop_height],
+ [0, img_crop_height]])
+ M = cv2.getPerspectiveTransform(points, pts_std)
+ dst_img = cv2.warpPerspective(
+ img,
+ M, (img_crop_width, img_crop_height),
+ borderMode=cv2.BORDER_REPLICATE,
+ flags=cv2.INTER_CUBIC)
+ dst_img_height, dst_img_width = dst_img.shape[0:2]
+ if dst_img_height * 1.0 / dst_img_width >= 1.5:
+ dst_img = np.rot90(dst_img)
+ return dst_img
+
+ def resize_norm_img_rec(self, img, max_wh_ratio):
+ imgC, imgH, imgW = self.rec_image_shape
+ assert imgC == img.shape[2]
+ h, w = img.shape[:2]
+ ratio = w / float(h)
+ if math.ceil(imgH * ratio) > imgW:
+ resized_w = imgW
+ else:
+ resized_w = int(math.ceil(imgH * ratio))
+ resized_image = cv2.resize(img, (resized_w, imgH))
+ resized_image = resized_image.astype('float32')
+ resized_image = resized_image.transpose((2, 0, 1)) / 255
+ resized_image -= 0.5
+ resized_image /= 0.5
+ padding_im = np.zeros((imgC, imgH, imgW), dtype=np.float32)
+ padding_im[:, :, 0:resized_w] = resized_image
+ return padding_im
+
+ def resize_norm_img_cls(self, img):
+ cls_image_shape = [3, 48, 192]
+ imgC, imgH, imgW = cls_image_shape
+ h = img.shape[0]
+ w = img.shape[1]
+ ratio = w / float(h)
+ if math.ceil(imgH * ratio) > imgW:
+ resized_w = imgW
+ else:
+ resized_w = int(math.ceil(imgH * ratio))
+ resized_image = cv2.resize(img, (resized_w, imgH))
+ resized_image = resized_image.astype('float32')
+ if cls_image_shape[0] == 1:
+ resized_image = resized_image / 255
+ resized_image = resized_image[np.newaxis, :]
+ else:
+ resized_image = resized_image.transpose((2, 0, 1)) / 255
+ resized_image -= 0.5
+ resized_image /= 0.5
+ padding_im = np.zeros((imgC, imgH, imgW), dtype=np.float32)
+ padding_im[:, :, 0:resized_w] = resized_image
+ return padding_im
+
+ def recognize_text(self,
+ images=[],
+ paths=[],
+ use_gpu=False,
+ output_dir='ocr_result',
+ visualization=False,
+ box_thresh=0.5,
+ text_thresh=0.5,
+ angle_classification_thresh=0.9):
+ """
+ Get the chinese texts in the predicted images.
+ Args:
+ images (list(numpy.ndarray)): images data, shape of each is [H, W, C]. If images not paths
+ paths (list[str]): The paths of images. If paths not images
+ use_gpu (bool): Whether to use gpu.
+ batch_size(int): the program deals once with one
+ output_dir (str): The directory to store output images.
+ visualization (bool): Whether to save image or not.
+ box_thresh(float): the threshold of the detected text box's confidence
+ text_thresh(float): the threshold of the chinese text recognition confidence
+ angle_classification_thresh(float): the threshold of the angle classification confidence
+
+ Returns:
+ res (list): The result of chinese texts and save path of images.
+ """
+ if use_gpu:
+ try:
+ _places = os.environ["CUDA_VISIBLE_DEVICES"]
+ int(_places[0])
+ except:
+ raise RuntimeError(
+ "Environment Variable CUDA_VISIBLE_DEVICES is not set correctly. If you wanna use gpu, please set CUDA_VISIBLE_DEVICES via export CUDA_VISIBLE_DEVICES=cuda_device_id."
+ )
+
+ self.use_gpu = use_gpu
+
+ if images != [] and isinstance(images, list) and paths == []:
+ predicted_data = images
+ elif images == [] and isinstance(paths, list) and paths != []:
+ predicted_data = self.read_images(paths)
+ else:
+ raise TypeError("The input data is inconsistent with expectations.")
+
+ assert predicted_data != [], "There is not any image to be predicted. Please check the input data."
+
+ detection_results = self.text_detector_module.detect_text(
+ images=predicted_data, use_gpu=self.use_gpu, box_thresh=box_thresh)
+ print('*'*10)
+ print(detection_results)
+
+ boxes = [
+ np.array(item['data']).astype(np.float32)
+ for item in detection_results
+ ]
+ all_results = []
+ for index, img_boxes in enumerate(boxes):
+ original_image = predicted_data[index].copy()
+ result = {'save_path': ''}
+ if img_boxes.size == 0:
+ result['data'] = []
+ else:
+ img_crop_list = []
+ boxes = sorted_boxes(img_boxes)
+ for num_box in range(len(boxes)):
+ tmp_box = copy.deepcopy(boxes[num_box])
+ img_crop = self.get_rotate_crop_image(
+ original_image, tmp_box)
+ img_crop_list.append(img_crop)
+
+ if self.use_angle_classification:
+ img_crop_list, angle_list = self._classify_text(
+ img_crop_list,
+ angle_classification_thresh=angle_classification_thresh)
+
+ rec_results = self._recognize_text(img_crop_list)
+
+ # if the recognized text confidence score is lower than text_thresh, then drop it
+ rec_res_final = []
+ for index, res in enumerate(rec_results):
+ text, score = res
+ if score >= text_thresh:
+ rec_res_final.append({
+ 'text':
+ text,
+ 'confidence':
+ float(score),
+ 'text_box_position':
+ boxes[index].astype(np.int).tolist()
+ })
+ result['data'] = rec_res_final
+
+ if visualization and result['data']:
+ result['save_path'] = self.save_result_image(
+ original_image, boxes, rec_results, output_dir,
+ text_thresh)
+ all_results.append(result)
+
+ return all_results
+
+ @serving
+ def serving_method(self, images, **kwargs):
+ """
+ Run as a service.
+ """
+ images_decode = [base64_to_cv2(image) for image in images]
+ results = self.recognize_text(images_decode, **kwargs)
+ return results
+
+ def save_result_image(
+ self,
+ original_image,
+ detection_boxes,
+ rec_results,
+ output_dir='ocr_result',
+ text_thresh=0.5,
+ ):
+ image = Image.fromarray(cv2.cvtColor(original_image, cv2.COLOR_BGR2RGB))
+ txts = [item[0] for item in rec_results]
+ scores = [item[1] for item in rec_results]
+ draw_img = draw_ocr(
+ image,
+ detection_boxes,
+ txts,
+ scores,
+ font_file=self.font_file,
+ draw_txt=True,
+ drop_score=text_thresh)
+
+ if not os.path.exists(output_dir):
+ os.makedirs(output_dir)
+ ext = get_image_ext(original_image)
+ saved_name = 'ndarray_{}{}'.format(time.time(), ext)
+ save_file_path = os.path.join(output_dir, saved_name)
+ cv2.imwrite(save_file_path, draw_img[:, :, ::-1])
+ return save_file_path
+
+ def _classify_text(self, image_list, angle_classification_thresh=0.9):
+ img_list = copy.deepcopy(image_list)
+ img_num = len(img_list)
+ # Calculate the aspect ratio of all text bars
+ width_list = []
+ for img in img_list:
+ width_list.append(img.shape[1] / float(img.shape[0]))
+ # Sorting can speed up the cls process
+ indices = np.argsort(np.array(width_list))
+
+ cls_res = [['', 0.0]] * img_num
+ batch_num = 30
+ for beg_img_no in range(0, img_num, batch_num):
+ end_img_no = min(img_num, beg_img_no + batch_num)
+ norm_img_batch = []
+ max_wh_ratio = 0
+ for ino in range(beg_img_no, end_img_no):
+ h, w = img_list[indices[ino]].shape[0:2]
+ wh_ratio = w * 1.0 / h
+ max_wh_ratio = max(max_wh_ratio, wh_ratio)
+ for ino in range(beg_img_no, end_img_no):
+ norm_img = self.resize_norm_img_cls(img_list[indices[ino]])
+ norm_img = norm_img[np.newaxis, :]
+ norm_img_batch.append(norm_img)
+ norm_img_batch = np.concatenate(norm_img_batch)
+ norm_img_batch = norm_img_batch.copy()
+
+ self.cls_input_tensor.copy_from_cpu(norm_img_batch)
+ self.cls_predictor.zero_copy_run()
+
+ prob_out = self.cls_output_tensors[0].copy_to_cpu()
+ label_out = self.cls_output_tensors[1].copy_to_cpu()
+ if len(label_out.shape) != 1:
+ prob_out, label_out = label_out, prob_out
+ label_list = ['0', '180']
+ for rno in range(len(label_out)):
+ label_idx = label_out[rno]
+ score = prob_out[rno][label_idx]
+ label = label_list[label_idx]
+ cls_res[indices[beg_img_no + rno]] = [label, score]
+ if '180' in label and score > angle_classification_thresh:
+ img_list[indices[beg_img_no + rno]] = cv2.rotate(
+ img_list[indices[beg_img_no + rno]], 1)
+ return img_list, cls_res
+
+ def _recognize_text(self, img_list):
+ img_num = len(img_list)
+ # Calculate the aspect ratio of all text bars
+ width_list = []
+ for img in img_list:
+ width_list.append(img.shape[1] / float(img.shape[0]))
+ # Sorting can speed up the recognition process
+ indices = np.argsort(np.array(width_list))
+
+ rec_res = [['', 0.0]] * img_num
+ batch_num = 30
+ for beg_img_no in range(0, img_num, batch_num):
+ end_img_no = min(img_num, beg_img_no + batch_num)
+ norm_img_batch = []
+ max_wh_ratio = 0
+ for ino in range(beg_img_no, end_img_no):
+ h, w = img_list[indices[ino]].shape[0:2]
+ wh_ratio = w * 1.0 / h
+ max_wh_ratio = max(max_wh_ratio, wh_ratio)
+ for ino in range(beg_img_no, end_img_no):
+ norm_img = self.resize_norm_img_rec(img_list[indices[ino]],
+ max_wh_ratio)
+ norm_img = norm_img[np.newaxis, :]
+ norm_img_batch.append(norm_img)
+
+ norm_img_batch = np.concatenate(norm_img_batch, axis=0)
+ norm_img_batch = norm_img_batch.copy()
+
+ self.rec_input_tensor.copy_from_cpu(norm_img_batch)
+ self.rec_predictor.zero_copy_run()
+
+ rec_idx_batch = self.rec_output_tensors[0].copy_to_cpu()
+ rec_idx_lod = self.rec_output_tensors[0].lod()[0]
+ predict_batch = self.rec_output_tensors[1].copy_to_cpu()
+ predict_lod = self.rec_output_tensors[1].lod()[0]
+ for rno in range(len(rec_idx_lod) - 1):
+ beg = rec_idx_lod[rno]
+ end = rec_idx_lod[rno + 1]
+ rec_idx_tmp = rec_idx_batch[beg:end, 0]
+ preds_text = self.char_ops.decode(rec_idx_tmp)
+ beg = predict_lod[rno]
+ end = predict_lod[rno + 1]
+ probs = predict_batch[beg:end, :]
+ ind = np.argmax(probs, axis=1)
+ blank = probs.shape[1]
+ valid_ind = np.where(ind != (blank - 1))[0]
+ if len(valid_ind) == 0:
+ continue
+ score = np.mean(probs[valid_ind, ind[valid_ind]])
+ # rec_res.append([preds_text, score])
+ rec_res[indices[beg_img_no + rno]] = [preds_text, score]
+
+ return rec_res
+
+ def save_inference_model(self,
+ dirname,
+ model_filename=None,
+ params_filename=None,
+ combined=True):
+ detector_dir = os.path.join(dirname, 'text_detector')
+ classifier_dir = os.path.join(dirname, 'angle_classifier')
+ recognizer_dir = os.path.join(dirname, 'text_recognizer')
+ self._save_detector_model(detector_dir, model_filename, params_filename,
+ combined)
+ if self.use_angle_classification:
+ self._save_classifier_model(classifier_dir, model_filename,
+ params_filename, combined)
+
+ self._save_recognizer_model(recognizer_dir, model_filename,
+ params_filename, combined)
+ logger.info("The inference model has been saved in the path {}".format(
+ os.path.realpath(dirname)))
+
+ def _save_detector_model(self,
+ dirname,
+ model_filename=None,
+ params_filename=None,
+ combined=True):
+ self.text_detector_module.save_inference_model(
+ dirname, model_filename, params_filename, combined)
+
+ def _save_recognizer_model(self,
+ dirname,
+ model_filename=None,
+ params_filename=None,
+ combined=True):
+ if combined:
+ model_filename = "__model__" if not model_filename else model_filename
+ params_filename = "__params__" if not params_filename else params_filename
+ place = fluid.CPUPlace()
+ exe = fluid.Executor(place)
+
+ model_file_path = os.path.join(self.rec_pretrained_model_path, 'model')
+ params_file_path = os.path.join(self.rec_pretrained_model_path,
+ 'params')
+ program, feeded_var_names, target_vars = fluid.io.load_inference_model(
+ dirname=self.rec_pretrained_model_path,
+ model_filename=model_file_path,
+ params_filename=params_file_path,
+ executor=exe)
+
+ fluid.io.save_inference_model(
+ dirname=dirname,
+ main_program=program,
+ executor=exe,
+ feeded_var_names=feeded_var_names,
+ target_vars=target_vars,
+ model_filename=model_filename,
+ params_filename=params_filename)
+
+ def _save_classifier_model(self,
+ dirname,
+ model_filename=None,
+ params_filename=None,
+ combined=True):
+ if combined:
+ model_filename = "__model__" if not model_filename else model_filename
+ params_filename = "__params__" if not params_filename else params_filename
+ place = fluid.CPUPlace()
+ exe = fluid.Executor(place)
+
+ model_file_path = os.path.join(self.cls_pretrained_model_path, 'model')
+ params_file_path = os.path.join(self.cls_pretrained_model_path,
+ 'params')
+ program, feeded_var_names, target_vars = fluid.io.load_inference_model(
+ dirname=self.cls_pretrained_model_path,
+ model_filename=model_file_path,
+ params_filename=params_file_path,
+ executor=exe)
+
+ fluid.io.save_inference_model(
+ dirname=dirname,
+ main_program=program,
+ executor=exe,
+ feeded_var_names=feeded_var_names,
+ target_vars=target_vars,
+ model_filename=model_filename,
+ params_filename=params_filename)
+
+ @runnable
+ def run_cmd(self, argvs):
+ """
+ Run as a command
+ """
+ self.parser = argparse.ArgumentParser(
+ description="Run the %s module." % self.name,
+ prog='hub run %s' % self.name,
+ usage='%(prog)s',
+ add_help=True)
+
+ self.arg_input_group = self.parser.add_argument_group(
+ title="Input options", description="Input data. Required")
+ self.arg_config_group = self.parser.add_argument_group(
+ title="Config options",
+ description=
+ "Run configuration for controlling module behavior, not required.")
+
+ self.add_module_config_arg()
+ self.add_module_input_arg()
+
+ args = self.parser.parse_args(argvs)
+ results = self.recognize_text(
+ paths=[args.input_path],
+ use_gpu=args.use_gpu,
+ output_dir=args.output_dir,
+ visualization=args.visualization)
+ return results
+
+ def add_module_config_arg(self):
+ """
+ Add the command config options
+ """
+ self.arg_config_group.add_argument(
+ '--use_gpu',
+ type=ast.literal_eval,
+ default=False,
+ help="whether use GPU or not")
+ self.arg_config_group.add_argument(
+ '--output_dir',
+ type=str,
+ default='ocr_result',
+ help="The directory to save output images.")
+ self.arg_config_group.add_argument(
+ '--visualization',
+ type=ast.literal_eval,
+ default=False,
+ help="whether to save output as images.")
+
+ def add_module_input_arg(self):
+ """
+ Add the command input options
+ """
+ self.arg_input_group.add_argument(
+ '--input_path', type=str, default=None, help="diretory to image")
+
+
+if __name__ == '__main__':
+ ocr = GermanOCRDBCRNNMobile(enable_mkldnn=False, use_angle_classification=True)
+ image_path = [
+ '/mnt/zhangxuefei/PaddleOCR/doc/imgs/ger_1.jpg',
+ '/mnt/zhangxuefei/PaddleOCR/doc/imgs/12.jpg',
+ '/mnt/zhangxuefei/PaddleOCR/doc/imgs/test_image.jpg'
+ ]
+ res = ocr.recognize_text(paths=image_path, visualization=True)
+ ocr.save_inference_model('save')
+ print(res)
diff --git a/modules/image/text_recognition/german_ocr_db_crnn_mobile/utils.py b/modules/image/text_recognition/german_ocr_db_crnn_mobile/utils.py
new file mode 100644
index 0000000000000000000000000000000000000000..8c41af300cc91de369a473cb7327b794b6cf5715
--- /dev/null
+++ b/modules/image/text_recognition/german_ocr_db_crnn_mobile/utils.py
@@ -0,0 +1,190 @@
+# -*- coding:utf-8 -*-
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+
+import math
+
+from PIL import Image, ImageDraw, ImageFont
+import base64
+import cv2
+import numpy as np
+
+
+def draw_ocr(image,
+ boxes,
+ txts,
+ scores,
+ font_file,
+ draw_txt=True,
+ drop_score=0.5):
+ """
+ Visualize the results of OCR detection and recognition
+ args:
+ image(Image|array): RGB image
+ boxes(list): boxes with shape(N, 4, 2)
+ txts(list): the texts
+ scores(list): txxs corresponding scores
+ draw_txt(bool): whether draw text or not
+ drop_score(float): only scores greater than drop_threshold will be visualized
+ return(array):
+ the visualized img
+ """
+ if scores is None:
+ scores = [1] * len(boxes)
+ for (box, score) in zip(boxes, scores):
+ if score < drop_score or math.isnan(score):
+ continue
+ box = np.reshape(np.array(box), [-1, 1, 2]).astype(np.int64)
+ image = cv2.polylines(np.array(image), [box], True, (255, 0, 0), 2)
+
+ if draw_txt:
+ img = np.array(resize_img(image, input_size=600))
+ txt_img = text_visual(
+ txts,
+ scores,
+ font_file,
+ img_h=img.shape[0],
+ img_w=600,
+ threshold=drop_score)
+ img = np.concatenate([np.array(img), np.array(txt_img)], axis=1)
+ return img
+ return image
+
+
+def text_visual(texts, scores, font_file, img_h=400, img_w=600, threshold=0.):
+ """
+ create new blank img and draw txt on it
+ args:
+ texts(list): the text will be draw
+ scores(list|None): corresponding score of each txt
+ img_h(int): the height of blank img
+ img_w(int): the width of blank img
+ return(array):
+ """
+ if scores is not None:
+ assert len(texts) == len(
+ scores), "The number of txts and corresponding scores must match"
+
+ def create_blank_img():
+ blank_img = np.ones(shape=[img_h, img_w], dtype=np.int8) * 255
+ blank_img[:, img_w - 1:] = 0
+ blank_img = Image.fromarray(blank_img).convert("RGB")
+ draw_txt = ImageDraw.Draw(blank_img)
+ return blank_img, draw_txt
+
+ blank_img, draw_txt = create_blank_img()
+
+ font_size = 20
+ txt_color = (0, 0, 0)
+ font = ImageFont.truetype(font_file, font_size, encoding="utf-8")
+
+ gap = font_size + 5
+ txt_img_list = []
+ count, index = 1, 0
+ for idx, txt in enumerate(texts):
+ index += 1
+ if scores[idx] < threshold or math.isnan(scores[idx]):
+ index -= 1
+ continue
+ first_line = True
+ while str_count(txt) >= img_w // font_size - 4:
+ tmp = txt
+ txt = tmp[:img_w // font_size - 4]
+ if first_line:
+ new_txt = str(index) + ': ' + txt
+ first_line = False
+ else:
+ new_txt = ' ' + txt
+ draw_txt.text((0, gap * count), new_txt, txt_color, font=font)
+ txt = tmp[img_w // font_size - 4:]
+ if count >= img_h // gap - 1:
+ txt_img_list.append(np.array(blank_img))
+ blank_img, draw_txt = create_blank_img()
+ count = 0
+ count += 1
+ if first_line:
+ new_txt = str(index) + ': ' + txt + ' ' + '%.3f' % (scores[idx])
+ else:
+ new_txt = " " + txt + " " + '%.3f' % (scores[idx])
+ draw_txt.text((0, gap * count), new_txt, txt_color, font=font)
+ # whether add new blank img or not
+ if count >= img_h // gap - 1 and idx + 1 < len(texts):
+ txt_img_list.append(np.array(blank_img))
+ blank_img, draw_txt = create_blank_img()
+ count = 0
+ count += 1
+ txt_img_list.append(np.array(blank_img))
+ if len(txt_img_list) == 1:
+ blank_img = np.array(txt_img_list[0])
+ else:
+ blank_img = np.concatenate(txt_img_list, axis=1)
+ return np.array(blank_img)
+
+
+def str_count(s):
+ """
+ Count the number of Chinese characters,
+ a single English character and a single number
+ equal to half the length of Chinese characters.
+ args:
+ s(string): the input of string
+ return(int):
+ the number of Chinese characters
+ """
+ import string
+ count_zh = count_pu = 0
+ s_len = len(s)
+ en_dg_count = 0
+ for c in s:
+ if c in string.ascii_letters or c.isdigit() or c.isspace():
+ en_dg_count += 1
+ elif c.isalpha():
+ count_zh += 1
+ else:
+ count_pu += 1
+ return s_len - math.ceil(en_dg_count / 2)
+
+
+def resize_img(img, input_size=600):
+ img = np.array(img)
+ im_shape = img.shape
+ im_size_min = np.min(im_shape[0:2])
+ im_size_max = np.max(im_shape[0:2])
+ im_scale = float(input_size) / float(im_size_max)
+ im = cv2.resize(img, None, None, fx=im_scale, fy=im_scale)
+ return im
+
+
+def get_image_ext(image):
+ if image.shape[2] == 4:
+ return ".png"
+ return ".jpg"
+
+
+def sorted_boxes(dt_boxes):
+ """
+ Sort text boxes in order from top to bottom, left to right
+ args:
+ dt_boxes(array):detected text boxes with shape [4, 2]
+ return:
+ sorted boxes(array) with shape [4, 2]
+ """
+ num_boxes = dt_boxes.shape[0]
+ sorted_boxes = sorted(dt_boxes, key=lambda x: (x[0][1], x[0][0]))
+ _boxes = list(sorted_boxes)
+
+ for i in range(num_boxes - 1):
+ if abs(_boxes[i + 1][0][1] - _boxes[i][0][1]) < 10 and \
+ (_boxes[i + 1][0][0] < _boxes[i][0][0]):
+ tmp = _boxes[i]
+ _boxes[i] = _boxes[i + 1]
+ _boxes[i + 1] = tmp
+ return _boxes
+
+
+def base64_to_cv2(b64str):
+ data = base64.b64decode(b64str.encode('utf8'))
+ data = np.fromstring(data, np.uint8)
+ data = cv2.imdecode(data, cv2.IMREAD_COLOR)
+ return data
diff --git a/modules/image/text_recognition/japan_ocr_db_crnn_mobile/README.md b/modules/image/text_recognition/japan_ocr_db_crnn_mobile/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..05f32a6621b4d81b5b14e1f1550449d22ad0f359
--- /dev/null
+++ b/modules/image/text_recognition/japan_ocr_db_crnn_mobile/README.md
@@ -0,0 +1,172 @@
+# japan_ocr_db_crnn_mobile
+
+|模型名称|japan_ocr_db_crnn_mobile|
+| :--- | :---: |
+|类别|图像-文字识别|
+|网络|Differentiable Binarization+CRNN|
+|数据集|icdar2015数据集|
+|是否支持Fine-tuning|否|
+|模型大小|8MB|
+|最新更新日期|2021-04-15|
+|数据指标|-|
+
+
+## 一、模型基本信息
+
+- ### 应用效果展示
+ - 样例结果示例:
+
+
+
+
+- ### 模型介绍
+
+ - japan_ocr_db_crnn_mobile Module用于识别图片当中的日文。其基于chinese_text_detection_db_mobile检测得到的文本框,继续识别文本框中的日文文字。最终识别文字算法采用CRNN(Convolutional Recurrent Neural Network)即卷积递归神经网络。其是DCNN和RNN的组合,专门用于识别图像中的序列式对象。与CTC loss配合使用,进行文字识别,可以直接从文本词级或行级的标注中学习,不需要详细的字符级的标注。该Module是一个识别日文的轻量级OCR模型,支持直接预测。
+
+## 二、安装
+
+- ### 1、环境依赖
+
+ - paddlepaddle >= 1.8.0
+
+ - paddlehub >= 1.8.0 | [如何安装paddlehub](../../../../docs/docs_ch/get_start/installation.rst)
+
+ - shapely
+
+ - pyclipper
+
+ - ```shell
+ $ pip install shapely pyclipper
+ ```
+ - **该Module依赖于第三方库shapely和pyclipper,使用该Module之前,请先安装shapely和pyclipper。**
+
+- ### 2、安装
+
+ - ```shell
+ $ hub install japan_ocr_db_crnn_mobile
+ ```
+ - 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md)
+ | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md)
+
+
+## 三、模型API预测
+
+- ### 1、命令行预测
+
+ - ```shell
+ $ hub run japan_ocr_db_crnn_mobile --input_path "/PATH/TO/IMAGE"
+ ```
+ - 通过命令行方式实现文字识别模型的调用,更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst)
+
+- ### 2、代码示例
+
+ - ```python
+ import paddlehub as hub
+ import cv2
+
+ ocr = hub.Module(name="japan_ocr_db_crnn_mobile", enable_mkldnn=True) # mkldnn加速仅在CPU下有效
+ result = ocr.recognize_text(images=[cv2.imread('/PATH/TO/IMAGE')])
+
+ # or
+ # result = ocr.recognize_text(paths=['/PATH/TO/IMAGE'])
+ ```
+
+- ### 3、API
+
+ - ```python
+ def __init__(text_detector_module=None, enable_mkldnn=False)
+ ```
+
+ - 构造JapanOCRDBCRNNMobile对象
+
+ - **参数**
+
+ - text_detector_module(str): 文字检测PaddleHub Module名字,如设置为None,则默认使用[chinese_text_detection_db_mobile Module](../chinese_text_detection_db_mobile/)。其作用为检测图片当中的文本。
+ - enable_mkldnn(bool): 是否开启mkldnn加速CPU计算。该参数仅在CPU运行下设置有效。默认为False。
+
+ - ```python
+ def recognize_text(images=[],
+ paths=[],
+ use_gpu=False,
+ output_dir='ocr_result',
+ visualization=False,
+ box_thresh=0.5,
+ text_thresh=0.5,
+ angle_classification_thresh=0.9)
+ ```
+
+ - 预测API,检测输入图片中的所有日文文本的位置。
+
+ - **参数**
+
+ - paths (list\[str\]): 图片的路径;
+ - images (list\[numpy.ndarray\]): 图片数据,ndarray.shape 为 \[H, W, C\],BGR格式;
+ - use\_gpu (bool): 是否使用 GPU;**若使用GPU,请先设置CUDA_VISIBLE_DEVICES环境变量**
+ - output\_dir (str): 图片的保存路径,默认设为 ocr\_result;
+ - box\_thresh (float): 检测文本框置信度的阈值;
+ - text\_thresh (float): 识别日文文本置信度的阈值;
+ - angle_classification_thresh(float): 文本角度分类置信度的阈值
+ - visualization (bool): 是否将识别结果保存为图片文件。
+
+
+ - **返回**
+
+ - res (list\[dict\]): 识别结果的列表,列表中每一个元素为 dict,各字段为:
+ - data (list\[dict\]): 识别文本结果,列表中每一个元素为 dict,各字段为:
+ - text(str): 识别得到的文本
+ - confidence(float): 识别文本结果置信度
+ - text_box_position(list): 文本框在原图中的像素坐标,4*2的矩阵,依次表示文本框左下、右下、右上、左上顶点的坐标
+ 如果无识别结果则data为\[\]
+ - save_path (str, optional): 识别结果的保存路径,如不保存图片则save_path为''
+
+
+
+## 四、服务部署
+
+- PaddleHub Serving 可以部署一个目标检测的在线服务。
+
+- ### 第一步:启动PaddleHub Serving
+
+ - 运行启动命令:
+ - ```shell
+ $ hub serving start -m japan_ocr_db_crnn_mobile
+ ```
+
+ - 这样就完成了一个目标检测的服务化API的部署,默认端口号为8866。
+
+ - **NOTE:** 如使用GPU预测,则需要在启动服务之前,请设置CUDA\_VISIBLE\_DEVICES环境变量,否则不用设置。
+
+- ### 第二步:发送预测请求
+
+ - 配置好服务端,以下数行代码即可实现发送预测请求,获取预测结果
+
+ - ```python
+ import requests
+ import json
+ import cv2
+ import base64
+
+ def cv2_to_base64(image):
+ data = cv2.imencode('.jpg', image)[1]
+ return base64.b64encode(data.tostring()).decode('utf8')
+
+ # 发送HTTP请求
+ data = {'images':[cv2_to_base64(cv2.imread("/PATH/TO/IMAGE"))]}
+ headers = {"Content-type": "application/json"}
+ url = "http://127.0.0.1:8866/predict/japan_ocr_db_crnn_mobile"
+ r = requests.post(url=url, headers=headers, data=json.dumps(data))
+
+ # 打印预测结果
+ print(r.json()["results"])
+ ```
+
+
+## 五、更新历史
+
+* 1.0.0
+
+ 初始发布
+
+ - ```shell
+ $ hub install japan_ocr_db_crnn_mobile==1.0.0
+ ```
diff --git a/modules/image/text_recognition/japan_ocr_db_crnn_mobile/__init__.py b/modules/image/text_recognition/japan_ocr_db_crnn_mobile/__init__.py
new file mode 100644
index 0000000000000000000000000000000000000000..e69de29bb2d1d6434b8b29ae775ad8c2e48c5391
diff --git a/modules/image/text_recognition/japan_ocr_db_crnn_mobile/assets/japan.ttc b/modules/image/text_recognition/japan_ocr_db_crnn_mobile/assets/japan.ttc
new file mode 100644
index 0000000000000000000000000000000000000000..ad68243b968fc87b207928594c585039859b75a9
Binary files /dev/null and b/modules/image/text_recognition/japan_ocr_db_crnn_mobile/assets/japan.ttc differ
diff --git a/modules/image/text_recognition/japan_ocr_db_crnn_mobile/assets/japan_dict.txt b/modules/image/text_recognition/japan_ocr_db_crnn_mobile/assets/japan_dict.txt
new file mode 100644
index 0000000000000000000000000000000000000000..339d4b89e5159a346636641a0814874faa59754a
--- /dev/null
+++ b/modules/image/text_recognition/japan_ocr_db_crnn_mobile/assets/japan_dict.txt
@@ -0,0 +1,4399 @@
+!
+"
+#
+$
+%
+&
+'
+(
+)
+*
++
+,
+-
+.
+/
+0
+1
+2
+3
+4
+5
+6
+7
+8
+9
+:
+;
+<
+=
+>
+?
+A
+B
+C
+D
+E
+F
+G
+H
+I
+J
+K
+L
+M
+N
+O
+P
+Q
+R
+S
+T
+U
+V
+W
+X
+Y
+Z
+[
+]
+_
+`
+a
+b
+c
+d
+e
+f
+g
+h
+i
+j
+k
+l
+m
+n
+o
+p
+q
+r
+s
+t
+u
+v
+w
+x
+y
+z
+©
+°
+²
+´
+½
+Á
+Ä
+Å
+Ç
+È
+É
+Í
+Ó
+Ö
+×
+Ü
+ß
+à
+á
+â
+ã
+ä
+å
+æ
+ç
+è
+é
+ê
+ë
+í
+ð
+ñ
+ò
+ó
+ô
+õ
+ö
+ø
+ú
+û
+ü
+ý
+ā
+ă
+ą
+ć
+Č
+č
+đ
+ē
+ė
+ę
+ğ
+ī
+ı
+Ł
+ł
+ń
+ň
+ō
+ř
+Ş
+ş
+Š
+š
+ţ
+ū
+ż
+Ž
+ž
+Ș
+ș
+ț
+Δ
+α
+λ
+μ
+φ
+Г
+О
+а
+в
+л
+о
+р
+с
+т
+я
+ồ
+
+—
+―
+’
+“
+”
+…
+℃
+→
+∇
+−
+■
+☆
+
+、
+。
+々
+〆
+〈
+〉
+「
+」
+『
+』
+〔
+〕
+〜
+ぁ
+あ
+ぃ
+い
+う
+ぇ
+え
+ぉ
+お
+か
+が
+き
+ぎ
+く
+ぐ
+け
+げ
+こ
+ご
+さ
+ざ
+し
+じ
+す
+ず
+せ
+ぜ
+そ
+ぞ
+た
+だ
+ち
+ぢ
+っ
+つ
+づ
+て
+で
+と
+ど
+な
+に
+ぬ
+ね
+の
+は
+ば
+ぱ
+ひ
+び
+ぴ
+ふ
+ぶ
+ぷ
+へ
+べ
+ぺ
+ほ
+ぼ
+ぽ
+ま
+み
+む
+め
+も
+ゃ
+や
+ゅ
+ゆ
+ょ
+よ
+ら
+り
+る
+れ
+ろ
+わ
+ゑ
+を
+ん
+ゝ
+ゞ
+ァ
+ア
+ィ
+イ
+ゥ
+ウ
+ェ
+エ
+ォ
+オ
+カ
+ガ
+キ
+ギ
+ク
+グ
+ケ
+ゲ
+コ
+ゴ
+サ
+ザ
+シ
+ジ
+ス
+ズ
+セ
+ゼ
+ソ
+ゾ
+タ
+ダ
+チ
+ヂ
+ッ
+ツ
+ヅ
+テ
+デ
+ト
+ド
+ナ
+ニ
+ヌ
+ネ
+ノ
+ハ
+バ
+パ
+ヒ
+ビ
+ピ
+フ
+ブ
+プ
+ヘ
+ベ
+ペ
+ホ
+ボ
+ポ
+マ
+ミ
+ム
+メ
+モ
+ャ
+ヤ
+ュ
+ユ
+ョ
+ヨ
+ラ
+リ
+ル
+レ
+ロ
+ワ
+ヰ
+ン
+ヴ
+ヵ
+ヶ
+・
+ー
+㈱
+一
+丁
+七
+万
+丈
+三
+上
+下
+不
+与
+丑
+且
+世
+丘
+丙
+丞
+両
+並
+中
+串
+丸
+丹
+主
+丼
+丿
+乃
+久
+之
+乎
+乏
+乗
+乘
+乙
+九
+乞
+也
+乱
+乳
+乾
+亀
+了
+予
+争
+事
+二
+于
+互
+五
+井
+亘
+亙
+些
+亜
+亟
+亡
+交
+亥
+亦
+亨
+享
+京
+亭
+亮
+人
+什
+仁
+仇
+今
+介
+仍
+仏
+仔
+仕
+他
+仗
+付
+仙
+代
+令
+以
+仮
+仰
+仲
+件
+任
+企
+伊
+伍
+伎
+伏
+伐
+休
+会
+伝
+伯
+估
+伴
+伶
+伸
+伺
+似
+伽
+佃
+但
+位
+低
+住
+佐
+佑
+体
+何
+余
+佚
+佛
+作
+佩
+佳
+併
+佶
+使
+侈
+例
+侍
+侏
+侑
+侘
+供
+依
+侠
+価
+侮
+侯
+侵
+侶
+便
+係
+促
+俄
+俊
+俔
+俗
+俘
+保
+信
+俣
+俤
+修
+俯
+俳
+俵
+俸
+俺
+倉
+個
+倍
+倒
+候
+借
+倣
+値
+倫
+倭
+倶
+倹
+偃
+假
+偈
+偉
+偏
+偐
+偕
+停
+健
+側
+偵
+偶
+偽
+傀
+傅
+傍
+傑
+傘
+備
+催
+傭
+傲
+傳
+債
+傷
+傾
+僊
+働
+像
+僑
+僕
+僚
+僧
+僭
+僮
+儀
+億
+儇
+儒
+儛
+償
+儡
+優
+儲
+儺
+儼
+兀
+允
+元
+兄
+充
+兆
+先
+光
+克
+兌
+免
+兎
+児
+党
+兜
+入
+全
+八
+公
+六
+共
+兵
+其
+具
+典
+兼
+内
+円
+冊
+再
+冑
+冒
+冗
+写
+冠
+冤
+冥
+冨
+冬
+冲
+决
+冶
+冷
+准
+凉
+凋
+凌
+凍
+凛
+凝
+凞
+几
+凡
+処
+凪
+凰
+凱
+凶
+凸
+凹
+出
+函
+刀
+刃
+分
+切
+刈
+刊
+刎
+刑
+列
+初
+判
+別
+利
+刪
+到
+制
+刷
+券
+刹
+刺
+刻
+剃
+則
+削
+剋
+前
+剖
+剛
+剣
+剤
+剥
+剪
+副
+剰
+割
+創
+剽
+劇
+劉
+劔
+力
+功
+加
+劣
+助
+努
+劫
+劭
+励
+労
+効
+劾
+勃
+勅
+勇
+勉
+勒
+動
+勘
+務
+勝
+募
+勢
+勤
+勧
+勲
+勺
+勾
+勿
+匁
+匂
+包
+匏
+化
+北
+匙
+匝
+匠
+匡
+匣
+匯
+匲
+匹
+区
+医
+匿
+十
+千
+升
+午
+卉
+半
+卍
+卑
+卒
+卓
+協
+南
+単
+博
+卜
+占
+卦
+卯
+印
+危
+即
+却
+卵
+卸
+卿
+厄
+厚
+原
+厠
+厨
+厩
+厭
+厳
+去
+参
+又
+叉
+及
+友
+双
+反
+収
+叔
+取
+受
+叙
+叛
+叟
+叡
+叢
+口
+古
+句
+叩
+只
+叫
+召
+可
+台
+叱
+史
+右
+叶
+号
+司
+吃
+各
+合
+吉
+吊
+同
+名
+后
+吏
+吐
+向
+君
+吝
+吟
+吠
+否
+含
+吸
+吹
+吻
+吽
+吾
+呂
+呆
+呈
+呉
+告
+呑
+周
+呪
+呰
+味
+呼
+命
+咀
+咄
+咋
+和
+咒
+咫
+咲
+咳
+咸
+哀
+品
+哇
+哉
+員
+哨
+哩
+哭
+哲
+哺
+唄
+唆
+唇
+唐
+唖
+唯
+唱
+唳
+唸
+唾
+啄
+商
+問
+啓
+啼
+善
+喋
+喚
+喜
+喝
+喧
+喩
+喪
+喫
+喬
+單
+喰
+営
+嗅
+嗇
+嗔
+嗚
+嗜
+嗣
+嘆
+嘉
+嘗
+嘘
+嘩
+嘯
+嘱
+嘲
+嘴
+噂
+噌
+噛
+器
+噴
+噺
+嚆
+嚢
+囀
+囃
+囉
+囚
+四
+回
+因
+団
+困
+囲
+図
+固
+国
+圀
+圃
+國
+圏
+園
+圓
+團
+圜
+土
+圧
+在
+圭
+地
+址
+坂
+均
+坊
+坐
+坑
+坡
+坤
+坦
+坪
+垂
+型
+垢
+垣
+埃
+埋
+城
+埒
+埔
+域
+埠
+埴
+埵
+執
+培
+基
+埼
+堀
+堂
+堅
+堆
+堕
+堤
+堪
+堯
+堰
+報
+場
+堵
+堺
+塀
+塁
+塊
+塑
+塔
+塗
+塘
+塙
+塚
+塞
+塩
+填
+塵
+塾
+境
+墉
+墓
+増
+墜
+墟
+墨
+墳
+墺
+墻
+墾
+壁
+壇
+壊
+壌
+壕
+士
+壬
+壮
+声
+壱
+売
+壷
+壹
+壺
+壽
+変
+夏
+夕
+外
+夙
+多
+夜
+夢
+夥
+大
+天
+太
+夫
+夬
+夭
+央
+失
+夷
+夾
+奄
+奇
+奈
+奉
+奎
+奏
+契
+奔
+奕
+套
+奘
+奠
+奢
+奥
+奨
+奪
+奮
+女
+奴
+奸
+好
+如
+妃
+妄
+妊
+妍
+妓
+妖
+妙
+妥
+妨
+妬
+妲
+妹
+妻
+妾
+姉
+始
+姐
+姓
+委
+姚
+姜
+姞
+姥
+姦
+姨
+姪
+姫
+姶
+姻
+姿
+威
+娑
+娘
+娟
+娠
+娩
+娯
+娼
+婆
+婉
+婚
+婢
+婦
+婬
+婿
+媄
+媒
+媓
+媚
+媛
+媞
+媽
+嫁
+嫄
+嫉
+嫌
+嫐
+嫗
+嫡
+嬉
+嬌
+嬢
+嬪
+嬬
+嬾
+孁
+子
+孔
+字
+存
+孚
+孝
+孟
+季
+孤
+学
+孫
+孵
+學
+宅
+宇
+守
+安
+宋
+完
+宍
+宏
+宕
+宗
+官
+宙
+定
+宛
+宜
+宝
+実
+客
+宣
+室
+宥
+宮
+宰
+害
+宴
+宵
+家
+宸
+容
+宿
+寂
+寄
+寅
+密
+寇
+富
+寒
+寓
+寔
+寛
+寝
+察
+寡
+實
+寧
+審
+寮
+寵
+寶
+寸
+寺
+対
+寿
+封
+専
+射
+将
+尉
+尊
+尋
+對
+導
+小
+少
+尖
+尚
+尤
+尪
+尭
+就
+尹
+尺
+尻
+尼
+尽
+尾
+尿
+局
+居
+屈
+届
+屋
+屍
+屎
+屏
+屑
+屓
+展
+属
+屠
+層
+履
+屯
+山
+岐
+岑
+岡
+岩
+岫
+岬
+岳
+岷
+岸
+峠
+峡
+峨
+峯
+峰
+島
+峻
+崇
+崋
+崎
+崑
+崖
+崗
+崛
+崩
+嵌
+嵐
+嵩
+嵯
+嶂
+嶋
+嶠
+嶺
+嶼
+嶽
+巀
+巌
+巒
+巖
+川
+州
+巡
+巣
+工
+左
+巧
+巨
+巫
+差
+己
+巳
+巴
+巷
+巻
+巽
+巾
+市
+布
+帆
+希
+帖
+帚
+帛
+帝
+帥
+師
+席
+帯
+帰
+帳
+帷
+常
+帽
+幄
+幅
+幇
+幌
+幔
+幕
+幟
+幡
+幢
+幣
+干
+平
+年
+并
+幸
+幹
+幻
+幼
+幽
+幾
+庁
+広
+庄
+庇
+床
+序
+底
+庖
+店
+庚
+府
+度
+座
+庫
+庭
+庵
+庶
+康
+庸
+廂
+廃
+廉
+廊
+廓
+廟
+廠
+廣
+廬
+延
+廷
+建
+廻
+廼
+廿
+弁
+弄
+弉
+弊
+弌
+式
+弐
+弓
+弔
+引
+弖
+弗
+弘
+弛
+弟
+弥
+弦
+弧
+弱
+張
+強
+弼
+弾
+彈
+彊
+彌
+彎
+当
+彗
+彙
+彝
+形
+彦
+彩
+彫
+彬
+彭
+彰
+影
+彷
+役
+彼
+往
+征
+徂
+径
+待
+律
+後
+徐
+徑
+徒
+従
+得
+徠
+御
+徧
+徨
+復
+循
+徭
+微
+徳
+徴
+德
+徹
+徽
+心
+必
+忉
+忌
+忍
+志
+忘
+忙
+応
+忠
+快
+忯
+念
+忻
+忽
+忿
+怒
+怖
+思
+怠
+怡
+急
+性
+怨
+怪
+怯
+恂
+恋
+恐
+恒
+恕
+恣
+恤
+恥
+恨
+恩
+恬
+恭
+息
+恵
+悉
+悌
+悍
+悔
+悟
+悠
+患
+悦
+悩
+悪
+悲
+悼
+情
+惇
+惑
+惚
+惜
+惟
+惠
+惣
+惧
+惨
+惰
+想
+惹
+惺
+愈
+愉
+愍
+意
+愔
+愚
+愛
+感
+愷
+愿
+慈
+態
+慌
+慎
+慕
+慢
+慣
+慧
+慨
+慮
+慰
+慶
+憂
+憎
+憐
+憑
+憙
+憤
+憧
+憩
+憬
+憲
+憶
+憾
+懇
+應
+懌
+懐
+懲
+懸
+懺
+懽
+懿
+戈
+戊
+戌
+戎
+成
+我
+戒
+戔
+或
+戚
+戟
+戦
+截
+戮
+戯
+戴
+戸
+戻
+房
+所
+扁
+扇
+扈
+扉
+手
+才
+打
+払
+托
+扮
+扱
+扶
+批
+承
+技
+抄
+把
+抑
+抓
+投
+抗
+折
+抜
+択
+披
+抱
+抵
+抹
+押
+抽
+担
+拇
+拈
+拉
+拍
+拏
+拐
+拒
+拓
+拘
+拙
+招
+拝
+拠
+拡
+括
+拭
+拳
+拵
+拶
+拾
+拿
+持
+挂
+指
+按
+挑
+挙
+挟
+挨
+振
+挺
+挽
+挿
+捉
+捕
+捗
+捜
+捧
+捨
+据
+捺
+捻
+掃
+掄
+授
+掌
+排
+掖
+掘
+掛
+掟
+採
+探
+掣
+接
+控
+推
+掩
+措
+掬
+掲
+掴
+掻
+掾
+揃
+揄
+揆
+揉
+描
+提
+揖
+揚
+換
+握
+揮
+援
+揶
+揺
+損
+搦
+搬
+搭
+携
+搾
+摂
+摘
+摩
+摸
+摺
+撃
+撒
+撞
+撤
+撥
+撫
+播
+撮
+撰
+撲
+撹
+擁
+操
+擔
+擦
+擬
+擾
+攘
+攝
+攣
+支
+收
+改
+攻
+放
+政
+故
+敏
+救
+敗
+教
+敢
+散
+敦
+敬
+数
+整
+敵
+敷
+斂
+文
+斉
+斎
+斐
+斑
+斗
+料
+斜
+斟
+斤
+斥
+斧
+斬
+断
+斯
+新
+方
+於
+施
+旁
+旅
+旋
+旌
+族
+旗
+旛
+无
+旡
+既
+日
+旦
+旧
+旨
+早
+旬
+旭
+旺
+旻
+昂
+昆
+昇
+昉
+昌
+明
+昏
+易
+昔
+星
+映
+春
+昧
+昨
+昪
+昭
+是
+昵
+昼
+晁
+時
+晃
+晋
+晏
+晒
+晟
+晦
+晧
+晩
+普
+景
+晴
+晶
+智
+暁
+暇
+暈
+暉
+暑
+暖
+暗
+暘
+暢
+暦
+暫
+暮
+暲
+暴
+暹
+暾
+曄
+曇
+曉
+曖
+曙
+曜
+曝
+曠
+曰
+曲
+曳
+更
+書
+曹
+曼
+曽
+曾
+替
+最
+會
+月
+有
+朋
+服
+朏
+朔
+朕
+朗
+望
+朝
+期
+朧
+木
+未
+末
+本
+札
+朱
+朴
+机
+朽
+杁
+杉
+李
+杏
+材
+村
+杓
+杖
+杜
+杞
+束
+条
+杢
+杣
+来
+杭
+杮
+杯
+東
+杲
+杵
+杷
+杼
+松
+板
+枅
+枇
+析
+枓
+枕
+林
+枚
+果
+枝
+枠
+枡
+枢
+枯
+枳
+架
+柄
+柊
+柏
+某
+柑
+染
+柔
+柘
+柚
+柯
+柱
+柳
+柴
+柵
+査
+柾
+柿
+栂
+栃
+栄
+栖
+栗
+校
+株
+栲
+栴
+核
+根
+栻
+格
+栽
+桁
+桂
+桃
+框
+案
+桐
+桑
+桓
+桔
+桜
+桝
+桟
+桧
+桴
+桶
+桾
+梁
+梅
+梆
+梓
+梔
+梗
+梛
+條
+梟
+梢
+梧
+梨
+械
+梱
+梲
+梵
+梶
+棄
+棋
+棒
+棗
+棘
+棚
+棟
+棠
+森
+棲
+棹
+棺
+椀
+椅
+椋
+植
+椎
+椏
+椒
+椙
+検
+椥
+椹
+椿
+楊
+楓
+楕
+楚
+楞
+楠
+楡
+楢
+楨
+楪
+楫
+業
+楮
+楯
+楳
+極
+楷
+楼
+楽
+概
+榊
+榎
+榕
+榛
+榜
+榮
+榱
+榴
+槃
+槇
+槊
+構
+槌
+槍
+槐
+様
+槙
+槻
+槽
+槿
+樂
+樋
+樓
+樗
+標
+樟
+模
+権
+横
+樫
+樵
+樹
+樺
+樽
+橇
+橋
+橘
+機
+橿
+檀
+檄
+檎
+檐
+檗
+檜
+檣
+檥
+檬
+檮
+檸
+檻
+櫃
+櫓
+櫛
+櫟
+櫨
+櫻
+欄
+欅
+欠
+次
+欣
+欧
+欲
+欺
+欽
+款
+歌
+歎
+歓
+止
+正
+此
+武
+歩
+歪
+歯
+歳
+歴
+死
+殆
+殉
+殊
+残
+殖
+殯
+殴
+段
+殷
+殺
+殻
+殿
+毀
+毅
+母
+毎
+毒
+比
+毘
+毛
+毫
+毬
+氈
+氏
+民
+気
+水
+氷
+永
+氾
+汀
+汁
+求
+汎
+汐
+汗
+汚
+汝
+江
+池
+汪
+汰
+汲
+決
+汽
+沂
+沃
+沅
+沆
+沈
+沌
+沐
+沓
+沖
+沙
+没
+沢
+沱
+河
+沸
+油
+治
+沼
+沽
+沿
+況
+泉
+泊
+泌
+法
+泗
+泡
+波
+泣
+泥
+注
+泯
+泰
+泳
+洋
+洒
+洗
+洛
+洞
+津
+洩
+洪
+洲
+洸
+洹
+活
+洽
+派
+流
+浄
+浅
+浙
+浚
+浜
+浣
+浦
+浩
+浪
+浮
+浴
+海
+浸
+涅
+消
+涌
+涙
+涛
+涯
+液
+涵
+涼
+淀
+淄
+淆
+淇
+淋
+淑
+淘
+淡
+淤
+淨
+淫
+深
+淳
+淵
+混
+淹
+添
+清
+済
+渉
+渋
+渓
+渕
+渚
+減
+渟
+渠
+渡
+渤
+渥
+渦
+温
+渫
+測
+港
+游
+渾
+湊
+湖
+湘
+湛
+湧
+湫
+湯
+湾
+湿
+満
+源
+準
+溜
+溝
+溢
+溥
+溪
+溶
+溺
+滄
+滅
+滋
+滌
+滑
+滕
+滝
+滞
+滴
+滸
+滹
+滿
+漁
+漂
+漆
+漉
+漏
+漑
+演
+漕
+漠
+漢
+漣
+漫
+漬
+漱
+漸
+漿
+潅
+潔
+潙
+潜
+潟
+潤
+潭
+潮
+潰
+潴
+澁
+澂
+澄
+澎
+澗
+澤
+澪
+澱
+澳
+激
+濁
+濃
+濟
+濠
+濡
+濤
+濫
+濯
+濱
+濾
+瀉
+瀋
+瀑
+瀕
+瀞
+瀟
+瀧
+瀬
+瀾
+灌
+灑
+灘
+火
+灯
+灰
+灸
+災
+炉
+炊
+炎
+炒
+炭
+炮
+炷
+点
+為
+烈
+烏
+烙
+烝
+烹
+焔
+焙
+焚
+無
+焦
+然
+焼
+煇
+煉
+煌
+煎
+煕
+煙
+煤
+煥
+照
+煩
+煬
+煮
+煽
+熈
+熊
+熙
+熟
+熨
+熱
+熹
+熾
+燃
+燈
+燎
+燔
+燕
+燗
+燥
+燭
+燻
+爆
+爐
+爪
+爬
+爲
+爵
+父
+爺
+爼
+爽
+爾
+片
+版
+牌
+牒
+牘
+牙
+牛
+牝
+牟
+牡
+牢
+牧
+物
+牲
+特
+牽
+犂
+犠
+犬
+犯
+状
+狂
+狄
+狐
+狗
+狙
+狛
+狡
+狩
+独
+狭
+狷
+狸
+狼
+猊
+猛
+猟
+猥
+猨
+猩
+猪
+猫
+献
+猴
+猶
+猷
+猾
+猿
+獄
+獅
+獏
+獣
+獲
+玄
+玅
+率
+玉
+王
+玖
+玩
+玲
+珀
+珂
+珈
+珉
+珊
+珍
+珎
+珞
+珠
+珣
+珥
+珪
+班
+現
+球
+理
+琉
+琢
+琥
+琦
+琮
+琲
+琳
+琴
+琵
+琶
+瑁
+瑋
+瑙
+瑚
+瑛
+瑜
+瑞
+瑠
+瑤
+瑩
+瑪
+瑳
+瑾
+璃
+璋
+璜
+璞
+璧
+璨
+環
+璵
+璽
+璿
+瓊
+瓔
+瓜
+瓢
+瓦
+瓶
+甍
+甑
+甕
+甘
+甚
+甞
+生
+産
+甥
+用
+甫
+田
+由
+甲
+申
+男
+町
+画
+界
+畏
+畑
+畔
+留
+畜
+畝
+畠
+畢
+略
+番
+異
+畳
+當
+畷
+畸
+畺
+畿
+疆
+疇
+疋
+疎
+疏
+疑
+疫
+疱
+疲
+疹
+疼
+疾
+病
+症
+痒
+痔
+痕
+痘
+痙
+痛
+痢
+痩
+痴
+痺
+瘍
+瘡
+瘧
+療
+癇
+癌
+癒
+癖
+癡
+癪
+発
+登
+白
+百
+的
+皆
+皇
+皋
+皐
+皓
+皮
+皺
+皿
+盂
+盃
+盆
+盈
+益
+盒
+盗
+盛
+盞
+盟
+盡
+監
+盤
+盥
+盧
+目
+盲
+直
+相
+盾
+省
+眉
+看
+県
+眞
+真
+眠
+眷
+眺
+眼
+着
+睡
+督
+睦
+睨
+睿
+瞋
+瞑
+瞞
+瞬
+瞭
+瞰
+瞳
+瞻
+瞼
+瞿
+矍
+矛
+矜
+矢
+知
+矧
+矩
+短
+矮
+矯
+石
+砂
+砌
+研
+砕
+砥
+砦
+砧
+砲
+破
+砺
+硝
+硫
+硬
+硯
+碁
+碇
+碌
+碑
+碓
+碕
+碗
+碣
+碧
+碩
+確
+碾
+磁
+磐
+磔
+磧
+磨
+磬
+磯
+礁
+礎
+礒
+礙
+礫
+礬
+示
+礼
+社
+祀
+祁
+祇
+祈
+祉
+祐
+祓
+祕
+祖
+祗
+祚
+祝
+神
+祟
+祠
+祢
+祥
+票
+祭
+祷
+祺
+禁
+禄
+禅
+禊
+禍
+禎
+福
+禔
+禖
+禛
+禦
+禧
+禮
+禰
+禹
+禽
+禿
+秀
+私
+秋
+科
+秒
+秘
+租
+秤
+秦
+秩
+称
+移
+稀
+程
+税
+稔
+稗
+稙
+稚
+稜
+稠
+種
+稱
+稲
+稷
+稻
+稼
+稽
+稿
+穀
+穂
+穆
+積
+穎
+穏
+穗
+穜
+穢
+穣
+穫
+穴
+究
+空
+突
+窃
+窄
+窒
+窓
+窟
+窠
+窩
+窪
+窮
+窯
+竃
+竄
+竈
+立
+站
+竜
+竝
+竟
+章
+童
+竪
+竭
+端
+竴
+競
+竹
+竺
+竽
+竿
+笄
+笈
+笏
+笑
+笙
+笛
+笞
+笠
+笥
+符
+第
+笹
+筅
+筆
+筇
+筈
+等
+筋
+筌
+筍
+筏
+筐
+筑
+筒
+答
+策
+筝
+筥
+筧
+筬
+筮
+筯
+筰
+筵
+箆
+箇
+箋
+箏
+箒
+箔
+箕
+算
+箙
+箜
+管
+箪
+箭
+箱
+箸
+節
+篁
+範
+篆
+篇
+築
+篋
+篌
+篝
+篠
+篤
+篥
+篦
+篩
+篭
+篳
+篷
+簀
+簒
+簡
+簧
+簪
+簫
+簺
+簾
+簿
+籀
+籃
+籌
+籍
+籐
+籟
+籠
+籤
+籬
+米
+籾
+粂
+粉
+粋
+粒
+粕
+粗
+粘
+粛
+粟
+粥
+粧
+粮
+粳
+精
+糊
+糖
+糜
+糞
+糟
+糠
+糧
+糯
+糸
+糺
+系
+糾
+紀
+約
+紅
+紋
+納
+紐
+純
+紗
+紘
+紙
+級
+紛
+素
+紡
+索
+紫
+紬
+累
+細
+紳
+紵
+紹
+紺
+絁
+終
+絃
+組
+絅
+経
+結
+絖
+絞
+絡
+絣
+給
+統
+絲
+絵
+絶
+絹
+絽
+綏
+經
+継
+続
+綜
+綟
+綬
+維
+綱
+網
+綴
+綸
+綺
+綽
+綾
+綿
+緊
+緋
+総
+緑
+緒
+線
+締
+緥
+編
+緩
+緬
+緯
+練
+緻
+縁
+縄
+縅
+縒
+縛
+縞
+縢
+縣
+縦
+縫
+縮
+縹
+總
+績
+繁
+繊
+繋
+繍
+織
+繕
+繝
+繦
+繧
+繰
+繹
+繼
+纂
+纈
+纏
+纐
+纒
+纛
+缶
+罔
+罠
+罧
+罪
+置
+罰
+署
+罵
+罷
+罹
+羂
+羅
+羆
+羇
+羈
+羊
+羌
+美
+群
+羨
+義
+羯
+羲
+羹
+羽
+翁
+翅
+翌
+習
+翔
+翛
+翠
+翡
+翫
+翰
+翺
+翻
+翼
+耀
+老
+考
+者
+耆
+而
+耐
+耕
+耗
+耨
+耳
+耶
+耽
+聊
+聖
+聘
+聚
+聞
+聟
+聡
+聨
+聯
+聰
+聲
+聴
+職
+聾
+肄
+肆
+肇
+肉
+肋
+肌
+肖
+肘
+肛
+肝
+股
+肢
+肥
+肩
+肪
+肯
+肱
+育
+肴
+肺
+胃
+胆
+背
+胎
+胖
+胚
+胝
+胞
+胡
+胤
+胱
+胴
+胸
+能
+脂
+脅
+脆
+脇
+脈
+脊
+脚
+脛
+脩
+脱
+脳
+腋
+腎
+腐
+腑
+腔
+腕
+腫
+腰
+腱
+腸
+腹
+腺
+腿
+膀
+膏
+膚
+膜
+膝
+膠
+膣
+膨
+膩
+膳
+膵
+膾
+膿
+臂
+臆
+臈
+臍
+臓
+臘
+臚
+臣
+臥
+臨
+自
+臭
+至
+致
+臺
+臼
+舂
+舅
+與
+興
+舌
+舍
+舎
+舒
+舖
+舗
+舘
+舜
+舞
+舟
+舩
+航
+般
+舳
+舶
+船
+艇
+艘
+艦
+艮
+良
+色
+艶
+芋
+芒
+芙
+芝
+芥
+芦
+芬
+芭
+芯
+花
+芳
+芸
+芹
+芻
+芽
+芿
+苅
+苑
+苔
+苗
+苛
+苞
+苡
+若
+苦
+苧
+苫
+英
+苴
+苻
+茂
+范
+茄
+茅
+茎
+茗
+茘
+茜
+茨
+茲
+茵
+茶
+茸
+茹
+草
+荊
+荏
+荒
+荘
+荷
+荻
+荼
+莞
+莪
+莫
+莬
+莱
+莵
+莽
+菅
+菊
+菌
+菓
+菖
+菘
+菜
+菟
+菩
+菫
+華
+菱
+菴
+萄
+萊
+萌
+萍
+萎
+萠
+萩
+萬
+萱
+落
+葉
+著
+葛
+葡
+董
+葦
+葩
+葬
+葭
+葱
+葵
+葺
+蒋
+蒐
+蒔
+蒙
+蒟
+蒡
+蒲
+蒸
+蒻
+蒼
+蒿
+蓄
+蓆
+蓉
+蓋
+蓑
+蓬
+蓮
+蓼
+蔀
+蔑
+蔓
+蔚
+蔡
+蔦
+蔬
+蔭
+蔵
+蔽
+蕃
+蕉
+蕊
+蕎
+蕨
+蕩
+蕪
+蕭
+蕾
+薄
+薇
+薊
+薔
+薗
+薙
+薛
+薦
+薨
+薩
+薪
+薫
+薬
+薭
+薮
+藁
+藉
+藍
+藏
+藐
+藝
+藤
+藩
+藪
+藷
+藹
+藺
+藻
+蘂
+蘆
+蘇
+蘊
+蘭
+虎
+虐
+虔
+虚
+虜
+虞
+號
+虫
+虹
+虻
+蚊
+蚕
+蛇
+蛉
+蛍
+蛎
+蛙
+蛛
+蛟
+蛤
+蛭
+蛮
+蛸
+蛹
+蛾
+蜀
+蜂
+蜃
+蜆
+蜊
+蜘
+蜜
+蜷
+蜻
+蝉
+蝋
+蝕
+蝙
+蝠
+蝦
+蝶
+蝿
+螂
+融
+螣
+螺
+蟄
+蟇
+蟠
+蟷
+蟹
+蟻
+蠢
+蠣
+血
+衆
+行
+衍
+衒
+術
+街
+衙
+衛
+衝
+衞
+衡
+衢
+衣
+表
+衫
+衰
+衵
+衷
+衽
+衾
+衿
+袁
+袈
+袋
+袍
+袒
+袖
+袙
+袞
+袢
+被
+袰
+袱
+袴
+袷
+袿
+裁
+裂
+裃
+装
+裏
+裔
+裕
+裘
+裙
+補
+裟
+裡
+裲
+裳
+裴
+裸
+裹
+製
+裾
+褂
+褄
+複
+褌
+褐
+褒
+褥
+褪
+褶
+褻
+襄
+襖
+襞
+襟
+襠
+襦
+襪
+襲
+襴
+襷
+西
+要
+覆
+覇
+覈
+見
+規
+視
+覗
+覚
+覧
+親
+覲
+観
+覺
+觀
+角
+解
+触
+言
+訂
+計
+討
+訓
+託
+記
+訛
+訟
+訢
+訥
+訪
+設
+許
+訳
+訴
+訶
+診
+註
+証
+詐
+詔
+評
+詛
+詞
+詠
+詢
+詣
+試
+詩
+詫
+詮
+詰
+話
+該
+詳
+誄
+誅
+誇
+誉
+誌
+認
+誓
+誕
+誘
+語
+誠
+誡
+誣
+誤
+誥
+誦
+説
+読
+誰
+課
+誼
+誾
+調
+談
+請
+諌
+諍
+諏
+諒
+論
+諚
+諜
+諟
+諡
+諦
+諧
+諫
+諭
+諮
+諱
+諶
+諷
+諸
+諺
+諾
+謀
+謄
+謌
+謎
+謗
+謙
+謚
+講
+謝
+謡
+謫
+謬
+謹
+證
+識
+譚
+譛
+譜
+警
+譬
+譯
+議
+譲
+譴
+護
+讀
+讃
+讐
+讒
+谷
+谿
+豅
+豆
+豊
+豎
+豐
+豚
+象
+豪
+豫
+豹
+貌
+貝
+貞
+負
+財
+貢
+貧
+貨
+販
+貪
+貫
+責
+貯
+貰
+貴
+買
+貸
+費
+貼
+貿
+賀
+賁
+賂
+賃
+賄
+資
+賈
+賊
+賎
+賑
+賓
+賛
+賜
+賞
+賠
+賢
+賣
+賤
+賦
+質
+賭
+購
+賽
+贄
+贅
+贈
+贋
+贔
+贖
+赤
+赦
+走
+赴
+起
+超
+越
+趙
+趣
+足
+趺
+趾
+跋
+跏
+距
+跡
+跨
+跪
+路
+跳
+践
+踊
+踏
+踐
+踞
+踪
+踵
+蹄
+蹉
+蹊
+蹟
+蹲
+蹴
+躅
+躇
+躊
+躍
+躑
+躙
+躪
+身
+躬
+躯
+躰
+車
+軋
+軌
+軍
+軒
+軟
+転
+軸
+軻
+軽
+軾
+較
+載
+輌
+輔
+輜
+輝
+輦
+輩
+輪
+輯
+輸
+輿
+轄
+轍
+轟
+轢
+辛
+辞
+辟
+辥
+辦
+辨
+辰
+辱
+農
+辺
+辻
+込
+迂
+迅
+迎
+近
+返
+迢
+迦
+迪
+迫
+迭
+述
+迷
+迹
+追
+退
+送
+逃
+逅
+逆
+逍
+透
+逐
+逓
+途
+逕
+逗
+這
+通
+逝
+逞
+速
+造
+逢
+連
+逮
+週
+進
+逸
+逼
+遁
+遂
+遅
+遇
+遊
+運
+遍
+過
+遐
+道
+達
+違
+遙
+遜
+遠
+遡
+遣
+遥
+適
+遭
+遮
+遯
+遵
+遷
+選
+遺
+遼
+避
+邀
+邁
+邂
+邃
+還
+邇
+邉
+邊
+邑
+那
+邦
+邨
+邪
+邯
+邵
+邸
+郁
+郊
+郎
+郡
+郢
+部
+郭
+郴
+郵
+郷
+都
+鄂
+鄙
+鄭
+鄰
+鄲
+酉
+酋
+酌
+配
+酎
+酒
+酔
+酢
+酥
+酪
+酬
+酵
+酷
+酸
+醍
+醐
+醒
+醗
+醜
+醤
+醪
+醵
+醸
+采
+釈
+釉
+釋
+里
+重
+野
+量
+釐
+金
+釘
+釜
+針
+釣
+釧
+釿
+鈍
+鈎
+鈐
+鈔
+鈞
+鈦
+鈴
+鈷
+鈸
+鈿
+鉄
+鉇
+鉉
+鉋
+鉛
+鉢
+鉤
+鉦
+鉱
+鉾
+銀
+銃
+銅
+銈
+銑
+銕
+銘
+銚
+銜
+銭
+鋏
+鋒
+鋤
+鋭
+鋲
+鋳
+鋸
+鋺
+鋼
+錆
+錍
+錐
+錘
+錠
+錣
+錦
+錫
+錬
+錯
+録
+錵
+鍋
+鍍
+鍑
+鍔
+鍛
+鍬
+鍮
+鍵
+鍼
+鍾
+鎌
+鎖
+鎗
+鎚
+鎧
+鎬
+鎮
+鎰
+鎹
+鏃
+鏑
+鏡
+鐃
+鐇
+鐐
+鐔
+鐘
+鐙
+鐚
+鐡
+鐵
+鐸
+鑁
+鑊
+鑑
+鑒
+鑚
+鑠
+鑢
+鑰
+鑵
+鑷
+鑼
+鑽
+鑿
+長
+門
+閃
+閇
+閉
+開
+閏
+閑
+間
+閔
+閘
+関
+閣
+閤
+閥
+閦
+閨
+閬
+閲
+閻
+閼
+閾
+闇
+闍
+闔
+闕
+闘
+關
+闡
+闢
+闥
+阜
+阪
+阮
+阯
+防
+阻
+阿
+陀
+陂
+附
+陌
+降
+限
+陛
+陞
+院
+陣
+除
+陥
+陪
+陬
+陰
+陳
+陵
+陶
+陸
+険
+陽
+隅
+隆
+隈
+隊
+隋
+階
+随
+隔
+際
+障
+隠
+隣
+隧
+隷
+隻
+隼
+雀
+雁
+雄
+雅
+集
+雇
+雉
+雊
+雋
+雌
+雍
+雑
+雖
+雙
+雛
+離
+難
+雨
+雪
+雫
+雰
+雲
+零
+雷
+雹
+電
+需
+震
+霊
+霍
+霖
+霜
+霞
+霧
+霰
+露
+靈
+青
+靖
+静
+靜
+非
+面
+革
+靫
+靭
+靱
+靴
+靺
+鞁
+鞄
+鞆
+鞋
+鞍
+鞏
+鞘
+鞠
+鞨
+鞭
+韋
+韓
+韜
+韮
+音
+韶
+韻
+響
+頁
+頂
+頃
+項
+順
+須
+頌
+預
+頑
+頒
+頓
+領
+頚
+頬
+頭
+頴
+頸
+頻
+頼
+顆
+題
+額
+顎
+顔
+顕
+顗
+願
+顛
+類
+顧
+顯
+風
+飛
+食
+飢
+飩
+飫
+飯
+飲
+飴
+飼
+飽
+飾
+餃
+餅
+餉
+養
+餌
+餐
+餓
+餘
+餝
+餡
+館
+饂
+饅
+饉
+饋
+饌
+饒
+饗
+首
+馗
+香
+馨
+馬
+馳
+馴
+駄
+駅
+駆
+駈
+駐
+駒
+駕
+駝
+駿
+騁
+騎
+騏
+騒
+験
+騙
+騨
+騰
+驕
+驚
+驛
+驢
+骨
+骸
+髄
+體
+高
+髙
+髢
+髪
+髭
+髮
+髷
+髻
+鬘
+鬚
+鬢
+鬨
+鬯
+鬱
+鬼
+魁
+魂
+魄
+魅
+魏
+魔
+魚
+魯
+鮎
+鮑
+鮒
+鮪
+鮫
+鮭
+鮮
+鯉
+鯔
+鯖
+鯛
+鯨
+鯰
+鯱
+鰐
+鰒
+鰭
+鰯
+鰰
+鰹
+鰻
+鱈
+鱒
+鱗
+鱧
+鳥
+鳩
+鳰
+鳳
+鳴
+鳶
+鴈
+鴉
+鴎
+鴛
+鴟
+鴦
+鴨
+鴫
+鴻
+鵄
+鵜
+鵞
+鵡
+鵬
+鵲
+鵺
+鶉
+鶏
+鶯
+鶴
+鷄
+鷙
+鷲
+鷹
+鷺
+鸚
+鸞
+鹸
+鹽
+鹿
+麁
+麒
+麓
+麗
+麝
+麞
+麟
+麦
+麩
+麹
+麺
+麻
+麾
+麿
+黄
+黌
+黍
+黒
+黙
+黛
+黠
+鼈
+鼉
+鼎
+鼓
+鼠
+鼻
+齊
+齋
+齟
+齢
+齬
+龍
+龕
+龗
+!
+#
+%
+&
+(
+)
++
+,
+-
+.
+/
+0
+1
+2
+3
+4
+5
+6
+7
+8
+9
+:
+;
+=
+?
+@
+A
+B
+C
+D
+E
+F
+G
+H
+I
+J
+K
+L
+M
+N
+O
+P
+R
+S
+T
+U
+V
+W
+X
+Z
+a
+c
+d
+e
+f
+h
+i
+j
+k
+l
+m
+n
+o
+p
+r
+s
+t
+u
+y
+z
+~
+・
+
diff --git a/modules/image/text_recognition/japan_ocr_db_crnn_mobile/character.py b/modules/image/text_recognition/japan_ocr_db_crnn_mobile/character.py
new file mode 100644
index 0000000000000000000000000000000000000000..21dbbd9dc790e3d009f45c1ef1b68c001e9f0e0b
--- /dev/null
+++ b/modules/image/text_recognition/japan_ocr_db_crnn_mobile/character.py
@@ -0,0 +1,213 @@
+# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import numpy as np
+import string
+
+class CharacterOps(object):
+ """ Convert between text-label and text-index """
+
+ def __init__(self, config):
+ self.character_type = config['character_type']
+ self.loss_type = config['loss_type']
+ self.max_text_len = config['max_text_length']
+ if self.character_type == "en":
+ self.character_str = "0123456789abcdefghijklmnopqrstuvwxyz"
+ dict_character = list(self.character_str)
+ elif self.character_type in [
+ "ch", 'japan', 'korean', 'french', 'german'
+ ]:
+ character_dict_path = config['character_dict_path']
+ add_space = False
+ if 'use_space_char' in config:
+ add_space = config['use_space_char']
+ self.character_str = ""
+ with open(character_dict_path, "rb") as fin:
+ lines = fin.readlines()
+ for line in lines:
+ line = line.decode('utf-8').strip("\n").strip("\r\n")
+ self.character_str += line
+ if add_space:
+ self.character_str += " "
+ dict_character = list(self.character_str)
+ elif self.character_type == "en_sensitive":
+ # same with ASTER setting (use 94 char).
+ self.character_str = string.printable[:-6]
+ dict_character = list(self.character_str)
+ else:
+ self.character_str = None
+ assert self.character_str is not None, \
+ "Nonsupport type of the character: {}".format(self.character_str)
+ self.beg_str = "sos"
+ self.end_str = "eos"
+ if self.loss_type == "attention":
+ dict_character = [self.beg_str, self.end_str] + dict_character
+ elif self.loss_type == "srn":
+ dict_character = dict_character + [self.beg_str, self.end_str]
+ self.dict = {}
+ for i, char in enumerate(dict_character):
+ self.dict[char] = i
+ self.character = dict_character
+
+ def encode(self, text):
+ """convert text-label into text-index.
+ input:
+ text: text labels of each image. [batch_size]
+
+ output:
+ text: concatenated text index for CTCLoss.
+ [sum(text_lengths)] = [text_index_0 + text_index_1 + ... + text_index_(n - 1)]
+ length: length of each text. [batch_size]
+ """
+ if self.character_type == "en":
+ text = text.lower()
+
+ text_list = []
+ for char in text:
+ if char not in self.dict:
+ continue
+ text_list.append(self.dict[char])
+ text = np.array(text_list)
+ return text
+
+ def decode(self, text_index, is_remove_duplicate=False):
+ """ convert text-index into text-label. """
+ char_list = []
+ char_num = self.get_char_num()
+
+ if self.loss_type == "attention":
+ beg_idx = self.get_beg_end_flag_idx("beg")
+ end_idx = self.get_beg_end_flag_idx("end")
+ ignored_tokens = [beg_idx, end_idx]
+ else:
+ ignored_tokens = [char_num]
+
+ for idx in range(len(text_index)):
+ if text_index[idx] in ignored_tokens:
+ continue
+ if is_remove_duplicate:
+ if idx > 0 and text_index[idx - 1] == text_index[idx]:
+ continue
+ char_list.append(self.character[int(text_index[idx])])
+ text = ''.join(char_list)
+ return text
+
+ def get_char_num(self):
+ return len(self.character)
+
+ def get_beg_end_flag_idx(self, beg_or_end):
+ if self.loss_type == "attention":
+ if beg_or_end == "beg":
+ idx = np.array(self.dict[self.beg_str])
+ elif beg_or_end == "end":
+ idx = np.array(self.dict[self.end_str])
+ else:
+ assert False, "Unsupport type %s in get_beg_end_flag_idx"\
+ % beg_or_end
+ return idx
+ else:
+ err = "error in get_beg_end_flag_idx when using the loss %s"\
+ % (self.loss_type)
+ assert False, err
+
+
+def cal_predicts_accuracy(char_ops,
+ preds,
+ preds_lod,
+ labels,
+ labels_lod,
+ is_remove_duplicate=False):
+ acc_num = 0
+ img_num = 0
+ for ino in range(len(labels_lod) - 1):
+ beg_no = preds_lod[ino]
+ end_no = preds_lod[ino + 1]
+ preds_text = preds[beg_no:end_no].reshape(-1)
+ preds_text = char_ops.decode(preds_text, is_remove_duplicate)
+
+ beg_no = labels_lod[ino]
+ end_no = labels_lod[ino + 1]
+ labels_text = labels[beg_no:end_no].reshape(-1)
+ labels_text = char_ops.decode(labels_text, is_remove_duplicate)
+ img_num += 1
+
+ if preds_text == labels_text:
+ acc_num += 1
+ acc = acc_num * 1.0 / img_num
+ return acc, acc_num, img_num
+
+
+def cal_predicts_accuracy_srn(char_ops,
+ preds,
+ labels,
+ max_text_len,
+ is_debug=False):
+ acc_num = 0
+ img_num = 0
+
+ char_num = char_ops.get_char_num()
+
+ total_len = preds.shape[0]
+ img_num = int(total_len / max_text_len)
+ for i in range(img_num):
+ cur_label = []
+ cur_pred = []
+ for j in range(max_text_len):
+ if labels[j + i * max_text_len] != int(char_num - 1): #0
+ cur_label.append(labels[j + i * max_text_len][0])
+ else:
+ break
+
+ for j in range(max_text_len + 1):
+ if j < len(cur_label) and preds[j + i * max_text_len][
+ 0] != cur_label[j]:
+ break
+ elif j == len(cur_label) and j == max_text_len:
+ acc_num += 1
+ break
+ elif j == len(cur_label) and preds[j + i * max_text_len][0] == int(
+ char_num - 1):
+ acc_num += 1
+ break
+ acc = acc_num * 1.0 / img_num
+ return acc, acc_num, img_num
+
+
+def convert_rec_attention_infer_res(preds):
+ img_num = preds.shape[0]
+ target_lod = [0]
+ convert_ids = []
+ for ino in range(img_num):
+ end_pos = np.where(preds[ino, :] == 1)[0]
+ if len(end_pos) <= 1:
+ text_list = preds[ino, 1:]
+ else:
+ text_list = preds[ino, 1:end_pos[1]]
+ target_lod.append(target_lod[ino] + len(text_list))
+ convert_ids = convert_ids + list(text_list)
+ convert_ids = np.array(convert_ids)
+ convert_ids = convert_ids.reshape((-1, 1))
+ return convert_ids, target_lod
+
+
+def convert_rec_label_to_lod(ori_labels):
+ img_num = len(ori_labels)
+ target_lod = [0]
+ convert_ids = []
+ for ino in range(img_num):
+ target_lod.append(target_lod[ino] + len(ori_labels[ino]))
+ convert_ids = convert_ids + list(ori_labels[ino])
+ convert_ids = np.array(convert_ids)
+ convert_ids = convert_ids.reshape((-1, 1))
+ return convert_ids, target_lod
diff --git a/modules/image/text_recognition/japan_ocr_db_crnn_mobile/module.py b/modules/image/text_recognition/japan_ocr_db_crnn_mobile/module.py
new file mode 100644
index 0000000000000000000000000000000000000000..cd04f063496af4a93459ec19a7a46b93f2dab51b
--- /dev/null
+++ b/modules/image/text_recognition/japan_ocr_db_crnn_mobile/module.py
@@ -0,0 +1,591 @@
+# -*- coding:utf-8 -*-
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+
+import argparse
+import ast
+import copy
+import math
+import os
+import time
+
+from paddle.fluid.core import AnalysisConfig, create_paddle_predictor, PaddleTensor
+from paddlehub.common.logger import logger
+from paddlehub.module.module import moduleinfo, runnable, serving
+from PIL import Image
+import cv2
+import numpy as np
+import paddle.fluid as fluid
+import paddlehub as hub
+
+from japan_ocr_db_crnn_mobile.character import CharacterOps
+from japan_ocr_db_crnn_mobile.utils import base64_to_cv2, draw_ocr, get_image_ext, sorted_boxes
+
+
+@moduleinfo(
+ name="japan_ocr_db_crnn_mobile",
+ version="1.0.0",
+ summary=
+ "The module can recognize the japan texts in an image. Firstly, it will detect the text box positions based on the differentiable_binarization module. Then it recognizes the german texts. ",
+ author="paddle-dev",
+ author_email="paddle-dev@baidu.com",
+ type="cv/text_recognition")
+class JapanOCRDBCRNNMobile(hub.Module):
+ def _initialize(self, text_detector_module=None, enable_mkldnn=False, use_angle_classification=False):
+ """
+ initialize with the necessary elements
+ """
+ self.character_dict_path = os.path.join(self.directory, 'assets',
+ 'japan_dict.txt')
+ char_ops_params = {
+ 'character_type': 'japan',
+ 'character_dict_path': self.character_dict_path,
+ 'loss_type': 'ctc',
+ 'max_text_length': 25,
+ 'use_space_char': True
+ }
+ self.char_ops = CharacterOps(char_ops_params)
+ self.rec_image_shape = [3, 32, 320]
+ self._text_detector_module = text_detector_module
+ self.font_file = os.path.join(self.directory, 'assets', 'japan.ttc')
+ self.enable_mkldnn = enable_mkldnn
+ self.use_angle_classification = use_angle_classification
+
+ self.rec_pretrained_model_path = os.path.join(
+ self.directory, 'inference_model', 'character_rec')
+ self.rec_predictor, self.rec_input_tensor, self.rec_output_tensors = self._set_config(
+ self.rec_pretrained_model_path)
+
+ if self.use_angle_classification:
+ self.cls_pretrained_model_path = os.path.join(
+ self.directory, 'inference_model', 'angle_cls')
+
+ self.cls_predictor, self.cls_input_tensor, self.cls_output_tensors = self._set_config(
+ self.cls_pretrained_model_path)
+
+ def _set_config(self, pretrained_model_path):
+ """
+ predictor config path
+ """
+ model_file_path = os.path.join(pretrained_model_path, 'model')
+ params_file_path = os.path.join(pretrained_model_path, 'params')
+
+ config = AnalysisConfig(model_file_path, params_file_path)
+ try:
+ _places = os.environ["CUDA_VISIBLE_DEVICES"]
+ int(_places[0])
+ use_gpu = True
+ except:
+ use_gpu = False
+
+ if use_gpu:
+ config.enable_use_gpu(8000, 0)
+ else:
+ config.disable_gpu()
+ if self.enable_mkldnn:
+ # cache 10 different shapes for mkldnn to avoid memory leak
+ config.set_mkldnn_cache_capacity(10)
+ config.enable_mkldnn()
+
+ config.disable_glog_info()
+ config.delete_pass("conv_transpose_eltwiseadd_bn_fuse_pass")
+ config.switch_use_feed_fetch_ops(False)
+
+ predictor = create_paddle_predictor(config)
+
+ input_names = predictor.get_input_names()
+ input_tensor = predictor.get_input_tensor(input_names[0])
+ output_names = predictor.get_output_names()
+ output_tensors = []
+ for output_name in output_names:
+ output_tensor = predictor.get_output_tensor(output_name)
+ output_tensors.append(output_tensor)
+
+ return predictor, input_tensor, output_tensors
+
+ @property
+ def text_detector_module(self):
+ """
+ text detect module
+ """
+ if not self._text_detector_module:
+ self._text_detector_module = hub.Module(
+ name='chinese_text_detection_db_mobile',
+ enable_mkldnn=self.enable_mkldnn,
+ version='1.0.4')
+ return self._text_detector_module
+
+ def read_images(self, paths=[]):
+ images = []
+ for img_path in paths:
+ assert os.path.isfile(
+ img_path), "The {} isn't a valid file.".format(img_path)
+ img = cv2.imread(img_path)
+ if img is None:
+ logger.info("error in loading image:{}".format(img_path))
+ continue
+ images.append(img)
+ return images
+
+ def get_rotate_crop_image(self, img, points):
+ '''
+ img_height, img_width = img.shape[0:2]
+ left = int(np.min(points[:, 0]))
+ right = int(np.max(points[:, 0]))
+ top = int(np.min(points[:, 1]))
+ bottom = int(np.max(points[:, 1]))
+ img_crop = img[top:bottom, left:right, :].copy()
+ points[:, 0] = points[:, 0] - left
+ points[:, 1] = points[:, 1] - top
+ '''
+ img_crop_width = int(
+ max(
+ np.linalg.norm(points[0] - points[1]),
+ np.linalg.norm(points[2] - points[3])))
+ img_crop_height = int(
+ max(
+ np.linalg.norm(points[0] - points[3]),
+ np.linalg.norm(points[1] - points[2])))
+ pts_std = np.float32([[0, 0], [img_crop_width, 0],
+ [img_crop_width, img_crop_height],
+ [0, img_crop_height]])
+ M = cv2.getPerspectiveTransform(points, pts_std)
+ dst_img = cv2.warpPerspective(
+ img,
+ M, (img_crop_width, img_crop_height),
+ borderMode=cv2.BORDER_REPLICATE,
+ flags=cv2.INTER_CUBIC)
+ dst_img_height, dst_img_width = dst_img.shape[0:2]
+ if dst_img_height * 1.0 / dst_img_width >= 1.5:
+ dst_img = np.rot90(dst_img)
+ return dst_img
+
+ def resize_norm_img_rec(self, img, max_wh_ratio):
+ imgC, imgH, imgW = self.rec_image_shape
+ assert imgC == img.shape[2]
+ h, w = img.shape[:2]
+ ratio = w / float(h)
+ if math.ceil(imgH * ratio) > imgW:
+ resized_w = imgW
+ else:
+ resized_w = int(math.ceil(imgH * ratio))
+ resized_image = cv2.resize(img, (resized_w, imgH))
+ resized_image = resized_image.astype('float32')
+ resized_image = resized_image.transpose((2, 0, 1)) / 255
+ resized_image -= 0.5
+ resized_image /= 0.5
+ padding_im = np.zeros((imgC, imgH, imgW), dtype=np.float32)
+ padding_im[:, :, 0:resized_w] = resized_image
+ return padding_im
+
+ def resize_norm_img_cls(self, img):
+ cls_image_shape = [3, 48, 192]
+ imgC, imgH, imgW = cls_image_shape
+ h = img.shape[0]
+ w = img.shape[1]
+ ratio = w / float(h)
+ if math.ceil(imgH * ratio) > imgW:
+ resized_w = imgW
+ else:
+ resized_w = int(math.ceil(imgH * ratio))
+ resized_image = cv2.resize(img, (resized_w, imgH))
+ resized_image = resized_image.astype('float32')
+ if cls_image_shape[0] == 1:
+ resized_image = resized_image / 255
+ resized_image = resized_image[np.newaxis, :]
+ else:
+ resized_image = resized_image.transpose((2, 0, 1)) / 255
+ resized_image -= 0.5
+ resized_image /= 0.5
+ padding_im = np.zeros((imgC, imgH, imgW), dtype=np.float32)
+ padding_im[:, :, 0:resized_w] = resized_image
+ return padding_im
+
+ def recognize_text(self,
+ images=[],
+ paths=[],
+ use_gpu=False,
+ output_dir='ocr_result',
+ visualization=False,
+ box_thresh=0.5,
+ text_thresh=0.5,
+ angle_classification_thresh=0.9):
+ """
+ Get the chinese texts in the predicted images.
+ Args:
+ images (list(numpy.ndarray)): images data, shape of each is [H, W, C]. If images not paths
+ paths (list[str]): The paths of images. If paths not images
+ use_gpu (bool): Whether to use gpu.
+ batch_size(int): the program deals once with one
+ output_dir (str): The directory to store output images.
+ visualization (bool): Whether to save image or not.
+ box_thresh(float): the threshold of the detected text box's confidence
+ text_thresh(float): the threshold of the chinese text recognition confidence
+ angle_classification_thresh(float): the threshold of the angle classification confidence
+
+ Returns:
+ res (list): The result of chinese texts and save path of images.
+ """
+ if use_gpu:
+ try:
+ _places = os.environ["CUDA_VISIBLE_DEVICES"]
+ int(_places[0])
+ except:
+ raise RuntimeError(
+ "Environment Variable CUDA_VISIBLE_DEVICES is not set correctly. If you wanna use gpu, please set CUDA_VISIBLE_DEVICES via export CUDA_VISIBLE_DEVICES=cuda_device_id."
+ )
+
+ self.use_gpu = use_gpu
+
+ if images != [] and isinstance(images, list) and paths == []:
+ predicted_data = images
+ elif images == [] and isinstance(paths, list) and paths != []:
+ predicted_data = self.read_images(paths)
+ else:
+ raise TypeError("The input data is inconsistent with expectations.")
+
+ assert predicted_data != [], "There is not any image to be predicted. Please check the input data."
+
+ detection_results = self.text_detector_module.detect_text(
+ images=predicted_data, use_gpu=self.use_gpu, box_thresh=box_thresh)
+ print('*'*10)
+ print(detection_results)
+
+ boxes = [
+ np.array(item['data']).astype(np.float32)
+ for item in detection_results
+ ]
+ all_results = []
+ for index, img_boxes in enumerate(boxes):
+ original_image = predicted_data[index].copy()
+ result = {'save_path': ''}
+ if img_boxes.size == 0:
+ result['data'] = []
+ else:
+ img_crop_list = []
+ boxes = sorted_boxes(img_boxes)
+ for num_box in range(len(boxes)):
+ tmp_box = copy.deepcopy(boxes[num_box])
+ img_crop = self.get_rotate_crop_image(
+ original_image, tmp_box)
+ img_crop_list.append(img_crop)
+
+ if self.use_angle_classification:
+ img_crop_list, angle_list = self._classify_text(
+ img_crop_list,
+ angle_classification_thresh=angle_classification_thresh)
+
+ rec_results = self._recognize_text(img_crop_list)
+
+ # if the recognized text confidence score is lower than text_thresh, then drop it
+ rec_res_final = []
+ for index, res in enumerate(rec_results):
+ text, score = res
+ if score >= text_thresh:
+ rec_res_final.append({
+ 'text':
+ text,
+ 'confidence':
+ float(score),
+ 'text_box_position':
+ boxes[index].astype(np.int).tolist()
+ })
+ result['data'] = rec_res_final
+
+ if visualization and result['data']:
+ result['save_path'] = self.save_result_image(
+ original_image, boxes, rec_results, output_dir,
+ text_thresh)
+ all_results.append(result)
+
+ return all_results
+
+ @serving
+ def serving_method(self, images, **kwargs):
+ """
+ Run as a service.
+ """
+ images_decode = [base64_to_cv2(image) for image in images]
+ results = self.recognize_text(images_decode, **kwargs)
+ return results
+
+ def save_result_image(
+ self,
+ original_image,
+ detection_boxes,
+ rec_results,
+ output_dir='ocr_result',
+ text_thresh=0.5,
+ ):
+ image = Image.fromarray(cv2.cvtColor(original_image, cv2.COLOR_BGR2RGB))
+ txts = [item[0] for item in rec_results]
+ scores = [item[1] for item in rec_results]
+ draw_img = draw_ocr(
+ image,
+ detection_boxes,
+ txts,
+ scores,
+ font_file=self.font_file,
+ draw_txt=True,
+ drop_score=text_thresh)
+
+ if not os.path.exists(output_dir):
+ os.makedirs(output_dir)
+ ext = get_image_ext(original_image)
+ saved_name = 'ndarray_{}{}'.format(time.time(), ext)
+ save_file_path = os.path.join(output_dir, saved_name)
+ cv2.imwrite(save_file_path, draw_img[:, :, ::-1])
+ return save_file_path
+
+ def _classify_text(self, image_list, angle_classification_thresh=0.9):
+ img_list = copy.deepcopy(image_list)
+ img_num = len(img_list)
+ # Calculate the aspect ratio of all text bars
+ width_list = []
+ for img in img_list:
+ width_list.append(img.shape[1] / float(img.shape[0]))
+ # Sorting can speed up the cls process
+ indices = np.argsort(np.array(width_list))
+
+ cls_res = [['', 0.0]] * img_num
+ batch_num = 30
+ for beg_img_no in range(0, img_num, batch_num):
+ end_img_no = min(img_num, beg_img_no + batch_num)
+ norm_img_batch = []
+ max_wh_ratio = 0
+ for ino in range(beg_img_no, end_img_no):
+ h, w = img_list[indices[ino]].shape[0:2]
+ wh_ratio = w * 1.0 / h
+ max_wh_ratio = max(max_wh_ratio, wh_ratio)
+ for ino in range(beg_img_no, end_img_no):
+ norm_img = self.resize_norm_img_cls(img_list[indices[ino]])
+ norm_img = norm_img[np.newaxis, :]
+ norm_img_batch.append(norm_img)
+ norm_img_batch = np.concatenate(norm_img_batch)
+ norm_img_batch = norm_img_batch.copy()
+
+ self.cls_input_tensor.copy_from_cpu(norm_img_batch)
+ self.cls_predictor.zero_copy_run()
+
+ prob_out = self.cls_output_tensors[0].copy_to_cpu()
+ label_out = self.cls_output_tensors[1].copy_to_cpu()
+ if len(label_out.shape) != 1:
+ prob_out, label_out = label_out, prob_out
+ label_list = ['0', '180']
+ for rno in range(len(label_out)):
+ label_idx = label_out[rno]
+ score = prob_out[rno][label_idx]
+ label = label_list[label_idx]
+ cls_res[indices[beg_img_no + rno]] = [label, score]
+ if '180' in label and score > angle_classification_thresh:
+ img_list[indices[beg_img_no + rno]] = cv2.rotate(
+ img_list[indices[beg_img_no + rno]], 1)
+ return img_list, cls_res
+
+ def _recognize_text(self, img_list):
+ img_num = len(img_list)
+ # Calculate the aspect ratio of all text bars
+ width_list = []
+ for img in img_list:
+ width_list.append(img.shape[1] / float(img.shape[0]))
+ # Sorting can speed up the recognition process
+ indices = np.argsort(np.array(width_list))
+
+ rec_res = [['', 0.0]] * img_num
+ batch_num = 30
+ for beg_img_no in range(0, img_num, batch_num):
+ end_img_no = min(img_num, beg_img_no + batch_num)
+ norm_img_batch = []
+ max_wh_ratio = 0
+ for ino in range(beg_img_no, end_img_no):
+ h, w = img_list[indices[ino]].shape[0:2]
+ wh_ratio = w * 1.0 / h
+ max_wh_ratio = max(max_wh_ratio, wh_ratio)
+ for ino in range(beg_img_no, end_img_no):
+ norm_img = self.resize_norm_img_rec(img_list[indices[ino]],
+ max_wh_ratio)
+ norm_img = norm_img[np.newaxis, :]
+ norm_img_batch.append(norm_img)
+
+ norm_img_batch = np.concatenate(norm_img_batch, axis=0)
+ norm_img_batch = norm_img_batch.copy()
+
+ self.rec_input_tensor.copy_from_cpu(norm_img_batch)
+ self.rec_predictor.zero_copy_run()
+
+ rec_idx_batch = self.rec_output_tensors[0].copy_to_cpu()
+ rec_idx_lod = self.rec_output_tensors[0].lod()[0]
+ predict_batch = self.rec_output_tensors[1].copy_to_cpu()
+ predict_lod = self.rec_output_tensors[1].lod()[0]
+ for rno in range(len(rec_idx_lod) - 1):
+ beg = rec_idx_lod[rno]
+ end = rec_idx_lod[rno + 1]
+ rec_idx_tmp = rec_idx_batch[beg:end, 0]
+ preds_text = self.char_ops.decode(rec_idx_tmp)
+ beg = predict_lod[rno]
+ end = predict_lod[rno + 1]
+ probs = predict_batch[beg:end, :]
+ ind = np.argmax(probs, axis=1)
+ blank = probs.shape[1]
+ valid_ind = np.where(ind != (blank - 1))[0]
+ if len(valid_ind) == 0:
+ continue
+ score = np.mean(probs[valid_ind, ind[valid_ind]])
+ # rec_res.append([preds_text, score])
+ rec_res[indices[beg_img_no + rno]] = [preds_text, score]
+
+ return rec_res
+
+ def save_inference_model(self,
+ dirname,
+ model_filename=None,
+ params_filename=None,
+ combined=True):
+ detector_dir = os.path.join(dirname, 'text_detector')
+ classifier_dir = os.path.join(dirname, 'angle_classifier')
+ recognizer_dir = os.path.join(dirname, 'text_recognizer')
+ self._save_detector_model(detector_dir, model_filename, params_filename,
+ combined)
+ if self.use_angle_classification:
+ self._save_classifier_model(classifier_dir, model_filename,
+ params_filename, combined)
+
+ self._save_recognizer_model(recognizer_dir, model_filename,
+ params_filename, combined)
+ logger.info("The inference model has been saved in the path {}".format(
+ os.path.realpath(dirname)))
+
+ def _save_detector_model(self,
+ dirname,
+ model_filename=None,
+ params_filename=None,
+ combined=True):
+ self.text_detector_module.save_inference_model(
+ dirname, model_filename, params_filename, combined)
+
+ def _save_recognizer_model(self,
+ dirname,
+ model_filename=None,
+ params_filename=None,
+ combined=True):
+ if combined:
+ model_filename = "__model__" if not model_filename else model_filename
+ params_filename = "__params__" if not params_filename else params_filename
+ place = fluid.CPUPlace()
+ exe = fluid.Executor(place)
+
+ model_file_path = os.path.join(self.rec_pretrained_model_path, 'model')
+ params_file_path = os.path.join(self.rec_pretrained_model_path,
+ 'params')
+ program, feeded_var_names, target_vars = fluid.io.load_inference_model(
+ dirname=self.rec_pretrained_model_path,
+ model_filename=model_file_path,
+ params_filename=params_file_path,
+ executor=exe)
+
+ fluid.io.save_inference_model(
+ dirname=dirname,
+ main_program=program,
+ executor=exe,
+ feeded_var_names=feeded_var_names,
+ target_vars=target_vars,
+ model_filename=model_filename,
+ params_filename=params_filename)
+
+ def _save_classifier_model(self,
+ dirname,
+ model_filename=None,
+ params_filename=None,
+ combined=True):
+ if combined:
+ model_filename = "__model__" if not model_filename else model_filename
+ params_filename = "__params__" if not params_filename else params_filename
+ place = fluid.CPUPlace()
+ exe = fluid.Executor(place)
+
+ model_file_path = os.path.join(self.cls_pretrained_model_path, 'model')
+ params_file_path = os.path.join(self.cls_pretrained_model_path,
+ 'params')
+ program, feeded_var_names, target_vars = fluid.io.load_inference_model(
+ dirname=self.cls_pretrained_model_path,
+ model_filename=model_file_path,
+ params_filename=params_file_path,
+ executor=exe)
+
+ fluid.io.save_inference_model(
+ dirname=dirname,
+ main_program=program,
+ executor=exe,
+ feeded_var_names=feeded_var_names,
+ target_vars=target_vars,
+ model_filename=model_filename,
+ params_filename=params_filename)
+
+ @runnable
+ def run_cmd(self, argvs):
+ """
+ Run as a command
+ """
+ self.parser = argparse.ArgumentParser(
+ description="Run the %s module." % self.name,
+ prog='hub run %s' % self.name,
+ usage='%(prog)s',
+ add_help=True)
+
+ self.arg_input_group = self.parser.add_argument_group(
+ title="Input options", description="Input data. Required")
+ self.arg_config_group = self.parser.add_argument_group(
+ title="Config options",
+ description=
+ "Run configuration for controlling module behavior, not required.")
+
+ self.add_module_config_arg()
+ self.add_module_input_arg()
+
+ args = self.parser.parse_args(argvs)
+ results = self.recognize_text(
+ paths=[args.input_path],
+ use_gpu=args.use_gpu,
+ output_dir=args.output_dir,
+ visualization=args.visualization)
+ return results
+
+ def add_module_config_arg(self):
+ """
+ Add the command config options
+ """
+ self.arg_config_group.add_argument(
+ '--use_gpu',
+ type=ast.literal_eval,
+ default=False,
+ help="whether use GPU or not")
+ self.arg_config_group.add_argument(
+ '--output_dir',
+ type=str,
+ default='ocr_result',
+ help="The directory to save output images.")
+ self.arg_config_group.add_argument(
+ '--visualization',
+ type=ast.literal_eval,
+ default=False,
+ help="whether to save output as images.")
+
+ def add_module_input_arg(self):
+ """
+ Add the command input options
+ """
+ self.arg_input_group.add_argument(
+ '--input_path', type=str, default=None, help="diretory to image")
+
+
+if __name__ == '__main__':
+ ocr = JapanOCRDBCRNNMobile(enable_mkldnn=False, use_angle_classification=True)
+ image_path = [
+ '/mnt/zhangxuefei/PaddleOCR/doc/imgs/ger_1.jpg',
+ '/mnt/zhangxuefei/PaddleOCR/doc/imgs/12.jpg',
+ '/mnt/zhangxuefei/PaddleOCR/doc/imgs/test_image.jpg'
+ ]
+ res = ocr.recognize_text(paths=image_path, visualization=True)
+ ocr.save_inference_model('save')
+ print(res)
diff --git a/modules/image/text_recognition/japan_ocr_db_crnn_mobile/utils.py b/modules/image/text_recognition/japan_ocr_db_crnn_mobile/utils.py
new file mode 100644
index 0000000000000000000000000000000000000000..8c41af300cc91de369a473cb7327b794b6cf5715
--- /dev/null
+++ b/modules/image/text_recognition/japan_ocr_db_crnn_mobile/utils.py
@@ -0,0 +1,190 @@
+# -*- coding:utf-8 -*-
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+
+import math
+
+from PIL import Image, ImageDraw, ImageFont
+import base64
+import cv2
+import numpy as np
+
+
+def draw_ocr(image,
+ boxes,
+ txts,
+ scores,
+ font_file,
+ draw_txt=True,
+ drop_score=0.5):
+ """
+ Visualize the results of OCR detection and recognition
+ args:
+ image(Image|array): RGB image
+ boxes(list): boxes with shape(N, 4, 2)
+ txts(list): the texts
+ scores(list): txxs corresponding scores
+ draw_txt(bool): whether draw text or not
+ drop_score(float): only scores greater than drop_threshold will be visualized
+ return(array):
+ the visualized img
+ """
+ if scores is None:
+ scores = [1] * len(boxes)
+ for (box, score) in zip(boxes, scores):
+ if score < drop_score or math.isnan(score):
+ continue
+ box = np.reshape(np.array(box), [-1, 1, 2]).astype(np.int64)
+ image = cv2.polylines(np.array(image), [box], True, (255, 0, 0), 2)
+
+ if draw_txt:
+ img = np.array(resize_img(image, input_size=600))
+ txt_img = text_visual(
+ txts,
+ scores,
+ font_file,
+ img_h=img.shape[0],
+ img_w=600,
+ threshold=drop_score)
+ img = np.concatenate([np.array(img), np.array(txt_img)], axis=1)
+ return img
+ return image
+
+
+def text_visual(texts, scores, font_file, img_h=400, img_w=600, threshold=0.):
+ """
+ create new blank img and draw txt on it
+ args:
+ texts(list): the text will be draw
+ scores(list|None): corresponding score of each txt
+ img_h(int): the height of blank img
+ img_w(int): the width of blank img
+ return(array):
+ """
+ if scores is not None:
+ assert len(texts) == len(
+ scores), "The number of txts and corresponding scores must match"
+
+ def create_blank_img():
+ blank_img = np.ones(shape=[img_h, img_w], dtype=np.int8) * 255
+ blank_img[:, img_w - 1:] = 0
+ blank_img = Image.fromarray(blank_img).convert("RGB")
+ draw_txt = ImageDraw.Draw(blank_img)
+ return blank_img, draw_txt
+
+ blank_img, draw_txt = create_blank_img()
+
+ font_size = 20
+ txt_color = (0, 0, 0)
+ font = ImageFont.truetype(font_file, font_size, encoding="utf-8")
+
+ gap = font_size + 5
+ txt_img_list = []
+ count, index = 1, 0
+ for idx, txt in enumerate(texts):
+ index += 1
+ if scores[idx] < threshold or math.isnan(scores[idx]):
+ index -= 1
+ continue
+ first_line = True
+ while str_count(txt) >= img_w // font_size - 4:
+ tmp = txt
+ txt = tmp[:img_w // font_size - 4]
+ if first_line:
+ new_txt = str(index) + ': ' + txt
+ first_line = False
+ else:
+ new_txt = ' ' + txt
+ draw_txt.text((0, gap * count), new_txt, txt_color, font=font)
+ txt = tmp[img_w // font_size - 4:]
+ if count >= img_h // gap - 1:
+ txt_img_list.append(np.array(blank_img))
+ blank_img, draw_txt = create_blank_img()
+ count = 0
+ count += 1
+ if first_line:
+ new_txt = str(index) + ': ' + txt + ' ' + '%.3f' % (scores[idx])
+ else:
+ new_txt = " " + txt + " " + '%.3f' % (scores[idx])
+ draw_txt.text((0, gap * count), new_txt, txt_color, font=font)
+ # whether add new blank img or not
+ if count >= img_h // gap - 1 and idx + 1 < len(texts):
+ txt_img_list.append(np.array(blank_img))
+ blank_img, draw_txt = create_blank_img()
+ count = 0
+ count += 1
+ txt_img_list.append(np.array(blank_img))
+ if len(txt_img_list) == 1:
+ blank_img = np.array(txt_img_list[0])
+ else:
+ blank_img = np.concatenate(txt_img_list, axis=1)
+ return np.array(blank_img)
+
+
+def str_count(s):
+ """
+ Count the number of Chinese characters,
+ a single English character and a single number
+ equal to half the length of Chinese characters.
+ args:
+ s(string): the input of string
+ return(int):
+ the number of Chinese characters
+ """
+ import string
+ count_zh = count_pu = 0
+ s_len = len(s)
+ en_dg_count = 0
+ for c in s:
+ if c in string.ascii_letters or c.isdigit() or c.isspace():
+ en_dg_count += 1
+ elif c.isalpha():
+ count_zh += 1
+ else:
+ count_pu += 1
+ return s_len - math.ceil(en_dg_count / 2)
+
+
+def resize_img(img, input_size=600):
+ img = np.array(img)
+ im_shape = img.shape
+ im_size_min = np.min(im_shape[0:2])
+ im_size_max = np.max(im_shape[0:2])
+ im_scale = float(input_size) / float(im_size_max)
+ im = cv2.resize(img, None, None, fx=im_scale, fy=im_scale)
+ return im
+
+
+def get_image_ext(image):
+ if image.shape[2] == 4:
+ return ".png"
+ return ".jpg"
+
+
+def sorted_boxes(dt_boxes):
+ """
+ Sort text boxes in order from top to bottom, left to right
+ args:
+ dt_boxes(array):detected text boxes with shape [4, 2]
+ return:
+ sorted boxes(array) with shape [4, 2]
+ """
+ num_boxes = dt_boxes.shape[0]
+ sorted_boxes = sorted(dt_boxes, key=lambda x: (x[0][1], x[0][0]))
+ _boxes = list(sorted_boxes)
+
+ for i in range(num_boxes - 1):
+ if abs(_boxes[i + 1][0][1] - _boxes[i][0][1]) < 10 and \
+ (_boxes[i + 1][0][0] < _boxes[i][0][0]):
+ tmp = _boxes[i]
+ _boxes[i] = _boxes[i + 1]
+ _boxes[i + 1] = tmp
+ return _boxes
+
+
+def base64_to_cv2(b64str):
+ data = base64.b64decode(b64str.encode('utf8'))
+ data = np.fromstring(data, np.uint8)
+ data = cv2.imdecode(data, cv2.IMREAD_COLOR)
+ return data
diff --git a/modules/thirdparty/image/text_recognition/Vehicle_License_Plate_Recognition/README.md b/modules/thirdparty/image/text_recognition/Vehicle_License_Plate_Recognition/README.md
index 2ff2cc180afdea2351d7b0ca1e0b1c78d3257dc7..cc299d800abfa4f627aa4ecd28c90e1b0281d802 100644
--- a/modules/thirdparty/image/text_recognition/Vehicle_License_Plate_Recognition/README.md
+++ b/modules/thirdparty/image/text_recognition/Vehicle_License_Plate_Recognition/README.md
@@ -1,83 +1,122 @@
-## 概述
-Vehicle_License_Plate_Recognition 是一个基于 CCPD 数据集训练的车牌识别模型,能够检测出图像中车牌位置并识别其中的车牌文字信息,大致的模型结构如下,分为检测车牌和文字识别两个模块:
+# Vehicle_License_Plate_Recognition
-
+|模型名称|Vehicle_License_Plate_Recognition|
+| :--- | :---: |
+|类别|图像 - 文字识别|
+|网络|-|
+|数据集|CCPD|
+|是否支持Fine-tuning|否|
+|模型大小|111MB|
+|最新更新日期|2021-03-22|
+|数据指标|-|
-## API
-```python
-def plate_recognition(images)
-```
-车牌识别 API
-**参数**
-* images(str / ndarray / list(str) / list(ndarray)):待识别图像的路径或者图像的 Ndarray(RGB)
+## 一、模型基本信息
-**返回**
-* results(list(dict{'license', 'bbox'})): 识别到的车牌信息列表,包含车牌的位置坐标和车牌号码
+- ### 应用效果展示
+ - 样例结果示例:
+
+
+
+
-**代码示例**
-```python
-import paddlehub as hub
+- ### 模型介绍
-# 加载模型
-model = hub.Module(name='Vehicle_License_Plate_Recognition')
+ - Vehicle_License_Plate_Recognition是一个基于CCPD数据集训练的车牌识别模型,能够检测出图像中车牌位置并识别其中的车牌文字信息。
-# 车牌识别
-result = model.plate_recognition("test.jpg")
-# 打印结果
-print(result)
-```
- [{'license': '苏B92912', 'bbox': [[131.0, 251.0], [368.0, 253.0], [367.0, 338.0], [131.0, 336.0]]}]
+## 二、安装
-## 服务部署
+- ### 1、环境依赖
-PaddleHub Serving 可以部署一个在线车牌识别服务。
+ - paddlepaddle >= 2.0.0
-## 第一步:启动PaddleHub Serving
+ - paddlehub >= 2.0.4
-运行启动命令:
-```shell
-$ hub serving start --modules Vehicle_License_Plate_Recognition
-```
+ - paddleocr >= 2.0.2
-这样就完成了一个车牌识别的在线服务API的部署,默认端口号为8866。
+- ### 2、安装
-**NOTE:** 如使用GPU预测,则需要在启动服务之前,请设置CUDA\_VISIBLE\_DEVICES环境变量,否则不用设置。
+ - ```shell
+ $ hub install Vehicle_License_Plate_Recognition
+ ```
+
+## 三、模型API预测
-## 第二步:发送预测请求
+- ### 1、代码示例
-配置好服务端,以下数行代码即可实现发送预测请求,获取预测结果
+ - ```python
+ import paddlehub as hub
+ import cv2
-```python
-import requests
-import json
-import cv2
-import base64
+ model = hub.Module(name="Vehicle_License_Plate_Recognition")
+ result = model.plate_recognition(images=[cv2.imread('/PATH/TO/IMAGE')])
+ ```
+- ### 2、API
-def cv2_to_base64(image):
- data = cv2.imencode('.jpg', image)[1]
- return base64.b64encode(data.tostring()).decode('utf8')
+ - ```python
+ def plate_recognition(images)
+ ```
+ - 车牌识别 API。
-# 发送HTTP请求
-data = {'images':[cv2_to_base64(cv2.imread("test.jpg"))]}
-headers = {"Content-type": "application/json"}
-url = "http://127.0.0.1:8866/predict/Vehicle_License_Plate_Recognition"
-r = requests.post(url=url, headers=headers, data=json.dumps(data))
+ - **参数**
-# 打印预测结果
-print(r.json()["results"])
-```
- [{'bbox': [[260.0, 100.0], [546.0, 104.0], [544.0, 200.0], [259.0, 196.0]], 'license': '苏DS0000'}]
+ - images (list\[numpy.ndarray\]): 图片数据,ndarray.shape 为 \[H, W, C\];
+
+
+ - **返回**
+ - results(list(dict{'license', 'bbox'})): 识别到的车牌信息列表,包含车牌的位置坐标和车牌号码
-## 查看代码
-https://github.com/jm12138/License_plate_recognition
-## 依赖
-paddlepaddle >= 2.0.0
+## 四、服务部署
-paddlehub >= 2.0.4
+- PaddleHub Serving可以部署一个在线车牌识别服务。
-paddleocr >= 2.0.2
+- ### 第一步:启动PaddleHub Serving
+
+ - 运行启动命令:
+ - ```shell
+ $ hub serving start -m Vehicle_License_Plate_Recognition
+ ```
+
+ - 这样就完成了一个车牌识别的在线服务API的部署,默认端口号为8866。
+
+ - **NOTE:** 如使用GPU预测,则需要在启动服务之前,请设置CUDA\_VISIBLE\_DEVICES环境变量,否则不用设置。
+
+- ### 第二步:发送预测请求
+
+ - 配置好服务端,以下数行代码即可实现发送预测请求,获取预测结果
+
+ - ```python
+ import requests
+ import json
+ import cv2
+ import base64
+
+
+ def cv2_to_base64(image):
+ data = cv2.imencode('.jpg', image)[1]
+ return base64.b64encode(data.tostring()).decode('utf8')
+
+ # 发送HTTP请求
+ data = {'images':[cv2_to_base64(cv2.imread("/PATH/TO/IMAGE"))]}
+ headers = {"Content-type": "application/json"}
+ url = "http://127.0.0.1:8866/predict/Vehicle_License_Plate_Recognition"
+ r = requests.post(url=url, headers=headers, data=json.dumps(data))
+
+ # 打印预测结果
+ print(r.json()["results"])
+ ```
+
+
+## 五、更新历史
+
+* 1.0.0
+
+ 初始发布
+
+ - ```shell
+ $ hub install Vehicle_License_Plate_Recognition==1.0.0
+ ```
\ No newline at end of file