未验证 提交 9209799d 编写于 作者: T topduke 提交者: GitHub

Merge branch 'dygraph' into dygraph

...@@ -63,8 +63,7 @@ Train: ...@@ -63,8 +63,7 @@ Train:
- DecodeImage: - DecodeImage:
img_mode: BGR img_mode: BGR
channel_first: false channel_first: false
- RecAug: - BaseDataAugmentation:
use_tia: False
- RandAugment: - RandAugment:
- SSLRotateResize: - SSLRotateResize:
image_shape: [3, 48, 320] image_shape: [3, 48, 320]
......
...@@ -60,8 +60,7 @@ Train: ...@@ -60,8 +60,7 @@ Train:
img_mode: BGR img_mode: BGR
channel_first: False channel_first: False
- ClsLabelEncode: # Class handling label - ClsLabelEncode: # Class handling label
- RecAug: - BaseDataAugmentation:
use_tia: False
- RandAugment: - RandAugment:
- ClsResizeImg: - ClsResizeImg:
image_shape: [3, 48, 192] image_shape: [3, 48, 192]
......
...@@ -208,7 +208,7 @@ Execute the built executable file: ...@@ -208,7 +208,7 @@ Execute the built executable file:
./build/ppocr [--param1] [--param2] [...] ./build/ppocr [--param1] [--param2] [...]
``` ```
**Note**:ppocr uses the `PP-OCRv3` model by default, and the input shape used by the recognition model is `3, 48, 320`, so if you use the recognition function, you need to add the parameter `--rec_img_h=48`, if you do not use the default `PP-OCRv3` model, you do not need to set this parameter. **Note**:ppocr uses the `PP-OCRv3` model by default, and the input shape used by the recognition model is `3, 48, 320`, if you want to use the old version model, you should add the parameter `--rec_img_h=32`.
Specifically, Specifically,
...@@ -222,7 +222,6 @@ Specifically, ...@@ -222,7 +222,6 @@ Specifically,
--det=true \ --det=true \
--rec=true \ --rec=true \
--cls=true \ --cls=true \
--rec_img_h=48\
``` ```
##### 2. det+rec: ##### 2. det+rec:
...@@ -234,7 +233,6 @@ Specifically, ...@@ -234,7 +233,6 @@ Specifically,
--det=true \ --det=true \
--rec=true \ --rec=true \
--cls=false \ --cls=false \
--rec_img_h=48\
``` ```
##### 3. det ##### 3. det
...@@ -254,7 +252,6 @@ Specifically, ...@@ -254,7 +252,6 @@ Specifically,
--det=false \ --det=false \
--rec=true \ --rec=true \
--cls=true \ --cls=true \
--rec_img_h=48\
``` ```
##### 5. rec ##### 5. rec
...@@ -265,7 +262,6 @@ Specifically, ...@@ -265,7 +262,6 @@ Specifically,
--det=false \ --det=false \
--rec=true \ --rec=true \
--cls=false \ --cls=false \
--rec_img_h=48\
``` ```
##### 6. cls ##### 6. cls
...@@ -330,7 +326,7 @@ More parameters are as follows, ...@@ -330,7 +326,7 @@ More parameters are as follows,
|rec_model_dir|string|-|Address of recognition inference model| |rec_model_dir|string|-|Address of recognition inference model|
|rec_char_dict_path|string|../../ppocr/utils/ppocr_keys_v1.txt|dictionary file| |rec_char_dict_path|string|../../ppocr/utils/ppocr_keys_v1.txt|dictionary file|
|rec_batch_num|int|6|batch size of recognition| |rec_batch_num|int|6|batch size of recognition|
|rec_img_h|int|32|image height of recognition| |rec_img_h|int|48|image height of recognition|
|rec_img_w|int|320|image width of recognition| |rec_img_w|int|320|image width of recognition|
* Multi-language inference is also supported in PaddleOCR, you can refer to [recognition tutorial](../../doc/doc_en/recognition_en.md) for more supported languages and models in PaddleOCR. Specifically, if you want to infer using multi-language models, you just need to modify values of `rec_char_dict_path` and `rec_model_dir`. * Multi-language inference is also supported in PaddleOCR, you can refer to [recognition tutorial](../../doc/doc_en/recognition_en.md) for more supported languages and models in PaddleOCR. Specifically, if you want to infer using multi-language models, you just need to modify values of `rec_char_dict_path` and `rec_model_dir`.
......
...@@ -213,7 +213,7 @@ CUDNN_LIB_DIR=/your_cudnn_lib_dir ...@@ -213,7 +213,7 @@ CUDNN_LIB_DIR=/your_cudnn_lib_dir
本demo支持系统串联调用,也支持单个功能的调用,如,只使用检测或识别功能。 本demo支持系统串联调用,也支持单个功能的调用,如,只使用检测或识别功能。
**注意** ppocr默认使用`PP-OCRv3`模型,识别模型使用的输入shape为`3,48,320`, 因此如果使用识别功能,需要添加参数`--rec_img_h=48`,如果不使用默认的`PP-OCRv3`模型,则无需设置该参数 **注意** ppocr默认使用`PP-OCRv3`模型,识别模型使用的输入shape为`3,48,320`, 如需使用旧版本的PP-OCR模型,则需要设置参数`--rec_img_h=32`
运行方式: 运行方式:
...@@ -232,7 +232,6 @@ CUDNN_LIB_DIR=/your_cudnn_lib_dir ...@@ -232,7 +232,6 @@ CUDNN_LIB_DIR=/your_cudnn_lib_dir
--det=true \ --det=true \
--rec=true \ --rec=true \
--cls=true \ --cls=true \
--rec_img_h=48\
``` ```
##### 2. 检测+识别: ##### 2. 检测+识别:
...@@ -244,7 +243,6 @@ CUDNN_LIB_DIR=/your_cudnn_lib_dir ...@@ -244,7 +243,6 @@ CUDNN_LIB_DIR=/your_cudnn_lib_dir
--det=true \ --det=true \
--rec=true \ --rec=true \
--cls=false \ --cls=false \
--rec_img_h=48\
``` ```
##### 3. 检测: ##### 3. 检测:
...@@ -264,7 +262,6 @@ CUDNN_LIB_DIR=/your_cudnn_lib_dir ...@@ -264,7 +262,6 @@ CUDNN_LIB_DIR=/your_cudnn_lib_dir
--det=false \ --det=false \
--rec=true \ --rec=true \
--cls=true \ --cls=true \
--rec_img_h=48\
``` ```
##### 5. 识别: ##### 5. 识别:
...@@ -275,7 +272,6 @@ CUDNN_LIB_DIR=/your_cudnn_lib_dir ...@@ -275,7 +272,6 @@ CUDNN_LIB_DIR=/your_cudnn_lib_dir
--det=false \ --det=false \
--rec=true \ --rec=true \
--cls=false \ --cls=false \
--rec_img_h=48\
``` ```
##### 6. 分类: ##### 6. 分类:
...@@ -339,7 +335,7 @@ CUDNN_LIB_DIR=/your_cudnn_lib_dir ...@@ -339,7 +335,7 @@ CUDNN_LIB_DIR=/your_cudnn_lib_dir
|rec_model_dir|string|-|识别模型inference model地址| |rec_model_dir|string|-|识别模型inference model地址|
|rec_char_dict_path|string|../../ppocr/utils/ppocr_keys_v1.txt|字典文件| |rec_char_dict_path|string|../../ppocr/utils/ppocr_keys_v1.txt|字典文件|
|rec_batch_num|int|6|识别模型batchsize| |rec_batch_num|int|6|识别模型batchsize|
|rec_img_h|int|32|识别模型输入图像高度| |rec_img_h|int|48|识别模型输入图像高度|
|rec_img_w|int|320|识别模型输入图像宽度| |rec_img_w|int|320|识别模型输入图像宽度|
......
...@@ -47,7 +47,7 @@ DEFINE_string(rec_model_dir, "", "Path of rec inference model."); ...@@ -47,7 +47,7 @@ DEFINE_string(rec_model_dir, "", "Path of rec inference model.");
DEFINE_int32(rec_batch_num, 6, "rec_batch_num."); DEFINE_int32(rec_batch_num, 6, "rec_batch_num.");
DEFINE_string(rec_char_dict_path, "../../ppocr/utils/ppocr_keys_v1.txt", DEFINE_string(rec_char_dict_path, "../../ppocr/utils/ppocr_keys_v1.txt",
"Path of dictionary."); "Path of dictionary.");
DEFINE_int32(rec_img_h, 32, "rec image height"); DEFINE_int32(rec_img_h, 48, "rec image height");
DEFINE_int32(rec_img_w, 320, "rec image width"); DEFINE_int32(rec_img_w, 320, "rec image width");
// ocr forward related // ocr forward related
......
...@@ -132,7 +132,9 @@ void CRNNRecognizer::LoadModel(const std::string &model_dir) { ...@@ -132,7 +132,9 @@ void CRNNRecognizer::LoadModel(const std::string &model_dir) {
paddle_infer::Config config; paddle_infer::Config config;
config.SetModel(model_dir + "/inference.pdmodel", config.SetModel(model_dir + "/inference.pdmodel",
model_dir + "/inference.pdiparams"); model_dir + "/inference.pdiparams");
std::cout << "In PP-OCRv3, default rec_img_h is 48,"
<< "if you use other model, you should set the param rec_img_h=32"
<< std::endl;
if (this->use_gpu_) { if (this->use_gpu_) {
config.EnableUseGpu(this->gpu_mem_, this->gpu_id_); config.EnableUseGpu(this->gpu_mem_, this->gpu_id_);
if (this->use_tensorrt_) { if (this->use_tensorrt_) {
......
...@@ -682,7 +682,7 @@ lr: ...@@ -682,7 +682,7 @@ lr:
#### Q: 关于dygraph分支中,文本识别模型训练,要使用数据增强应该如何设置? #### Q: 关于dygraph分支中,文本识别模型训练,要使用数据增强应该如何设置?
**A**:可以参考[配置文件](../../configs/rec/ch_ppocr_v2.0/rec_chinese_lite_train_v2.0.yml)在Train['dataset']['transforms']添加RecAug字段,使数据增强生效。可以通过添加对aug_prob设置,表示每种数据增强采用的概率。aug_prob默认是0.4.由于tia数据增强特殊性,默认不采用,可以通过添加use_tia设置,使tia数据增强生效。详细设置可以参考[ISSUE 1744](https://github.com/PaddlePaddle/PaddleOCR/issues/1744) **A**:可以参考[配置文件](../../configs/rec/ch_ppocr_v2.0/rec_chinese_lite_train_v2.0.yml)在Train['dataset']['transforms']添加RecAug字段,使数据增强生效。可以通过添加对aug_prob设置,表示每种数据增强采用的概率。aug_prob默认是0.4。详细设置可以参考[ISSUE 1744](https://github.com/PaddlePaddle/PaddleOCR/issues/1744)
#### Q: 训练过程中,训练程序意外退出/挂起,应该如何解决? #### Q: 训练过程中,训练程序意外退出/挂起,应该如何解决?
......
...@@ -101,8 +101,17 @@ cd /path/to/ppocr_img ...@@ -101,8 +101,17 @@ cd /path/to/ppocr_img
['韩国小馆', 0.994467] ['韩国小馆', 0.994467]
``` ```
**版本说明**
paddleocr默认使用PP-OCRv3模型(`--ocr_version PP-OCRv3`),如需使用其他版本可通过设置参数`--ocr_version`,具体版本说明如下:
| 版本名称 | 版本说明 |
| --- | --- |
| PP-OCRv3 | 支持中、英文检测和识别,方向分类器,支持多语种识别 |
| PP-OCRv2 | 支持中英文的检测和识别,方向分类器,多语言暂未更新 |
| PP-OCR | 支持中、英文检测和识别,方向分类器,支持多语种识别 |
如需使用2.0模型,请指定参数`--ocr_version PP-OCR`,paddleocr默认使用PP-OCRv3模型(`--ocr_version PP-OCRv3`)。更多whl包使用可参考[whl包文档](./whl.md) 如需新增自己训练的模型,可以在[paddleocr](../../paddleocr.py)中增加模型链接和字段,重新编译即可。
更多whl包使用可参考[whl包文档](./whl.md)
<a name="212"></a> <a name="212"></a>
......
...@@ -100,7 +100,7 @@ Considering that the features of some channels will be suppressed if the convolu ...@@ -100,7 +100,7 @@ Considering that the features of some channels will be suppressed if the convolu
The recognition module of PP-OCRv3 is optimized based on the text recognition algorithm [SVTR](https://arxiv.org/abs/2205.00159). RNN is abandoned in SVTR, and the context information of the text line image is more effectively mined by introducing the Transformers structure, thereby improving the text recognition ability. The recognition module of PP-OCRv3 is optimized based on the text recognition algorithm [SVTR](https://arxiv.org/abs/2205.00159). RNN is abandoned in SVTR, and the context information of the text line image is more effectively mined by introducing the Transformers structure, thereby improving the text recognition ability.
The recognition accuracy of SVTR_inty outperforms PP-OCRv2 recognition model by 5.3%, while the prediction speed nearly 11 times slower. It takes nearly 100ms to predict a text line on CPU. Therefore, as shown in the figure below, PP-OCRv3 adopts the following six optimization strategies to accelerate the recognition model. The recognition accuracy of SVTR_tiny outperforms PP-OCRv2 recognition model by 5.3%, while the prediction speed nearly 11 times slower. It takes nearly 100ms to predict a text line on CPU. Therefore, as shown in the figure below, PP-OCRv3 adopts the following six optimization strategies to accelerate the recognition model.
<div align="center"> <div align="center">
<img src="../ppocr_v3/v3_rec_pipeline.png" width=800> <img src="../ppocr_v3/v3_rec_pipeline.png" width=800>
......
...@@ -119,7 +119,18 @@ If you do not use the provided test image, you can replace the following `--imag ...@@ -119,7 +119,18 @@ If you do not use the provided test image, you can replace the following `--imag
['PAIN', 0.9934559464454651] ['PAIN', 0.9934559464454651]
``` ```
If you need to use the 2.0 model, please specify the parameter `--ocr_version PP-OCR`, paddleocr uses the PP-OCRv3 model by default(`--ocr_version PP-OCRv3`). More whl package usage can be found in [whl package](./whl_en.md) **Version**
paddleocr uses the PP-OCRv3 model by default(`--ocr_version PP-OCRv3`). If you want to use other versions, you can set the parameter `--ocr_version`, the specific version description is as follows:
| version name | description |
| --- | --- |
| PP-OCRv3 | support Chinese and English detection and recognition, direction classifier, support multilingual recognition |
| PP-OCRv2 | only supports Chinese and English detection and recognition, direction classifier, multilingual model is not updated |
| PP-OCR | support Chinese and English detection and recognition, direction classifier, support multilingual recognition |
If you want to add your own trained model, you can add model links and keys in [paddleocr](../../paddleocr.py) and recompile.
More whl package usage can be found in [whl package](./whl_en.md)
<a name="212-multi-language-model"></a> <a name="212-multi-language-model"></a>
#### 2.1.2 Multi-language Model #### 2.1.2 Multi-language Model
......
...@@ -154,7 +154,13 @@ MODEL_URLS = { ...@@ -154,7 +154,13 @@ MODEL_URLS = {
'https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_rec_infer.tar', 'https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_rec_infer.tar',
'dict_path': './ppocr/utils/ppocr_keys_v1.txt' 'dict_path': './ppocr/utils/ppocr_keys_v1.txt'
} }
} },
'cls': {
'ch': {
'url':
'https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_infer.tar',
}
},
}, },
'PP-OCR': { 'PP-OCR': {
'det': { 'det': {
......
...@@ -22,6 +22,7 @@ from .make_shrink_map import MakeShrinkMap ...@@ -22,6 +22,7 @@ from .make_shrink_map import MakeShrinkMap
from .random_crop_data import EastRandomCropData, RandomCropImgMask from .random_crop_data import EastRandomCropData, RandomCropImgMask
from .make_pse_gt import MakePseGt from .make_pse_gt import MakePseGt
from .rec_img_aug import RecAug, RecConAug, RecResizeImg, ClsResizeImg, \ from .rec_img_aug import RecAug, RecConAug, RecResizeImg, ClsResizeImg, \
SRNRecResizeImg, GrayRecResizeImg, SARRecResizeImg, PRENResizeImg SRNRecResizeImg, GrayRecResizeImg, SARRecResizeImg, PRENResizeImg
from .ssl_img_aug import SSLRotateResize from .ssl_img_aug import SSLRotateResize
......
...@@ -22,13 +22,74 @@ from .text_image_aug import tia_perspective, tia_stretch, tia_distort ...@@ -22,13 +22,74 @@ from .text_image_aug import tia_perspective, tia_stretch, tia_distort
class RecAug(object): class RecAug(object):
def __init__(self, use_tia=True, aug_prob=0.4, **kwargs): def __init__(self,
self.use_tia = use_tia tia_prob=0.4,
self.aug_prob = aug_prob crop_prob=0.4,
reverse_prob=0.4,
noise_prob=0.4,
jitter_prob=0.4,
blur_prob=0.4,
hsv_aug_prob=0.4,
**kwargs):
self.tia_prob = tia_prob
self.bda = BaseDataAugmentation(crop_prob, reverse_prob, noise_prob,
jitter_prob, blur_prob, hsv_aug_prob)
def __call__(self, data): def __call__(self, data):
img = data['image'] img = data['image']
img = warp(img, 10, self.use_tia, self.aug_prob) h, w, _ = img.shape
# tia
if random.random() <= self.tia_prob:
if h >= 20 and w >= 20:
img = tia_distort(img, random.randint(3, 6))
img = tia_stretch(img, random.randint(3, 6))
img = tia_perspective(img)
# bda
data['image'] = img
data = self.bda(data)
return data
class BaseDataAugmentation(object):
def __init__(self,
crop_prob=0.4,
reverse_prob=0.4,
noise_prob=0.4,
jitter_prob=0.4,
blur_prob=0.4,
hsv_aug_prob=0.4,
**kwargs):
self.crop_prob = crop_prob
self.reverse_prob = reverse_prob
self.noise_prob = noise_prob
self.jitter_prob = jitter_prob
self.blur_prob = blur_prob
self.hsv_aug_prob = hsv_aug_prob
def __call__(self, data):
img = data['image']
h, w, _ = img.shape
if random.random() <= self.crop_prob and h >= 20 and w >= 20:
img = get_crop(img)
if random.random() <= self.blur_prob:
img = blur(img)
if random.random() <= self.hsv_aug_prob:
img = hsv_aug(img)
if random.random() <= self.jitter_prob:
img = jitter(img)
if random.random() <= self.noise_prob:
img = add_gasuss_noise(img)
if random.random() <= self.reverse_prob:
img = 255 - img
data['image'] = img data['image'] = img
return data return data
...@@ -370,7 +431,7 @@ def flag(): ...@@ -370,7 +431,7 @@ def flag():
return 1 if random.random() > 0.5000001 else -1 return 1 if random.random() > 0.5000001 else -1
def cvtColor(img): def hsv_aug(img):
""" """
cvtColor cvtColor
""" """
...@@ -438,50 +499,6 @@ def get_crop(image): ...@@ -438,50 +499,6 @@ def get_crop(image):
return crop_img return crop_img
class Config:
"""
Config
"""
def __init__(self, use_tia):
self.anglex = random.random() * 30
self.angley = random.random() * 15
self.anglez = random.random() * 10
self.fov = 42
self.r = 0
self.shearx = random.random() * 0.3
self.sheary = random.random() * 0.05
self.borderMode = cv2.BORDER_REPLICATE
self.use_tia = use_tia
def make(self, w, h, ang):
"""
make
"""
self.anglex = random.random() * 5 * flag()
self.angley = random.random() * 5 * flag()
self.anglez = -1 * random.random() * int(ang) * flag()
self.fov = 42
self.r = 0
self.shearx = 0
self.sheary = 0
self.borderMode = cv2.BORDER_REPLICATE
self.w = w
self.h = h
self.perspective = self.use_tia
self.stretch = self.use_tia
self.distort = self.use_tia
self.crop = True
self.affine = False
self.reverse = True
self.noise = True
self.jitter = True
self.blur = True
self.color = True
def rad(x): def rad(x):
""" """
rad rad
...@@ -565,48 +582,3 @@ def get_warpAffine(config): ...@@ -565,48 +582,3 @@ def get_warpAffine(config):
rz = np.array([[np.cos(rad(anglez)), np.sin(rad(anglez)), 0], rz = np.array([[np.cos(rad(anglez)), np.sin(rad(anglez)), 0],
[-np.sin(rad(anglez)), np.cos(rad(anglez)), 0]], np.float32) [-np.sin(rad(anglez)), np.cos(rad(anglez)), 0]], np.float32)
return rz return rz
def warp(img, ang, use_tia=True, prob=0.4):
"""
warp
"""
h, w, _ = img.shape
config = Config(use_tia=use_tia)
config.make(w, h, ang)
new_img = img
if config.distort:
img_height, img_width = img.shape[0:2]
if random.random() <= prob and img_height >= 20 and img_width >= 20:
new_img = tia_distort(new_img, random.randint(3, 6))
if config.stretch:
img_height, img_width = img.shape[0:2]
if random.random() <= prob and img_height >= 20 and img_width >= 20:
new_img = tia_stretch(new_img, random.randint(3, 6))
if config.perspective:
if random.random() <= prob:
new_img = tia_perspective(new_img)
if config.crop:
img_height, img_width = img.shape[0:2]
if random.random() <= prob and img_height >= 20 and img_width >= 20:
new_img = get_crop(new_img)
if config.blur:
if random.random() <= prob:
new_img = blur(new_img)
if config.color:
if random.random() <= prob:
new_img = cvtColor(new_img)
if config.jitter:
new_img = jitter(new_img)
if config.noise:
if random.random() <= prob:
new_img = add_gasuss_noise(new_img)
if config.reverse:
if random.random() <= prob:
new_img = 255 - new_img
return new_img
...@@ -33,7 +33,7 @@ class SimpleDataSet(Dataset): ...@@ -33,7 +33,7 @@ class SimpleDataSet(Dataset):
self.delimiter = dataset_config.get('delimiter', '\t') self.delimiter = dataset_config.get('delimiter', '\t')
label_file_list = dataset_config.pop('label_file_list') label_file_list = dataset_config.pop('label_file_list')
data_source_num = len(label_file_list) data_source_num = len(label_file_list)
ratio_list = dataset_config.get("ratio_list", [1.0]) ratio_list = dataset_config.get("ratio_list", 1.0)
if isinstance(ratio_list, (float, int)): if isinstance(ratio_list, (float, int)):
ratio_list = [float(ratio_list)] * int(data_source_num) ratio_list = [float(ratio_list)] * int(data_source_num)
......
...@@ -28,6 +28,7 @@ import numpy as np ...@@ -28,6 +28,7 @@ import numpy as np
import time import time
import tools.infer.predict_rec as predict_rec import tools.infer.predict_rec as predict_rec
import tools.infer.predict_det as predict_det import tools.infer.predict_det as predict_det
import tools.infer.utility as utility
from ppocr.utils.utility import get_image_file_list, check_and_read_gif from ppocr.utils.utility import get_image_file_list, check_and_read_gif
from ppocr.utils.logging import get_logger from ppocr.utils.logging import get_logger
from ppstructure.table.matcher import distance, compute_iou from ppstructure.table.matcher import distance, compute_iou
...@@ -59,11 +60,37 @@ class TableSystem(object): ...@@ -59,11 +60,37 @@ class TableSystem(object):
self.text_recognizer = predict_rec.TextRecognizer( self.text_recognizer = predict_rec.TextRecognizer(
args) if text_recognizer is None else text_recognizer args) if text_recognizer is None else text_recognizer
self.table_structurer = predict_strture.TableStructurer(args) self.table_structurer = predict_strture.TableStructurer(args)
self.benchmark = args.benchmark
self.predictor, self.input_tensor, self.output_tensors, self.config = utility.create_predictor(
args, 'table', logger)
if args.benchmark:
import auto_log
pid = os.getpid()
gpu_id = utility.get_infer_gpuid()
self.autolog = auto_log.AutoLogger(
model_name="table",
model_precision=args.precision,
batch_size=1,
data_shape="dynamic",
save_path=None, #args.save_log_path,
inference_config=self.config,
pids=pid,
process_name=None,
gpu_ids=gpu_id if args.use_gpu else None,
time_keys=[
'preprocess_time', 'inference_time', 'postprocess_time'
],
warmup=0,
logger=logger)
def __call__(self, img, return_ocr_result_in_table=False): def __call__(self, img, return_ocr_result_in_table=False):
result = dict() result = dict()
ori_im = img.copy() ori_im = img.copy()
if self.benchmark:
self.autolog.times.start()
structure_res, elapse = self.table_structurer(copy.deepcopy(img)) structure_res, elapse = self.table_structurer(copy.deepcopy(img))
if self.benchmark:
self.autolog.times.stamp()
dt_boxes, elapse = self.text_detector(copy.deepcopy(img)) dt_boxes, elapse = self.text_detector(copy.deepcopy(img))
dt_boxes = sorted_boxes(dt_boxes) dt_boxes = sorted_boxes(dt_boxes)
if return_ocr_result_in_table: if return_ocr_result_in_table:
...@@ -77,13 +104,11 @@ class TableSystem(object): ...@@ -77,13 +104,11 @@ class TableSystem(object):
box = [x_min, y_min, x_max, y_max] box = [x_min, y_min, x_max, y_max]
r_boxes.append(box) r_boxes.append(box)
dt_boxes = np.array(r_boxes) dt_boxes = np.array(r_boxes)
logger.debug("dt_boxes num : {}, elapse : {}".format( logger.debug("dt_boxes num : {}, elapse : {}".format(
len(dt_boxes), elapse)) len(dt_boxes), elapse))
if dt_boxes is None: if dt_boxes is None:
return None, None return None, None
img_crop_list = [] img_crop_list = []
for i in range(len(dt_boxes)): for i in range(len(dt_boxes)):
det_box = dt_boxes[i] det_box = dt_boxes[i]
x0, y0, x1, y1 = expand(2, det_box, ori_im.shape) x0, y0, x1, y1 = expand(2, det_box, ori_im.shape)
...@@ -92,10 +117,14 @@ class TableSystem(object): ...@@ -92,10 +117,14 @@ class TableSystem(object):
rec_res, elapse = self.text_recognizer(img_crop_list) rec_res, elapse = self.text_recognizer(img_crop_list)
logger.debug("rec_res num : {}, elapse : {}".format( logger.debug("rec_res num : {}, elapse : {}".format(
len(rec_res), elapse)) len(rec_res), elapse))
if self.benchmark:
self.autolog.times.stamp()
if return_ocr_result_in_table: if return_ocr_result_in_table:
result['rec_res'] = rec_res result['rec_res'] = rec_res
pred_html, pred = self.rebuild_table(structure_res, dt_boxes, rec_res) pred_html, pred = self.rebuild_table(structure_res, dt_boxes, rec_res)
result['html'] = pred_html result['html'] = pred_html
if self.benchmark:
self.autolog.times.end(stamp=True)
return result return result
def rebuild_table(self, structure_res, dt_boxes, rec_res): def rebuild_table(self, structure_res, dt_boxes, rec_res):
...@@ -213,6 +242,8 @@ def main(args): ...@@ -213,6 +242,8 @@ def main(args):
logger.info('excel saved to {}'.format(excel_path)) logger.info('excel saved to {}'.format(excel_path))
elapse = time.time() - starttime elapse = time.time() - starttime
logger.info("Predict time : {:.3f}s".format(elapse)) logger.info("Predict time : {:.3f}s".format(elapse))
if args.benchmark:
text_sys.autolog.report()
if __name__ == "__main__": if __name__ == "__main__":
......
...@@ -57,10 +57,11 @@ function status_check(){ ...@@ -57,10 +57,11 @@ function status_check(){
last_status=$1 # the exit code last_status=$1 # the exit code
run_command=$2 run_command=$2
run_log=$3 run_log=$3
model_name=$4
if [ $last_status -eq 0 ]; then if [ $last_status -eq 0 ]; then
echo -e "\033[33m Run successfully with command - ${run_command}! \033[0m" | tee -a ${run_log} echo -e "\033[33m Run successfully with command - ${model_name} - ${run_command}! \033[0m" | tee -a ${run_log}
else else
echo -e "\033[33m Run failed with command - ${run_command}! \033[0m" | tee -a ${run_log} echo -e "\033[33m Run failed with command - ${model_name} - ${run_command}! \033[0m" | tee -a ${run_log}
fi fi
} }
===========================kl_quant_params=========================== ===========================kl_quant_params===========================
model_name:PPOCRv2_ocr_det_kl model_name:ch_PP-OCRv2_det_KL
python:python3.7 python:python3.7
Global.pretrained_model:null Global.pretrained_model:null
Global.save_inference_dir:null Global.save_inference_dir:null
...@@ -8,10 +8,10 @@ infer_export:deploy/slim/quantization/quant_kl.py -c configs/det/ch_PP-OCRv2/ch_ ...@@ -8,10 +8,10 @@ infer_export:deploy/slim/quantization/quant_kl.py -c configs/det/ch_PP-OCRv2/ch_
infer_quant:True infer_quant:True
inference:tools/infer/predict_det.py inference:tools/infer/predict_det.py
--use_gpu:False|True --use_gpu:False|True
--enable_mkldnn:True --enable_mkldnn:False
--cpu_threads:1|6 --cpu_threads:6
--rec_batch_num:1 --rec_batch_num:1
--use_tensorrt:False|True --use_tensorrt:False
--precision:int8 --precision:int8
--det_model_dir: --det_model_dir:
--image_dir:./inference/ch_det_data_50/all-sum-510/ --image_dir:./inference/ch_det_data_50/all-sum-510/
......
===========================kl_quant_params=========================== ===========================kl_quant_params===========================
model_name:PPOCRv2_ocr_rec_kl model_name:ch_PP-OCRv2_rec_KL
python:python3.7 python:python3.7
Global.pretrained_model:null Global.pretrained_model:null
Global.save_inference_dir:null Global.save_inference_dir:null
infer_model:./inference/ch_PP-OCRv2_rec_infer/ infer_model:./inference/ch_PP-OCRv2_rec_infer/
infer_export:deploy/slim/quantization/quant_kl.py -c test_tipc/configs/ch_PP-OCRv2_rec/ch_PP-OCRv2_rec_distillation.yml -o infer_export:deploy/slim/quantization/quant_kl.py -c test_tipc/configs/ch_PP-OCRv2_rec/ch_PP-OCRv2_rec_distillation.yml -o
infer_quant:True infer_quant:True
inference:tools/infer/predict_rec.py inference:tools/infer/predict_rec.py --rec_image_shape="3,32,320"
--use_gpu:False|True --use_gpu:False|True
--enable_mkldnn:False|True --enable_mkldnn:False
--cpu_threads:1|6 --cpu_threads:6
--rec_batch_num:1|6 --rec_batch_num:1|6
--use_tensorrt:True --use_tensorrt:False
--precision:int8 --precision:int8
--rec_model_dir: --rec_model_dir:
--image_dir:./inference/rec_inference --image_dir:./inference/rec_inference
......
...@@ -4,7 +4,7 @@ python:python3.7 ...@@ -4,7 +4,7 @@ python:python3.7
gpu_list:0|0,1 gpu_list:0|0,1
Global.use_gpu:True|True Global.use_gpu:True|True
Global.auto_cast:fp32 Global.auto_cast:fp32
Global.epoch_num:lite_train_lite_infer=6|whole_train_whole_infer=50 Global.epoch_num:lite_train_lite_infer=1|whole_train_whole_infer=50
Global.save_model_dir:./output/ Global.save_model_dir:./output/
Train.loader.batch_size_per_card:lite_train_lite_infer=16|whole_train_whole_infer=128 Train.loader.batch_size_per_card:lite_train_lite_infer=16|whole_train_whole_infer=128
Global.pretrained_model:pretrain_models/ch_PP-OCRv2_rec_train/best_accuracy Global.pretrained_model:pretrain_models/ch_PP-OCRv2_rec_train/best_accuracy
......
===========================kl_quant_params=========================== ===========================kl_quant_params===========================
model_name:PPOCRv3_ocr_det_kl model_name:ch_PP-OCRv3_det_KL
python:python3.7 python:python3.7
Global.pretrained_model:null Global.pretrained_model:null
Global.save_inference_dir:null Global.save_inference_dir:null
...@@ -8,10 +8,10 @@ infer_export:deploy/slim/quantization/quant_kl.py -c configs/det/ch_PP-OCRv3/ch_ ...@@ -8,10 +8,10 @@ infer_export:deploy/slim/quantization/quant_kl.py -c configs/det/ch_PP-OCRv3/ch_
infer_quant:True infer_quant:True
inference:tools/infer/predict_det.py inference:tools/infer/predict_det.py
--use_gpu:False|True --use_gpu:False|True
--enable_mkldnn:True --enable_mkldnn:False
--cpu_threads:1|6 --cpu_threads:6
--rec_batch_num:1 --rec_batch_num:1
--use_tensorrt:False|True --use_tensorrt:False
--precision:int8 --precision:int8
--det_model_dir: --det_model_dir:
--image_dir:./inference/ch_det_data_50/all-sum-510/ --image_dir:./inference/ch_det_data_50/all-sum-510/
......
===========================kl_quant_params=========================== ===========================kl_quant_params===========================
model_name:PPOCRv3_ocr_rec_kl model_name:ch_PP-OCRv3_rec_KL
python:python3.7 python:python3.7
Global.pretrained_model:null Global.pretrained_model:
Global.save_inference_dir:null Global.save_inference_dir:null
infer_model:./inference/ch_PP-OCRv3_rec_infer/ infer_model:./inference/ch_PP-OCRv3_rec_infer/
infer_export:deploy/slim/quantization/quant_kl.py -c test_tipc/configs/ch_PP-OCRv3_rec/ch_PP-OCRv3_rec_distillation.yml -o infer_export:deploy/slim/quantization/quant_kl.py -c test_tipc/configs/ch_PP-OCRv3_rec/ch_PP-OCRv3_rec_distillation.yml -o
infer_quant:True infer_quant:True
inference:tools/infer/predict_rec.py --rec_image_shape="3,48,320" inference:tools/infer/predict_rec.py --rec_image_shape="3,48,320"
--use_gpu:False|True --use_gpu:False|True
--enable_mkldnn:False|True --enable_mkldnn:False
--cpu_threads:1|6 --cpu_threads:6
--rec_batch_num:1|6 --rec_batch_num:1|6
--use_tensorrt:True --use_tensorrt:False
--precision:int8 --precision:int8
--rec_model_dir: --rec_model_dir:
--image_dir:./inference/rec_inference --image_dir:./inference/rec_inference
......
...@@ -4,7 +4,7 @@ python:python3.7 ...@@ -4,7 +4,7 @@ python:python3.7
gpu_list:0|0,1 gpu_list:0|0,1
Global.use_gpu:True|True Global.use_gpu:True|True
Global.auto_cast:fp32 Global.auto_cast:fp32
Global.epoch_num:lite_train_lite_infer=6|whole_train_whole_infer=50 Global.epoch_num:lite_train_lite_infer=1|whole_train_whole_infer=50
Global.save_model_dir:./output/ Global.save_model_dir:./output/
Train.loader.batch_size_per_card:lite_train_lite_infer=16|whole_train_whole_infer=128 Train.loader.batch_size_per_card:lite_train_lite_infer=16|whole_train_whole_infer=128
Global.pretrained_model:pretrain_models/ch_PP-OCRv3_rec_train/best_accuracy Global.pretrained_model:pretrain_models/ch_PP-OCRv3_rec_train/best_accuracy
......
...@@ -8,10 +8,10 @@ infer_export:deploy/slim/quantization/quant_kl.py -c configs/det/ch_ppocr_v2.0/c ...@@ -8,10 +8,10 @@ infer_export:deploy/slim/quantization/quant_kl.py -c configs/det/ch_ppocr_v2.0/c
infer_quant:True infer_quant:True
inference:tools/infer/predict_det.py inference:tools/infer/predict_det.py
--use_gpu:False|True --use_gpu:False|True
--enable_mkldnn:True --enable_mkldnn:False
--cpu_threads:1|6 --cpu_threads:6
--rec_batch_num:1 --rec_batch_num:1
--use_tensorrt:False|True --use_tensorrt:False
--precision:int8 --precision:int8
--det_model_dir: --det_model_dir:
--image_dir:./inference/ch_det_data_50/all-sum-510/ --image_dir:./inference/ch_det_data_50/all-sum-510/
......
...@@ -4,7 +4,7 @@ python:python3.7 ...@@ -4,7 +4,7 @@ python:python3.7
gpu_list:0|0,1 gpu_list:0|0,1
Global.use_gpu:True|True Global.use_gpu:True|True
Global.auto_cast:null Global.auto_cast:null
Global.epoch_num:lite_train_lite_infer=20|whole_train_whole_infer=50 Global.epoch_num:lite_train_lite_infer=2|whole_train_whole_infer=50
Global.save_model_dir:./output/ Global.save_model_dir:./output/
Train.loader.batch_size_per_card:lite_train_lite_infer=2|whole_train_whole_infer=4 Train.loader.batch_size_per_card:lite_train_lite_infer=2|whole_train_whole_infer=4
Global.pretrained_model:null Global.pretrained_model:null
......
...@@ -6,12 +6,12 @@ Global.save_inference_dir:null ...@@ -6,12 +6,12 @@ Global.save_inference_dir:null
infer_model:./inference/ch_ppocr_mobile_v2.0_rec_infer/ infer_model:./inference/ch_ppocr_mobile_v2.0_rec_infer/
infer_export:deploy/slim/quantization/quant_kl.py -c test_tipc/configs/ch_ppocr_mobile_v2.0_rec_KL/rec_chinese_lite_train_v2.0.yml -o infer_export:deploy/slim/quantization/quant_kl.py -c test_tipc/configs/ch_ppocr_mobile_v2.0_rec_KL/rec_chinese_lite_train_v2.0.yml -o
infer_quant:True infer_quant:True
inference:tools/infer/predict_rec.py inference:tools/infer/predict_rec.py --rec_image_shape="3,32,320"
--use_gpu:False|True --use_gpu:False|True
--enable_mkldnn:True --enable_mkldnn:False
--cpu_threads:1|6 --cpu_threads:6
--rec_batch_num:1 --rec_batch_num:1
--use_tensorrt:False|True --use_tensorrt:False
--precision:int8 --precision:int8
--rec_model_dir: --rec_model_dir:
--image_dir:./inference/rec_inference --image_dir:./inference/rec_inference
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册