提交 2735e9e3 编写于 作者: L LDOUBLEV

Merge branch 'dygraph' of https://github.com/PaddlePaddle/PaddleOCR into dyg_db

...@@ -4,16 +4,18 @@ ...@@ -4,16 +4,18 @@
PaddleOCR旨在打造一套丰富、领先、且实用的OCR工具库,助力使用者训练出更好的模型,并应用落地。 PaddleOCR旨在打造一套丰富、领先、且实用的OCR工具库,助力使用者训练出更好的模型,并应用落地。
**近期更新** **近期更新**
- 2020.12.07 [FAQ](./doc/doc_ch/FAQ.md)新增5个高频问题,总数124个,并且计划以后每周一都会更新,欢迎大家持续关注。
- 2020.11.25 更新半自动标注工具[PPOCRLabel](./PPOCRLabel/README.md),辅助开发者高效完成标注任务,输出格式与PP-OCR训练任务完美衔接。
- 2020.9.22 更新PP-OCR技术文章,https://arxiv.org/abs/2009.09941 - 2020.9.22 更新PP-OCR技术文章,https://arxiv.org/abs/2009.09941
- 2020.9.19 更新超轻量压缩ppocr_mobile_slim系列模型,整体模型3.5M(详见[PP-OCR Pipline](#PP-OCR)),适合在移动端部署使用。[模型下载](#模型下载) - 2020.9.19 更新超轻量压缩ppocr_mobile_slim系列模型,整体模型3.5M(详见[PP-OCR Pipeline](#PP-OCR)),适合在移动端部署使用。[模型下载](#模型下载)
- 2020.9.17 更新超轻量ppocr_mobile系列和通用ppocr_server系列中英文ocr模型,媲美商业效果。[模型下载](#模型下载) - 2020.9.17 更新超轻量ppocr_mobile系列和通用ppocr_server系列中英文ocr模型,媲美商业效果。[模型下载](#模型下载)
- 2020.9.17 更新[英文识别模型](./doc/doc_ch/models_list.md#英文识别模型)[多语言识别模型](doc/doc_ch/models_list.md#多语言识别模型),已支持`德语、法语、日语、韩语`,更多语种识别模型将持续更新。 - 2020.9.17 更新[英文识别模型](./doc/doc_ch/models_list.md#英文识别模型)[多语言识别模型](doc/doc_ch/models_list.md#多语言识别模型),已支持`德语、法语、日语、韩语`,更多语种识别模型将持续更新。
- 2020.8.26 更新OCR相关的84个常见问题及解答,具体参考[FAQ](./doc/doc_ch/FAQ.md)
- 2020.8.24 支持通过whl包安装使用PaddleOCR,具体参考[Paddleocr Package使用说明](./doc/doc_ch/whl.md) - 2020.8.24 支持通过whl包安装使用PaddleOCR,具体参考[Paddleocr Package使用说明](./doc/doc_ch/whl.md)
- 2020.8.21 更新8月18日B站直播课回放和PPT,课节2,易学易用的OCR工具大礼包,[获取地址](https://aistudio.baidu.com/aistudio/education/group/info/1519) - 2020.8.21 更新8月18日B站直播课回放和PPT,课节2,易学易用的OCR工具大礼包,[获取地址](https://aistudio.baidu.com/aistudio/education/group/info/1519)
- [More](./doc/doc_ch/update.md) - [More](./doc/doc_ch/update.md)
## 特性 ## 特性
- PPOCR系列高质量预训练模型,准确的识别效果 - PPOCR系列高质量预训练模型,准确的识别效果
...@@ -48,15 +50,14 @@ PaddleOCR旨在打造一套丰富、领先、且实用的OCR工具库,助力 ...@@ -48,15 +50,14 @@ PaddleOCR旨在打造一套丰富、领先、且实用的OCR工具库,助力
- 代码体验:从[快速安装](./doc/doc_ch/installation.md) 开始 - 代码体验:从[快速安装](./doc/doc_ch/installation.md) 开始
<a name="模型下载"></a> <a name="模型下载"></a>
## PP-OCR 1.1系列模型列表(9月17日更新 ## PP-OCR 2.0系列模型列表(更新中
| 模型简介 | 模型名称 |推荐场景 | 检测模型 | 方向分类器 | 识别模型 | | 模型简介 | 模型名称 |推荐场景 | 检测模型 | 方向分类器 | 识别模型 |
| ------------ | --------------- | ----------------|---- | ---------- | -------- | | ------------ | --------------- | ----------------|---- | ---------- | -------- |
| 中英文超轻量OCR模型(8.1M) | ch_ppocr_mobile_v1.1_xx |移动端&服务器端|[推理模型](https://paddleocr.bj.bcebos.com/20-09-22/mobile/det/ch_ppocr_mobile_v1.1_det_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/20-09-22/mobile/det/ch_ppocr_mobile_v1.1_det_train.tar)|[推理模型](https://paddleocr.bj.bcebos.com/20-09-22/cls/ch_ppocr_mobile_v1.1_cls_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/20-09-22/cls/ch_ppocr_mobile_v1.1_cls_train.tar) |[推理模型](https://paddleocr.bj.bcebos.com/20-09-22/mobile/rec/ch_ppocr_mobile_v1.1_rec_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/20-09-22/mobile/rec/ch_ppocr_mobile_v1.1_rec_pre.tar) | | 中英文超轻量OCR模型(8.1M) | ch_ppocr_mobile_v2.0_xx |移动端&服务器端|[推理模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_det_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_det_train.tar)|[推理模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_train.tar) |[推理模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_rec_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_rec_pre.tar) |
| 中英文通用OCR模型(155.1M) |ch_ppocr_server_v1.1_xx|服务器端 |[推理模型](https://paddleocr.bj.bcebos.com/20-09-22/server/det/ch_ppocr_server_v1.1_det_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/20-09-22/server/det/ch_ppocr_server_v1.1_det_train.tar) |[推理模型](https://paddleocr.bj.bcebos.com/20-09-22/cls/ch_ppocr_mobile_v1.1_cls_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/20-09-22/cls/ch_ppocr_mobile_v1.1_cls_train.tar) |[推理模型](https://paddleocr.bj.bcebos.com/20-09-22/server/rec/ch_ppocr_server_v1.1_rec_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/20-09-22/server/rec/ch_ppocr_server_v1.1_rec_pre.tar) | | 中英文通用OCR模型(143M) |ch_ppocr_server_v2.0_xx|服务器端 |[推理模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_det_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_det_train.tar) |[推理模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_train.tar) |[推理模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_rec_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_rec_pre.tar) |
| 中英文超轻量压缩OCR模型(3.5M) | ch_ppocr_mobile_slim_v1.1_xx| 移动端 |[推理模型](https://paddleocr.bj.bcebos.com/20-09-22/mobile-slim/det/ch_ppocr_mobile_v1.1_det_prune_infer.tar) / [slim模型](https://paddleocr.bj.bcebos.com/20-09-22/mobile/lite/ch_ppocr_mobile_v1.1_det_prune_opt.nb) |[推理模型](https://paddleocr.bj.bcebos.com/20-09-22/cls/ch_ppocr_mobile_v1.1_cls_quant_infer.tar) / [slim模型](https://paddleocr.bj.bcebos.com/20-09-22/mobile/lite/ch_ppocr_mobile_v1.1_cls_quant_opt.nb)| [推理模型](https://paddleocr.bj.bcebos.com/20-09-22/mobile-slim/rec/ch_ppocr_mobile_v1.1_rec_quant_infer.tar) / [slim模型](https://paddleocr.bj.bcebos.com/20-09-22/mobile/lite/ch_ppocr_mobile_v1.1_rec_quant_opt.nb)|
更多模型下载(包括多语言),可以参考[PP-OCR v1.1 系列模型下载](./doc/doc_ch/models_list.md) 更多模型下载(包括多语言),可以参考[PP-OCR v2.0 系列模型下载](./doc/doc_ch/models_list.md)
## 文档教程 ## 文档教程
- [快速安装](./doc/doc_ch/installation.md) - [快速安装](./doc/doc_ch/installation.md)
...@@ -141,6 +142,7 @@ PP-OCR是一个实用的超轻量OCR系统。主要由DB文本检测、检测框 ...@@ -141,6 +142,7 @@ PP-OCR是一个实用的超轻量OCR系统。主要由DB文本检测、检测框
## 贡献代码 ## 贡献代码
我们非常欢迎你为PaddleOCR贡献代码,也十分感谢你的反馈。 我们非常欢迎你为PaddleOCR贡献代码,也十分感谢你的反馈。
- 非常感谢 [Khanh Tran](https://github.com/xxxpsyduck)[Karl Horky](https://github.com/karlhorky) 贡献修改英文文档 - 非常感谢 [Khanh Tran](https://github.com/xxxpsyduck)[Karl Horky](https://github.com/karlhorky) 贡献修改英文文档
- 非常感谢 [zhangxin](https://github.com/ZhangXinNan)([Blog](https://blog.csdn.net/sdlypyzq)) 贡献新的可视化方式、添加.gitgnore、处理手动设置PYTHONPATH环境变量的问题 - 非常感谢 [zhangxin](https://github.com/ZhangXinNan)([Blog](https://blog.csdn.net/sdlypyzq)) 贡献新的可视化方式、添加.gitgnore、处理手动设置PYTHONPATH环境变量的问题
- 非常感谢 [lyl120117](https://github.com/lyl120117) 贡献打印网络结构的代码 - 非常感谢 [lyl120117](https://github.com/lyl120117) 贡献打印网络结构的代码
...@@ -148,3 +150,6 @@ PP-OCR是一个实用的超轻量OCR系统。主要由DB文本检测、检测框 ...@@ -148,3 +150,6 @@ PP-OCR是一个实用的超轻量OCR系统。主要由DB文本检测、检测框
- 非常感谢 [authorfu](https://github.com/authorfu) 贡献Android和[xiadeye](https://github.com/xiadeye) 贡献IOS的demo代码 - 非常感谢 [authorfu](https://github.com/authorfu) 贡献Android和[xiadeye](https://github.com/xiadeye) 贡献IOS的demo代码
- 非常感谢 [BeyondYourself](https://github.com/BeyondYourself) 给PaddleOCR提了很多非常棒的建议,并简化了PaddleOCR的部分代码风格。 - 非常感谢 [BeyondYourself](https://github.com/BeyondYourself) 给PaddleOCR提了很多非常棒的建议,并简化了PaddleOCR的部分代码风格。
- 非常感谢 [tangmq](https://gitee.com/tangmq) 给PaddleOCR增加Docker化部署服务,支持快速发布可调用的Restful API服务。 - 非常感谢 [tangmq](https://gitee.com/tangmq) 给PaddleOCR增加Docker化部署服务,支持快速发布可调用的Restful API服务。
- 非常感谢 [lijinhan](https://github.com/lijinhan) 给PaddleOCR增加java SpringBoot 调用OCR Hubserving接口完成对OCR服务化部署的使用。
- 非常感谢 [Mejans](https://github.com/Mejans) 给PaddleOCR增加新语言奥克西坦语Occitan的字典和语料。
- 非常感谢 [Evezerest](https://github.com/Evezerest)[ninetailskim](https://github.com/ninetailskim)[edencfc](https://github.com/edencfc)[BeyondYourself](https://github.com/BeyondYourself)[1084667371](https://github.com/1084667371) 贡献了PPOCRLabel的完整代码。
此差异已折叠。
...@@ -8,7 +8,6 @@ Global: ...@@ -8,7 +8,6 @@ Global:
# evaluation is run every 5000 iterations after the 4000th iteration # evaluation is run every 5000 iterations after the 4000th iteration
eval_batch_step: [0, 1000] eval_batch_step: [0, 1000]
# if pretrained_model is saved in static mode, load_static_weights must set to True # if pretrained_model is saved in static mode, load_static_weights must set to True
load_static_weights: True
cal_metric_during_train: True cal_metric_during_train: True
pretrained_model: pretrained_model:
checkpoints: checkpoints:
......
Global:
use_gpu: true
epoch_num: 1200
log_smooth_window: 20
print_batch_step: 2
save_model_dir: ./output/det_r50_vd/
save_epoch_step: 1200
# evaluation is run every 5000 iterations after the 4000th iteration
eval_batch_step: 8
# if pretrained_model is saved in static mode, load_static_weights must set to True
load_static_weights: True
cal_metric_during_train: False
pretrained_model: ./pretrain_models/ResNet50_vd_ssld_pretrained/
checkpoints:
save_inference_dir:
use_visualdl: True
infer_img: doc/imgs_en/img_10.jpg
save_res_path: ./output/det_db/predicts_db.txt
Optimizer:
name: Adam
beta1: 0.9
beta2: 0.999
learning_rate:
lr: 0.001
regularizer:
name: 'L2'
factor: 0
Architecture:
type: det
algorithm: DB
Transform:
Backbone:
name: ResNet
layers: 50
Neck:
name: FPN
out_channels: 256
Head:
name: DBHead
k: 50
Loss:
name: DBLoss
balance_loss: true
main_loss_type: DiceLoss
alpha: 5
beta: 10
ohem_ratio: 3
PostProcess:
name: DBPostProcess
thresh: 0.3
box_thresh: 0.6
max_candidates: 1000
unclip_ratio: 1.5
Metric:
name: DetMetric
main_indicator: hmean
TRAIN:
dataset:
name: SimpleDataSet
data_dir: ./detection/
file_list:
- ./detection/train_icdar2015_label.txt # dataset1
ratio_list: [1.0]
transforms:
- DecodeImage: # load image
img_mode: BGR
channel_first: False
- DetLabelEncode: # Class handling label
- IaaAugment:
augmenter_args:
- { 'type': Fliplr, 'args': { 'p': 0.5 } }
- { 'type': Affine, 'args': { 'rotate': [ -10,10 ] } }
- { 'type': Resize,'args': { 'size': [ 0.5,3 ] } }
- EastRandomCropData:
size: [ 640,640 ]
max_tries: 50
keep_ratio: true
- MakeBorderMap:
shrink_ratio: 0.4
thresh_min: 0.3
thresh_max: 0.7
- MakeShrinkMap:
shrink_ratio: 0.4
min_text_size: 8
- NormalizeImage:
scale: 1./255.
mean: [ 0.485, 0.456, 0.406 ]
std: [ 0.229, 0.224, 0.225 ]
order: 'hwc'
- ToCHWImage:
- keepKeys:
keep_keys: ['image','threshold_map','threshold_mask','shrink_map','shrink_mask'] # dataloader will return list in this order
loader:
shuffle: True
drop_last: False
batch_size: 16
num_workers: 8
EVAL:
dataset:
name: SimpleDataSet
data_dir: ./detection/
file_list:
- ./detection/test_icdar2015_label.txt
transforms:
- DecodeImage: # load image
img_mode: BGR
channel_first: False
- DetLabelEncode: # Class handling label
- DetResizeForTest:
image_shape: [736,1280]
- NormalizeImage:
scale: 1./255.
mean: [ 0.485, 0.456, 0.406 ]
std: [ 0.229, 0.224, 0.225 ]
order: 'hwc'
- ToCHWImage:
- keepKeys:
keep_keys: ['image','shape','polys','ignore_tags']
loader:
shuffle: False
drop_last: False
batch_size: 1 # must be 1
num_workers: 8
\ No newline at end of file
...@@ -11,7 +11,7 @@ Global: ...@@ -11,7 +11,7 @@ Global:
load_static_weights: True load_static_weights: True
cal_metric_during_train: False cal_metric_during_train: False
pretrained_model: ./pretrain_models/MobileNetV3_large_x0_5_pretrained pretrained_model: ./pretrain_models/MobileNetV3_large_x0_5_pretrained
checkpoints: #./output/det_db_0.001_DiceLoss_256_pp_config_2.0b_4gpu/best_accuracy checkpoints:
save_inference_dir: save_inference_dir:
use_visualdl: False use_visualdl: False
infer_img: doc/imgs_en/img_10.jpg infer_img: doc/imgs_en/img_10.jpg
......
...@@ -11,7 +11,7 @@ Global: ...@@ -11,7 +11,7 @@ Global:
load_static_weights: True load_static_weights: True
cal_metric_during_train: False cal_metric_during_train: False
pretrained_model: ./pretrain_models/ResNet18_vd_pretrained pretrained_model: ./pretrain_models/ResNet18_vd_pretrained
checkpoints: #./output/det_db_0.001_DiceLoss_256_pp_config_2.0b_4gpu/best_accuracy checkpoints:
save_inference_dir: save_inference_dir:
use_visualdl: False use_visualdl: False
infer_img: doc/imgs_en/img_10.jpg infer_img: doc/imgs_en/img_10.jpg
......
...@@ -11,7 +11,7 @@ Global: ...@@ -11,7 +11,7 @@ Global:
load_static_weights: True load_static_weights: True
cal_metric_during_train: False cal_metric_during_train: False
pretrained_model: ./pretrain_models/MobileNetV3_large_x0_5_pretrained pretrained_model: ./pretrain_models/MobileNetV3_large_x0_5_pretrained
checkpoints: #./output/det_db_0.001_DiceLoss_256_pp_config_2.0b_4gpu/best_accuracy checkpoints:
save_inference_dir: save_inference_dir:
use_visualdl: False use_visualdl: False
infer_img: doc/imgs_en/img_10.jpg infer_img: doc/imgs_en/img_10.jpg
......
...@@ -3,7 +3,7 @@ Global: ...@@ -3,7 +3,7 @@ Global:
epoch_num: 1200 epoch_num: 1200
log_smooth_window: 20 log_smooth_window: 20
print_batch_step: 10 print_batch_step: 10
save_model_dir: ./output/det_rc/det_r50_vd/ save_model_dir: ./output/det_r50_vd/
save_epoch_step: 1200 save_epoch_step: 1200
# evaluation is run every 5000 iterations after the 4000th iteration # evaluation is run every 5000 iterations after the 4000th iteration
eval_batch_step: [5000,4000] eval_batch_step: [5000,4000]
......
Global:
use_gpu: false
epoch_num: 500
log_smooth_window: 20
print_batch_step: 10
save_model_dir: ./output/rec/mv3_none_bilstm_ctc/
save_epoch_step: 500
# evaluation is run every 5000 iterations after the 4000th iteration
eval_batch_step: 127
# if pretrained_model is saved in static mode, load_static_weights must set to True
load_static_weights: True
cal_metric_during_train: True
pretrained_model:
checkpoints:
save_inference_dir:
use_visualdl: False
infer_img: doc/imgs_words/ch/word_1.jpg
# for data or label process
max_text_length: 80
character_dict_path: ppocr/utils/ppocr_keys_v1.txt
character_type: 'ch'
use_space_char: False
infer_mode: False
use_tps: False
Optimizer:
name: Adam
beta1: 0.9
beta2: 0.999
learning_rate:
lr: 0.001
regularizer:
name: 'L2'
factor: 0.00001
Architecture:
type: rec
algorithm: CRNN
Transform:
Backbone:
name: MobileNetV3
scale: 0.5
model_name: small
small_stride: [ 1, 2, 2, 2 ]
Neck:
name: SequenceEncoder
encoder_type: fc
hidden_size: 96
Head:
name: CTC
fc_decay: 0.00001
Loss:
name: CTCLoss
PostProcess:
name: CTCLabelDecode
Metric:
name: RecMetric
main_indicator: acc
TRAIN:
dataset:
name: SimpleDataSet
data_dir: ./rec
file_list:
- ./rec/train.txt # dataset1
ratio_list: [ 0.4,0.6 ]
transforms:
- DecodeImage: # load image
img_mode: BGR
channel_first: False
- CTCLabelEncode: # Class handling label
- RecAug:
- RecResizeImg:
image_shape: [ 3,32,320 ]
- keepKeys:
keep_keys: [ 'image','label','length' ] # dataloader will return list in this order
loader:
batch_size: 256
shuffle: True
drop_last: True
num_workers: 8
EVAL:
dataset:
name: SimpleDataSet
data_dir: ./rec
file_list:
- ./rec/val.txt
transforms:
- DecodeImage: # load image
img_mode: BGR
channel_first: False
- CTCLabelEncode: # Class handling label
- RecResizeImg:
image_shape: [ 3,32,320 ]
- keepKeys:
keep_keys: [ 'image','label','length' ] # dataloader will return list in this order
loader:
shuffle: False
drop_last: False
batch_size: 256
num_workers: 8
Global:
use_gpu: false
epoch_num: 500
log_smooth_window: 20
print_batch_step: 10
save_model_dir: ./output/rec/res34_none_bilstm_ctc/
save_epoch_step: 500
# evaluation is run every 5000 iterations after the 4000th iteration
eval_batch_step: 127
# if pretrained_model is saved in static mode, load_static_weights must set to True
load_static_weights: True
cal_metric_during_train: True
pretrained_model:
checkpoints:
save_inference_dir:
use_visualdl: False
infer_img: doc/imgs_words/ch/word_1.jpg
# for data or label process
max_text_length: 80
character_dict_path: ppocr/utils/ppocr_keys_v1.txt
character_type: 'ch'
use_space_char: False
infer_mode: False
use_tps: False
Optimizer:
name: Adam
beta1: 0.9
beta2: 0.999
learning_rate:
lr: 0.001
regularizer:
name: 'L2'
factor: 0.00001
Architecture:
type: rec
algorithm: CRNN
Transform:
Backbone:
name: ResNet
layers: 34
Neck:
name: SequenceEncoder
encoder_type: fc
hidden_size: 96
Head:
name: CTC
fc_decay: 0.00001
Loss:
name: CTCLoss
PostProcess:
name: CTCLabelDecode
Metric:
name: RecMetric
main_indicator: acc
TRAIN:
dataset:
name: SimpleDataSet
data_dir: ./rec
file_list:
- ./rec/train.txt # dataset1
ratio_list: [ 0.4,0.6 ]
transforms:
- DecodeImage: # load image
img_mode: BGR
channel_first: False
- CTCLabelEncode: # Class handling label
- RecAug:
- RecResizeImg:
image_shape: [ 3,32,320 ]
- keepKeys:
keep_keys: [ 'image','label','length' ] # dataloader will return list in this order
loader:
batch_size: 256
shuffle: True
drop_last: True
num_workers: 8
EVAL:
dataset:
name: SimpleDataSet
data_dir: ./rec
file_list:
- ./rec/val.txt
transforms:
- DecodeImage: # load image
img_mode: BGR
channel_first: False
- CTCLabelEncode: # Class handling label
- RecResizeImg:
image_shape: [ 3,32,320 ]
- keepKeys:
keep_keys: [ 'image','label','length' ] # dataloader will return list in this order
loader:
shuffle: False
drop_last: False
batch_size: 256
num_workers: 8
...@@ -3,7 +3,7 @@ Global: ...@@ -3,7 +3,7 @@ Global:
epoch_num: 500 epoch_num: 500
log_smooth_window: 20 log_smooth_window: 20
print_batch_step: 10 print_batch_step: 10
save_model_dir: ./output/rec_chinese_common_v1.1 save_model_dir: ./output/rec_chinese_common_v2.0
save_epoch_step: 3 save_epoch_step: 3
# evaluation is run every 5000 iterations after the 4000th iteration # evaluation is run every 5000 iterations after the 4000th iteration
eval_batch_step: [0, 2000] eval_batch_step: [0, 2000]
......
...@@ -3,7 +3,7 @@ Global: ...@@ -3,7 +3,7 @@ Global:
epoch_num: 500 epoch_num: 500
log_smooth_window: 20 log_smooth_window: 20
print_batch_step: 10 print_batch_step: 10
save_model_dir: ./output/rec_chinese_lite_v1.1 save_model_dir: ./output/rec_chinese_lite_v2.0
save_epoch_step: 3 save_epoch_step: 3
# evaluation is run every 5000 iterations after the 4000th iteration # evaluation is run every 5000 iterations after the 4000th iteration
eval_batch_step: [0, 2000] eval_batch_step: [0, 2000]
...@@ -19,7 +19,7 @@ Global: ...@@ -19,7 +19,7 @@ Global:
character_type: ch character_type: ch
max_text_length: 25 max_text_length: 25
infer_mode: False infer_mode: False
use_space_char: False use_space_char: True
Optimizer: Optimizer:
......
Global: Global:
use_gpu: true use_gpu: True
epoch_num: 500 epoch_num: 500
log_smooth_window: 20 log_smooth_window: 20
print_batch_step: 10 print_batch_step: 10
...@@ -15,7 +15,7 @@ Global: ...@@ -15,7 +15,7 @@ Global:
use_visualdl: False use_visualdl: False
infer_img: infer_img:
# for data or label process # for data or label process
character_dict_path: ppocr/utils/dict/ic15_dict.txt character_dict_path: ppocr/utils/dict/en_dict.txt
character_type: ch character_type: ch
max_text_length: 25 max_text_length: 25
infer_mode: False infer_mode: False
......
Global: Global:
use_gpu: true use_gpu: True
epoch_num: 500 epoch_num: 500
log_smooth_window: 20 log_smooth_window: 20
print_batch_step: 10 print_batch_step: 10
...@@ -9,9 +9,9 @@ Global: ...@@ -9,9 +9,9 @@ Global:
eval_batch_step: [0, 2000] eval_batch_step: [0, 2000]
# if pretrained_model is saved in static mode, load_static_weights must set to True # if pretrained_model is saved in static mode, load_static_weights must set to True
cal_metric_during_train: True cal_metric_during_train: True
pretrained_model: pretrained_model:
checkpoints: checkpoints:
save_inference_dir: save_inference_dir:
use_visualdl: False use_visualdl: False
infer_img: infer_img:
# for data or label process # for data or label process
...@@ -19,7 +19,7 @@ Global: ...@@ -19,7 +19,7 @@ Global:
character_type: french character_type: french
max_text_length: 25 max_text_length: 25
infer_mode: False infer_mode: False
use_space_char: True use_space_char: False
Optimizer: Optimizer:
......
Global: Global:
use_gpu: true use_gpu: True
epoch_num: 500 epoch_num: 500
log_smooth_window: 20 log_smooth_window: 20
print_batch_step: 10 print_batch_step: 10
...@@ -19,7 +19,7 @@ Global: ...@@ -19,7 +19,7 @@ Global:
character_type: german character_type: german
max_text_length: 25 max_text_length: 25
infer_mode: False infer_mode: False
use_space_char: True use_space_char: False
Optimizer: Optimizer:
......
Global: Global:
use_gpu: true use_gpu: True
epoch_num: 500 epoch_num: 500
log_smooth_window: 20 log_smooth_window: 20
print_batch_step: 10 print_batch_step: 10
...@@ -19,7 +19,7 @@ Global: ...@@ -19,7 +19,7 @@ Global:
character_type: japan character_type: japan
max_text_length: 25 max_text_length: 25
infer_mode: False infer_mode: False
use_space_char: True use_space_char: False
Optimizer: Optimizer:
......
Global: Global:
use_gpu: true use_gpu: True
epoch_num: 500 epoch_num: 500
log_smooth_window: 20 log_smooth_window: 20
print_batch_step: 10 print_batch_step: 10
...@@ -19,7 +19,7 @@ Global: ...@@ -19,7 +19,7 @@ Global:
character_type: korean character_type: korean
max_text_length: 25 max_text_length: 25
infer_mode: False infer_mode: False
use_space_char: True use_space_char: False
Optimizer: Optimizer:
......
Global: Global:
use_gpu: false use_gpu: true
epoch_num: 500 epoch_num: 72
log_smooth_window: 20 log_smooth_window: 20
print_batch_step: 10 print_batch_step: 10
save_model_dir: ./output/rec/res34_none_none_ctc/ save_model_dir: ./output/rec/ic15/
save_epoch_step: 500 save_epoch_step: 3
# evaluation is run every 5000 iterations after the 4000th iteration # evaluation is run every 2000 iterations
eval_batch_step: 127 eval_batch_step: [0, 2000]
# if pretrained_model is saved in static mode, load_static_weights must set to True # if pretrained_model is saved in static mode, load_static_weights must set to True
load_static_weights: True
cal_metric_during_train: True cal_metric_during_train: True
pretrained_model: pretrained_model:
checkpoints: checkpoints:
save_inference_dir: save_inference_dir:
use_visualdl: False use_visualdl: False
infer_img: doc/imgs_words/ch/word_1.jpg infer_img: doc/imgs_words_en/word_10.png
# for data or label process # for data or label process
max_text_length: 80 character_dict_path: ppocr/utils/ic15_dict.txt
character_dict_path: ppocr/utils/ppocr_keys_v1.txt character_type: ch
character_type: 'ch' max_text_length: 25
use_space_char: False
infer_mode: False infer_mode: False
use_tps: False use_space_char: False
Optimizer: Optimizer:
name: Adam name: Adam
beta1: 0.9 beta1: 0.9
beta2: 0.999 beta2: 0.999
learning_rate: lr:
lr: 0.001 learning_rate: 0.0005
regularizer: regularizer:
name: 'L2' name: 'L2'
factor: 0.00001 factor: 0
Architecture: Architecture:
type: rec model_type: rec
algorithm: CRNN algorithm: CRNN
Transform: Transform:
Backbone: Backbone:
...@@ -43,10 +40,11 @@ Architecture: ...@@ -43,10 +40,11 @@ Architecture:
layers: 34 layers: 34
Neck: Neck:
name: SequenceEncoder name: SequenceEncoder
encoder_type: reshape encoder_type: rnn
hidden_size: 256
Head: Head:
name: CTC name: CTCHead
fc_decay: 0.00001 fc_decay: 0
Loss: Loss:
name: CTCLoss name: CTCLoss
...@@ -58,46 +56,42 @@ Metric: ...@@ -58,46 +56,42 @@ Metric:
name: RecMetric name: RecMetric
main_indicator: acc main_indicator: acc
TRAIN: Train:
dataset: dataset:
name: SimpleDataSet name: SimpleDataSet
data_dir: ./rec data_dir: ./train_data/
file_list: label_file_list: ["./train_data/train_list.txt"]
- ./rec/train.txt # dataset1
ratio_list: [ 0.4,0.6 ]
transforms: transforms:
- DecodeImage: # load image - DecodeImage: # load image
img_mode: BGR img_mode: BGR
channel_first: False channel_first: False
- CTCLabelEncode: # Class handling label - CTCLabelEncode: # Class handling label
- RecAug:
- RecResizeImg: - RecResizeImg:
image_shape: [ 3,32,320 ] image_shape: [3, 32, 100]
- keepKeys: - KeepKeys:
keep_keys: [ 'image','label','length' ] # dataloader will return list in this order keep_keys: ['image', 'label', 'length'] # dataloader will return list in this order
loader: loader:
batch_size: 256
shuffle: True shuffle: True
batch_size_per_card: 256
drop_last: True drop_last: True
num_workers: 8 num_workers: 8
EVAL: Eval:
dataset: dataset:
name: SimpleDataSet name: SimpleDataSet
data_dir: ./rec data_dir: ./train_data/
file_list: label_file_list: ["./train_data/train_list.txt"]
- ./rec/val.txt
transforms: transforms:
- DecodeImage: # load image - DecodeImage: # load image
img_mode: BGR img_mode: BGR
channel_first: False channel_first: False
- CTCLabelEncode: # Class handling label - CTCLabelEncode: # Class handling label
- RecResizeImg: - RecResizeImg:
image_shape: [ 3,32,320 ] image_shape: [3, 32, 100]
- keepKeys: - KeepKeys:
keep_keys: [ 'image','label','length' ] # dataloader will return list in this order keep_keys: ['image', 'label', 'length'] # dataloader will return list in this order
loader: loader:
shuffle: False shuffle: False
drop_last: False drop_last: False
batch_size: 256 batch_size_per_card: 256
num_workers: 8 num_workers: 4
...@@ -81,7 +81,8 @@ cv::Mat Classifier::Run(cv::Mat &img) { ...@@ -81,7 +81,8 @@ cv::Mat Classifier::Run(cv::Mat &img) {
void Classifier::LoadModel(const std::string &model_dir) { void Classifier::LoadModel(const std::string &model_dir) {
AnalysisConfig config; AnalysisConfig config;
config.SetModel(model_dir + "/model", model_dir + "/params"); config.SetModel(model_dir + "/inference.pdmodel",
model_dir + "/inference.pdiparams");
if (this->use_gpu_) { if (this->use_gpu_) {
config.EnableUseGpu(this->gpu_mem_, this->gpu_id_); config.EnableUseGpu(this->gpu_mem_, this->gpu_id_);
......
...@@ -18,7 +18,8 @@ namespace PaddleOCR { ...@@ -18,7 +18,8 @@ namespace PaddleOCR {
void DBDetector::LoadModel(const std::string &model_dir) { void DBDetector::LoadModel(const std::string &model_dir) {
AnalysisConfig config; AnalysisConfig config;
config.SetModel(model_dir + "/model", model_dir + "/params"); config.SetModel(model_dir + "/inference.pdmodel",
model_dir + "/inference.pdiparams");
if (this->use_gpu_) { if (this->use_gpu_) {
config.EnableUseGpu(this->gpu_mem_, this->gpu_id_); config.EnableUseGpu(this->gpu_mem_, this->gpu_id_);
......
...@@ -103,7 +103,8 @@ void CRNNRecognizer::Run(std::vector<std::vector<std::vector<int>>> boxes, ...@@ -103,7 +103,8 @@ void CRNNRecognizer::Run(std::vector<std::vector<std::vector<int>>> boxes,
void CRNNRecognizer::LoadModel(const std::string &model_dir) { void CRNNRecognizer::LoadModel(const std::string &model_dir) {
AnalysisConfig config; AnalysisConfig config;
config.SetModel(model_dir + "/model", model_dir + "/params"); config.SetModel(model_dir + "/inference.pdmodel",
model_dir + "/inference.pdiparams");
if (this->use_gpu_) { if (this->use_gpu_) {
config.EnableUseGpu(this->gpu_mem_, this->gpu_id_); config.EnableUseGpu(this->gpu_mem_, this->gpu_id_);
......
English | [简体中文](README_cn.md)
## Introduction
Many users hope package the PaddleOCR service into a docker image, so that it can be quickly released and used in the docker or k8s environment.
This page provides some standardized code to achieve this goal. You can quickly publish the PaddleOCR project into a callable Restful API service through the following steps. (At present, the deployment based on the HubServing mode is implemented first, and author plans to increase the deployment of the PaddleServing mode in the futrue)
## 1. Prerequisites
You need to install the following basic components first:
a. Docker
b. Graphics driver and CUDA 10.0+(GPU)
c. NVIDIA Container Toolkit(GPU,Docker 19.03+ can skip this)
d. cuDNN 7.6+(GPU)
## 2. Build Image
a. Goto Dockerfile directory(ps:Need to distinguish between cpu and gpu version, the following takes cpu as an example, gpu version needs to replace the keyword)
```
cd deploy/docker/hubserving/cpu
```
c. Build image
```
docker build -t paddleocr:cpu .
```
## 3. Start container
a. CPU version
```
sudo docker run -dp 8868:8868 --name paddle_ocr paddleocr:cpu
```
b. GPU version (base on NVIDIA Container Toolkit)
```
sudo nvidia-docker run -dp 8868:8868 --name paddle_ocr paddleocr:gpu
```
c. GPU version (Docker 19.03++)
```
sudo docker run -dp 8868:8868 --gpus all --name paddle_ocr paddleocr:gpu
```
d. Check service status(If you can see the following statement then it means completed:Successfully installed ocr_system && Running on http://0.0.0.0:8868/)
```
docker logs -f paddle_ocr
```
## 4. Test
a. Calculate the Base64 encoding of the picture to be recognized (if you just test, you can use a free online tool, like:https://freeonlinetools24.com/base64-image/)
b. Post a service request(sample request in sample_request.txt)
```
curl -H "Content-Type:application/json" -X POST --data "{\"images\": [\"Input image Base64 encode(need to delete the code 'data:image/jpg;base64,')\"]}" http://localhost:8868/predict/ocr_system
```
c. Get resposne(If the call is successful, the following result will be returned)
```
{"msg":"","results":[[{"confidence":0.8403433561325073,"text":"约定","text_region":[[345,377],[641,390],[634,540],[339,528]]},{"confidence":0.8131805658340454,"text":"最终相遇","text_region":[[356,532],[624,530],[624,596],[356,598]]}]],"status":"0"}
```
[English](README.md) | 简体中文
## Docker化部署服务
在日常项目应用中,相信大家一般都会希望能通过Docker技术,把PaddleOCR服务打包成一个镜像,以便在Docker或k8s环境里,快速发布上线使用。
本文将提供一些标准化的代码来实现这样的目标。大家通过如下步骤可以把PaddleOCR项目快速发布成可调用的Restful API服务。(目前暂时先实现了基于HubServing模式的部署,后续作者计划增加PaddleServing模式的部署)
## 1.实施前提准备
需要先完成如下基本组件的安装:
a. Docker环境
b. 显卡驱动和CUDA 10.0+(GPU)
c. NVIDIA Container Toolkit(GPU,Docker 19.03以上版本可以跳过此步)
d. cuDNN 7.6+(GPU)
## 2.制作镜像
a.切换至Dockerfile目录(注:需要区分cpu或gpu版本,下文以cpu为例,gpu版本需要替换一下关键字即可)
```
cd deploy/docker/hubserving/cpu
```
c.生成镜像
```
docker build -t paddleocr:cpu .
```
## 3.启动Docker容器
a. CPU 版本
```
sudo docker run -dp 8868:8868 --name paddle_ocr paddleocr:cpu
```
b. GPU 版本 (通过NVIDIA Container Toolkit)
```
sudo nvidia-docker run -dp 8868:8868 --name paddle_ocr paddleocr:gpu
```
c. GPU 版本 (Docker 19.03以上版本,可以直接用如下命令)
```
sudo docker run -dp 8868:8869 --gpus all --name paddle_ocr paddleocr:gpu
```
d. 检查服务运行情况(出现:Successfully installed ocr_system和Running on http://0.0.0.0:8868 等信息,表示运行成功)
```
docker logs -f paddle_ocr
```
## 4.测试服务
a. 计算待识别图片的Base64编码(如果只是测试一下效果,可以通过免费的在线工具实现,如:http://tool.chinaz.com/tools/imgtobase/)
b. 发送服务请求(可参见sample_request.txt中的值)
```
curl -H "Content-Type:application/json" -X POST --data "{\"images\": [\"填入图片Base64编码(需要删除'data:image/jpg;base64,')\"]}" http://localhost:8868/predict/ocr_system
```
c. 返回结果(如果调用成功,会返回如下结果)
```
{"msg":"","results":[[{"confidence":0.8403433561325073,"text":"约定","text_region":[[345,377],[641,390],[634,540],[339,528]]},{"confidence":0.8131805658340454,"text":"最终相遇","text_region":[[356,532],[624,530],[624,596],[356,598]]}]],"status":"0"}
```
# Version: 1.0.0
FROM hub.baidubce.com/paddlepaddle/paddle:latest-gpu-cuda10.0-cudnn7-dev
# PaddleOCR base on Python3.7
RUN pip3.7 install --upgrade pip -i https://mirror.baidu.com/pypi/simple
RUN python3.7 -m pip install paddlepaddle==2.0.0rc0 -i https://mirror.baidu.com/pypi/simple
RUN pip3.7 install paddlehub --upgrade -i https://mirror.baidu.com/pypi/simple
RUN git clone https://github.com/PaddlePaddle/PaddleOCR.git /PaddleOCR
WORKDIR /PaddleOCR
RUN pip3.7 install -r requirements.txt -i https://mirror.baidu.com/pypi/simple
RUN mkdir -p /PaddleOCR/inference/
# Download orc detect model(light version). if you want to change normal version, you can change ch_ppocr_mobile_v1.1_det_infer to ch_ppocr_server_v1.1_det_infer, also remember change det_model_dir in deploy/hubserving/ocr_system/params.py)
ADD {link} /PaddleOCR/inference/
RUN tar xf /PaddleOCR/inference/{file} -C /PaddleOCR/inference/
# Download direction classifier(light version). If you want to change normal version, you can change ch_ppocr_mobile_v1.1_cls_infer to ch_ppocr_mobile_v1.1_cls_infer, also remember change cls_model_dir in deploy/hubserving/ocr_system/params.py)
ADD {link} /PaddleOCR/inference/
RUN tar xf /PaddleOCR/inference/{file}.tar -C /PaddleOCR/inference/
# Download orc recognition model(light version). If you want to change normal version, you can change ch_ppocr_mobile_v1.1_rec_infer to ch_ppocr_server_v1.1_rec_infer, also remember change rec_model_dir in deploy/hubserving/ocr_system/params.py)
ADD {link} /PaddleOCR/inference/
RUN tar xf /PaddleOCR/inference/{file}.tar -C /PaddleOCR/inference/
EXPOSE 8868
CMD ["/bin/bash","-c","hub install deploy/hubserving/ocr_system/ && hub serving start -m ocr_system"]
\ No newline at end of file
# Version: 1.0.0
FROM hub.baidubce.com/paddlepaddle/paddle:latest-gpu-cuda10.0-cudnn7-dev
# PaddleOCR base on Python3.7
RUN pip3.7 install --upgrade pip -i https://mirror.baidu.com/pypi/simple
RUN python3.7 -m pip install paddlepaddle-gpu==2.0.0rc0 -i https://mirror.baidu.com/pypi/simple
RUN pip3.7 install paddlehub --upgrade -i https://mirror.baidu.com/pypi/simple
RUN git clone https://github.com/PaddlePaddle/PaddleOCR.git /PaddleOCR
WORKDIR /PaddleOCR
RUN pip3.7 install -r requirements.txt -i https://mirror.baidu.com/pypi/simple
RUN mkdir -p /PaddleOCR/inference/
# Download orc detect model(light version). if you want to change normal version, you can change ch_ppocr_mobile_v1.1_det_infer to ch_ppocr_server_v1.1_det_infer, also remember change det_model_dir in deploy/hubserving/ocr_system/params.py)
ADD {link} /PaddleOCR/inference/
RUN tar xf /PaddleOCR/inference/{file}.tar -C /PaddleOCR/inference/
# Download direction classifier(light version). If you want to change normal version, you can change ch_ppocr_mobile_v1.1_cls_infer to ch_ppocr_mobile_v1.1_cls_infer, also remember change cls_model_dir in deploy/hubserving/ocr_system/params.py)
ADD {link} /PaddleOCR/inference/
RUN tar xf /PaddleOCR/inference/{file} -C /PaddleOCR/inference/
# Download orc recognition model(light version). If you want to change normal version, you can change ch_ppocr_mobile_v1.1_rec_infer to ch_ppocr_server_v1.1_rec_infer, also remember change rec_model_dir in deploy/hubserving/ocr_system/params.py)
ADD {link} /PaddleOCR/inference/
RUN tar xf /PaddleOCR/inference/{file}.tar -C /PaddleOCR/inference/
EXPOSE 8868
CMD ["/bin/bash","-c","hub install deploy/hubserving/ocr_system/ && hub serving start -m ocr_system"]
\ No newline at end of file
此差异已折叠。
...@@ -17,17 +17,17 @@ PaddleOCR开源的文本检测算法列表: ...@@ -17,17 +17,17 @@ PaddleOCR开源的文本检测算法列表:
|模型|骨干网络|precision|recall|Hmean|下载链接| |模型|骨干网络|precision|recall|Hmean|下载链接|
|-|-|-|-|-|-| |-|-|-|-|-|-|
|EAST|ResNet50_vd|88.18%|85.51%|86.82%|[下载链接](https://paddleocr.bj.bcebos.com/det_r50_vd_east.tar)| |EAST|ResNet50_vd|88.18%|85.51%|86.82%|[下载链接](link)|
|EAST|MobileNetV3|81.67%|79.83%|80.74%|[下载链接](https://paddleocr.bj.bcebos.com/det_mv3_east.tar)| |EAST|MobileNetV3|81.67%|79.83%|80.74%|[下载链接](link)|
|DB|ResNet50_vd|83.79%|80.65%|82.19%|[下载链接](https://paddleocr.bj.bcebos.com/det_r50_vd_db.tar)| |DB|ResNet50_vd|83.79%|80.65%|82.19%|[下载链接](link)|
|DB|MobileNetV3|75.92%|73.18%|74.53%|[下载链接](https://paddleocr.bj.bcebos.com/det_mv3_db.tar)| |DB|MobileNetV3|75.92%|73.18%|74.53%|[下载链接](link)|
|SAST|ResNet50_vd|92.18%|82.96%|87.33%|[下载链接](https://paddleocr.bj.bcebos.com/SAST/sast_r50_vd_icdar2015.tar)| |SAST|ResNet50_vd|92.18%|82.96%|87.33%|[下载链接](link))|
在Total-text文本检测公开数据集上,算法效果如下: 在Total-text文本检测公开数据集上,算法效果如下:
|模型|骨干网络|precision|recall|Hmean|下载链接| |模型|骨干网络|precision|recall|Hmean|下载链接|
|-|-|-|-|-|-| |-|-|-|-|-|-|
|SAST|ResNet50_vd|88.74%|79.80%|84.03%|[下载链接](https://paddleocr.bj.bcebos.com/SAST/sast_r50_vd_total_text.tar)| |SAST|ResNet50_vd|88.74%|79.80%|84.03%|[下载链接](link)|
**说明:** SAST模型训练额外加入了icdar2013、icdar2017、COCO-Text、ArT等公开数据集进行调优。PaddleOCR用到的经过整理格式的英文公开数据集下载:[百度云地址](https://pan.baidu.com/s/12cPnZcVuV1zn5DOd4mqjVw) (提取码: 2bpi) **说明:** SAST模型训练额外加入了icdar2013、icdar2017、COCO-Text、ArT等公开数据集进行调优。PaddleOCR用到的经过整理格式的英文公开数据集下载:[百度云地址](https://pan.baidu.com/s/12cPnZcVuV1zn5DOd4mqjVw) (提取码: 2bpi)
...@@ -37,28 +37,23 @@ PaddleOCR文本检测算法的训练和使用请参考文档教程中[模型训 ...@@ -37,28 +37,23 @@ PaddleOCR文本检测算法的训练和使用请参考文档教程中[模型训
<a name="文本识别算法"></a> <a name="文本识别算法"></a>
### 2.文本识别算法 ### 2.文本识别算法
PaddleOCR开源的文本识别算法列表: PaddleOCR基于动态图开源的文本识别算法列表:
- [x] CRNN([paper](https://arxiv.org/abs/1507.05717))(ppocr推荐) - [x] CRNN([paper](https://arxiv.org/abs/1507.05717))(ppocr推荐)
- [x] Rosetta([paper](https://arxiv.org/abs/1910.05085)) - [x] Rosetta([paper](https://arxiv.org/abs/1910.05085))
- [x] STAR-Net([paper](http://www.bmva.org/bmvc/2016/papers/paper043/index.html)) - [x] STAR-Net([paper](http://www.bmva.org/bmvc/2016/papers/paper043/index.html))
- [x] RARE([paper](https://arxiv.org/abs/1603.03915v1)) - [ ] RARE([paper](https://arxiv.org/abs/1603.03915v1)) coming soon
- [x] SRN([paper](https://arxiv.org/abs/2003.12294)) - [ ] SRN([paper](https://arxiv.org/abs/2003.12294)) coming soon
参考[DTRB](https://arxiv.org/abs/1904.01906)文字识别训练和评估流程,使用MJSynth和SynthText两个文字识别数据集训练,在IIIT, SVT, IC03, IC13, IC15, SVTP, CUTE数据集上进行评估,算法效果如下: 参考[DTRB](https://arxiv.org/abs/1904.01906)文字识别训练和评估流程,使用MJSynth和SynthText两个文字识别数据集训练,在IIIT, SVT, IC03, IC13, IC15, SVTP, CUTE数据集上进行评估,算法效果如下:
|模型|骨干网络|Avg Accuracy|模型存储命名|下载链接| |模型|骨干网络|Avg Accuracy|模型存储命名|下载链接|
|-|-|-|-|-| |-|-|-|-|-|
|Rosetta|Resnet34_vd|80.24%|rec_r34_vd_none_none_ctc|[下载链接](https://paddleocr.bj.bcebos.com/rec_r34_vd_none_none_ctc.tar)| |Rosetta|Resnet34_vd|80.24%|rec_r34_vd_none_none_ctc|[下载链接](link)|
|Rosetta|MobileNetV3|78.16%|rec_mv3_none_none_ctc|[下载链接](https://paddleocr.bj.bcebos.com/rec_mv3_none_none_ctc.tar)| |Rosetta|MobileNetV3|78.16%|rec_mv3_none_none_ctc|[下载链接](link)|
|CRNN|Resnet34_vd|82.20%|rec_r34_vd_none_bilstm_ctc|[下载链接](https://paddleocr.bj.bcebos.com/rec_r34_vd_none_bilstm_ctc.tar)| |CRNN|Resnet34_vd|82.20%|rec_r34_vd_none_bilstm_ctc|[下载链接](link)|
|CRNN|MobileNetV3|79.37%|rec_mv3_none_bilstm_ctc|[下载链接](https://paddleocr.bj.bcebos.com/rec_mv3_none_bilstm_ctc.tar)| |CRNN|MobileNetV3|79.37%|rec_mv3_none_bilstm_ctc|[下载链接](link)|
|STAR-Net|Resnet34_vd|83.93%|rec_r34_vd_tps_bilstm_ctc|[下载链接](https://paddleocr.bj.bcebos.com/rec_r34_vd_tps_bilstm_ctc.tar)| |STAR-Net|Resnet34_vd|83.93%|rec_r34_vd_tps_bilstm_ctc|[下载链接](link)|
|STAR-Net|MobileNetV3|81.56%|rec_mv3_tps_bilstm_ctc|[下载链接](https://paddleocr.bj.bcebos.com/rec_mv3_tps_bilstm_ctc.tar)| |STAR-Net|MobileNetV3|81.56%|rec_mv3_tps_bilstm_ctc|[下载链接](link)|
|RARE|Resnet34_vd|84.90%|rec_r34_vd_tps_bilstm_attn|[下载链接](https://paddleocr.bj.bcebos.com/rec_r34_vd_tps_bilstm_attn.tar)|
|RARE|MobileNetV3|83.32%|rec_mv3_tps_bilstm_attn|[下载链接](https://paddleocr.bj.bcebos.com/rec_mv3_tps_bilstm_attn.tar)|
|SRN|Resnet50_vd_fpn|88.33%|rec_r50fpn_vd_none_srn|[下载链接](https://paddleocr.bj.bcebos.com/SRN/rec_r50fpn_vd_none_srn.tar)|
**说明:** SRN模型使用了数据扰动方法对上述提到对两个训练集进行增广,增广后的数据可以在[百度网盘](https://pan.baidu.com/s/1-HSZ-ZVdqBF2HaBZ5pRAKA)上下载,提取码: y3ry。
原始论文使用两阶段训练平均精度为89.74%,PaddleOCR中使用one-stage训练,平均精度为88.33%。两种预训练权重均在[下载链接](https://paddleocr.bj.bcebos.com/SRN/rec_r50fpn_vd_none_srn.tar)中。
PaddleOCR文本识别算法的训练和使用请参考文档教程中[模型训练/评估中的文本识别部分](./recognition.md) PaddleOCR文本识别算法的训练和使用请参考文档教程中[模型训练/评估中的文本识别部分](./recognition.md)
...@@ -10,14 +10,14 @@ ...@@ -10,14 +10,14 @@
## 配置文件参数介绍 ## 配置文件参数介绍
`rec_chinese_lite_train_v1.1.yml ` 为例 `rec_chinese_lite_train_v2.0.yml ` 为例
### Global ### Global
| 字段 | 用途 | 默认值 | 备注 | | 字段 | 用途 | 默认值 | 备注 |
| :----------------------: | :---------------------: | :--------------: | :--------------------: | | :----------------------: | :---------------------: | :--------------: | :--------------------: |
| use_gpu | 设置代码是否在gpu运行 | true | \ | | use_gpu | 设置代码是否在gpu运行 | true | \ |
| epoch_num | 最大训练epoch数 | 500 | \ | | epoch_num | 最大训练epoch数 | 500 | \ |
| log_smooth_window | 滑动窗口大小 | 20 | \ | | log_smooth_window | log队列长度,每次打印输出队列里的中间值 | 20 | \ |
| print_batch_step | 设置打印log间隔 | 10 | \ | | print_batch_step | 设置打印log间隔 | 10 | \ |
| save_model_dir | 设置模型保存路径 | output/{算法名称} | \ | | save_model_dir | 设置模型保存路径 | output/{算法名称} | \ |
| save_epoch_step | 设置模型保存间隔 | 3 | \ | | save_epoch_step | 设置模型保存间隔 | 3 | \ |
...@@ -119,4 +119,4 @@ ...@@ -119,4 +119,4 @@
| shuffle | 每个epoch是否将数据集顺序打乱 | True | \ | | shuffle | 每个epoch是否将数据集顺序打乱 | True | \ |
| batch_size_per_card | 训练时单卡batch size | 256 | \ | | batch_size_per_card | 训练时单卡batch size | 256 | \ |
| drop_last | 是否丢弃因数据集样本数不能被 batch_size 整除而产生的最后一个不完整的mini-batch | True | \ | | drop_last | 是否丢弃因数据集样本数不能被 batch_size 整除而产生的最后一个不完整的mini-batch | True | \ |
| num_workers | 用于加载数据的子进程个数,若为0即为不开启子进程,在主进程中进行数据加载 | 8 | \ | | num_workers | 用于加载数据的子进程个数,若为0即为不开启子进程,在主进程中进行数据加载 | 8 | \ |
\ No newline at end of file
此差异已折叠。
## OCR模型列表(V1.1,9月22日更新) ## OCR模型列表(V2.0,2020年12月12日更新)
- [一、文本检测模型](#文本检测模型) - [一、文本检测模型](#文本检测模型)
- [二、文本识别模型](#文本识别模型) - [二、文本识别模型](#文本识别模型)
...@@ -10,19 +10,20 @@ ...@@ -10,19 +10,20 @@
PaddleOCR提供的可下载模型包括`推理模型``训练模型``预训练模型``slim模型`,模型区别说明如下: PaddleOCR提供的可下载模型包括`推理模型``训练模型``预训练模型``slim模型`,模型区别说明如下:
|模型类型|模型格式|简介| |模型类型|模型格式|简介|
|-|-|-| |--- | --- | --- |
|推理模型|model、params|用于python预测引擎推理,[详情](./inference.md)| |推理模型|inference.pdmodel、inference.pdiparams|用于python预测引擎推理,[详情](./inference.md)|
|训练模型、预训练模型|\*.pdmodel、\*.pdopt、\*.pdparams|训练过程中保存的checkpoints模型,保存的是模型的参数,多用于模型指标评估和恢复训练| |训练模型、预训练模型|\*.pdparams、\*.pdopt、\*.states |训练过程中保存的模型的参数、优化器状态和训练中间信息,多用于模型指标评估和恢复训练|
|slim模型|\*.nb|用于lite部署| |slim模型|\*.nb|用于lite部署|
<a name="文本检测模型"></a> <a name="文本检测模型"></a>
### 一、文本检测模型 ### 一、文本检测模型
|模型名称|模型简介|配置文件|推理模型大小|下载地址| |模型名称|模型简介|配置文件|推理模型大小|下载地址|
|-|-|-|-|-| | --- | --- | --- | --- | --- |
|ch_ppocr_mobile_slim_v1.1_det|slim裁剪版超轻量模型,支持中英文、多语种文本检测|[det_mv3_db_v1.1.yml](../../configs/det/det_mv3_db_v1.1.yml)|1.4M|[推理模型](https://paddleocr.bj.bcebos.com/20-09-22/mobile-slim/det/ch_ppocr_mobile_v1.1_det_prune_infer.tar) / [slim模型](https://paddleocr.bj.bcebos.com/20-09-22/mobile/lite/ch_ppocr_mobile_v1.1_det_prune_opt.nb)| |ch_ppocr_mobile_slim_v2.0_det|slim裁剪版超轻量模型,支持中英文、多语种文本检测|[ch_det_mv3_db_v2.0.yml](../../configs/det/ch_ppocr_v2.0/ch_det_mv3_db_v2.0.yml)| |[推理模型 (coming soon)](link) / [slim模型 (coming soon)](link)|
|ch_ppocr_mobile_v1.1_det|原始超轻量模型,支持中英文、多语种文本检测|[det_mv3_db_v1.1.yml](../../configs/det/det_mv3_db_v1.1.yml)|2.6M|[推理模型](https://paddleocr.bj.bcebos.com/20-09-22/mobile/det/ch_ppocr_mobile_v1.1_det_infer.tar) / [训练模型](https://paddleocr.bj.bcebos.com/20-09-22/mobile/det/ch_ppocr_mobile_v1.1_det_train.tar)| |ch_ppocr_mobile_v2.0_det|原始超轻量模型,支持中英文、多语种文本检测|[ch_det_mv3_db_v2.0.yml](../../configs/det/ch_ppocr_v2.0/ch_det_mv3_db_v2.0.yml)|3M|[推理模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_det_infer.tar) / [训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_det_train.tar)|
|ch_ppocr_server_v1.1_det|通用模型,支持中英文、多语种文本检测,比超轻量模型更大,但效果更好|[det_r18_vd_db_v1.1.yml](../../configs/det/det_r18_vd_db_v1.1.yml)|47.2M|[推理模型](https://paddleocr.bj.bcebos.com/20-09-22/server/det/ch_ppocr_server_v1.1_det_infer.tar) / [训练模型](https://paddleocr.bj.bcebos.com/20-09-22/server/det/ch_ppocr_server_v1.1_det_train.tar)| |ch_ppocr_server_v2.0_det|通用模型,支持中英文、多语种文本检测,比超轻量模型更大,但效果更好|[ch_det_res18_db_v2.0.yml](../../configs/det/ch_ppocr_v2.0/ch_det_res18_db_v2.0.yml)|47M|[推理模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_det_infer.tar) / [训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_det_train.tar)|
<a name="文本识别模型"></a> <a name="文本识别模型"></a>
...@@ -30,42 +31,44 @@ PaddleOCR提供的可下载模型包括`推理模型`、`训练模型`、`预训 ...@@ -30,42 +31,44 @@ PaddleOCR提供的可下载模型包括`推理模型`、`训练模型`、`预训
<a name="中文识别模型"></a> <a name="中文识别模型"></a>
#### 1. 中文识别模型 #### 1. 中文识别模型
|模型名称|模型简介|配置文件|推理模型大小|下载地址| |模型名称|模型简介|配置文件|推理模型大小|下载地址|
|-|-|-|-|-| | --- | --- | --- | --- | --- |
|ch_ppocr_mobile_slim_v1.1_rec|slim裁剪量化版超轻量模型,支持中英文、数字识别|[rec_chinese_lite_train_v1.1.yml](../../configs/rec/ch_ppocr_v1.1/rec_chinese_lite_train_v1.1.yml)|1.6M|[推理模型](https://paddleocr.bj.bcebos.com/20-09-22/mobile-slim/rec/ch_ppocr_mobile_v1.1_rec_quant_infer.tar) / [slim模型](https://paddleocr.bj.bcebos.com/20-09-22/mobile/lite/ch_ppocr_mobile_v1.1_rec_quant_opt.nb) | |ch_ppocr_mobile_slim_v2.0_rec|slim裁剪量化版超轻量模型,支持中英文、数字识别|[rec_chinese_lite_train_v2.0.yml](../../configs/rec/ch_ppocr_v2.0/rec_chinese_lite_train_v2.0.yml)| |[推理模型 (coming soon)](link) / [slim模型 (coming soon)](link) |
|ch_ppocr_mobile_v1.1_rec|原始超轻量模型,支持中英文、数字识别|[rec_chinese_lite_train_v1.1.yml](../../configs/rec/ch_ppocr_v1.1/rec_chinese_lite_train_v1.1.yml)|4.6M|[推理模型](https://paddleocr.bj.bcebos.com/20-09-22/mobile/rec/ch_ppocr_mobile_v1.1_rec_infer.tar) / [训练模型](https://paddleocr.bj.bcebos.com/20-09-22/mobile/rec/ch_ppocr_mobile_v1.1_rec_train.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/20-09-22/mobile/rec/ch_ppocr_mobile_v1.1_rec_pre.tar) | |ch_ppocr_mobile_v2.0_rec|原始超轻量模型,支持中英文、数字识别|[rec_chinese_lite_train_v2.0.yml](../../configs/rec/ch_ppocr_v2.0/rec_chinese_lite_train_v2.0.yml)|3.71M|[推理模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_rec_infer.tar) / [训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_rec_train.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_rec_pre.tar) |
|ch_ppocr_server_v1.1_rec|通用模型,支持中英文、数字识别|[rec_chinese_common_train_v1.1.yml](../../configs/rec/ch_ppocr_v1.1/rec_chinese_common_train_v1.1.yml)|105M|[推理模型](https://paddleocr.bj.bcebos.com/20-09-22/server/rec/ch_ppocr_server_v1.1_rec_infer.tar) / [训练模型](https://paddleocr.bj.bcebos.com/20-09-22/server/rec/ch_ppocr_server_v1.1_rec_train.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/20-09-22/server/rec/ch_ppocr_server_v1.1_rec_pre.tar) | |ch_ppocr_server_v2.0_rec|通用模型,支持中英文、数字识别|[rec_chinese_common_train_v2.0.yml](../../configs/rec/ch_ppocr_v2.0/rec_chinese_common_train_v2.0.yml)|94.8M|[推理模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_rec_infer.tar) / [训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_rec_train.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_rec_pre.tar) |
**说明:** `训练模型`是基于预训练模型在真实数据与竖排合成文本数据上finetune得到的模型,在真实应用场景中有着更好的表现,`预训练模型`则是直接基于全量真实数据与合成数据训练得到,更适合用于在自己的数据集上finetune。 **说明:** `训练模型`是基于预训练模型在真实数据与竖排合成文本数据上finetune得到的模型,在真实应用场景中有着更好的表现,`预训练模型`则是直接基于全量真实数据与合成数据训练得到,更适合用于在自己的数据集上finetune。
<a name="英文识别模型"></a> <a name="英文识别模型"></a>
#### 2. 英文识别模型 #### 2. 英文识别模型
|模型名称|模型简介|配置文件|推理模型大小|下载地址| |模型名称|模型简介|配置文件|推理模型大小|下载地址|
|-|-|-|-|-| | --- | --- | --- | --- | --- |
|en_ppocr_mobile_slim_v1.1_rec|slim裁剪量化版超轻量模型,支持英文、数字识别|[rec_en_lite_train.yml](../../configs/rec/multi_languages/rec_en_lite_train.yml)|0.9M|[推理模型](https://paddleocr.bj.bcebos.com/20-09-22/mobile-slim/en/en_ppocr_mobile_v1.1_rec_quant_infer.tar) / [slim模型](https://paddleocr.bj.bcebos.com/20-09-22/mobile-slim/en/en_ppocr_mobile_v1.1_rec_quant_opt.nb) | |en_number_mobile_slim_v2.0_rec|slim裁剪量化版超轻量模型,支持英文、数字识别|[rec_en_number_lite_train.yml](../../configs/rec/multi_language/rec_en_number_lite_train.yml)| |[推理模型 (coming soon )](link) / [slim模型 (coming soon)](link) |
|en_ppocr_mobile_v1.1_rec|原始超轻量模型,支持英文、数字识别|[rec_en_lite_train.yml](../../configs/rec/multi_languages/rec_en_lite_train.yml)|2.0M|[推理模型](https://paddleocr.bj.bcebos.com/20-09-22/mobile/en/en_ppocr_mobile_v1.1_rec_infer.tar) / [训练模型](https://paddleocr.bj.bcebos.com/20-09-22/mobile/en/en_ppocr_mobile_v1.1_rec_train.tar) | |en_number_mobile_v2.0_rec|原始超轻量模型,支持英文、数字识别|[rec_en_number_lite_train.yml](../../configs/rec/multi_language/rec_en_number_lite_train.yml)|2.56M|[推理模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/en_number_mobile_v2.0_rec_infer.tar) / [训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/en_number_mobile_v2.0_rec_train.tar) |
<a name="多语言识别模型"></a> <a name="多语言识别模型"></a>
#### 3. 多语言识别模型(更多语言持续更新中...) #### 3. 多语言识别模型(更多语言持续更新中...)
|模型名称|模型简介|配置文件|推理模型大小|下载地址| |模型名称|模型简介|配置文件|推理模型大小|下载地址|
|-|-|-|-|-| | --- | --- | --- | --- | --- |
| french_ppocr_mobile_v1.1_rec |法文识别|[rec_french_lite_train.yml](../../configs/rec/multi_languages/rec_french_lite_train.yml)|2.1M|[推理模型](https://paddleocr.bj.bcebos.com/20-09-22/mobile/fr/french_ppocr_mobile_v1.1_rec_infer.tar) / [训练模型](https://paddleocr.bj.bcebos.com/20-09-22/mobile/fr/french_ppocr_mobile_v1.1_rec_train.tar) | | french_mobile_v2.0_rec |法文识别|[rec_french_lite_train.yml](../../configs/rec/multi_language/rec_french_lite_train.yml)|2.65M|[推理模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/french_mobile_v2.0_rec_infer.tar) / [训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/french_mobile_v2.0_rec_train.tar) |
| german_ppocr_mobile_v1.1_rec |德文识别|[rec_ger_lite_train.yml](../../configs/rec/multi_languages/rec_ger_lite_train.yml)|2.1M|[推理模型](https://paddleocr.bj.bcebos.com/20-09-22/mobile/ge/german_ppocr_mobile_v1.1_rec_infer.tar) / [训练模型](https://paddleocr.bj.bcebos.com/20-09-22/mobile/ge/german_ppocr_mobile_v1.1_rec_train.tar) | | german_mobile_v2.0_rec |德文识别|[rec_german_lite_train.yml](../../configs/rec/multi_language/rec_german_lite_train.yml)|2.65M|[推理模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/german_mobile_v2.0_rec_infer.tar) / [训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/german_mobile_v2.0_rec_train.tar) |
| korean_ppocr_mobile_v1.1_rec |韩文识别|[rec_korean_lite_train.yml](../../configs/rec/multi_languages/rec_korean_lite_train.yml)|3.4M|[推理模型](https://paddleocr.bj.bcebos.com/20-09-22/mobile/kr/korean_ppocr_mobile_v1.1_rec_infer.tar) / [训练模型](https://paddleocr.bj.bcebos.com/20-09-22/mobile/kr/korean_ppocr_mobile_v1.1_rec_train.tar) | | korean_mobile_v2.0_rec |韩文识别|[rec_korean_lite_train.yml](../../configs/rec/multi_language/rec_korean_lite_train.yml)|3.9M|[推理模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/korean_mobile_v2.0_rec_infer.tar) / [训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/korean_mobile_v2.0_rec_train.tar) |
| japan_ppocr_mobile_v1.1_rec |日文识别|[rec_japan_lite_train.yml](../../configs/rec/multi_languages/rec_japan_lite_train.yml)|3.7M|[推理模型](https://paddleocr.bj.bcebos.com/20-09-22/mobile/jp/japan_ppocr_mobile_v1.1_rec_infer.tar) / [训练模型](https://paddleocr.bj.bcebos.com/20-09-22/mobile/jp/japan_ppocr_mobile_v1.1_rec_train.tar) | | japan_mobile_v2.0_rec |日文识别|[rec_japan_lite_train.yml](../../configs/rec/multi_language/rec_japan_lite_train.yml)|4.23M|[推理模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/japan_mobile_v2.0_rec_infer.tar) / [训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/japan_mobile_v2.0_rec_train.tar) |
<a name="文本方向分类模型"></a> <a name="文本方向分类模型"></a>
### 三、文本方向分类模型 ### 三、文本方向分类模型
|模型名称|模型简介|配置文件|推理模型大小|下载地址| |模型名称|模型简介|配置文件|推理模型大小|下载地址|
|-|-|-|-|-| | --- | --- | --- | --- | --- |
|ch_ppocr_mobile_v1.1_cls_quant|slim量化版模型|[cls_mv3.yml](../../configs/cls/cls_mv3.yml)|0.5M|[推理模型](https://paddleocr.bj.bcebos.com/20-09-22/cls/ch_ppocr_mobile_v1.1_cls_quant_infer.tar) / [训练模型](https://paddleocr.bj.bcebos.com/20-09-22/cls/ch_ppocr_mobile_v1.1_cls_quant_train.tar) / [slim模型](https://paddleocr.bj.bcebos.com/20-09-22/mobile/lite/ch_ppocr_mobile_v1.1_cls_quant_opt.nb) | |ch_ppocr_mobile_slim_v2.0_cls|slim量化版模型|[cls_mv3.yml](../../configs/cls/cls_mv3.yml)| |[推理模型 (coming soon)](link) / [训练模型](link) / [slim模型](link) |
|ch_ppocr_mobile_v1.1_cls|原始模型|[cls_mv3.yml](../../configs/cls/cls_mv3.yml)|850kb|[推理模型](https://paddleocr.bj.bcebos.com/20-09-22/cls/ch_ppocr_mobile_v1.1_cls_infer.tar) / [训练模型](https://paddleocr.bj.bcebos.com/20-09-22/cls/ch_ppocr_mobile_v1.1_cls_train.tar) | |ch_ppocr_mobile_v2.0_cls|原始模型|[cls_mv3.yml](../../configs/cls/cls_mv3.yml)|1.38M|[推理模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_infer.tar) / [训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_train.tar) |
## OCR模型列表(V1.1,2020年9月22日更新)
## OCR模型列表(V1.0,7月16日更新) [1.1系列模型地址](https://github.com/PaddlePaddle/PaddleOCR/blob/dygraph/doc/doc_ch/models_list.md)
|模型名称|模型简介|检测模型地址|识别模型地址|支持空格的识别模型地址|
|-|-|-|-|-|
|chinese_db_crnn_mobile|8.6M超轻量级中文OCR模型|[推理模型](https://paddleocr.bj.bcebos.com/ch_models/ch_det_mv3_db_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/ch_models/ch_det_mv3_db.tar) |[推理模型](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_mv3_crnn_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_mv3_crnn.tar) |[推理模型](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_mv3_crnn_enhance_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_mv3_crnn_enhance.tar)
|chinese_db_crnn_server|通用中文OCR模型|[推理模型](https://paddleocr.bj.bcebos.com/ch_models/ch_det_r50_vd_db_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/ch_models/ch_det_r50_vd_db.tar) |[推理模型](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_r34_vd_crnn_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_r34_vd_crnn.tar) |[推理模型](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_r34_vd_crnn_enhance_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_r34_vd_crnn_enhance.tar)
...@@ -5,16 +5,16 @@ ...@@ -5,16 +5,16 @@
请先参考[快速安装](./installation.md)配置PaddleOCR运行环境。 请先参考[快速安装](./installation.md)配置PaddleOCR运行环境。
*注意:也可以通过 whl 包安装使用PaddleOCR,具体参考[Paddleocr Package使用说明](https://github.com/PaddlePaddle/PaddleOCR/blob/develop/doc/doc_ch/whl.md)。* *注意:也可以通过 whl 包安装使用PaddleOCR,具体参考[Paddleocr Package使用说明](./whl.md)。*
## 2.inference模型下载 ## 2.inference模型下载
* 移动端和服务器端的检测与识别模型如下,更多模型下载(包括多语言),可以参考[PP-OCR v1.1 系列模型下载](../doc_ch/models_list.md) * 移动端和服务器端的检测与识别模型如下,更多模型下载(包括多语言),可以参考[PP-OCR v2.0 系列模型下载](../doc_ch/models_list.md)
| 模型简介 | 模型名称 |推荐场景 | 检测模型 | 方向分类器 | 识别模型 | | 模型简介 | 模型名称 |推荐场景 | 检测模型 | 方向分类器 | 识别模型 |
| ------------ | --------------- | ----------------|---- | ---------- | -------- | | ------------ | --------------- | ----------------|---- | ---------- | -------- |
| 中英文超轻量OCR模型(8.1M) | ch_ppocr_mobile_v1.1_xx |移动端&服务器端|[推理模型](https://paddleocr.bj.bcebos.com/20-09-22/mobile/det/ch_ppocr_mobile_v1.1_det_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/20-09-22/mobile/det/ch_ppocr_mobile_v1.1_det_train.tar)|[推理模型](https://paddleocr.bj.bcebos.com/20-09-22/cls/ch_ppocr_mobile_v1.1_cls_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/20-09-22/cls/ch_ppocr_mobile_v1.1_cls_train.tar) |[推理模型](https://paddleocr.bj.bcebos.com/20-09-22/mobile/rec/ch_ppocr_mobile_v1.1_rec_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/20-09-22/mobile/rec/ch_ppocr_mobile_v1.1_rec_pre.tar) | | 中英文超轻量OCR模型(8.1M) | ch_ppocr_mobile_v2.0_xx |移动端&服务器端|[推理模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_det_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_det_train.tar)|[推理模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_train.tar) |[推理模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_rec_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_rec_pre.tar) |
| 中英文通用OCR模型(155.1M) |ch_ppocr_server_v1.1_xx|服务器端 |[推理模型](https://paddleocr.bj.bcebos.com/20-09-22/server/det/ch_ppocr_server_v1.1_det_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/20-09-22/server/det/ch_ppocr_server_v1.1_det_train.tar) |[推理模型](https://paddleocr.bj.bcebos.com/20-09-22/cls/ch_ppocr_mobile_v1.1_cls_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/20-09-22/cls/ch_ppocr_mobile_v1.1_cls_train.tar) |[推理模型](https://paddleocr.bj.bcebos.com/20-09-22/server/rec/ch_ppocr_server_v1.1_rec_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/20-09-22/server/rec/ch_ppocr_server_v1.1_rec_pre.tar) | | 中英文通用OCR模型(143M) | ch_ppocr_server_v2.0_xx |服务器端 |[推理模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_det_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_det_train.tar) |[推理模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_train.tar) |[推理模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_rec_infer.tar) / [预训练模型](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_rec_pre.tar) |
* windows 环境下如果没有安装wget,下载模型时可将链接复制到浏览器中下载,并解压放置在相应目录下 * windows 环境下如果没有安装wget,下载模型时可将链接复制到浏览器中下载,并解压放置在相应目录下
...@@ -37,44 +37,45 @@ cd .. ...@@ -37,44 +37,45 @@ cd ..
``` ```
mkdir inference && cd inference mkdir inference && cd inference
# 下载超轻量级中文OCR模型的检测模型并解压 # 下载超轻量级中文OCR模型的检测模型并解压
wget https://paddleocr.bj.bcebos.com/20-09-22/mobile/det/ch_ppocr_mobile_v1.1_det_infer.tar && tar xf ch_ppocr_mobile_v1.1_det_infer.tar wget https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_det_infer.tar && tar xf ch_ppocr_mobile_v2.0_det_infer.tar
# 下载超轻量级中文OCR模型的识别模型并解压 # 下载超轻量级中文OCR模型的识别模型并解压
wget https://paddleocr.bj.bcebos.com/20-09-22/mobile/rec/ch_ppocr_mobile_v1.1_rec_infer.tar && tar xf ch_ppocr_mobile_v1.1_rec_infer.tar wget https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_rec_infer.tar && tar xf ch_ppocr_mobile_v2.0_rec_infer.tar
# 下载超轻量级中文OCR模型的文本方向分类器模型并解压 # 下载超轻量级中文OCR模型的文本方向分类器模型并解压
wget https://paddleocr.bj.bcebos.com/20-09-22/cls/ch_ppocr_mobile_v1.1_cls_infer.tar && tar xf ch_ppocr_mobile_v1.1_cls_infer.tar wget https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_infer.tar && tar xf ch_ppocr_mobile_v2.0_cls_infer.tar
cd .. cd ..
``` ```
解压完毕后应有如下文件结构: 解压完毕后应有如下文件结构:
``` ```
|-inference ├── ch_ppocr_mobile_v2.0_cls_infer
|-ch_ppocr_mobile_v1.1_det_infer │ ├── inference.pdiparams
|- model │ ├── inference.pdiparams.info
|- params │ └── inference.pdmodel
|-ch_ppocr_mobile_v1.1_rec_infer ├── ch_ppocr_mobile_v2.0_det_infer
|- model │ ├── inference.pdiparams
|- params │ ├── inference.pdiparams.info
|-ch_ppocr_mobile-v1.1_cls_infer │ └── inference.pdmodel
|- model ├── ch_ppocr_mobile_v2.0_rec_infer
|- params ├── inference.pdiparams
... ├── inference.pdiparams.info
└── inference.pdmodel
``` ```
## 3.单张图像或者图像集合预测 ## 3.单张图像或者图像集合预测
以下代码实现了文本检测、识别串联推理,在执行预测时,需要通过参数image_dir指定单张图像或者图像集合的路径、参数`det_model_dir`指定检测inference模型的路径、参数`rec_model_dir`指定识别inference模型的路径、参数`use_angle_cls`指定是否使用方向分类器、参数`cls_model_dir`指定方向分类器inference模型的路径、参数`use_space_char`指定是否预测空格字符。可视化识别结果默认保存到`./inference_results`文件夹里面。 以下代码实现了文本检测、方向分类器和识别串联推理,在执行预测时,需要通过参数image_dir指定单张图像或者图像集合的路径、参数`det_model_dir`指定检测inference模型的路径、参数`rec_model_dir`指定识别inference模型的路径、参数`use_angle_cls`指定是否使用方向分类器、参数`cls_model_dir`指定方向分类器inference模型的路径、参数`use_space_char`指定是否预测空格字符。可视化识别结果默认保存到`./inference_results`文件夹里面。
```bash ```bash
# 预测image_dir指定的单张图像 # 预测image_dir指定的单张图像
python3 tools/infer/predict_system.py --image_dir="./doc/imgs/11.jpg" --det_model_dir="./inference/ch_ppocr_mobile_v1.1_det_infer/" --rec_model_dir="./inference/ch_ppocr_mobile_v1.1_rec_infer/" --cls_model_dir="./inference/ch_ppocr_mobile_v1.1_cls_infer/" --use_angle_cls=True --use_space_char=True python3 tools/infer/predict_system.py --image_dir="./doc/imgs/11.jpg" --det_model_dir="./inference/ch_ppocr_mobile_v2.0_det_infer/" --rec_model_dir="./inference/ch_ppocr_mobile_v2.0_rec_infer/" --cls_model_dir="./inference/ch_ppocr_mobile_v2.0_cls_infer/" --use_angle_cls=True --use_space_char=True
# 预测image_dir指定的图像集合 # 预测image_dir指定的图像集合
python3 tools/infer/predict_system.py --image_dir="./doc/imgs/" --det_model_dir="./inference/ch_ppocr_mobile_v1.1_det_infer/" --rec_model_dir="./inference/ch_ppocr_mobile_v1.1_rec_infer/" --cls_model_dir="./inference/ch_ppocr_mobile_v1.1_cls_infer/" --use_angle_cls=True --use_space_char=True python3 tools/infer/predict_system.py --image_dir="./doc/imgs/" --det_model_dir="./inference/ch_ppocr_mobile_v2.0_det_infer/" --rec_model_dir="./inference/ch_ppocr_mobile_v2.0_rec_infer/" --cls_model_dir="./inference/ch_ppocr_mobile_v2.0_cls_infer/" --use_angle_cls=True --use_space_char=True
# 如果想使用CPU进行预测,需设置use_gpu参数为False # 如果想使用CPU进行预测,需设置use_gpu参数为False
python3 tools/infer/predict_system.py --image_dir="./doc/imgs/11.jpg" --det_model_dir="./inference/ch_ppocr_mobile_v1.1_det_infer/" --rec_model_dir="./inference/ch_ppocr_mobile_v1.1_rec_infer/" --cls_model_dir="./inference/ch_ppocr_mobile_v1.1_cls_infer/" --use_angle_cls=True --use_space_char=True --use_gpu=False python3 tools/infer/predict_system.py --image_dir="./doc/imgs/11.jpg" --det_model_dir="./inference/ch_ppocr_mobile_v2.0_det_infer/" --rec_model_dir="./inference/ch_ppocr_mobile_v2.0_rec_infer/" --cls_model_dir="./inference/ch_ppocr_mobile_v2.0_cls_infer/" --use_angle_cls=True --use_space_char=True --use_gpu=False
``` ```
- 通用中文OCR模型 - 通用中文OCR模型
...@@ -83,7 +84,7 @@ python3 tools/infer/predict_system.py --image_dir="./doc/imgs/11.jpg" --det_mode ...@@ -83,7 +84,7 @@ python3 tools/infer/predict_system.py --image_dir="./doc/imgs/11.jpg" --det_mode
```bash ```bash
# 预测image_dir指定的单张图像 # 预测image_dir指定的单张图像
python3 tools/infer/predict_system.py --image_dir="./doc/imgs/11.jpg" --det_model_dir="./inference/ch_ppocr_server_v1.1_det_infer/" --rec_model_dir="./inference/ch_ppocr_server_v1.1_rec_infer/" --cls_model_dir="./inference/ch_ppocr_mobile_v1.1_cls_infer/" --use_angle_cls=True --use_space_char=True python3 tools/infer/predict_system.py --image_dir="./doc/imgs/11.jpg" --det_model_dir="./inference/ch_ppocr_server_v2.0_det_infer/" --rec_model_dir="./inference/ch_ppocr_server_v2.0_rec_infer/" --cls_model_dir="./inference/ch_ppocr_mobile_v2.0_cls_infer/" --use_angle_cls=True --use_space_char=True
``` ```
* 注意: * 注意:
......
...@@ -37,8 +37,6 @@ ln -sf <path/to/dataset> <path/to/paddle_ocr>/train_data/dataset ...@@ -37,8 +37,6 @@ ln -sf <path/to/dataset> <path/to/paddle_ocr>/train_data/dataset
若您本地没有数据集,可以在官网下载 [icdar2015](http://rrc.cvc.uab.es/?ch=4&com=downloads) 数据,用于快速验证。也可以参考[DTRB](https://github.com/clovaai/deep-text-recognition-benchmark#download-lmdb-dataset-for-traininig-and-evaluation-from-here),下载 benchmark 所需的lmdb格式数据集。 若您本地没有数据集,可以在官网下载 [icdar2015](http://rrc.cvc.uab.es/?ch=4&com=downloads) 数据,用于快速验证。也可以参考[DTRB](https://github.com/clovaai/deep-text-recognition-benchmark#download-lmdb-dataset-for-traininig-and-evaluation-from-here),下载 benchmark 所需的lmdb格式数据集。
如果希望复现SRN的论文指标,需要下载离线[增广数据](https://pan.baidu.com/s/1-HSZ-ZVdqBF2HaBZ5pRAKA),提取码: y3ry。增广数据是由MJSynth和SynthText做旋转和扰动得到的。数据下载完成后请解压到 {your_path}/PaddleOCR/train_data/data_lmdb_release/training/ 路径下。
<a name="自定义数据集"></a> <a name="自定义数据集"></a>
* 使用自己数据集 * 使用自己数据集
...@@ -65,7 +63,7 @@ wget -P ./train_data/ic15_data https://paddleocr.bj.bcebos.com/dataset/rec_gt_t ...@@ -65,7 +63,7 @@ wget -P ./train_data/ic15_data https://paddleocr.bj.bcebos.com/dataset/rec_gt_t
wget -P ./train_data/ic15_data https://paddleocr.bj.bcebos.com/dataset/rec_gt_test.txt wget -P ./train_data/ic15_data https://paddleocr.bj.bcebos.com/dataset/rec_gt_test.txt
``` ```
PaddleOCR 也提供了数据格式转换脚本,可以将官网 label 转换支持的数据格式。 数据转换工具在 `train_data/gen_label.py`, 这里以训练集为例: PaddleOCR 也提供了数据格式转换脚本,可以将官网 label 转换支持的数据格式。 数据转换工具在 `ppocr/utils/gen_label.py`, 这里以训练集为例:
``` ```
# 将官网下载的标签文件转换为 rec_gt_label.txt # 将官网下载的标签文件转换为 rec_gt_label.txt
...@@ -116,17 +114,19 @@ n ...@@ -116,17 +114,19 @@ n
word_dict.txt 每行有一个单字,将字符与数字索引映射在一起,“and” 将被映射成 [2 5 1] word_dict.txt 每行有一个单字,将字符与数字索引映射在一起,“and” 将被映射成 [2 5 1]
`ppocr/utils/ppocr_keys_v1.txt` 是一个包含6623个字符的中文字典 `ppocr/utils/ppocr_keys_v1.txt` 是一个包含6623个字符的中文字典
`ppocr/utils/ic15_dict.txt` 是一个包含36个字符的英文字典 `ppocr/utils/ic15_dict.txt` 是一个包含36个字符的英文字典
`ppocr/utils/dict/french_dict.txt` 是一个包含118个字符的法文字典 `ppocr/utils/dict/french_dict.txt` 是一个包含118个字符的法文字典
`ppocr/utils/dict/japan_dict.txt` 是一个包含4399个字符的法文字典 `ppocr/utils/dict/japan_dict.txt` 是一个包含4399个字符的日文字典
`ppocr/utils/dict/korean_dict.txt` 是一个包含3636个字符的韩文字典
`ppocr/utils/dict/korean_dict.txt` 是一个包含3636个字符的法文字典 `ppocr/utils/dict/german_dict.txt` 是一个包含131个字符的德文字典
`ppocr/utils/dict/german_dict.txt` 是一个包含131个字符的法文字典 `ppocr/utils/dict/en_dict.txt` 是一个包含63个字符的英文字典
您可以按需使用。 您可以按需使用。
...@@ -142,9 +142,8 @@ word_dict.txt 每行有一个单字,将字符与数字索引映射在一起, ...@@ -142,9 +142,8 @@ word_dict.txt 每行有一个单字,将字符与数字索引映射在一起,
<a name="支持空格"></a> <a name="支持空格"></a>
- 添加空格类别 - 添加空格类别
如果希望支持识别"空格"类别, 请将yml文件中的 `use_space_char` 字段设置为 `true` 如果希望支持识别"空格"类别, 请将yml文件中的 `use_space_char` 字段设置为 `True`
**注意:`use_space_char` 仅在 `character_type=ch` 时生效**
<a name="启动训练"></a> <a name="启动训练"></a>
### 启动训练 ### 启动训练
...@@ -156,10 +155,10 @@ PaddleOCR提供了训练脚本、评估脚本和预测脚本,本节将以 CRNN ...@@ -156,10 +155,10 @@ PaddleOCR提供了训练脚本、评估脚本和预测脚本,本节将以 CRNN
``` ```
cd PaddleOCR/ cd PaddleOCR/
# 下载MobileNetV3的预训练模型 # 下载MobileNetV3的预训练模型
wget -P ./pretrain_models/ https://paddleocr.bj.bcebos.com/rec_mv3_none_bilstm_ctc.tar wget -P ./pretrain_models/ https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/rec_mv3_none_bilstm_ctc_v2.0_train.tar
# 解压模型参数 # 解压模型参数
cd pretrain_models cd pretrain_models
tar -xf rec_mv3_none_bilstm_ctc.tar && rm -rf rec_mv3_none_bilstm_ctc.tar tar -xf rec_mv3_none_bilstm_ctc_v2.0_train.tar && rm -rf rec_mv3_none_bilstm_ctc_v2.0_train.tar
``` ```
开始训练: 开始训练:
...@@ -167,10 +166,9 @@ tar -xf rec_mv3_none_bilstm_ctc.tar && rm -rf rec_mv3_none_bilstm_ctc.tar ...@@ -167,10 +166,9 @@ tar -xf rec_mv3_none_bilstm_ctc.tar && rm -rf rec_mv3_none_bilstm_ctc.tar
*如果您安装的是cpu版本,请将配置文件中的 `use_gpu` 字段修改为false* *如果您安装的是cpu版本,请将配置文件中的 `use_gpu` 字段修改为false*
``` ```
# GPU训练 支持单卡,多卡训练,通过CUDA_VISIBLE_DEVICES指定卡号 # GPU训练 支持单卡,多卡训练,通过--gpus参数指定卡号
export CUDA_VISIBLE_DEVICES=0,1,2,3
# 训练icdar15英文数据 并将训练日志保存为 tain_rec.log # 训练icdar15英文数据 并将训练日志保存为 tain_rec.log
python3 tools/train.py -c configs/rec/rec_icdar15_train.yml 2>&1 | tee train_rec.log python3 -m paddle.distributed.launch --gpus '0,1,2,3' tools/train.py -c configs/rec/rec_icdar15_train.yml
``` ```
<a name="数据增强"></a> <a name="数据增强"></a>
- 数据增强 - 数据增强
...@@ -195,8 +193,8 @@ PaddleOCR支持训练和评估交替进行, 可以在 `configs/rec/rec_icdar15_t ...@@ -195,8 +193,8 @@ PaddleOCR支持训练和评估交替进行, 可以在 `configs/rec/rec_icdar15_t
| 配置文件 | 算法名称 | backbone | trans | seq | pred | | 配置文件 | 算法名称 | backbone | trans | seq | pred |
| :--------: | :-------: | :-------: | :-------: | :-----: | :-----: | | :--------: | :-------: | :-------: | :-------: | :-----: | :-----: |
| [rec_chinese_lite_train_v1.1.yml](../../configs/rec/ch_ppocr_v1.1/rec_chinese_lite_train_v1.1.yml) | CRNN | Mobilenet_v3 small 0.5 | None | BiLSTM | ctc | | [rec_chinese_lite_train_v2.0.yml](../../configs/rec/ch_ppocr_v2.0/rec_chinese_lite_train_v2.0.yml) | CRNN | Mobilenet_v3 small 0.5 | None | BiLSTM | ctc |
| [rec_chinese_common_train_v1.1.yml](../../configs/rec/ch_ppocr_v1.1/rec_chinese_common_train_v1.1.yml) | CRNN | ResNet34_vd | None | BiLSTM | ctc | | [rec_chinese_common_train_v2.0.yml](../../configs/rec/ch_ppocr_v2.0/rec_chinese_common_train_v2.0.yml) | CRNN | ResNet34_vd | None | BiLSTM | ctc |
| rec_chinese_lite_train.yml | CRNN | Mobilenet_v3 small 0.5 | None | BiLSTM | ctc | | rec_chinese_lite_train.yml | CRNN | Mobilenet_v3 small 0.5 | None | BiLSTM | ctc |
| rec_chinese_common_train.yml | CRNN | ResNet34_vd | None | BiLSTM | ctc | | rec_chinese_common_train.yml | CRNN | ResNet34_vd | None | BiLSTM | ctc |
| rec_icdar15_train.yml | CRNN | Mobilenet_v3 large 0.5 | None | BiLSTM | ctc | | rec_icdar15_train.yml | CRNN | Mobilenet_v3 large 0.5 | None | BiLSTM | ctc |
...@@ -206,43 +204,71 @@ PaddleOCR支持训练和评估交替进行, 可以在 `configs/rec/rec_icdar15_t ...@@ -206,43 +204,71 @@ PaddleOCR支持训练和评估交替进行, 可以在 `configs/rec/rec_icdar15_t
| rec_mv3_tps_bilstm_attn.yml | RARE | Mobilenet_v3 large 0.5 | tps | BiLSTM | attention | | rec_mv3_tps_bilstm_attn.yml | RARE | Mobilenet_v3 large 0.5 | tps | BiLSTM | attention |
| rec_r34_vd_none_bilstm_ctc.yml | CRNN | Resnet34_vd | None | BiLSTM | ctc | | rec_r34_vd_none_bilstm_ctc.yml | CRNN | Resnet34_vd | None | BiLSTM | ctc |
| rec_r34_vd_none_none_ctc.yml | Rosetta | Resnet34_vd | None | None | ctc | | rec_r34_vd_none_none_ctc.yml | Rosetta | Resnet34_vd | None | None | ctc |
| rec_r34_vd_tps_bilstm_attn.yml | RARE | Resnet34_vd | tps | BiLSTM | attention |
| rec_r34_vd_tps_bilstm_ctc.yml | STARNet | Resnet34_vd | tps | BiLSTM | ctc | | rec_r34_vd_tps_bilstm_ctc.yml | STARNet | Resnet34_vd | tps | BiLSTM | ctc |
| rec_r50fpn_vd_none_srn.yml | SRN | Resnet50_fpn_vd | None | rnn | srn |
训练中文数据,推荐使用[rec_chinese_lite_train_v1.1.yml](../../configs/rec/ch_ppocr_v1.1/rec_chinese_lite_train_v1.1.yml),如您希望尝试其他算法在中文数据集上的效果,请参考下列说明修改配置文件: 训练中文数据,推荐使用[rec_chinese_lite_train_v2.0.yml](../../configs/rec/ch_ppocr_v2.0/rec_chinese_lite_train_v2.0.yml),如您希望尝试其他算法在中文数据集上的效果,请参考下列说明修改配置文件:
`rec_mv3_none_none_ctc.yml` 为例: `rec_chinese_lite_train_v2.0.yml` 为例:
``` ```
Global: Global:
... ...
# 修改 image_shape 以适应长文本 # 添加自定义字典,如修改字典请将路径指向新字典
image_shape: [3, 32, 320] character_dict_path: ppocr/utils/ppocr_keys_v1.txt
...
# 修改字符类型 # 修改字符类型
character_type: ch character_type: ch
# 添加自定义字典,如修改字典请将路径指向新字典
character_dict_path: ./ppocr/utils/ppocr_keys_v1.txt
# 训练时添加数据增强
distort: true
# 识别空格
use_space_char: true
...
# 修改reader类型
reader_yml: ./configs/rec/rec_chinese_reader.yml
... ...
# 识别空格
use_space_char: True
...
Optimizer: Optimizer:
... ...
# 添加学习率衰减策略 # 添加学习率衰减策略
decay: lr:
function: cosine_decay name: Cosine
# 每个 epoch 包含 iter 数 learning_rate: 0.001
step_each_epoch: 20 ...
# 总共训练epoch数
total_epoch: 1000 ...
Train:
dataset:
# 数据集格式,支持LMDBDateSet以及SimpleDataSet
name: SimpleDataSet
# 数据集路径
data_dir: ./train_data/
# 训练集标签文件
label_file_list: ["./train_data/train_list.txt"]
transforms:
...
- RecResizeImg:
# 修改 image_shape 以适应长文本
image_shape: [3, 32, 320]
...
loader:
...
# 单卡训练的batch_size
batch_size_per_card: 256
...
Eval:
dataset:
# 数据集格式,支持LMDBDateSet以及SimpleDataSet
name: SimpleDataSet
# 数据集路径
data_dir: ./train_data
# 验证集标签文件
label_file_list: ["./train_data/val_list.txt"]
transforms:
...
- RecResizeImg:
# 修改 image_shape 以适应长文本
image_shape: [3, 32, 320]
...
loader:
# 单卡验证的batch_size
batch_size_per_card: 256
...
``` ```
**注意,预测/评估时的配置文件请务必与训练一致。** **注意,预测/评估时的配置文件请务必与训练一致。**
...@@ -270,39 +296,41 @@ Global: ...@@ -270,39 +296,41 @@ Global:
... ...
# 添加自定义字典,如修改字典请将路径指向新字典 # 添加自定义字典,如修改字典请将路径指向新字典
character_dict_path: ./ppocr/utils/dict/french_dict.txt character_dict_path: ./ppocr/utils/dict/french_dict.txt
# 训练时添加数据增强
distort: true
# 识别空格
use_space_char: true
...
# 修改reader类型
reader_yml: ./configs/rec/multi_languages/rec_french_reader.yml
... ...
... # 识别空格
``` use_space_char: True
同时需要修改数据读取文件 `rec_french_reader.yml`
```
TrainReader:
...
# 修改训练数据存放的目录名
img_set_dir: ./train_data
# 修改 label 文件名称
label_file_path: ./train_data/french_train.txt
... ...
Train:
dataset:
# 数据集格式,支持LMDBDateSet以及SimpleDataSet
name: SimpleDataSet
# 数据集路径
data_dir: ./train_data/
# 训练集标签文件
label_file_list: ["./train_data/french_train.txt"]
...
Eval:
dataset:
# 数据集格式,支持LMDBDateSet以及SimpleDataSet
name: SimpleDataSet
# 数据集路径
data_dir: ./train_data
# 验证集标签文件
label_file_list: ["./train_data/french_val.txt"]
...
``` ```
<a name="评估"></a> <a name="评估"></a>
### 评估 ### 评估
评估数据集可以通过 `configs/rec/rec_icdar15_reader.yml` 修改EvalReader中的 `label_file_path` 设置。 评估数据集可以通过 `configs/rec/rec_icdar15_train.yml` 修改Eval中的 `label_file_path` 设置。
*注意* 评估时必须确保配置文件中 infer_img 字段为空 *注意* 评估时必须确保配置文件中 infer_img 字段为空
``` ```
export CUDA_VISIBLE_DEVICES=0
# GPU 评估, Global.checkpoints 为待测权重 # GPU 评估, Global.checkpoints 为待测权重
python3 tools/eval.py -c configs/rec/rec_icdar15_train.yml -o Global.checkpoints={path/to/weights}/best_accuracy python3 -m paddle.distributed.launch --gpus '0' tools/eval.py -c configs/rec/rec_icdar15_train.yml -o Global.checkpoints={path/to/weights}/best_accuracy
``` ```
<a name="预测"></a> <a name="预测"></a>
...@@ -332,12 +360,12 @@ infer_img: doc/imgs_words/en/word_1.png ...@@ -332,12 +360,12 @@ infer_img: doc/imgs_words/en/word_1.png
word : joint word : joint
``` ```
预测使用的配置文件必须与训练一致,如您通过 `python3 tools/train.py -c configs/rec/ch_ppocr_v1.1/rec_chinese_lite_train_v1.1.yml` 完成了中文模型的训练, 预测使用的配置文件必须与训练一致,如您通过 `python3 tools/train.py -c configs/rec/ch_ppocr_v2.0/rec_chinese_lite_train_v2.0.yml` 完成了中文模型的训练,
您可以使用如下命令进行中文模型预测。 您可以使用如下命令进行中文模型预测。
``` ```
# 预测中文结果 # 预测中文结果
python3 tools/infer_rec.py -c configs/rec/ch_ppocr_v1.1/rec_chinese_lite_train_v1.1.yml -o Global.checkpoints={path/to/weights}/best_accuracy Global.infer_img=doc/imgs_words/ch/word_1.jpg python3 tools/infer_rec.py -c configs/rec/ch_ppocr_v2.0/rec_chinese_lite_train_v2.0.yml -o Global.checkpoints={path/to/weights}/best_accuracy Global.infer_img=doc/imgs_words/ch/word_1.jpg
``` ```
预测图片: 预测图片:
......
...@@ -348,7 +348,7 @@ im_show.save('result.jpg') ...@@ -348,7 +348,7 @@ im_show.save('result.jpg')
| cls_batch_num | 进行分类时,同时前向的图片数 |30 | | cls_batch_num | 进行分类时,同时前向的图片数 |30 |
| enable_mkldnn | 是否启用mkldnn | FALSE | | enable_mkldnn | 是否启用mkldnn | FALSE |
| use_zero_copy_run | 是否通过zero_copy_run的方式进行前向 | FALSE | | use_zero_copy_run | 是否通过zero_copy_run的方式进行前向 | FALSE |
| lang | 模型语言类型,目前支持 中文(ch)和英文(en) | ch | | lang | 模型语言类型,目前支持 目前支持中英文(ch)、英文(en)、法语(french)、德语(german)、韩语(korean)、日语(japan) | ch |
| det | 前向时使用启动检测 | TRUE | | det | 前向时使用启动检测 | TRUE |
| rec | 前向时是否启动识别 | TRUE | | rec | 前向时是否启动识别 | TRUE |
| cls | 前向时是否启动分类 (命令行模式下使用use_angle_cls控制前向是否启动分类) | FALSE | | cls | 前向时是否启动分类 (命令行模式下使用use_angle_cls控制前向是否启动分类) | FALSE |
...@@ -19,17 +19,17 @@ On the ICDAR2015 dataset, the text detection result is as follows: ...@@ -19,17 +19,17 @@ On the ICDAR2015 dataset, the text detection result is as follows:
|Model|Backbone|precision|recall|Hmean|Download link| |Model|Backbone|precision|recall|Hmean|Download link|
|-|-|-|-|-|-| |-|-|-|-|-|-|
|EAST|ResNet50_vd|88.18%|85.51%|86.82%|[Download link](https://paddleocr.bj.bcebos.com/det_r50_vd_east.tar)| |EAST|ResNet50_vd|88.18%|85.51%|86.82%|[Download link](link)|
|EAST|MobileNetV3|81.67%|79.83%|80.74%|[Download link](https://paddleocr.bj.bcebos.com/det_mv3_east.tar)| |EAST|MobileNetV3|81.67%|79.83%|80.74%|[Download link](link)|
|DB|ResNet50_vd|83.79%|80.65%|82.19%|[Download link](https://paddleocr.bj.bcebos.com/det_r50_vd_db.tar)| |DB|ResNet50_vd|83.79%|80.65%|82.19%|[Download link](link)|
|DB|MobileNetV3|75.92%|73.18%|74.53%|[Download link](https://paddleocr.bj.bcebos.com/det_mv3_db.tar)| |DB|MobileNetV3|75.92%|73.18%|74.53%|[Download link](link)|
|SAST|ResNet50_vd|92.18%|82.96%|87.33%|[Download link](https://paddleocr.bj.bcebos.com/SAST/sast_r50_vd_icdar2015.tar)| |SAST|ResNet50_vd|92.18%|82.96%|87.33%|[Download link](link)|
On Total-Text dataset, the text detection result is as follows: On Total-Text dataset, the text detection result is as follows:
|Model|Backbone|precision|recall|Hmean|Download link| |Model|Backbone|precision|recall|Hmean|Download link|
|-|-|-|-|-|-| |-|-|-|-|-|-|
|SAST|ResNet50_vd|88.74%|79.80%|84.03%|[Download link](https://paddleocr.bj.bcebos.com/SAST/sast_r50_vd_total_text.tar)| |SAST|ResNet50_vd|88.74%|79.80%|84.03%|[Download link](link)|
**Note:** Additional data, like icdar2013, icdar2017, COCO-Text, ArT, was added to the model training of SAST. Download English public dataset in organized format used by PaddleOCR from [Baidu Drive](https://pan.baidu.com/s/12cPnZcVuV1zn5DOd4mqjVw) (download code: 2bpi). **Note:** Additional data, like icdar2013, icdar2017, COCO-Text, ArT, was added to the model training of SAST. Download English public dataset in organized format used by PaddleOCR from [Baidu Drive](https://pan.baidu.com/s/12cPnZcVuV1zn5DOd4mqjVw) (download code: 2bpi).
...@@ -42,8 +42,8 @@ PaddleOCR open-source text recognition algorithms list: ...@@ -42,8 +42,8 @@ PaddleOCR open-source text recognition algorithms list:
- [x] CRNN([paper](https://arxiv.org/abs/1507.05717)) - [x] CRNN([paper](https://arxiv.org/abs/1507.05717))
- [x] Rosetta([paper](https://arxiv.org/abs/1910.05085)) - [x] Rosetta([paper](https://arxiv.org/abs/1910.05085))
- [x] STAR-Net([paper](http://www.bmva.org/bmvc/2016/papers/paper043/index.html)) - [x] STAR-Net([paper](http://www.bmva.org/bmvc/2016/papers/paper043/index.html))
- [x] RARE([paper](https://arxiv.org/abs/1603.03915v1)) - [ ] RARE([paper](https://arxiv.org/abs/1603.03915v1)) coming soon
- [x] SRN([paper](https://arxiv.org/abs/2003.12294))(Baidu Self-Research) - [ ] SRN([paper](https://arxiv.org/abs/2003.12294))(Baidu Self-Research) coming soon
Refer to [DTRB](https://arxiv.org/abs/1904.01906), the training and evaluation result of these above text recognition (using MJSynth and SynthText for training, evaluate on IIIT, SVT, IC03, IC13, IC15, SVTP, CUTE) is as follow: Refer to [DTRB](https://arxiv.org/abs/1904.01906), the training and evaluation result of these above text recognition (using MJSynth and SynthText for training, evaluate on IIIT, SVT, IC03, IC13, IC15, SVTP, CUTE) is as follow:
...@@ -55,12 +55,6 @@ Refer to [DTRB](https://arxiv.org/abs/1904.01906), the training and evaluation r ...@@ -55,12 +55,6 @@ Refer to [DTRB](https://arxiv.org/abs/1904.01906), the training and evaluation r
|CRNN|MobileNetV3|79.37%|rec_mv3_none_bilstm_ctc|[Download link](https://paddleocr.bj.bcebos.com/rec_mv3_none_bilstm_ctc.tar)| |CRNN|MobileNetV3|79.37%|rec_mv3_none_bilstm_ctc|[Download link](https://paddleocr.bj.bcebos.com/rec_mv3_none_bilstm_ctc.tar)|
|STAR-Net|Resnet34_vd|83.93%|rec_r34_vd_tps_bilstm_ctc|[Download link](https://paddleocr.bj.bcebos.com/rec_r34_vd_tps_bilstm_ctc.tar)| |STAR-Net|Resnet34_vd|83.93%|rec_r34_vd_tps_bilstm_ctc|[Download link](https://paddleocr.bj.bcebos.com/rec_r34_vd_tps_bilstm_ctc.tar)|
|STAR-Net|MobileNetV3|81.56%|rec_mv3_tps_bilstm_ctc|[Download link](https://paddleocr.bj.bcebos.com/rec_mv3_tps_bilstm_ctc.tar)| |STAR-Net|MobileNetV3|81.56%|rec_mv3_tps_bilstm_ctc|[Download link](https://paddleocr.bj.bcebos.com/rec_mv3_tps_bilstm_ctc.tar)|
|RARE|Resnet34_vd|84.90%|rec_r34_vd_tps_bilstm_attn|[Download link](https://paddleocr.bj.bcebos.com/rec_r34_vd_tps_bilstm_attn.tar)|
|RARE|MobileNetV3|83.32%|rec_mv3_tps_bilstm_attn|[Download link](https://paddleocr.bj.bcebos.com/rec_mv3_tps_bilstm_attn.tar)|
|SRN|Resnet50_vd_fpn|88.33%|rec_r50fpn_vd_none_srn|[Download link](https://paddleocr.bj.bcebos.com/SRN/rec_r50fpn_vd_none_srn.tar)|
**Note:** SRN model uses data expansion method to expand the two training sets mentioned above, and the expanded data can be downloaded from [Baidu Drive](https://pan.baidu.com/s/1-HSZ-ZVdqBF2HaBZ5pRAKA) (download code: y3ry).
The average accuracy of the two-stage training in the original paper is 89.74%, and that of one stage training in paddleocr is 88.33%. Both pre-trained weights can be downloaded [here](https://paddleocr.bj.bcebos.com/SRN/rec_r50fpn_vd_none_srn.tar).
Please refer to the document for training guide and use of PaddleOCR text recognition algorithms [Text recognition model training/evaluation/prediction](./doc/doc_en/recognition_en.md) Please refer to the document for training guide and use of PaddleOCR text recognition algorithms [Text recognition model training/evaluation/prediction](./doc/doc_en/recognition_en.md)
...@@ -9,14 +9,14 @@ The following list can be viewed through `--help` ...@@ -9,14 +9,14 @@ The following list can be viewed through `--help`
## INTRODUCTION TO GLOBAL PARAMETERS OF CONFIGURATION FILE ## INTRODUCTION TO GLOBAL PARAMETERS OF CONFIGURATION FILE
Take rec_chinese_lite_train_v1.1.yml as an example Take rec_chinese_lite_train_v2.0.yml as an example
### Global ### Global
| Parameter | Use | Defaults | Note | | Parameter | Use | Defaults | Note |
| :----------------------: | :---------------------: | :--------------: | :--------------------: | | :----------------------: | :---------------------: | :--------------: | :--------------------: |
| use_gpu | Set using GPU or not | true | \ | | use_gpu | Set using GPU or not | true | \ |
| epoch_num | Maximum training epoch number | 500 | \ | | epoch_num | Maximum training epoch number | 500 | \ |
| log_smooth_window | Sliding window size | 20 | \ | | log_smooth_window | Log queue length, the median value in the queue each time will be printed | 20 | \ |
| print_batch_step | Set print log interval | 10 | \ | | print_batch_step | Set print log interval | 10 | \ |
| save_model_dir | Set model save path | output/{算法名称} | \ | | save_model_dir | Set model save path | output/{算法名称} | \ |
| save_epoch_step | Set model save interval | 3 | \ | | save_epoch_step | Set model save interval | 3 | \ |
...@@ -118,4 +118,4 @@ In ppocr, the network is divided into four stages: Transform, Backbone, Neck and ...@@ -118,4 +118,4 @@ In ppocr, the network is divided into four stages: Transform, Backbone, Neck and
| shuffle | Does each epoch disrupt the order of the data set | True | \ | | shuffle | Does each epoch disrupt the order of the data set | True | \ |
| batch_size_per_card | Single card batch size during training | 256 | \ | | batch_size_per_card | Single card batch size during training | 256 | \ |
| drop_last | Whether to discard the last incomplete mini-batch because the number of samples in the data set cannot be divisible by batch_size | True | \ | | drop_last | Whether to discard the last incomplete mini-batch because the number of samples in the data set cannot be divisible by batch_size | True | \ |
| num_workers | The number of sub-processes used to load data, if it is 0, the sub-process is not started, and the data is loaded in the main process | 8 | \ | | num_workers | The number of sub-processes used to load data, if it is 0, the sub-process is not started, and the data is loaded in the main process | 8 | \ |
\ No newline at end of file
此差异已折叠。
## OCR model list(V1.1, updated on 9.22) ## OCR model list(V1.1, updated on 2020.12.12)
- [1. Text Detection Model](#Detection) - [1. Text Detection Model](#Detection)
- [2. Text Recognition Model](#Recognition) - [2. Text Recognition Model](#Recognition)
...@@ -10,61 +10,62 @@ ...@@ -10,61 +10,62 @@
The downloadable models provided by PaddleOCR include `inference model`, `trained model`, `pre-trained model` and `slim model`. The differences between the models are as follows: The downloadable models provided by PaddleOCR include `inference model`, `trained model`, `pre-trained model` and `slim model`. The differences between the models are as follows:
|model type|model format|description| |model type|model format|description|
|-|-|-| |--- | --- | --- |
|inference model|model、params|Used for reasoning based on Python prediction engine. [detail](./inference_en.md)| |inference model|inference.pdmodel、inference.pdiparams|Used for reasoning based on Python prediction engine,[detail](./inference_en.md)|
|trained model / pre-trained model|\*.pdmodel、\*.pdopt、\*.pdparams|The checkpoints model saved in the training process, which stores the parameters of the model, mostly used for model evaluation and continuous training.| |trained model, pre-trained model|\*.pdparams、\*.pdopt、\*.states |The checkpoints model saved in the training process, which stores the parameters of the model, mostly used for model evaluation and continuous training.|
|slim model|\*.nb|Generally used for Lite deployment| |slim model|\*.nb|Generally used for Lite deployment|
<a name="Detection"></a> <a name="Detection"></a>
### 1. Text Detection Model ### 1. Text Detection Model
|model name|description|config|model size|download|
|-|-|-|-|-|
|ch_ppocr_mobile_slim_v1.1_det|Slim pruned lightweight model, supporting Chinese, English, multilingual text detection|[det_mv3_db_v1.1.yml](../../configs/det/det_mv3_db_v1.1.yml)|1.4M|[inference model](https://paddleocr.bj.bcebos.com/20-09-22/mobile-slim/det/ch_ppocr_mobile_v1.1_det_prune_infer.tar) / [slim model](https://paddleocr.bj.bcebos.com/20-09-22/mobile/lite/ch_ppocr_mobile_v1.1_det_prune_opt.nb)|
|ch_ppocr_mobile_v1.1_det|Original lightweight model, supporting Chinese, English, multilingual text detection|[det_mv3_db_v1.1.yml](../../configs/det/det_mv3_db_v1.1.yml)|2.6M|[inference model](https://paddleocr.bj.bcebos.com/20-09-22/mobile/det/ch_ppocr_mobile_v1.1_det_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/20-09-22/mobile/det/ch_ppocr_mobile_v1.1_det_train.tar)|
|ch_ppocr_server_v1.1_det|General model, which is larger than the lightweight model, but achieved better performance|[det_r18_vd_db_v1.1.yml](../../configs/det/det_r18_vd_db_v1.1.yml)|47.2M|[inference model](https://paddleocr.bj.bcebos.com/20-09-22/server/det/ch_ppocr_server_v1.1_det_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/20-09-22/server/det/ch_ppocr_server_v1.1_det_train.tar)|
|model name|description|config|model size|download|
| --- | --- | --- | --- | --- |
|ch_ppocr_mobile_slim_v2.0_det|Slim pruned lightweight model, supporting Chinese, English, multilingual text detection|[ch_det_mv3_db_v2.0.yml](../../configs/det/ch_ppocr_v2.0/ch_det_mv3_db_v2.0.yml)| |[inference model (coming soon)](link) / [slim model (coming soon)](link)|
|ch_ppocr_mobile_v2.0_det|Original lightweight model, supporting Chinese, English, multilingual text detection|[ch_det_mv3_db_v2.0.yml](../../configs/det/ch_ppocr_v2.0/ch_det_mv3_db_v2.0.yml)|3M|[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_det_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_det_train.tar)|
|ch_ppocr_server_v2.0_det|General model, which is larger than the lightweight model, but achieved better performance|[ch_det_res18_db_v2.0.yml](../../configs/det/ch_ppocr_v2.0/ch_det_res18_db_v2.0.yml)|47M|[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_det_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_det_train.tar)|
<a name="Recognition"></a> <a name="Recognition"></a>
### 2. Text Recognition Model ### 2. Text Recognition Model
<a name="Chinese"></a> <a name="Chinese"></a>
#### Chinese Recognition Model #### Chinese Recognition Model
|model name|description|config|model size|download| |model name|description|config|model size|download|
|-|-|-|-|-| | --- | --- | --- | --- | --- |
|ch_ppocr_mobile_slim_v1.1_rec|Slim pruned and quantized lightweight model, supporting Chinese, English and number recognition|[rec_chinese_lite_train_v1.1.yml](../../configs/rec/ch_ppocr_v1.1/rec_chinese_lite_train_v1.1.yml)|1.6M|[inference model](https://paddleocr.bj.bcebos.com/20-09-22/mobile-slim/rec/ch_ppocr_mobile_v1.1_rec_quant_infer.tar) / [slim model](https://paddleocr.bj.bcebos.com/20-09-22/mobile/lite/ch_ppocr_mobile_v1.1_rec_quant_opt.nb) | |ch_ppocr_mobile_slim_v2.0_rec|Slim pruned and quantized lightweight model, supporting Chinese, English and number recognition|[rec_chinese_lite_train_v2.0.yml](../../configs/rec/ch_ppocr_v2.0/rec_chinese_lite_train_v2.0.yml)| |[inference model (coming soon)](link) / [slim model (coming soon)](link) |
|ch_ppocr_mobile_v1.1_rec|Original lightweight model, supporting Chinese, English and number recognition|[rec_chinese_lite_train_v1.1.yml](../../configs/rec/ch_ppocr_v1.1/rec_chinese_lite_train_v1.1.yml)|4.6M|[inference model](https://paddleocr.bj.bcebos.com/20-09-22/mobile/rec/ch_ppocr_mobile_v1.1_rec_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/20-09-22/mobile/rec/ch_ppocr_mobile_v1.1_rec_train.tar) / [pre-trained model](https://paddleocr.bj.bcebos.com/20-09-22/mobile/rec/ch_ppocr_mobile_v1.1_rec_pre.tar) | |ch_ppocr_mobile_v2.0_rec|Original lightweight model, supporting Chinese, English and number recognition|[rec_chinese_lite_train_v2.0.yml](../../configs/rec/ch_ppocr_v2.0/rec_chinese_lite_train_v2.0.yml)|3.71M|[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_rec_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_rec_train.tar) / [pre-trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_rec_pre.tar) |
|ch_ppocr_server_v1.1_rec|General model, supporting Chinese, English and number recognition|[rec_chinese_common_train_v1.1.yml](../../configs/rec/ch_ppocr_v1.1/rec_chinese_common_train_v1.1.yml)|105M|[inference model](https://paddleocr.bj.bcebos.com/20-09-22/server/rec/ch_ppocr_server_v1.1_rec_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/20-09-22/server/rec/ch_ppocr_server_v1.1_rec_train.tar) / [pre-trained model](https://paddleocr.bj.bcebos.com/20-09-22/server/rec/ch_ppocr_server_v1.1_rec_pre.tar) | |ch_ppocr_server_v2.0_rec|General model, supporting Chinese, English and number recognition|[rec_chinese_common_train_v2.0.yml](../../configs/rec/ch_ppocr_v2.0/rec_chinese_common_train_v2.0.yml)|94.8M|[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_rec_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_rec_train.tar) / [pre-trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_rec_pre.tar) |
**Note:** The `trained model` is finetuned on the `pre-trained model` with real data and synthsized vertical text data, which achieved better performance in real scene. The `pre-trained model` is directly trained on the full amount of real data and synthsized data, which is more suitable for finetune on your own dataset. **Note:** The `trained model` is finetuned on the `pre-trained model` with real data and synthsized vertical text data, which achieved better performance in real scene. The `pre-trained model` is directly trained on the full amount of real data and synthsized data, which is more suitable for finetune on your own dataset.
<a name="English"></a> <a name="English"></a>
#### English Recognition Model #### English Recognition Model
|model name|description|config|model size|download| |model name|description|config|model size|download|
|-|-|-|-|-| | --- | --- | --- | --- | --- |
|en_ppocr_mobile_slim_v1.1_rec|Slim pruned and quantized lightweight model, supporting English and number recognition|[rec_en_lite_train.yml](../../configs/rec/multi_languages/rec_en_lite_train.yml)|0.9M|[inference model](https://paddleocr.bj.bcebos.com/20-09-22/mobile-slim/en/en_ppocr_mobile_v1.1_rec_quant_infer.tar) / [slim model](https://paddleocr.bj.bcebos.com/20-09-22/mobile-slim/en/en_ppocr_mobile_v1.1_rec_quant_opt.nb) | |en_number_mobile_slim_v2.0_rec|Slim pruned and quantized lightweight model, supporting English and number recognition|[rec_en_number_lite_train.yml](../../configs/rec/multi_language/rec_en_number_lite_train.yml)| |[inference model (coming soon )](link) / [slim model (coming soon)](link) |
|en_ppocr_mobile_v1.1_rec|Original lightweight model, supporting English and number recognition|[rec_en_lite_train.yml](../../configs/rec/multi_languages/rec_en_lite_train.yml)|2.0M|[inference model](https://paddleocr.bj.bcebos.com/20-09-22/mobile/en/en_ppocr_mobile_v1.1_rec_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/20-09-22/mobile/en/en_ppocr_mobile_v1.1_rec_train.tar) | |en_number_mobile_v2.0_rec|Original lightweight model, supporting English and number recognition|[rec_en_number_lite_train.yml](../../configs/rec/multi_language/rec_en_number_lite_train.yml)|2.56M|[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/en_number_mobile_v2.0_rec_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/en_number_mobile_v2.0_rec_train.tar) |
<a name="Multilingual"></a> <a name="Multilingual"></a>
#### Multilingual Recognition Model(Updating...) #### Multilingual Recognition Model(Updating...)
|model name|description|config|model size|download|
|-|-|-|-|-|
| french_ppocr_mobile_v1.1_rec |Lightweight model for French recognition|[rec_french_lite_train.yml](../../configs/rec/multi_languages/rec_french_lite_train.yml)|2.1M|[inference model](https://paddleocr.bj.bcebos.com/20-09-22/mobile/fr/french_ppocr_mobile_v1.1_rec_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/20-09-22/mobile/fr/french_ppocr_mobile_v1.1_rec_train.tar) |
| german_ppocr_mobile_v1.1_rec |German model for French recognition|[rec_ger_lite_train.yml](../../configs/rec/multi_languages/rec_ger_lite_train.yml)|2.1M|[inference model](https://paddleocr.bj.bcebos.com/20-09-22/mobile/ge/german_ppocr_mobile_v1.1_rec_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/20-09-22/mobile/ge/german_ppocr_mobile_v1.1_rec_train.tar) |
| korean_ppocr_mobile_v1.1_rec |Lightweight model for Korean recognition|[rec_korean_lite_train.yml](../../configs/rec/multi_languages/rec_korean_lite_train.yml)|3.4M|[inference model](https://paddleocr.bj.bcebos.com/20-09-22/mobile/kr/korean_ppocr_mobile_v1.1_rec_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/20-09-22/mobile/kr/korean_ppocr_mobile_v1.1_rec_train.tar) |
| japan_ppocr_mobile_v1.1_rec |Lightweight model for Japanese recognition|[rec_japan_lite_train.yml](../../configs/rec/multi_languages/rec_japan_lite_train.yml)|3.7M|[inference model](https://paddleocr.bj.bcebos.com/20-09-22/mobile/jp/japan_ppocr_mobile_v1.1_rec_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/20-09-22/mobile/jp/japan_ppocr_mobile_v1.1_rec_train.tar) |
|model name|description|config|model size|download|
| --- | --- | --- | --- | --- |
| french_mobile_v2.0_rec |Lightweight model for French recognition|[rec_french_lite_train.yml](../../configs/rec/multi_language/rec_french_lite_train.yml)|2.65M|[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/french_mobile_v2.0_rec_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/french_mobile_v2.0_rec_train.tar) |
| german_mobile_v2.0_rec |Lightweight model for French recognition|[rec_german_lite_train.yml](../../configs/rec/multi_language/rec_german_lite_train.yml)|2.65M|[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/german_mobile_v2.0_rec_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/german_mobile_v2.0_rec_train.tar) |
| korean_mobile_v2.0_rec |Lightweight model for Korean recognition|[rec_korean_lite_train.yml](../../configs/rec/multi_language/rec_korean_lite_train.yml)|3.9M|[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/korean_mobile_v2.0_rec_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/korean_mobile_v2.0_rec_train.tar) |
| japan_mobile_v2.0_rec |Lightweight model for Japanese recognition|[rec_japan_lite_train.yml](../../configs/rec/multi_language/rec_japan_lite_train.yml)|4.23M|[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/japan_mobile_v2.0_rec_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/japan_mobile_v2.0_rec_train.tar) |
<a name="Angle"></a> <a name="Angle"></a>
### 3. Text Angle Classification Model ### 3. Text Angle Classification Model
|model name|description|config|model size|download| |model name|description|config|model size|download|
|-|-|-|-|-| | --- | --- | --- | --- | --- |
|ch_ppocr_mobile_v1.1_cls_quant|Slim quantized model|[cls_mv3.yml](../../configs/cls/cls_mv3.yml)|0.5M|[inference model](https://paddleocr.bj.bcebos.com/20-09-22/cls/ch_ppocr_mobile_v1.1_cls_quant_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/20-09-22/cls/ch_ppocr_mobile_v1.1_cls_quant_train.tar) / [slim model](https://paddleocr.bj.bcebos.com/20-09-22/mobile/lite/ch_ppocr_mobile_v1.1_cls_quant_opt.nb) | |ch_ppocr_mobile_slim_v2.0_cls|Slim quantized model|[cls_mv3.yml](../../configs/cls/cls_mv3.yml)| |[inference model (coming soon)](link) / [trained model](link) / [slim model](link) |
|ch_ppocr_mobile_v1.1_cls|Original model|[cls_mv3.yml](../../configs/cls/cls_mv3.yml)|850kb|[inference model](https://paddleocr.bj.bcebos.com/20-09-22/cls/ch_ppocr_mobile_v1.1_cls_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/20-09-22/cls/ch_ppocr_mobile_v1.1_cls_train.tar) | |ch_ppocr_mobile_v2.0_cls|Original model|[cls_mv3.yml](../../configs/cls/cls_mv3.yml)|1.38M|[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_train.tar) |
## OCR model list (V1.1,updated on 2020.9.22)
## OCR model list(V1.0, updated on 7.16) [1.1 series model address](https://github.com/PaddlePaddle/PaddleOCR/blob/dygraph/doc/doc_ch/models_list.md)
|model name|description|detection model|recognition model|recognition model supporting space recognition|
|-|-|-|-|-|
|chinese_db_crnn_mobile|8.6M lightweight OCR model|[inference model](https://paddleocr.bj.bcebos.com/ch_models/ch_det_mv3_db_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/ch_models/ch_det_mv3_db.tar) | [inference model](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_mv3_crnn_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_mv3_crnn.tar) |[inference model](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_mv3_crnn_enhance_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_mv3_crnn_enhance.tar)
|chinese_db_crnn_server|General OCR model|[inference model](https://paddleocr.bj.bcebos.com/ch_models/ch_det_r50_vd_db_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/ch_models/ch_det_r50_vd_db.tar) | [inference model](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_r34_vd_crnn_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_r34_vd_crnn.tar) |[inference model](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_r34_vd_crnn_enhance_infer.tar) / [trained model](https://paddleocr.bj.bcebos.com/ch_models/ch_rec_r34_vd_crnn_enhance.tar)
...@@ -5,17 +5,17 @@ ...@@ -5,17 +5,17 @@
Please refer to [quick installation](./installation_en.md) to configure the PaddleOCR operating environment. Please refer to [quick installation](./installation_en.md) to configure the PaddleOCR operating environment.
* Note: Support the use of PaddleOCR through whl package installation,pelease refer [PaddleOCR Package](https://github.com/PaddlePaddle/PaddleOCR/blob/develop/doc/doc_en/whl_en.md). * Note: Support the use of PaddleOCR through whl package installation,pelease refer [PaddleOCR Package](./whl_en.md).
## 2.inference models ## 2.inference models
The detection and recognition models on the mobile and server sides are as follows. For more models (including multiple languages), please refer to [PP-OCR v1.1 series model list](../doc_ch/models_list.md) The detection and recognition models on the mobile and server sides are as follows. For more models (including multiple languages), please refer to [PP-OCR v2.0 series model list](../doc_ch/models_list.md)
| Model introduction | Model name | Recommended scene | Detection model | Direction Classifier | Recognition model |
| Model introduction | Model name | Recommended scene | Detection model | Direction Classifier | Recognition model |
| ------------ | --------------- | ----------------|---- | ---------- | -------- | | ------------ | --------------- | ----------------|---- | ---------- | -------- |
| Ultra-lightweight Chinese OCR model(8.1M) | ch_ppocr_mobile_v1.1_xx |Mobile-side/Server-side|[inference model](https://paddleocr.bj.bcebos.com/20-09-22/mobile/det/ch_ppocr_mobile_v1.1_det_infer.tar) / [pretrained model](https://paddleocr.bj.bcebos.com/20-09-22/mobile/det/ch_ppocr_mobile_v1.1_det_train.tar)|[inference model](https://paddleocr.bj.bcebos.com/20-09-22/cls/ch_ppocr_mobile_v1.1_cls_infer.tar) / [pretrained model](https://paddleocr.bj.bcebos.com/20-09-22/cls/ch_ppocr_mobile_v1.1_cls_train.tar) |[inference model](https://paddleocr.bj.bcebos.com/20-09-22/mobile/rec/ch_ppocr_mobile_v1.1_rec_infer.tar) / [pretrained model](https://paddleocr.bj.bcebos.com/20-09-22/mobile/rec/ch_ppocr_mobile_v1.1_rec_pre.tar) | | Ultra-lightweight Chinese OCR model (8.1M) | ch_ppocr_mobile_v2.0_xx |Mobile-side/Server-side|[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_det_infer.tar) / [pretrained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_det_train.tar)|[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_infer.tar) / [pretrained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_train.tar) |[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_rec_infer.tar) / [pretrained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_rec_pre.tar) |
| Universal Chinese OCR model(155.1M) |ch_ppocr_server_v1.1_xx|Server-side |[inference model](https://paddleocr.bj.bcebos.com/20-09-22/server/det/ch_ppocr_server_v1.1_det_infer.tar) / [pretrained model](https://paddleocr.bj.bcebos.com/20-09-22/server/det/ch_ppocr_server_v1.1_det_train.tar) |[inference model](https://paddleocr.bj.bcebos.com/20-09-22/cls/ch_ppocr_mobile_v1.1_cls_infer.tar) / [pretrained model](https://paddleocr.bj.bcebos.com/20-09-22/cls/ch_ppocr_mobile_v1.1_cls_train.tar) |[inference model](https://paddleocr.bj.bcebos.com/20-09-22/server/rec/ch_ppocr_server_v1.1_rec_infer.tar) / [pretrained model](https://paddleocr.bj.bcebos.com/20-09-22/server/rec/ch_ppocr_server_v1.1_rec_pre.tar) | | Universal Chinese OCR model (143M) | ch_ppocr_server_v2.0_xx |Server-side |[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_det_infer.tar) / [pretrained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_det_train.tar) |[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_infer.tar) / [pretrained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_train.tar) |[inference model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_rec_infer.tar) / [pretrained model](https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_rec_pre.tar) |
* If `wget` is not installed in the windows environment, you can copy the link to the browser to download when downloading the model, then uncompress it and place it in the corresponding directory. * If `wget` is not installed in the windows environment, you can copy the link to the browser to download when downloading the model, then uncompress it and place it in the corresponding directory.
...@@ -37,46 +37,47 @@ Take the ultra-lightweight model as an example: ...@@ -37,46 +37,47 @@ Take the ultra-lightweight model as an example:
``` ```
mkdir inference && cd inference mkdir inference && cd inference
# Download the detection model of the ultra-lightweight Chinese OCR model and uncompress it # Download the detection model of the ultra-lightweight Chinese OCR model and uncompress it
wget https://paddleocr.bj.bcebos.com/20-09-22/mobile/det/ch_ppocr_mobile_v1.1_det_infer.tar && tar xf ch_ppocr_mobile_v1.1_det_infer.tar wget https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_det_infer.tar && tar xf ch_ppocr_mobile_v2.0_det_infer.tar
# Download the recognition model of the ultra-lightweight Chinese OCR model and uncompress it # Download the recognition model of the ultra-lightweight Chinese OCR model and uncompress it
wget https://paddleocr.bj.bcebos.com/20-09-22/mobile/rec/ch_ppocr_mobile_v1.1_rec_infer.tar && tar xf ch_ppocr_mobile_v1.1_rec_infer.tar wget https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_rec_infer.tar && tar xf ch_ppocr_mobile_v2.0_rec_infer.tar
# Download the direction classifier model of the ultra-lightweight Chinese OCR model and uncompress it # Download the angle classifier model of the ultra-lightweight Chinese OCR model and uncompress it
wget https://paddleocr.bj.bcebos.com/20-09-22/cls/ch_ppocr_mobile_v1.1_cls_infer.tar && tar xf ch_ppocr_mobile_v1.1_cls_infer.tar wget https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_infer.tar && tar xf ch_ppocr_mobile_v2.0_cls_infer.tar
cd .. cd ..
``` ```
After decompression, the file structure should be as follows: After decompression, the file structure should be as follows:
``` ```
|-inference ├── ch_ppocr_mobile_v2.0_cls_infer
|-ch_ppocr_mobile_v1.1_det_infer │ ├── inference.pdiparams
|- model │ ├── inference.pdiparams.info
|- params │ └── inference.pdmodel
|-ch_ppocr_mobile_v1.1_rec_infer ├── ch_ppocr_mobile_v2.0_det_infer
|- model │ ├── inference.pdiparams
|- params │ ├── inference.pdiparams.info
|-ch_ppocr_mobile_v1.1_cls_infer │ └── inference.pdmodel
|- model ├── ch_ppocr_mobile_v2.0_rec_infer
|- params ├── inference.pdiparams
... ├── inference.pdiparams.info
└── inference.pdmodel
``` ```
## 3. Single image or image set prediction ## 3. Single image or image set prediction
* The following code implements text detection and recognition process. When performing prediction, you need to specify the path of a single image or image set through the parameter `image_dir`, the parameter `det_model_dir` specifies the path to detect the inference model, the parameter `rec_model_dir` specifies the path to identify the inference model, the parameter `use_angle_cls` specifies whether to use the direction classifier, the parameter `cls_model_dir` specifies the path to identify the direction classifier model, the parameter `use_space_char` specifies whether to predict the space char. The visual results are saved to the `./inference_results` folder by default. * The following code implements text detection、angle class and recognition process. When performing prediction, you need to specify the path of a single image or image set through the parameter `image_dir`, the parameter `det_model_dir` specifies the path to detect the inference model, the parameter `rec_model_dir` specifies the path to identify the inference model, the parameter `use_angle_cls` specifies whether to use the direction classifier, the parameter `cls_model_dir` specifies the path to identify the direction classifier model, the parameter `use_space_char` specifies whether to predict the space char. The visual results are saved to the `./inference_results` folder by default.
```bash ```bash
# Predict a single image specified by image_dir # Predict a single image specified by image_dir
python3 tools/infer/predict_system.py --image_dir="./doc/imgs/11.jpg" --det_model_dir="./inference/ch_ppocr_mobile_v1.1_det_infer/" --rec_model_dir="./inference/ch_ppocr_mobile_v1.1_rec_infer/" --cls_model_dir="./inference/ch_ppocr_mobile_v1.1_cls_infer/" --use_angle_cls=True --use_space_char=True python3 tools/infer/predict_system.py --image_dir="./doc/imgs/11.jpg" --det_model_dir="./inference/ch_ppocr_mobile_v2.0_det_infer/" --rec_model_dir="./inference/ch_ppocr_mobile_v2.0_rec_infer/" --cls_model_dir="./inference/ch_ppocr_mobile_v2.0_cls_infer/" --use_angle_cls=True --use_space_char=True
# Predict imageset specified by image_dir # Predict imageset specified by image_dir
python3 tools/infer/predict_system.py --image_dir="./doc/imgs/" --det_model_dir="./inference/ch_ppocr_mobile_v1.1_det_infer/" --rec_model_dir="./inference/ch_ppocr_mobile_v1.1_rec_infer/" --cls_model_dir="./inference/ch_ppocr_mobile_v1.1_cls_infer/" --use_angle_cls=True --use_space_char=True python3 tools/infer/predict_system.py --image_dir="./doc/imgs/" --det_model_dir="./inference/ch_ppocr_mobile_v2.0_det_infer/" --rec_model_dir="./inference/ch_ppocr_mobile_v2.0_rec_infer/" --cls_model_dir="./inference/ch_ppocr_mobile_v2.0_cls_infer/" --use_angle_cls=True --use_space_char=True
# If you want to use the CPU for prediction, you need to set the use_gpu parameter to False # If you want to use the CPU for prediction, you need to set the use_gpu parameter to False
python3 tools/infer/predict_system.py --image_dir="./doc/imgs/11.jpg" --det_model_dir="./inference/ch_ppocr_mobile_v1.1_det_infer/" --rec_model_dir="./inference/ch_ppocr_mobile_v1.1_rec_infer/" --cls_model_dir="./inference/ch_ppocr_mobile_v1.1_cls_infer/" --use_angle_cls=True --use_space_char=True --use_gpu=False python3 tools/infer/predict_system.py --image_dir="./doc/imgs/11.jpg" --det_model_dir="./inference/ch_ppocr_mobile_v2.0_det_infer/" --rec_model_dir="./inference/ch_ppocr_mobile_v2.0_rec_infer/" --cls_model_dir="./inference/ch_ppocr_mobile_v2.0_cls_infer/" --use_angle_cls=True --use_space_char=True --use_gpu=False
``` ```
- Universal Chinese OCR model - Universal Chinese OCR model
...@@ -85,7 +86,7 @@ Please follow the above steps to download the corresponding models and update th ...@@ -85,7 +86,7 @@ Please follow the above steps to download the corresponding models and update th
``` ```
# Predict a single image specified by image_dir # Predict a single image specified by image_dir
python3 tools/infer/predict_system.py --image_dir="./doc/imgs/11.jpg" --det_model_dir="./inference/ch_ppocr_server_v1.1_det_infer/" --rec_model_dir="./inference/ch_ppocr_server_v1.1_rec_infer/" --cls_model_dir="./inference/ch_ppocr_mobile_v1.1_cls_infer/" --use_angle_cls=True --use_space_char=True python3 tools/infer/predict_system.py --image_dir="./doc/imgs/11.jpg" --det_model_dir="./inference/ch_ppocr_server_v2.0_det_infer/" --rec_model_dir="./inference/ch_ppocr_server_v2.0_rec_infer/" --cls_model_dir="./inference/ch_ppocr_mobile_v2.0_cls_infer/" --use_angle_cls=True --use_space_char=True
``` ```
* Note * Note
......
...@@ -114,11 +114,14 @@ In `word_dict.txt`, there is a single word in each line, which maps characters a ...@@ -114,11 +114,14 @@ In `word_dict.txt`, there is a single word in each line, which maps characters a
`ppocr/utils/dict/french_dict.txt` is a French dictionary with 118 characters `ppocr/utils/dict/french_dict.txt` is a French dictionary with 118 characters
`ppocr/utils/dict/japan_dict.txt` is a French dictionary with 4399 characters `ppocr/utils/dict/japan_dict.txt` is a Japanese dictionary with 4399 characters
`ppocr/utils/dict/korean_dict.txt` is a French dictionary with 3636 characters `ppocr/utils/dict/korean_dict.txt` is a Korean dictionary with 3636 characters
`ppocr/utils/dict/german_dict.txt` is a German dictionary with 131 characters
`ppocr/utils/dict/en_dict.txt` is a English dictionary with 63 characters
`ppocr/utils/dict/german_dict.txt` is a French dictionary with 131 characters
You can use it on demand. You can use it on demand.
...@@ -135,7 +138,7 @@ If you need to customize dic file, please add character_dict_path field in confi ...@@ -135,7 +138,7 @@ If you need to customize dic file, please add character_dict_path field in confi
<a name="Add_space_category"></a> <a name="Add_space_category"></a>
- Add space category - Add space category
If you want to support the recognition of the `space` category, please set the `use_space_char` field in the yml file to `true`. If you want to support the recognition of the `space` category, please set the `use_space_char` field in the yml file to `True`.
**Note: use_space_char only takes effect when character_type=ch** **Note: use_space_char only takes effect when character_type=ch**
...@@ -149,19 +152,18 @@ First download the pretrain model, you can download the trained model to finetun ...@@ -149,19 +152,18 @@ First download the pretrain model, you can download the trained model to finetun
``` ```
cd PaddleOCR/ cd PaddleOCR/
# Download the pre-trained model of MobileNetV3 # Download the pre-trained model of MobileNetV3
wget -P ./pretrain_models/ https://paddleocr.bj.bcebos.com/rec_mv3_none_bilstm_ctc.tar wget -P ./pretrain_models/ https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/rec_mv3_none_bilstm_ctc_v2.0_train.tar
# Decompress model parameters # Decompress model parameters
cd pretrain_models cd pretrain_models
tar -xf rec_mv3_none_bilstm_ctc.tar && rm -rf rec_mv3_none_bilstm_ctc.tar tar -xf rec_mv3_none_bilstm_ctc_v2.0_train.tar && rm -rf rec_mv3_none_bilstm_ctc_v2.0_train.tar
``` ```
Start training: Start training:
``` ```
# GPU training Support single card and multi-card training, specify the card number through CUDA_VISIBLE_DEVICES # GPU training Support single card and multi-card training, specify the card number through --gpus
export CUDA_VISIBLE_DEVICES=0,1,2,3
# Training icdar15 English data and saving the log as train_rec.log # Training icdar15 English data and saving the log as train_rec.log
python3 tools/train.py -c configs/rec/rec_icdar15_train.yml 2>&1 | tee train_rec.log python3 -m paddle.distributed.launch --gpus '0,1,2,3' tools/train.py -c configs/rec/rec_icdar15_train.yml
``` ```
<a name="Data_Augmentation"></a> <a name="Data_Augmentation"></a>
- Data Augmentation - Data Augmentation
...@@ -184,8 +186,8 @@ If the evaluation set is large, the test will be time-consuming. It is recommend ...@@ -184,8 +186,8 @@ If the evaluation set is large, the test will be time-consuming. It is recommend
| Configuration file | Algorithm | backbone | trans | seq | pred | | Configuration file | Algorithm | backbone | trans | seq | pred |
| :--------: | :-------: | :-------: | :-------: | :-----: | :-----: | | :--------: | :-------: | :-------: | :-------: | :-----: | :-----: |
| [rec_chinese_lite_train_v1.1.yml](../../configs/rec/ch_ppocr_v1.1/rec_chinese_lite_train_v1.1.yml) | CRNN | Mobilenet_v3 small 0.5 | None | BiLSTM | ctc | | [rec_chinese_lite_train_v2.0.yml](../../configs/rec/ch_ppocr_v2.0/rec_chinese_lite_train_v2.0.yml) | CRNN | Mobilenet_v3 small 0.5 | None | BiLSTM | ctc |
| [rec_chinese_common_train_v1.1.yml](../../configs/rec/ch_ppocr_v1.1/rec_chinese_common_train_v1.1.yml) | CRNN | ResNet34_vd | None | BiLSTM | ctc | | [rec_chinese_common_train_v2.0.yml](../../configs/rec/ch_ppocr_v2.0/rec_chinese_common_train_v2.0.yml) | CRNN | ResNet34_vd | None | BiLSTM | ctc |
| rec_chinese_lite_train.yml | CRNN | Mobilenet_v3 small 0.5 | None | BiLSTM | ctc | | rec_chinese_lite_train.yml | CRNN | Mobilenet_v3 small 0.5 | None | BiLSTM | ctc |
| rec_chinese_common_train.yml | CRNN | ResNet34_vd | None | BiLSTM | ctc | | rec_chinese_common_train.yml | CRNN | ResNet34_vd | None | BiLSTM | ctc |
| rec_icdar15_train.yml | CRNN | Mobilenet_v3 large 0.5 | None | BiLSTM | ctc | | rec_icdar15_train.yml | CRNN | Mobilenet_v3 large 0.5 | None | BiLSTM | ctc |
...@@ -195,43 +197,72 @@ If the evaluation set is large, the test will be time-consuming. It is recommend ...@@ -195,43 +197,72 @@ If the evaluation set is large, the test will be time-consuming. It is recommend
| rec_mv3_tps_bilstm_attn.yml | RARE | Mobilenet_v3 large 0.5 | tps | BiLSTM | attention | | rec_mv3_tps_bilstm_attn.yml | RARE | Mobilenet_v3 large 0.5 | tps | BiLSTM | attention |
| rec_r34_vd_none_bilstm_ctc.yml | CRNN | Resnet34_vd | None | BiLSTM | ctc | | rec_r34_vd_none_bilstm_ctc.yml | CRNN | Resnet34_vd | None | BiLSTM | ctc |
| rec_r34_vd_none_none_ctc.yml | Rosetta | Resnet34_vd | None | None | ctc | | rec_r34_vd_none_none_ctc.yml | Rosetta | Resnet34_vd | None | None | ctc |
| rec_r34_vd_tps_bilstm_attn.yml | RARE | Resnet34_vd | tps | BiLSTM | attention |
| rec_r34_vd_tps_bilstm_ctc.yml | STARNet | Resnet34_vd | tps | BiLSTM | ctc | | rec_r34_vd_tps_bilstm_ctc.yml | STARNet | Resnet34_vd | tps | BiLSTM | ctc |
For training Chinese data, it is recommended to use For training Chinese data, it is recommended to use
训练中文数据,推荐使用[rec_chinese_lite_train_v1.1.yml](../../configs/rec/ch_ppocr_v1.1/rec_chinese_lite_train_v1.1.yml). If you want to try the result of other algorithms on the Chinese data set, please refer to the following instructions to modify the configuration file: [rec_chinese_lite_train_v2.0.yml](../../configs/rec/ch_ppocr_v2.0/rec_chinese_lite_train_v2.0.yml). If you want to try the result of other algorithms on the Chinese data set, please refer to the following instructions to modify the configuration file:
co co
Take `rec_mv3_none_none_ctc.yml` as an example: Take `rec_chinese_lite_train_v2.0.yml` as an example:
``` ```
Global: Global:
... ...
# Modify image_shape to fit long text # Add a custom dictionary, such as modify the dictionary, please point the path to the new dictionary
image_shape: [3, 32, 320] character_dict_path: ppocr/utils/ppocr_keys_v1.txt
...
# Modify character type # Modify character type
character_type: ch character_type: ch
# Add a custom dictionary, such as modify the dictionary, please point the path to the new dictionary
character_dict_path: ./ppocr/utils/ppocr_keys_v1.txt
... ...
# Modify reader type
reader_yml: ./configs/rec/rec_chinese_reader.yml
# Whether to use data augmentation
distort: true
# Whether to recognize spaces # Whether to recognize spaces
use_space_char: true use_space_char: True
...
...
Optimizer: Optimizer:
... ...
# Add learning rate decay strategy # Add learning rate decay strategy
decay: lr:
function: cosine_decay name: Cosine
# Each epoch contains iter number learning_rate: 0.001
step_each_epoch: 20 ...
# Total epoch number
total_epoch: 1000 ...
Train:
dataset:
# Type of dataset,we support LMDBDateSet and SimpleDataSet
name: SimpleDataSet
# Path of dataset
data_dir: ./train_data/
# Path of train list
label_file_list: ["./train_data/train_list.txt"]
transforms:
...
- RecResizeImg:
# Modify image_shape to fit long text
image_shape: [3, 32, 320]
...
loader:
...
# Train batch_size for Single card
batch_size_per_card: 256
...
Eval:
dataset:
# Type of dataset,we support LMDBDateSet and SimpleDataSet
name: SimpleDataSet
# Path of dataset
data_dir: ./train_data
# Path of eval list
label_file_list: ["./train_data/val_list.txt"]
transforms:
...
- RecResizeImg:
# Modify image_shape to fit long text
image_shape: [3, 32, 320]
...
loader:
# Eval batch_size for Single card
batch_size_per_card: 256
...
``` ```
**Note that the configuration file for prediction/evaluation must be consistent with the training.** **Note that the configuration file for prediction/evaluation must be consistent with the training.**
...@@ -257,18 +288,33 @@ Take `rec_french_lite_train` as an example: ...@@ -257,18 +288,33 @@ Take `rec_french_lite_train` as an example:
``` ```
Global: Global:
... ...
# Add a custom dictionary, if you modify the dictionary # Add a custom dictionary, such as modify the dictionary, please point the path to the new dictionary
# please point the path to the new dictionary
character_dict_path: ./ppocr/utils/dict/french_dict.txt character_dict_path: ./ppocr/utils/dict/french_dict.txt
# Add data augmentation during training
distort: true
# Identify spaces
use_space_char: true
...
# Modify reader type
reader_yml: ./configs/rec/multi_languages/rec_french_reader.yml
... ...
# Whether to recognize spaces
use_space_char: True
... ...
Train:
dataset:
# Type of dataset,we support LMDBDateSet and SimpleDataSet
name: SimpleDataSet
# Path of dataset
data_dir: ./train_data/
# Path of train list
label_file_list: ["./train_data/french_train.txt"]
...
Eval:
dataset:
# Type of dataset,we support LMDBDateSet and SimpleDataSet
name: SimpleDataSet
# Path of dataset
data_dir: ./train_data
# Path of eval list
label_file_list: ["./train_data/french_val.txt"]
...
``` ```
<a name="EVALUATION"></a> <a name="EVALUATION"></a>
...@@ -277,9 +323,8 @@ Global: ...@@ -277,9 +323,8 @@ Global:
The evaluation data set can be modified via `configs/rec/rec_icdar15_reader.yml` setting of `label_file_path` in EvalReader. The evaluation data set can be modified via `configs/rec/rec_icdar15_reader.yml` setting of `label_file_path` in EvalReader.
``` ```
export CUDA_VISIBLE_DEVICES=0
# GPU evaluation, Global.checkpoints is the weight to be tested # GPU evaluation, Global.checkpoints is the weight to be tested
python3 tools/eval.py -c configs/rec/rec_icdar15_reader.yml -o Global.checkpoints={path/to/weights}/best_accuracy python3 -m paddle.distributed.launch --gpus '0' tools/eval.py -c configs/rec/rec_icdar15_reader.yml -o Global.checkpoints={path/to/weights}/best_accuracy
``` ```
<a name="PREDICTION"></a> <a name="PREDICTION"></a>
...@@ -294,7 +339,7 @@ The default prediction picture is stored in `infer_img`, and the weight is speci ...@@ -294,7 +339,7 @@ The default prediction picture is stored in `infer_img`, and the weight is speci
``` ```
# Predict English results # Predict English results
python3 tools/infer_rec.py -c configs/rec/ch_ppocr_v1.1/rec_chinese_lite_train_v1.1.yml -o Global.checkpoints={path/to/weights}/best_accuracy TestReader.infer_img=doc/imgs_words/en/word_1.jpg python3 tools/infer_rec.py -c configs/rec/ch_ppocr_v2.0/rec_chinese_lite_train_v2.0.yml -o Global.checkpoints={path/to/weights}/best_accuracy TestReader.infer_img=doc/imgs_words/en/word_1.jpg
``` ```
Input image: Input image:
...@@ -309,11 +354,11 @@ infer_img: doc/imgs_words/en/word_1.png ...@@ -309,11 +354,11 @@ infer_img: doc/imgs_words/en/word_1.png
word : joint word : joint
``` ```
The configuration file used for prediction must be consistent with the training. For example, you completed the training of the Chinese model with `python3 tools/train.py -c configs/rec/ch_ppocr_v1.1/rec_chinese_lite_train_v1.1.yml`, you can use the following command to predict the Chinese model: The configuration file used for prediction must be consistent with the training. For example, you completed the training of the Chinese model with `python3 tools/train.py -c configs/rec/ch_ppocr_v2.0/rec_chinese_lite_train_v2.0.yml`, you can use the following command to predict the Chinese model:
``` ```
# Predict Chinese results # Predict Chinese results
python3 tools/infer_rec.py -c configs/rec/ch_ppocr_v1.1/rec_chinese_lite_train_v1.1.yml -o Global.checkpoints={path/to/weights}/best_accuracy TestReader.infer_img=doc/imgs_words/ch/word_1.jpg python3 tools/infer_rec.py -c configs/rec/ch_ppocr_v2.0/rec_chinese_lite_train_v2.0.yml -o Global.checkpoints={path/to/weights}/best_accuracy TestReader.infer_img=doc/imgs_words/ch/word_1.jpg
``` ```
Input image: Input image:
......
doc/imgs_results/2.jpg

148.4 KB | W: | H:

doc/imgs_results/2.jpg

92.2 KB | W: | H:

doc/imgs_results/2.jpg
doc/imgs_results/2.jpg
doc/imgs_results/2.jpg
doc/imgs_results/2.jpg
  • 2-up
  • Swipe
  • Onion skin
doc/imgs_results/det_res_2.jpg

79.5 KB | W: | H:

doc/imgs_results/det_res_2.jpg

77.3 KB | W: | H:

doc/imgs_results/det_res_2.jpg
doc/imgs_results/det_res_2.jpg
doc/imgs_results/det_res_2.jpg
doc/imgs_results/det_res_2.jpg
  • 2-up
  • Swipe
  • Onion skin
doc/imgs_results/det_res_img_10_db.jpg

330.5 KB | W: | H:

doc/imgs_results/det_res_img_10_db.jpg

331.3 KB | W: | H:

doc/imgs_results/det_res_img_10_db.jpg
doc/imgs_results/det_res_img_10_db.jpg
doc/imgs_results/det_res_img_10_db.jpg
doc/imgs_results/det_res_img_10_db.jpg
  • 2-up
  • Swipe
  • Onion skin
doc/joinus.PNG

15.7 KB | W: | H:

doc/joinus.PNG

408.3 KB | W: | H:

doc/joinus.PNG
doc/joinus.PNG
doc/joinus.PNG
doc/joinus.PNG
  • 2-up
  • Swipe
  • Onion skin
...@@ -35,44 +35,45 @@ __all__ = ['PaddleOCR'] ...@@ -35,44 +35,45 @@ __all__ = ['PaddleOCR']
model_urls = { model_urls = {
'det': 'det':
'https://paddleocr.bj.bcebos.com/20-09-22/mobile/det/ch_ppocr_mobile_v1.1_det_infer.tar', 'https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_det_infer.tar',
'rec': { 'rec': {
'ch': { 'ch': {
'url': 'url':
'https://paddleocr.bj.bcebos.com/20-09-22/mobile/rec/ch_ppocr_mobile_v1.1_rec_infer.tar', 'https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_rec_infer.tar',
'dict_path': './ppocr/utils/ppocr_keys_v1.txt' 'dict_path': './ppocr/utils/ppocr_keys_v1.txt'
}, },
'en': { 'en': {
'url': 'url':
'https://paddleocr.bj.bcebos.com/20-09-22/mobile/en/en_ppocr_mobile_v1.1_rec_infer.tar', 'https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/en_number_mobile_v2.0_rec_infer.tar',
'dict_path': './ppocr/utils/ic15_dict.txt' 'dict_path': './ppocr/utils/dict/en_dict.txt'
}, },
'french': { 'french': {
'url': 'url':
'https://paddleocr.bj.bcebos.com/20-09-22/mobile/fr/french_ppocr_mobile_v1.1_rec_infer.tar', 'https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/french_mobile_v2.0_rec_infer.tar',
'dict_path': './ppocr/utils/dict/french_dict.txt' 'dict_path': './ppocr/utils/dict/french_dict.txt'
}, },
'german': { 'german': {
'url': 'url':
'https://paddleocr.bj.bcebos.com/20-09-22/mobile/ge/german_ppocr_mobile_v1.1_rec_infer.tar', 'https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/german_mobile_v2.0_rec_infer.tar',
'dict_path': './ppocr/utils/dict/german_dict.txt' 'dict_path': './ppocr/utils/dict/german_dict.txt'
}, },
'korean': { 'korean': {
'url': 'url':
'https://paddleocr.bj.bcebos.com/20-09-22/mobile/kr/korean_ppocr_mobile_v1.1_rec_infer.tar', 'https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/korean_mobile_v2.0_rec_infer.tar',
'dict_path': './ppocr/utils/dict/korean_dict.txt' 'dict_path': './ppocr/utils/dict/korean_dict.txt'
}, },
'japan': { 'japan': {
'url': 'url':
'https://paddleocr.bj.bcebos.com/20-09-22/mobile/jp/japan_ppocr_mobile_v1.1_rec_infer.tar', 'https://paddleocr.bj.bcebos.com/dygraph_v2.0/multilingual/japan_mobile_v2.0_rec_infer.tar',
'dict_path': './ppocr/utils/dict/japan_dict.txt' 'dict_path': './ppocr/utils/dict/japan_dict.txt'
} }
}, },
'cls': 'cls':
'https://paddleocr.bj.bcebos.com/20-09-22/cls/ch_ppocr_mobile_v1.1_cls_infer.tar' 'https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_infer.tar'
} }
SUPPORT_DET_MODEL = ['DB'] SUPPORT_DET_MODEL = ['DB']
VERSION = 2.0
SUPPORT_REC_MODEL = ['CRNN'] SUPPORT_REC_MODEL = ['CRNN']
BASE_DIR = os.path.expanduser("~/.paddleocr/") BASE_DIR = os.path.expanduser("~/.paddleocr/")
...@@ -94,20 +95,24 @@ def download_with_progressbar(url, save_path): ...@@ -94,20 +95,24 @@ def download_with_progressbar(url, save_path):
def maybe_download(model_storage_directory, url): def maybe_download(model_storage_directory, url):
# using custom model # using custom model
if not os.path.exists(os.path.join( tar_file_name_list = [
model_storage_directory, 'model')) or not os.path.exists( 'inference.pdiparams', 'inference.pdiparams.info', 'inference.pdmodel'
os.path.join(model_storage_directory, 'params')): ]
if not os.path.exists(
os.path.join(model_storage_directory, 'inference.pdiparams')
) or not os.path.exists(
os.path.join(model_storage_directory, 'inference.pdmodel')):
tmp_path = os.path.join(model_storage_directory, url.split('/')[-1]) tmp_path = os.path.join(model_storage_directory, url.split('/')[-1])
print('download {} to {}'.format(url, tmp_path)) print('download {} to {}'.format(url, tmp_path))
os.makedirs(model_storage_directory, exist_ok=True) os.makedirs(model_storage_directory, exist_ok=True)
download_with_progressbar(url, tmp_path) download_with_progressbar(url, tmp_path)
with tarfile.open(tmp_path, 'r') as tarObj: with tarfile.open(tmp_path, 'r') as tarObj:
for member in tarObj.getmembers(): for member in tarObj.getmembers():
if "model" in member.name: filename = None
filename = 'model' for tar_file_name in tar_file_name_list:
elif "params" in member.name: if tar_file_name in member.name:
filename = 'params' filename = tar_file_name
else: if filename is None:
continue continue
file = tarObj.extractfile(member) file = tarObj.extractfile(member)
with open( with open(
...@@ -176,43 +181,43 @@ def parse_args(mMain=True, add_help=True): ...@@ -176,43 +181,43 @@ def parse_args(mMain=True, add_help=True):
parser.add_argument("--use_angle_cls", type=str2bool, default=False) parser.add_argument("--use_angle_cls", type=str2bool, default=False)
return parser.parse_args() return parser.parse_args()
else: else:
return argparse.Namespace(use_gpu=True, return argparse.Namespace(
ir_optim=True, use_gpu=True,
use_tensorrt=False, ir_optim=True,
gpu_mem=8000, use_tensorrt=False,
image_dir='', gpu_mem=8000,
det_algorithm='DB', image_dir='',
det_model_dir=None, det_algorithm='DB',
det_limit_side_len=960, det_model_dir=None,
det_limit_type='max', det_limit_side_len=960,
det_db_thresh=0.3, det_limit_type='max',
det_db_box_thresh=0.5, det_db_thresh=0.3,
det_db_unclip_ratio=2.0, det_db_box_thresh=0.5,
det_east_score_thresh=0.8, det_db_unclip_ratio=2.0,
det_east_cover_thresh=0.1, det_east_score_thresh=0.8,
det_east_nms_thresh=0.2, det_east_cover_thresh=0.1,
rec_algorithm='CRNN', det_east_nms_thresh=0.2,
rec_model_dir=None, rec_algorithm='CRNN',
rec_image_shape="3, 32, 320", rec_model_dir=None,
rec_char_type='ch', rec_image_shape="3, 32, 320",
rec_batch_num=30, rec_char_type='ch',
max_text_length=25, rec_batch_num=30,
rec_char_dict_path=None, max_text_length=25,
use_space_char=True, rec_char_dict_path=None,
drop_score=0.5, use_space_char=True,
cls_model_dir=None, drop_score=0.5,
cls_image_shape="3, 48, 192", cls_model_dir=None,
label_list=['0', '180'], cls_image_shape="3, 48, 192",
cls_batch_num=30, label_list=['0', '180'],
cls_thresh=0.9, cls_batch_num=30,
enable_mkldnn=False, cls_thresh=0.9,
use_zero_copy_run=False, enable_mkldnn=False,
use_pdserving=False, use_zero_copy_run=False,
lang='ch', use_pdserving=False,
det=True, lang='ch',
rec=True, det=True,
use_angle_cls=False rec=True,
) use_angle_cls=False)
class PaddleOCR(predict_system.TextSystem): class PaddleOCR(predict_system.TextSystem):
...@@ -228,19 +233,21 @@ class PaddleOCR(predict_system.TextSystem): ...@@ -228,19 +233,21 @@ class PaddleOCR(predict_system.TextSystem):
lang = postprocess_params.lang lang = postprocess_params.lang
assert lang in model_urls[ assert lang in model_urls[
'rec'], 'param lang must in {}, but got {}'.format( 'rec'], 'param lang must in {}, but got {}'.format(
model_urls['rec'].keys(), lang) model_urls['rec'].keys(), lang)
if postprocess_params.rec_char_dict_path is None: if postprocess_params.rec_char_dict_path is None:
postprocess_params.rec_char_dict_path = model_urls['rec'][lang][ postprocess_params.rec_char_dict_path = model_urls['rec'][lang][
'dict_path'] 'dict_path']
# init model dir # init model dir
if postprocess_params.det_model_dir is None: if postprocess_params.det_model_dir is None:
postprocess_params.det_model_dir = os.path.join(BASE_DIR, 'det') postprocess_params.det_model_dir = os.path.join(
BASE_DIR, '{}/det'.format(VERSION))
if postprocess_params.rec_model_dir is None: if postprocess_params.rec_model_dir is None:
postprocess_params.rec_model_dir = os.path.join( postprocess_params.rec_model_dir = os.path.join(
BASE_DIR, 'rec/{}'.format(lang)) BASE_DIR, '{}/rec/{}'.format(VERSION, lang))
if postprocess_params.cls_model_dir is None: if postprocess_params.cls_model_dir is None:
postprocess_params.cls_model_dir = os.path.join(BASE_DIR, 'cls') postprocess_params.cls_model_dir = os.path.join(
BASE_DIR, '{}/cls'.format(VERSION))
print(postprocess_params) print(postprocess_params)
# download model # download model
maybe_download(postprocess_params.det_model_dir, model_urls['det']) maybe_download(postprocess_params.det_model_dir, model_urls['det'])
......
...@@ -32,9 +32,8 @@ class ClsMetric(object): ...@@ -32,9 +32,8 @@ class ClsMetric(object):
def get_metric(self): def get_metric(self):
""" """
return metircs { return metrics {
'acc': 0, 'acc': 0
'norm_edit_dis': 0,
} }
""" """
acc = self.correct_num / self.all_num acc = self.correct_num / self.all_num
......
...@@ -57,7 +57,7 @@ class DetMetric(object): ...@@ -57,7 +57,7 @@ class DetMetric(object):
def get_metric(self): def get_metric(self):
""" """
return metircs { return metrics {
'precision': 0, 'precision': 0,
'recall': 0, 'recall': 0,
'hmean': 0 'hmean': 0
......
...@@ -43,7 +43,7 @@ class RecMetric(object): ...@@ -43,7 +43,7 @@ class RecMetric(object):
def get_metric(self): def get_metric(self):
""" """
return metircs { return metrics {
'acc': 0, 'acc': 0,
'norm_edit_dis': 0, 'norm_edit_dis': 0,
} }
......
...@@ -40,7 +40,7 @@ class DBPostProcess(object): ...@@ -40,7 +40,7 @@ class DBPostProcess(object):
self.max_candidates = max_candidates self.max_candidates = max_candidates
self.unclip_ratio = unclip_ratio self.unclip_ratio = unclip_ratio
self.min_size = 3 self.min_size = 3
self.dilation_kernel = None if not use_dilation else [[1, 1], [1, 1]] self.dilation_kernel = None if not use_dilation else np.array([[1, 1], [1, 1]])
def boxes_from_bitmap(self, pred, _bitmap, dest_width, dest_height): def boxes_from_bitmap(self, pred, _bitmap, dest_width, dest_height):
''' '''
......
0
1
2
3
4
5
6
7
8
9
a
b
c
d
e
f
g
h
i
j
k
l
m
n
o
p
q
r
s
t
u
v
w
x
y
z
A
B
C
D
E
F
G
H
I
J
K
L
M
N
O
P
Q
R
S
T
U
V
W
X
Y
Z
...@@ -132,4 +132,5 @@ j ...@@ -132,4 +132,5 @@ j
³ ³
Å Å
$ $
# #
\ No newline at end of file
...@@ -123,4 +123,5 @@ z ...@@ -123,4 +123,5 @@ z
â â
å å
æ æ
é é
\ No newline at end of file
...@@ -4395,4 +4395,5 @@ z ...@@ -4395,4 +4395,5 @@ z
\ No newline at end of file
...@@ -179,7 +179,7 @@ z ...@@ -179,7 +179,7 @@ z
с с
т т
я я
...@@ -3684,4 +3684,5 @@ z ...@@ -3684,4 +3684,5 @@ z
\ No newline at end of file
...@@ -33,4 +33,4 @@ v ...@@ -33,4 +33,4 @@ v
w w
x x
y y
z z
\ No newline at end of file
...@@ -28,37 +28,16 @@ from ppocr.modeling.architectures import build_model ...@@ -28,37 +28,16 @@ from ppocr.modeling.architectures import build_model
from ppocr.postprocess import build_post_process from ppocr.postprocess import build_post_process
from ppocr.utils.save_load import init_model from ppocr.utils.save_load import init_model
from ppocr.utils.logging import get_logger from ppocr.utils.logging import get_logger
from tools.program import load_config from tools.program import load_config, merge_config, ArgsParser
def parse_args():
parser = argparse.ArgumentParser()
parser.add_argument("-c", "--config", help="configuration file to use")
parser.add_argument(
"-o", "--output_path", type=str, default='./output/infer/')
return parser.parse_args()
class Model(paddle.nn.Layer):
def __init__(self, model):
super(Model, self).__init__()
self.pre_model = model
# Please modify the 'shape' according to actual needs
@to_static(input_spec=[
paddle.static.InputSpec(
shape=[None, 3, 640, 640], dtype='float32')
])
def forward(self, inputs):
x = self.pre_model(inputs)
return x
def main(): def main():
FLAGS = parse_args() FLAGS = ArgsParser().parse_args()
config = load_config(FLAGS.config) config = load_config(FLAGS.config)
merge_config(FLAGS.opt)
logger = get_logger() logger = get_logger()
# build post process # build post process
post_process_class = build_post_process(config['PostProcess'], post_process_class = build_post_process(config['PostProcess'],
config['Global']) config['Global'])
...@@ -71,9 +50,15 @@ def main(): ...@@ -71,9 +50,15 @@ def main():
init_model(config, model, logger) init_model(config, model, logger)
model.eval() model.eval()
model = Model(model) save_path = '{}/inference'.format(config['Global']['save_inference_dir'])
save_path = '{}/{}'.format(FLAGS.output_path, infer_shape = [3, 32, 100] if config['Architecture'][
config['Architecture']['model_type']) 'model_type'] != "det" else [3, 640, 640]
model = to_static(
model,
input_spec=[
paddle.static.InputSpec(
shape=[None] + infer_shape, dtype='float32')
])
paddle.jit.save(model, save_path) paddle.jit.save(model, save_path)
logger.info('inference model is saved to {}'.format(save_path)) logger.info('inference model is saved to {}'.format(save_path))
......
...@@ -63,6 +63,7 @@ class TextDetector(object): ...@@ -63,6 +63,7 @@ class TextDetector(object):
postprocess_params["box_thresh"] = args.det_db_box_thresh postprocess_params["box_thresh"] = args.det_db_box_thresh
postprocess_params["max_candidates"] = 1000 postprocess_params["max_candidates"] = 1000
postprocess_params["unclip_ratio"] = args.det_db_unclip_ratio postprocess_params["unclip_ratio"] = args.det_db_unclip_ratio
postprocess_params["use_dilation"] = True
else: else:
logger.info("unknown det_algorithm:{}".format(self.det_algorithm)) logger.info("unknown det_algorithm:{}".format(self.det_algorithm))
sys.exit(0) sys.exit(0)
...@@ -111,7 +112,7 @@ class TextDetector(object): ...@@ -111,7 +112,7 @@ class TextDetector(object):
box = self.clip_det_res(box, img_height, img_width) box = self.clip_det_res(box, img_height, img_width)
rect_width = int(np.linalg.norm(box[0] - box[1])) rect_width = int(np.linalg.norm(box[0] - box[1]))
rect_height = int(np.linalg.norm(box[0] - box[3])) rect_height = int(np.linalg.norm(box[0] - box[3]))
if rect_width <= 10 or rect_height <= 10: if rect_width <= 3 or rect_height <= 3:
continue continue
dt_boxes_new.append(box) dt_boxes_new.append(box)
dt_boxes = np.array(dt_boxes_new) dt_boxes = np.array(dt_boxes_new)
...@@ -186,4 +187,4 @@ if __name__ == "__main__": ...@@ -186,4 +187,4 @@ if __name__ == "__main__":
cv2.imwrite(img_path, src_im) cv2.imwrite(img_path, src_im)
logger.info("The visualized image saved in {}".format(img_path)) logger.info("The visualized image saved in {}".format(img_path))
if count > 1: if count > 1:
logger.info("Avg Time:", total_time / (count - 1)) logger.info("Avg Time: {}".format(total_time / (count - 1)))
...@@ -100,8 +100,8 @@ def create_predictor(args, mode, logger): ...@@ -100,8 +100,8 @@ def create_predictor(args, mode, logger):
if model_dir is None: if model_dir is None:
logger.info("not find {} model file path {}".format(mode, model_dir)) logger.info("not find {} model file path {}".format(mode, model_dir))
sys.exit(0) sys.exit(0)
model_file_path = model_dir + "/model" model_file_path = model_dir + "/inference.pdmodel"
params_file_path = model_dir + "/params" params_file_path = model_dir + "/inference.pdiparams"
if not os.path.exists(model_file_path): if not os.path.exists(model_file_path):
logger.info("not find model file path {}".format(model_file_path)) logger.info("not find model file path {}".format(model_file_path))
sys.exit(0) sys.exit(0)
......
...@@ -113,7 +113,6 @@ def merge_config(config): ...@@ -113,7 +113,6 @@ def merge_config(config):
global_config.keys(), sub_keys[0]) global_config.keys(), sub_keys[0])
cur = global_config[sub_keys[0]] cur = global_config[sub_keys[0]]
for idx, sub_key in enumerate(sub_keys[1:]): for idx, sub_key in enumerate(sub_keys[1:]):
assert (sub_key in cur)
if idx == len(sub_keys) - 2: if idx == len(sub_keys) - 2:
cur[sub_key] = value cur[sub_key] = value
else: else:
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册