Skip to content
体验新版
项目
组织
正在加载...
登录
切换导航
打开侧边栏
weixin_41840029
PaddleOCR
提交
2e98890d
P
PaddleOCR
项目概览
weixin_41840029
/
PaddleOCR
与 Fork 源项目一致
Fork自
PaddlePaddle / PaddleOCR
通知
1
Star
1
Fork
0
代码
文件
提交
分支
Tags
贡献者
分支图
Diff
Issue
0
列表
看板
标记
里程碑
合并请求
0
Wiki
0
Wiki
分析
仓库
DevOps
项目成员
Pages
P
PaddleOCR
项目概览
项目概览
详情
发布
仓库
仓库
文件
提交
分支
标签
贡献者
分支图
比较
Issue
0
Issue
0
列表
看板
标记
里程碑
合并请求
0
合并请求
0
Pages
分析
分析
仓库分析
DevOps
Wiki
0
Wiki
成员
成员
收起侧边栏
关闭侧边栏
动态
分支图
创建新Issue
提交
Issue看板
提交
2e98890d
编写于
7月 29, 2021
作者:
W
WenmuZhou
浏览文件
操作
浏览文件
下载
电子邮件补丁
差异文件
add pse config
上级
a739abab
变更
4
显示空白变更内容
内联
并排
Showing
4 changed file
with
174 addition
and
33 deletion
+174
-33
configs/det/det_mv3_pse.yml
configs/det/det_mv3_pse.yml
+135
-0
configs/det/det_r50_vd_pse.yml
configs/det/det_r50_vd_pse.yml
+15
-16
doc/doc_ch/algorithm_overview.md
doc/doc_ch/algorithm_overview.md
+13
-9
doc/doc_en/algorithm_overview_en.md
doc/doc_en/algorithm_overview_en.md
+11
-8
未找到文件。
configs/det/det_mv3_pse.yml
0 → 100644
浏览文件 @
2e98890d
Global
:
use_gpu
:
true
epoch_num
:
600
log_smooth_window
:
20
print_batch_step
:
10
save_model_dir
:
./output/det_mv3_pse/
save_epoch_step
:
600
# evaluation is run every 63 iterations
eval_batch_step
:
[
0
,
63
]
cal_metric_during_train
:
False
pretrained_model
:
./pretrain_models/MobileNetV3_large_x0_5_pretrained
checkpoints
:
#./output/det_r50_vd_pse_batch8_ColorJitter/best_accuracy
save_inference_dir
:
use_visualdl
:
False
infer_img
:
doc/imgs_en/img_10.jpg
save_res_path
:
./output/det_pse/predicts_pse.txt
Architecture
:
model_type
:
det
algorithm
:
PSE
Transform
:
null
Backbone
:
name
:
MobileNetV3
scale
:
0.5
model_name
:
large
Neck
:
name
:
FPN
out_channels
:
96
Head
:
name
:
PSEHead
hidden_dim
:
96
out_channels
:
7
Loss
:
name
:
PSELoss
alpha
:
0.7
ohem_ratio
:
3
kernel_sample_mask
:
pred
reduction
:
none
Optimizer
:
name
:
Adam
beta1
:
0.9
beta2
:
0.999
lr
:
name
:
Step
learning_rate
:
0.001
step_size
:
200
gamma
:
0.1
regularizer
:
name
:
'
L2'
factor
:
0.0005
PostProcess
:
name
:
PSEPostProcess
thresh
:
0
box_thresh
:
0.85
min_area
:
16
box_type
:
box
# 'box' or 'poly'
scale
:
1
Metric
:
name
:
DetMetric
main_indicator
:
hmean
Train
:
dataset
:
name
:
SimpleDataSet
data_dir
:
./train_data/icdar2015/text_localization/
label_file_list
:
-
./train_data/icdar2015/text_localization/train_icdar2015_label.txt
ratio_list
:
[
1.0
]
transforms
:
-
DecodeImage
:
# load image
img_mode
:
BGR
channel_first
:
False
-
DetLabelEncode
:
# Class handling label
-
ColorJitter
:
brightness
:
0.12549019607843137
saturation
:
0.5
-
IaaAugment
:
augmenter_args
:
-
{
'
type'
:
Resize
,
'
args'
:
{
'
size'
:
[
0.5
,
3
]
}
}
-
{
'
type'
:
Fliplr
,
'
args'
:
{
'
p'
:
0.5
}
}
-
{
'
type'
:
Affine
,
'
args'
:
{
'
rotate'
:
[
-10
,
10
]
}
}
-
MakePseGt
:
kernel_num
:
7
min_shrink_ratio
:
0.4
size
:
640
-
RandomCropImgMask
:
size
:
[
640
,
640
]
main_key
:
gt_text
crop_keys
:
[
'
image'
,
'
gt_text'
,
'
gt_kernels'
,
'
mask'
]
-
NormalizeImage
:
scale
:
1./255.
mean
:
[
0.485
,
0.456
,
0.406
]
std
:
[
0.229
,
0.224
,
0.225
]
order
:
'
hwc'
-
ToCHWImage
:
-
KeepKeys
:
keep_keys
:
[
'
image'
,
'
gt_text'
,
'
gt_kernels'
,
'
mask'
]
# the order of the dataloader list
loader
:
shuffle
:
True
drop_last
:
False
batch_size_per_card
:
16
num_workers
:
8
Eval
:
dataset
:
name
:
SimpleDataSet
data_dir
:
./train_data/icdar2015/text_localization/
label_file_list
:
-
./train_data/icdar2015/text_localization/test_icdar2015_label.txt
ratio_list
:
[
1.0
]
transforms
:
-
DecodeImage
:
# load image
img_mode
:
BGR
channel_first
:
False
-
DetLabelEncode
:
# Class handling label
-
DetResizeForTest
:
limit_side_len
:
736
limit_type
:
min
-
NormalizeImage
:
scale
:
1./255.
mean
:
[
0.485
,
0.456
,
0.406
]
std
:
[
0.229
,
0.224
,
0.225
]
order
:
'
hwc'
-
ToCHWImage
:
-
KeepKeys
:
keep_keys
:
[
'
image'
,
'
shape'
,
'
polys'
,
'
ignore_tags'
]
loader
:
shuffle
:
False
drop_last
:
False
batch_size_per_card
:
1
# must be 1
num_workers
:
8
\ No newline at end of file
configs/det/det_r50_vd_pse.yml
浏览文件 @
2e98890d
...
...
@@ -4,16 +4,16 @@ Global:
log_smooth_window
:
20
print_batch_step
:
10
save_model_dir
:
./output/det_r50_vd_pse/
save_epoch_step
:
12
00
# evaluation is run every
2000
iterations
eval_batch_step
:
[
0
,
125
]
save_epoch_step
:
6
00
# evaluation is run every
125
iterations
eval_batch_step
:
[
0
,
125
]
cal_metric_during_train
:
False
pretrained_model
:
/ssd1/zhoujun20/fuxian
/ResNet50_vd_ssld_pretrained
pretrained_model
:
./pretrain_models
/ResNet50_vd_ssld_pretrained
checkpoints
:
#./output/det_r50_vd_pse_batch8_ColorJitter/best_accuracy
save_inference_dir
:
use_visualdl
:
False
infer_img
:
doc/imgs_en/img_10.jpg
save_res_path
:
./output/det_
db/predicts_db
.txt
save_res_path
:
./output/det_
pse/predicts_pse
.txt
Architecture
:
model_type
:
det
...
...
@@ -68,7 +68,7 @@ Train:
data_dir
:
./train_data/icdar2015/text_localization/
label_file_list
:
-
./train_data/icdar2015/text_localization/train_icdar2015_label.txt
ratio_list
:
[
1.0
]
ratio_list
:
[
1.0
]
transforms
:
-
DecodeImage
:
# load image
img_mode
:
BGR
...
...
@@ -81,23 +81,23 @@ Train:
augmenter_args
:
-
{
'
type'
:
Resize
,
'
args'
:
{
'
size'
:
[
0.5
,
3
]
}
}
-
{
'
type'
:
Fliplr
,
'
args'
:
{
'
p'
:
0.5
}
}
-
{
'
type'
:
Affine
,
'
args'
:
{
'
rotate'
:
[
-10
,
10
]
}
}
-
{
'
type'
:
Affine
,
'
args'
:
{
'
rotate'
:
[
-10
,
10
]
}
}
-
MakePseGt
:
kernel_num
:
7
min_shrink_ratio
:
0.4
size
:
640
-
RandomCropImgMask
:
size
:
[
640
,
640
]
size
:
[
640
,
640
]
main_key
:
gt_text
crop_keys
:
[
'
image'
,
'
gt_text'
,
'
gt_kernels'
,
'
mask'
]
crop_keys
:
[
'
image'
,
'
gt_text'
,
'
gt_kernels'
,
'
mask'
]
-
NormalizeImage
:
scale
:
1./255.
mean
:
[
0.485
,
0.456
,
0.406
]
std
:
[
0.229
,
0.224
,
0.225
]
mean
:
[
0.485
,
0.456
,
0.406
]
std
:
[
0.229
,
0.224
,
0.225
]
order
:
'
hwc'
-
ToCHWImage
:
-
KeepKeys
:
keep_keys
:
[
'
image'
,
'
gt_text'
,
'
gt_kernels'
,
'
mask'
]
# the order of the dataloader list
keep_keys
:
[
'
image'
,
'
gt_text'
,
'
gt_kernels'
,
'
mask'
]
# the order of the dataloader list
loader
:
shuffle
:
True
drop_last
:
False
...
...
@@ -119,15 +119,14 @@ Eval:
-
DetResizeForTest
:
limit_side_len
:
736
limit_type
:
min
# resize_long: 2240
-
NormalizeImage
:
scale
:
1./255.
mean
:
[
0.485
,
0.456
,
0.406
]
std
:
[
0.229
,
0.224
,
0.225
]
mean
:
[
0.485
,
0.456
,
0.406
]
std
:
[
0.229
,
0.224
,
0.225
]
order
:
'
hwc'
-
ToCHWImage
:
-
KeepKeys
:
keep_keys
:
[
'
image'
,
'
shape'
,
'
polys'
,
'
ignore_tags'
]
keep_keys
:
[
'
image'
,
'
shape'
,
'
polys'
,
'
ignore_tags'
]
loader
:
shuffle
:
False
drop_last
:
False
...
...
doc/doc_ch/algorithm_overview.md
浏览文件 @
2e98890d
...
...
@@ -9,11 +9,13 @@
### 1.文本检测算法
PaddleOCR开源的文本检测算法列表:
-
[
x] DB([paper
](
https://arxiv.org/abs/1911.08947
)
) [2](ppocr推荐)
-
[
x] EAST([paper
](
https://arxiv.org/abs/1704.03155
)
)[1]
-
[
x] SAST([paper
](
https://arxiv.org/abs/1908.05498
)
)[4]
-
[
x] DB([paper
](
https://arxiv.org/abs/1911.08947
)
)(ppocr推荐)
-
[
x] EAST([paper
](
https://arxiv.org/abs/1704.03155
)
)
-
[
x] SAST([paper
](
https://arxiv.org/abs/1908.05498
)
)
-
[
x] PSENet([paper
](
https://arxiv.org/abs/1903.12473v2
)
)
在ICDAR2015文本检测公开数据集上,算法效果如下:
|模型|骨干网络|precision|recall|Hmean|下载链接|
| --- | --- | --- | --- | --- | --- |
|EAST|ResNet50_vd|85.80%|86.71%|86.25%|
[
下载链接
](
https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/det_r50_vd_east_v2.0_train.tar
)
|
...
...
@@ -21,6 +23,8 @@ PaddleOCR开源的文本检测算法列表:
|DB|ResNet50_vd|86.41%|78.72%|82.38%|
[
下载链接
](
https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/det_r50_vd_db_v2.0_train.tar
)
|
|DB|MobileNetV3|77.29%|73.08%|75.12%|
[
下载链接
](
https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/det_mv3_db_v2.0_train.tar
)
|
|SAST|ResNet50_vd|91.39%|83.77%|87.42%|
[
下载链接
](
https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/det_r50_vd_sast_icdar15_v2.0_train.tar
)
|
|PSE|ResNet50_vd|85.81%|79.53%|82.55%|
[
下载链接
](
https://paddleocr.bj.bcebos.com/dygraph_v2.1/en_det/det_r50_vd_pse_v2.0_train.tar
)
|
|PSE|MobileNetV3|82.20%|70.47%|75.89%|
[
下载链接
](
https://paddleocr.bj.bcebos.com/dygraph_v2.1/en_det/det_mv3_pse_v2.0_train.tar
)
|
在Total-text文本检测公开数据集上,算法效果如下:
...
...
@@ -39,13 +43,13 @@ PaddleOCR文本检测算法的训练和使用请参考文档教程中[模型训
### 2.文本识别算法
PaddleOCR基于动态图开源的文本识别算法列表:
-
[
x] CRNN([paper
](
https://arxiv.org/abs/1507.05717
)
)
[7]
(ppocr推荐)
-
[
x] Rosetta([paper
](
https://arxiv.org/abs/1910.05085
)
)
[10]
-
[
x] STAR-Net([paper
](
http://www.bmva.org/bmvc/2016/papers/paper043/index.html
)
)
[11]
-
[
x] RARE([paper
](
https://arxiv.org/abs/1603.03915v1
)
)
[12]
-
[
x] SRN([paper
](
https://arxiv.org/abs/2003.12294
)
)
[5]
-
[
x] CRNN([paper
](
https://arxiv.org/abs/1507.05717
)
)(ppocr推荐)
-
[
x] Rosetta([paper
](
https://arxiv.org/abs/1910.05085
)
)
-
[
x] STAR-Net([paper
](
http://www.bmva.org/bmvc/2016/papers/paper043/index.html
)
)
-
[
x] RARE([paper
](
https://arxiv.org/abs/1603.03915v1
)
)
-
[
x] SRN([paper
](
https://arxiv.org/abs/2003.12294
)
)
参考
[
DTRB
]
[
3
]
(https://arxiv.org/abs/1904.01906)
文字识别训练和评估流程,使用MJSynth和SynthText两个文字识别数据集训练,在IIIT, SVT, IC03, IC13, IC15, SVTP, CUTE数据集上进行评估,算法效果如下:
参考
[
DTRB
]
(
https://arxiv.org/abs/1904.01906
)
文字识别训练和评估流程,使用MJSynth和SynthText两个文字识别数据集训练,在IIIT, SVT, IC03, IC13, IC15, SVTP, CUTE数据集上进行评估,算法效果如下:
|模型|骨干网络|Avg Accuracy|模型存储命名|下载链接|
|---|---|---|---|---|
...
...
doc/doc_en/algorithm_overview_en.md
浏览文件 @
2e98890d
...
...
@@ -11,9 +11,10 @@ This tutorial lists the text detection algorithms and text recognition algorithm
### 1. Text Detection Algorithm
PaddleOCR open source text detection algorithms list:
-
[
x] EAST([paper
](
https://arxiv.org/abs/1704.03155
)
)[2]
-
[
x] DB([paper
](
https://arxiv.org/abs/1911.08947
)
)[1]
-
[
x] SAST([paper
](
https://arxiv.org/abs/1908.05498
)
)[4]
-
[
x] EAST([paper
](
https://arxiv.org/abs/1704.03155
)
)
-
[
x] DB([paper
](
https://arxiv.org/abs/1911.08947
)
)
-
[
x] SAST([paper
](
https://arxiv.org/abs/1908.05498
)
)
-
[
x] PSE([paper
](
https://arxiv.org/abs/1903.12473v2
)
)
On the ICDAR2015 dataset, the text detection result is as follows:
...
...
@@ -24,6 +25,8 @@ On the ICDAR2015 dataset, the text detection result is as follows:
|DB|ResNet50_vd|86.41%|78.72%|82.38%|
[
Download link
](
https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/det_r50_vd_db_v2.0_train.tar
)
|
|DB|MobileNetV3|77.29%|73.08%|75.12%|
[
Download link
](
https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/det_mv3_db_v2.0_train.tar
)
|
|SAST|ResNet50_vd|91.39%|83.77%|87.42%|
[
Download link
](
https://paddleocr.bj.bcebos.com/dygraph_v2.0/en/det_r50_vd_sast_icdar15_v2.0_train.tar
)
|
|PSE|ResNet50_vd|85.81%|79.53%|82.55%|
[
Download link
](
https://paddleocr.bj.bcebos.com/dygraph_v2.1/en_det/det_r50_vd_pse_v2.0_train.tar
)
|
|PSE|MobileNetV3|82.20%|70.47%|75.89%|
[
Download link
](
https://paddleocr.bj.bcebos.com/dygraph_v2.1/en_det/det_mv3_pse_v2.0_train.tar
)
|
On Total-Text dataset, the text detection result is as follows:
...
...
@@ -41,11 +44,11 @@ For the training guide and use of PaddleOCR text detection algorithms, please re
### 2. Text Recognition Algorithm
PaddleOCR open-source text recognition algorithms list:
-
[
x] CRNN([paper
](
https://arxiv.org/abs/1507.05717
)
)
[7]
-
[
x] Rosetta([paper
](
https://arxiv.org/abs/1910.05085
)
)
[10]
-
[
x] STAR-Net([paper
](
http://www.bmva.org/bmvc/2016/papers/paper043/index.html
)
)
[11]
-
[
x] RARE([paper
](
https://arxiv.org/abs/1603.03915v1
)
)
[12]
-
[
x] SRN([paper
](
https://arxiv.org/abs/2003.12294
)
)
[5]
-
[
x] CRNN([paper
](
https://arxiv.org/abs/1507.05717
)
)
-
[
x] Rosetta([paper
](
https://arxiv.org/abs/1910.05085
)
)
-
[
x] STAR-Net([paper
](
http://www.bmva.org/bmvc/2016/papers/paper043/index.html
)
)
-
[
x] RARE([paper
](
https://arxiv.org/abs/1603.03915v1
)
)
-
[
x] SRN([paper
](
https://arxiv.org/abs/2003.12294
)
)
Refer to
[
DTRB
](
https://arxiv.org/abs/1904.01906
)
, the training and evaluation result of these above text recognition (using MJSynth and SynthText for training, evaluate on IIIT, SVT, IC03, IC13, IC15, SVTP, CUTE) is as follow:
...
...
编辑
预览
Markdown
is supported
0%
请重试
或
添加新附件
.
添加附件
取消
You are about to add
0
people
to the discussion. Proceed with caution.
先完成此消息的编辑!
取消
想要评论请
注册
或
登录