Skip to content
体验新版
项目
组织
正在加载...
登录
切换导航
打开侧边栏
PaddlePaddle
PaddleOCR
提交
b4800ad2
P
PaddleOCR
项目概览
PaddlePaddle
/
PaddleOCR
大约 1 年 前同步成功
通知
1528
Star
32962
Fork
6643
代码
文件
提交
分支
Tags
贡献者
分支图
Diff
Issue
108
列表
看板
标记
里程碑
合并请求
7
Wiki
0
Wiki
分析
仓库
DevOps
项目成员
Pages
P
PaddleOCR
项目概览
项目概览
详情
发布
仓库
仓库
文件
提交
分支
标签
贡献者
分支图
比较
Issue
108
Issue
108
列表
看板
标记
里程碑
合并请求
7
合并请求
7
Pages
分析
分析
仓库分析
DevOps
Wiki
0
Wiki
成员
成员
收起侧边栏
关闭侧边栏
动态
分支图
创建新Issue
提交
Issue看板
未验证
提交
b4800ad2
编写于
11月 10, 2021
作者:
Z
zhoujun
提交者:
GitHub
11月 10, 2021
浏览文件
操作
浏览文件
下载
电子邮件补丁
差异文件
fix gap in table structure train model and inference model (#4566)
上级
b8c8b64f
变更
2
隐藏空白更改
内联
并排
Showing
2 changed file
with
33 addition
and
24 deletion
+33
-24
configs/table/table_mv3.yml
configs/table/table_mv3.yml
+9
-8
ppocr/modeling/heads/table_att_head.py
ppocr/modeling/heads/table_att_head.py
+24
-16
未找到文件。
configs/table/table_mv3.yml
浏览文件 @
b4800ad2
Global
:
Global
:
use_gpu
:
true
use_gpu
:
true
epoch_num
:
5
0
epoch_num
:
40
0
log_smooth_window
:
20
log_smooth_window
:
20
print_batch_step
:
5
print_batch_step
:
5
save_model_dir
:
./output/table_mv3/
save_model_dir
:
./output/table_mv3/
save_epoch_step
:
5
save_epoch_step
:
3
# evaluation is run every 400 iterations after the 0th iteration
# evaluation is run every 400 iterations after the 0th iteration
eval_batch_step
:
[
0
,
400
]
eval_batch_step
:
[
0
,
400
]
cal_metric_during_train
:
True
cal_metric_during_train
:
True
pretrained_model
:
pretrained_model
:
checkpoints
:
checkpoints
:
save_inference_dir
:
save_inference_dir
:
use_visualdl
:
False
use_visualdl
:
False
infer_img
:
doc/
imgs_words/ch/word_1
.jpg
infer_img
:
doc/
table/table
.jpg
# for data or label process
# for data or label process
character_dict_path
:
ppocr/utils/dict/table_structure_dict.txt
character_dict_path
:
ppocr/utils/dict/table_structure_dict.txt
character_type
:
en
character_type
:
en
max_text_length
:
100
max_text_length
:
100
max_elem_length
:
5
00
max_elem_length
:
8
00
max_cell_num
:
500
max_cell_num
:
500
infer_mode
:
False
infer_mode
:
False
process_total_num
:
0
process_total_num
:
0
process_cut_num
:
0
process_cut_num
:
0
Optimizer
:
Optimizer
:
name
:
Adam
name
:
Adam
beta1
:
0.9
beta1
:
0.9
...
@@ -41,13 +40,15 @@ Architecture:
...
@@ -41,13 +40,15 @@ Architecture:
Backbone
:
Backbone
:
name
:
MobileNetV3
name
:
MobileNetV3
scale
:
1.0
scale
:
1.0
model_name
:
small
model_name
:
large
disable_se
:
True
Head
:
Head
:
name
:
TableAttentionHead
name
:
TableAttentionHead
hidden_size
:
256
hidden_size
:
256
l2_decay
:
0.00001
l2_decay
:
0.00001
loc_type
:
2
loc_type
:
2
max_text_length
:
100
max_elem_length
:
800
max_cell_num
:
500
Loss
:
Loss
:
name
:
TableAttentionLoss
name
:
TableAttentionLoss
...
...
ppocr/modeling/heads/table_att_head.py
浏览文件 @
b4800ad2
...
@@ -23,32 +23,40 @@ import numpy as np
...
@@ -23,32 +23,40 @@ import numpy as np
class
TableAttentionHead
(
nn
.
Layer
):
class
TableAttentionHead
(
nn
.
Layer
):
def
__init__
(
self
,
in_channels
,
hidden_size
,
loc_type
,
in_max_len
=
488
,
**
kwargs
):
def
__init__
(
self
,
in_channels
,
hidden_size
,
loc_type
,
in_max_len
=
488
,
max_text_length
=
100
,
max_elem_length
=
800
,
max_cell_num
=
500
,
**
kwargs
):
super
(
TableAttentionHead
,
self
).
__init__
()
super
(
TableAttentionHead
,
self
).
__init__
()
self
.
input_size
=
in_channels
[
-
1
]
self
.
input_size
=
in_channels
[
-
1
]
self
.
hidden_size
=
hidden_size
self
.
hidden_size
=
hidden_size
self
.
elem_num
=
30
self
.
elem_num
=
30
self
.
max_text_length
=
100
self
.
max_text_length
=
max_text_length
self
.
max_elem_length
=
500
self
.
max_elem_length
=
max_elem_length
self
.
max_cell_num
=
500
self
.
max_cell_num
=
max_cell_num
self
.
structure_attention_cell
=
AttentionGRUCell
(
self
.
structure_attention_cell
=
AttentionGRUCell
(
self
.
input_size
,
hidden_size
,
self
.
elem_num
,
use_gru
=
False
)
self
.
input_size
,
hidden_size
,
self
.
elem_num
,
use_gru
=
False
)
self
.
structure_generator
=
nn
.
Linear
(
hidden_size
,
self
.
elem_num
)
self
.
structure_generator
=
nn
.
Linear
(
hidden_size
,
self
.
elem_num
)
self
.
loc_type
=
loc_type
self
.
loc_type
=
loc_type
self
.
in_max_len
=
in_max_len
self
.
in_max_len
=
in_max_len
if
self
.
loc_type
==
1
:
if
self
.
loc_type
==
1
:
self
.
loc_generator
=
nn
.
Linear
(
hidden_size
,
4
)
self
.
loc_generator
=
nn
.
Linear
(
hidden_size
,
4
)
else
:
else
:
if
self
.
in_max_len
==
640
:
if
self
.
in_max_len
==
640
:
self
.
loc_fea_trans
=
nn
.
Linear
(
400
,
self
.
max_elem_length
+
1
)
self
.
loc_fea_trans
=
nn
.
Linear
(
400
,
self
.
max_elem_length
+
1
)
elif
self
.
in_max_len
==
800
:
elif
self
.
in_max_len
==
800
:
self
.
loc_fea_trans
=
nn
.
Linear
(
625
,
self
.
max_elem_length
+
1
)
self
.
loc_fea_trans
=
nn
.
Linear
(
625
,
self
.
max_elem_length
+
1
)
else
:
else
:
self
.
loc_fea_trans
=
nn
.
Linear
(
256
,
self
.
max_elem_length
+
1
)
self
.
loc_fea_trans
=
nn
.
Linear
(
256
,
self
.
max_elem_length
+
1
)
self
.
loc_generator
=
nn
.
Linear
(
self
.
input_size
+
hidden_size
,
4
)
self
.
loc_generator
=
nn
.
Linear
(
self
.
input_size
+
hidden_size
,
4
)
def
_char_to_onehot
(
self
,
input_char
,
onehot_dim
):
def
_char_to_onehot
(
self
,
input_char
,
onehot_dim
):
input_ont_hot
=
F
.
one_hot
(
input_char
,
onehot_dim
)
input_ont_hot
=
F
.
one_hot
(
input_char
,
onehot_dim
)
return
input_ont_hot
return
input_ont_hot
...
@@ -60,16 +68,16 @@ class TableAttentionHead(nn.Layer):
...
@@ -60,16 +68,16 @@ class TableAttentionHead(nn.Layer):
if
len
(
fea
.
shape
)
==
3
:
if
len
(
fea
.
shape
)
==
3
:
pass
pass
else
:
else
:
last_shape
=
int
(
np
.
prod
(
fea
.
shape
[
2
:]))
# gry added
last_shape
=
int
(
np
.
prod
(
fea
.
shape
[
2
:]))
# gry added
fea
=
paddle
.
reshape
(
fea
,
[
fea
.
shape
[
0
],
fea
.
shape
[
1
],
last_shape
])
fea
=
paddle
.
reshape
(
fea
,
[
fea
.
shape
[
0
],
fea
.
shape
[
1
],
last_shape
])
fea
=
fea
.
transpose
([
0
,
2
,
1
])
# (NTC)(batch, width, channels)
fea
=
fea
.
transpose
([
0
,
2
,
1
])
# (NTC)(batch, width, channels)
batch_size
=
fea
.
shape
[
0
]
batch_size
=
fea
.
shape
[
0
]
hidden
=
paddle
.
zeros
((
batch_size
,
self
.
hidden_size
))
hidden
=
paddle
.
zeros
((
batch_size
,
self
.
hidden_size
))
output_hiddens
=
[]
output_hiddens
=
[]
if
self
.
training
and
targets
is
not
None
:
if
self
.
training
and
targets
is
not
None
:
structure
=
targets
[
0
]
structure
=
targets
[
0
]
for
i
in
range
(
self
.
max_elem_length
+
1
):
for
i
in
range
(
self
.
max_elem_length
+
1
):
elem_onehots
=
self
.
_char_to_onehot
(
elem_onehots
=
self
.
_char_to_onehot
(
structure
[:,
i
],
onehot_dim
=
self
.
elem_num
)
structure
[:,
i
],
onehot_dim
=
self
.
elem_num
)
(
outputs
,
hidden
),
alpha
=
self
.
structure_attention_cell
(
(
outputs
,
hidden
),
alpha
=
self
.
structure_attention_cell
(
...
@@ -96,7 +104,7 @@ class TableAttentionHead(nn.Layer):
...
@@ -96,7 +104,7 @@ class TableAttentionHead(nn.Layer):
alpha
=
None
alpha
=
None
max_elem_length
=
paddle
.
to_tensor
(
self
.
max_elem_length
)
max_elem_length
=
paddle
.
to_tensor
(
self
.
max_elem_length
)
i
=
0
i
=
0
while
i
<
max_elem_length
+
1
:
while
i
<
max_elem_length
+
1
:
elem_onehots
=
self
.
_char_to_onehot
(
elem_onehots
=
self
.
_char_to_onehot
(
temp_elem
,
onehot_dim
=
self
.
elem_num
)
temp_elem
,
onehot_dim
=
self
.
elem_num
)
(
outputs
,
hidden
),
alpha
=
self
.
structure_attention_cell
(
(
outputs
,
hidden
),
alpha
=
self
.
structure_attention_cell
(
...
@@ -105,7 +113,7 @@ class TableAttentionHead(nn.Layer):
...
@@ -105,7 +113,7 @@ class TableAttentionHead(nn.Layer):
structure_probs_step
=
self
.
structure_generator
(
outputs
)
structure_probs_step
=
self
.
structure_generator
(
outputs
)
temp_elem
=
structure_probs_step
.
argmax
(
axis
=
1
,
dtype
=
"int32"
)
temp_elem
=
structure_probs_step
.
argmax
(
axis
=
1
,
dtype
=
"int32"
)
i
+=
1
i
+=
1
output
=
paddle
.
concat
(
output_hiddens
,
axis
=
1
)
output
=
paddle
.
concat
(
output_hiddens
,
axis
=
1
)
structure_probs
=
self
.
structure_generator
(
output
)
structure_probs
=
self
.
structure_generator
(
output
)
structure_probs
=
F
.
softmax
(
structure_probs
)
structure_probs
=
F
.
softmax
(
structure_probs
)
...
@@ -119,9 +127,9 @@ class TableAttentionHead(nn.Layer):
...
@@ -119,9 +127,9 @@ class TableAttentionHead(nn.Layer):
loc_concat
=
paddle
.
concat
([
output
,
loc_fea
],
axis
=
2
)
loc_concat
=
paddle
.
concat
([
output
,
loc_fea
],
axis
=
2
)
loc_preds
=
self
.
loc_generator
(
loc_concat
)
loc_preds
=
self
.
loc_generator
(
loc_concat
)
loc_preds
=
F
.
sigmoid
(
loc_preds
)
loc_preds
=
F
.
sigmoid
(
loc_preds
)
return
{
'structure_probs'
:
structure_probs
,
'loc_preds'
:
loc_preds
}
return
{
'structure_probs'
:
structure_probs
,
'loc_preds'
:
loc_preds
}
class
AttentionGRUCell
(
nn
.
Layer
):
class
AttentionGRUCell
(
nn
.
Layer
):
def
__init__
(
self
,
input_size
,
hidden_size
,
num_embeddings
,
use_gru
=
False
):
def
__init__
(
self
,
input_size
,
hidden_size
,
num_embeddings
,
use_gru
=
False
):
super
(
AttentionGRUCell
,
self
).
__init__
()
super
(
AttentionGRUCell
,
self
).
__init__
()
...
...
编辑
预览
Markdown
is supported
0%
请重试
或
添加新附件
.
添加附件
取消
You are about to add
0
people
to the discussion. Proceed with caution.
先完成此消息的编辑!
取消
想要评论请
注册
或
登录