Skip to content
体验新版
项目
组织
正在加载...
登录
切换导航
打开侧边栏
PaddlePaddle
DeepSpeech
提交
7cef93a6
D
DeepSpeech
项目概览
PaddlePaddle
/
DeepSpeech
大约 2 年 前同步成功
通知
210
Star
8425
Fork
1598
代码
文件
提交
分支
Tags
贡献者
分支图
Diff
Issue
245
列表
看板
标记
里程碑
合并请求
3
Wiki
0
Wiki
分析
仓库
DevOps
项目成员
Pages
D
DeepSpeech
项目概览
项目概览
详情
发布
仓库
仓库
文件
提交
分支
标签
贡献者
分支图
比较
Issue
245
Issue
245
列表
看板
标记
里程碑
合并请求
3
合并请求
3
Pages
分析
分析
仓库分析
DevOps
Wiki
0
Wiki
成员
成员
收起侧边栏
关闭侧边栏
动态
分支图
创建新Issue
提交
Issue看板
提交
7cef93a6
编写于
11月 23, 2021
作者:
G
gongel
浏览文件
操作
浏览文件
下载
电子邮件补丁
差异文件
refactor: update
上级
5b5c73f9
变更
6
隐藏空白更改
内联
并排
Showing
6 changed file
with
47 addition
and
18 deletion
+47
-18
examples/ted_en_zh/t1/README.md
examples/ted_en_zh/t1/README.md
+8
-7
examples/ted_en_zh/t1/conf/transformer_mtl_noam.yaml
examples/ted_en_zh/t1/conf/transformer_mtl_noam.yaml
+0
-0
examples/ted_en_zh/t1/local/convert_torch_to_paddle.py
examples/ted_en_zh/t1/local/convert_torch_to_paddle.py
+9
-0
examples/ted_en_zh/t1/local/download_pretrain.sh
examples/ted_en_zh/t1/local/download_pretrain.sh
+19
-0
examples/ted_en_zh/t1/local/train_finetune.sh
examples/ted_en_zh/t1/local/train_finetune.sh
+0
-0
examples/ted_en_zh/t1/run.sh
examples/ted_en_zh/t1/run.sh
+11
-11
未找到文件。
examples/ted_en_zh/t1/README.md
浏览文件 @
7cef93a6
...
@@ -3,13 +3,14 @@
...
@@ -3,13 +3,14 @@
## Dataset
## Dataset
| Data Subset | Duration in
Second
s |
| Data Subset | Duration in
Frame
s |
| --- | --- |
| --- | --- |
| data/manifest.train |
0.942 ~ 6
0 |
| data/manifest.train |
94.2 ~ 600
0 |
| data/manifest.dev | 1
.151 ~ 39
|
| data/manifest.dev | 1
15.1 ~ 3900
|
| data/manifest.test | 1
.1 ~ 42.74
6 |
| data/manifest.test | 1
10 ~ 4274.
6 |
## Transformer
## Transformer
| Model | Params | Config | Char-BLEU |
| Model | Params | Config | Val loss | Char-BLEU |
| --- | --- | --- | --- |
| --- | --- | --- | --- | --- |
| Transformer+ASR MTL | 50.26M | conf/transformer_joint_noam.yaml | 17.38 |
| FAT + Transformer+ASR MTL | 50.26M | conf/transformer_mtl_noam.yaml | 62.86 | 19.45 |
| FAT + Transformer+ASR MTL with word reward | 50.26M | conf/transformer_mtl_noam.yaml | 62.86 | 20.80 |
examples/ted_en_zh/t1/conf/transformer_
joint
_noam.yaml
→
examples/ted_en_zh/t1/conf/transformer_
mtl
_noam.yaml
浏览文件 @
7cef93a6
文件已移动
examples/ted_en_zh/t1/local/convert_torch_to_paddle.py
浏览文件 @
7cef93a6
...
@@ -27,6 +27,7 @@ def torch2paddle(args):
...
@@ -27,6 +27,7 @@ def torch2paddle(args):
torch_model
=
torch
.
load
(
args
.
torch_ckpt
,
map_location
=
'cpu'
)
torch_model
=
torch
.
load
(
args
.
torch_ckpt
,
map_location
=
'cpu'
)
cnt
=
0
cnt
=
0
for
k
,
v
in
torch_model
[
'model'
].
items
():
for
k
,
v
in
torch_model
[
'model'
].
items
():
# encoder.embed.* --> encoder.embed.*
if
k
.
startswith
(
'encoder.embed'
):
if
k
.
startswith
(
'encoder.embed'
):
if
v
.
ndim
==
2
:
if
v
.
ndim
==
2
:
v
=
v
.
transpose
(
0
,
1
)
v
=
v
.
transpose
(
0
,
1
)
...
@@ -35,6 +36,10 @@ def torch2paddle(args):
...
@@ -35,6 +36,10 @@ def torch2paddle(args):
logger
.
info
(
logger
.
info
(
f
"Convert torch weight:
{
k
}
to paddlepaddle weight:
{
k
}
, shape is
{
v
.
shape
}
"
f
"Convert torch weight:
{
k
}
to paddlepaddle weight:
{
k
}
, shape is
{
v
.
shape
}
"
)
)
# encoder.after_norm.* --> encoder.after_norm.*
# encoder.after_norm.* --> decoder.after_norm.*
# encoder.after_norm.* --> st_decoder.after_norm.*
if
k
.
startswith
(
'encoder.after_norm'
):
if
k
.
startswith
(
'encoder.after_norm'
):
paddle_model_dict
[
k
]
=
v
.
numpy
()
paddle_model_dict
[
k
]
=
v
.
numpy
()
cnt
+=
1
cnt
+=
1
...
@@ -47,6 +52,10 @@ def torch2paddle(args):
...
@@ -47,6 +52,10 @@ def torch2paddle(args):
f
"Convert torch weight:
{
k
}
to paddlepaddle weight:
{
'st_'
+
k
.
replace
(
'en'
,
'de'
)
}
, shape is
{
v
.
shape
}
"
f
"Convert torch weight:
{
k
}
to paddlepaddle weight:
{
'st_'
+
k
.
replace
(
'en'
,
'de'
)
}
, shape is
{
v
.
shape
}
"
)
)
cnt
+=
2
cnt
+=
2
# encoder.encoders.* --> encoder.encoders.*
# encoder.encoders.* (last six layers) --> decoder.encoders.* (first six layers)
# encoder.encoders.* (last six layers) --> st_decoder.encoders.* (first six layers)
if
k
.
startswith
(
'encoder.encoders'
):
if
k
.
startswith
(
'encoder.encoders'
):
if
v
.
ndim
==
2
:
if
v
.
ndim
==
2
:
v
=
v
.
transpose
(
0
,
1
)
v
=
v
.
transpose
(
0
,
1
)
...
...
examples/ted_en_zh/t1/local/download_pretrain.sh
0 → 100755
浏览文件 @
7cef93a6
#!/bin/bash
# download pytorch weight
wget https://paddlespeech.bj.bcebos.com/s2t/ted_en_zh/st1/snapshot.ep.98
--no-check-certificate
# convert pytorch weight to paddlepaddle
python
local
/convert_torch_to_paddle.py
\
--torch_ckpt
snapshot.ep.98
\
--paddle_ckpt
paddle.98.pdparams
# Or you can download converted weights
# wget https://paddlespeech.bj.bcebos.com/s2t/ted_en_zh/st1/paddle.98.pdparams --no-check-certificate
if
[
$?
-ne
0
]
;
then
echo
"Failed in downloading and coverting!"
exit
1
fi
exit
0
\ No newline at end of file
examples/ted_en_zh/t1/local/train.sh
→
examples/ted_en_zh/t1/local/train
_finetune
.sh
浏览文件 @
7cef93a6
文件已移动
examples/ted_en_zh/t1/run.sh
浏览文件 @
7cef93a6
...
@@ -4,8 +4,8 @@ source path.sh
...
@@ -4,8 +4,8 @@ source path.sh
gpus
=
0,1,2,3
gpus
=
0,1,2,3
stage
=
1
stage
=
1
stop_stage
=
100
stop_stage
=
4
conf_path
=
conf/transformer_
joint
_noam.yaml
conf_path
=
conf/transformer_
mtl
_noam.yaml
ckpt_path
=
paddle.98
ckpt_path
=
paddle.98
avg_num
=
5
avg_num
=
5
data_path
=
./TED_EnZh
# path to unzipped data
data_path
=
./TED_EnZh
# path to unzipped data
...
@@ -22,21 +22,21 @@ if [ ${stage} -le 0 ] && [ ${stop_stage} -ge 0 ]; then
...
@@ -22,21 +22,21 @@ if [ ${stage} -le 0 ] && [ ${stop_stage} -ge 0 ]; then
fi
fi
if
[
${
stage
}
-le
1
]
&&
[
${
stop_stage
}
-ge
1
]
;
then
if
[
${
stage
}
-le
1
]
&&
[
${
stop_stage
}
-ge
1
]
;
then
#
train model, all `ckpt` under `exp` dir
#
download pretrained
CUDA_VISIBLE_DEVICES
=
${
gpus
}
./local/train.sh
${
conf_path
}
${
ckpt
}
${
ckpt_path
}
bash ./local/download_pretrain.sh
||
exit
-1
fi
fi
if
[
${
stage
}
-le
2
]
&&
[
${
stop_stage
}
-ge
2
]
;
then
if
[
${
stage
}
-le
2
]
&&
[
${
stop_stage
}
-ge
2
]
;
then
#
avg n best model
#
train model, all `ckpt` under `exp` dir
avg.sh best exp/
${
ckpt
}
/checkpoints
${
avg_num
}
CUDA_VISIBLE_DEVICES
=
${
gpus
}
./local/train_finetune.sh
${
conf_path
}
${
ckpt
}
${
ckpt_path
}
fi
fi
if
[
${
stage
}
-le
3
]
&&
[
${
stop_stage
}
-ge
3
]
;
then
if
[
${
stage
}
-le
3
]
&&
[
${
stop_stage
}
-ge
3
]
;
then
#
test ckpt avg_n
#
avg n best model
CUDA_VISIBLE_DEVICES
=
0 ./local/test.sh
${
conf_path
}
exp/
${
ckpt
}
/checkpoints/
${
avg_ckpt
}
||
exit
-1
avg.sh best exp/
${
ckpt
}
/checkpoints
${
avg_num
}
fi
fi
if
[
${
stage
}
-le
4
]
&&
[
${
stop_stage
}
-ge
4
]
;
then
if
[
${
stage
}
-le
4
]
&&
[
${
stop_stage
}
-ge
4
]
;
then
#
expor
t ckpt avg_n
#
tes
t ckpt avg_n
CUDA_VISIBLE_DEVICES
=
./local/export.sh
${
conf_path
}
exp/
${
ckpt
}
/checkpoints/
${
avg_ckpt
}
exp/
${
ckpt
}
/checkpoints/
${
avg_ckpt
}
.jit
CUDA_VISIBLE_DEVICES
=
0 ./local/test.sh
${
conf_path
}
exp/
${
ckpt
}
/checkpoints/
${
avg_ckpt
}
||
exit
-1
fi
fi
\ No newline at end of file
编辑
预览
Markdown
is supported
0%
请重试
或
添加新附件
.
添加附件
取消
You are about to add
0
people
to the discussion. Proceed with caution.
先完成此消息的编辑!
取消
想要评论请
注册
或
登录