Skip to content

  • 体验新版
    • 正在加载...
  • 登录
  • PaddlePaddle
  • DeepSpeech
  • Issue
  • #431

D
DeepSpeech
  • 项目概览

PaddlePaddle / DeepSpeech
大约 2 年 前同步成功

通知 210
Star 8425
Fork 1598
  • 代码
    • 文件
    • 提交
    • 分支
    • Tags
    • 贡献者
    • 分支图
    • Diff
  • Issue 245
    • 列表
    • 看板
    • 标记
    • 里程碑
  • 合并请求 3
  • Wiki 0
    • Wiki
  • 分析
    • 仓库
    • DevOps
  • 项目成员
  • Pages
D
DeepSpeech
  • 项目概览
    • 项目概览
    • 详情
    • 发布
  • 仓库
    • 仓库
    • 文件
    • 提交
    • 分支
    • 标签
    • 贡献者
    • 分支图
    • 比较
  • Issue 245
    • Issue 245
    • 列表
    • 看板
    • 标记
    • 里程碑
  • 合并请求 3
    • 合并请求 3
  • Pages
  • 分析
    • 分析
    • 仓库分析
    • DevOps
  • Wiki 0
    • Wiki
  • 成员
    • 成员
  • 收起侧边栏
  • 动态
  • 分支图
  • 创建新Issue
  • 提交
  • Issue看板
已关闭
开放中
Opened 2月 24, 2020 by saxon_zh@saxon_zhGuest

Can the released models be used as checkpoints?

Created by: DavidHuie

I was experimenting with adding new training data to the released models, specifically the BaiduEN8k model. I tried running a script that uses the BaiduEN8k as a checkpoint (borrowing from examples/librispeech/run_train.sh):

export FLAGS_sync_nccl_allreduce=0
export CUDA_VISIBLE_DEVICES=0

python -u train.py \
--batch_size=20 \
--num_epoch=50 \
--num_conv_layers=2 \
--num_rnn_layers=3 \
--rnn_layer_size=2048 \
--num_iter_print=100 \
--save_epoch=1 \
--num_samples=280000 \
--learning_rate=5e-4 \
--max_duration=27.0 \
--min_duration=0.0 \
--test_off=False \
--use_sortagrad=True \
--use_gru=False \
--use_gpu=True \
--is_local=True \
--share_rnn_weights=True \
--train_manifest='manifest.train-clean-100' \
--dev_manifest='manifest.dev-clean' \
--mean_std_path='mean_std.npz' \
--vocab_path='/code/models/deepspeech2/vocab.txt' \
--init_from_pretrained_model='/code/models/deepspeech2' \
--output_model_dir='/code/models/deepspeech2_new' \
--augment_conf_path='conf/augmentation.config' \
--specgram_type='linear' \
--shuffle_method='batch_shuffle_clipped'

The directory /code/models/deepspeech2 contains the BaiduEN8k models:

$ ls /code/models/deepspeech2
README.md  mean_std.npz  params.pdparams  vocab.txt
$ du -hs /code/models/deepspeech2/*
4.0K    /code/models/deepspeech2/README.md
4.0K    /code/models/deepspeech2/mean_std.npz
201M    /code/models/deepspeech2/params.pdparams
4.0K    /code/models/deepspeech2/vocab.txt

When I run the script, I get this segfault:

W0224 02:04:07.079113   149 device_context.cc:236] Please NOTE: device: 0, CUDA Capability: 61, Driver API Version: 10.2, Runtime API Version: 10.0
W0224 02:04:07.100364   149 device_context.cc:244] device: 0, cuDNN Version: 7.6.
W0224 02:04:08.754072   149 init.cc:206] *** Aborted at 1582509848 (unix time) try "date -d @1582509848" if you are using GNU date ***
W0224 02:04:08.756754   149 init.cc:206] PC: @                0x0 (unknown)
W0224 02:04:08.757686   149 init.cc:206] *** SIGSEGV (@0x50) received by PID 149 (TID 0x7f0571ad1700) from PID 80; stack trace: ***
W0224 02:04:08.760175   149 init.cc:206]     @     0x7f05716aa390 (unknown)
W0224 02:04:08.762334   149 init.cc:206]     @     0x7f05718c275c (unknown)
W0224 02:04:08.764492   149 init.cc:206]     @     0x7f05718cb861 (unknown)
W0224 02:04:08.766629   149 init.cc:206]     @     0x7f05718c6574 (unknown)
W0224 02:04:08.768791   149 init.cc:206]     @     0x7f05718cadb9 (unknown)
W0224 02:04:08.771391   149 init.cc:206]     @     0x7f05714125ad (unknown)
W0224 02:04:08.773540   149 init.cc:206]     @     0x7f05718c6574 (unknown)
W0224 02:04:08.776126   149 init.cc:206]     @     0x7f0571412664 __libc_dlopen_mode
W0224 02:04:08.778730   149 init.cc:206]     @     0x7f05713e4a85 (unknown)
W0224 02:04:08.780908   149 init.cc:206]     @     0x7f05716a7a99 __pthread_once_slow
W0224 02:04:08.782840   149 init.cc:206]     @     0x7f05713e4ba4 backtrace
W0224 02:04:08.790917   149 init.cc:206]     @     0x7f0506158844 paddle::platform::GetTraceBackString<>()
W0224 02:04:08.795003   149 init.cc:206]     @     0x7f0506158cfa paddle::platform::EnforceNotMet::EnforceNotMet()
W0224 02:04:08.801776   149 init.cc:206]     @     0x7f0507554b35 paddle::operators::LoadCombineOpKernel<>::LoadParamsFromBuffer()
W0224 02:04:08.807842   149 init.cc:206]     @     0x7f0507554ece paddle::operators::LoadCombineOpKernel<>::Compute()
W0224 02:04:08.811461   149 init.cc:206]     @     0x7f0507555423 _ZNSt17_Function_handlerIFvRKN6paddle9framework16ExecutionContextEEZNKS1_24OpKernelRegistrarFunctorINS0_8platform9CUDAPlaceELb0ELm0EJNS0_9operators19LoadCombineOpKernelINS7_17CUDADeviceContextEfEENSA_ISB_dEENSA_ISB_iEENSA_ISB_aEENSA_ISB_lEEEEclEPKcSJ_iEUlS4_E_E9_M_invokeERKSt9_Any_dataS4_
W0224 02:04:08.816128   149 init.cc:206]     @     0x7f0508b4dd6b paddle::framework::OperatorWithKernel::RunImpl()
W0224 02:04:08.822257   149 init.cc:206]     @     0x7f0508b4e361 paddle::framework::OperatorWithKernel::RunImpl()
W0224 02:04:08.825278   149 init.cc:206]     @     0x7f0508b47fec paddle::framework::OperatorBase::Run()
W0224 02:04:08.830551   149 init.cc:206]     @     0x7f0506308c86 paddle::framework::Executor::RunPreparedContext()
W0224 02:04:08.833338   149 init.cc:206]     @     0x7f050630c4cf paddle::framework::Executor::Run()
W0224 02:04:08.834825   149 init.cc:206]     @     0x7f0506145f1d _ZZN8pybind1112cpp_function10initializeIZN6paddle6pybindL22pybind11_init_core_avxERNS_6moduleEEUlRNS2_9framework8ExecutorERKNS6_11ProgramDescEPNS6_5ScopeEibbRKSt6vectorISsSaISsEEE103_vIS8_SB_SD_ibbSI_EINS_4nameENS_9is_methodENS_7siblingEEEEvOT_PFT0_DpT1_EDpRKT2_ENUlRNS_6detail13function_callEE1_4_FUNES10_
W0224 02:04:08.836560   149 init.cc:206]     @     0x7f050618f086 pybind11::cpp_function::dispatcher()
W0224 02:04:08.836661   149 init.cc:206]     @           0x4c5cd6 PyEval_EvalFrameEx
W0224 02:04:08.836764   149 init.cc:206]     @           0x4ba506 PyEval_EvalCodeEx
W0224 02:04:08.836939   149 init.cc:206]     @           0x4c2418 PyEval_EvalFrameEx
W0224 02:04:08.837083   149 init.cc:206]     @           0x4ba506 PyEval_EvalCodeEx
W0224 02:04:08.837195   149 init.cc:206]     @           0x4c2418 PyEval_EvalFrameEx
W0224 02:04:08.837280   149 init.cc:206]     @           0x4ba506 PyEval_EvalCodeEx
W0224 02:04:08.837378   149 init.cc:206]     @           0x4c1e32 PyEval_EvalFrameEx
W0224 02:04:08.837463   149 init.cc:206]     @           0x4ba506 PyEval_EvalCodeEx
W0224 02:04:08.837559   149 init.cc:206]     @           0x4c1e32 PyEval_EvalFrameEx
/code/code/deepspeech2/train.sh: line 35:   149 Segmentation fault      (core dumped) python -u train.py --batch_size=20 --num_epoch=50 --num_conv_layers=2 --num_rnn_layers=3 --rnn_layer_size=2048 --num_iter_print=100 --save_epoch=1 --num_samples=280000 --learning_rate=5e-4 --max_duration=27.0 --min_duration=0.0 --test_off=False --use_sortagrad=True --use_gru=False --use_gpu=True --is_local=True --share_rnn_weights=True --train_manifest='manifest.train-clean-100' --dev_manifest='manifest.dev-clean' --mean_std_path='mean_std.npz' --vocab_path='/code/models/deepspeech2/vocab.txt' --init_from_pretrained_model='/code/models/deepspeech2_copy' --output_model_dir='/code/models/deepspeech2_new' --augment_conf_path='conf/augmentation.config' --specgram_type='linear' --shuffle_method='batch_shuffle_clipped'

If using BaiduEN8k as a checkpoint is possible, am I doing it the right way? I understand some of those parameters in the train.py command may need to changed.

指派人
分配到
无
里程碑
无
分配里程碑
工时统计
无
截止日期
无
标识: paddlepaddle/DeepSpeech#431
渝ICP备2023009037号

京公网安备11010502055752号

网络110报警服务 Powered by GitLab CE v13.7
开源知识
Git 入门 Pro Git 电子书 在线学 Git
Markdown 基础入门 IT 技术知识开源图谱
帮助
使用手册 反馈建议 博客
《GitCode 隐私声明》 《GitCode 服务条款》 关于GitCode
Powered by GitLab CE v13.7