Skip to content

  • 体验新版
    • 正在加载...
  • 登录
  • PaddlePaddle
  • DeepSpeech
  • Issue
  • #204

D
DeepSpeech
  • 项目概览

PaddlePaddle / DeepSpeech
大约 2 年 前同步成功

通知 210
Star 8425
Fork 1598
  • 代码
    • 文件
    • 提交
    • 分支
    • Tags
    • 贡献者
    • 分支图
    • Diff
  • Issue 245
    • 列表
    • 看板
    • 标记
    • 里程碑
  • 合并请求 3
  • Wiki 0
    • Wiki
  • 分析
    • 仓库
    • DevOps
  • 项目成员
  • Pages
D
DeepSpeech
  • 项目概览
    • 项目概览
    • 详情
    • 发布
  • 仓库
    • 仓库
    • 文件
    • 提交
    • 分支
    • 标签
    • 贡献者
    • 分支图
    • 比较
  • Issue 245
    • Issue 245
    • 列表
    • 看板
    • 标记
    • 里程碑
  • 合并请求 3
    • 合并请求 3
  • Pages
  • 分析
    • 分析
    • 仓库分析
    • DevOps
  • Wiki 0
    • Wiki
  • 成员
    • 成员
  • 收起侧边栏
  • 动态
  • 分支图
  • 创建新Issue
  • 提交
  • Issue看板
已关闭
开放中
Opened 4月 11, 2018 by saxon_zh@saxon_zhGuest

nvidia-docker中训练aishell模型出现CudnnBatchNormLayer::backward()失败问题

Created by: BulimiaDH

Thread [140347691067136] Forwarding batch_norm_0,

* Aborted at 1523453910 (unix time) try "date -d @1523453910" if you are using GNU date *

PC: @ 0x0 (unknown)

* SIGSEGV (@0x0) received by PID 19 (TID 0x7fa53e456700) from PID 0; stack trace: *

@ 0x7fa53e032390 (unknown) @ 0x7fa518ea6c82 paddle::CudnnBatchNormLayer::backward() @ 0x7fa518f2e2bd paddle::NeuralNetwork::backward() @ 0x7fa519270bb0 GradientMachine::forwardBackward() @ 0x7fa518d295f4 _wrap_GradientMachine_forwardBackward @ 0x4cb45e PyEval_EvalFrameEx @ 0x4c2765 PyEval_EvalCodeEx @ 0x4ca8d1 PyEval_EvalFrameEx @ 0x4c2765 PyEval_EvalCodeEx @ 0x4ca099 PyEval_EvalFrameEx @ 0x4c2765 PyEval_EvalCodeEx @ 0x4ca099 PyEval_EvalFrameEx @ 0x4c2765 PyEval_EvalCodeEx @ 0x4ca099 PyEval_EvalFrameEx @ 0x4c2765 PyEval_EvalCodeEx @ 0x4ca8d1 PyEval_EvalFrameEx @ 0x4c2765 PyEval_EvalCodeEx @ 0x4ca8d1 PyEval_EvalFrameEx @ 0x4c2765 PyEval_EvalCodeEx @ 0x4c2509 PyEval_EvalCode @ 0x4f1def (unknown) @ 0x4ec652 PyRun_FileExFlags @ 0x4eae31 PyRun_SimpleFileExFlags @ 0x49e14a Py_Main @ 0x7fa53dc77830 __libc_start_main @ 0x49d9d9 _start @ 0x0 (unknown) Segmentation fault (core dumped)

模型配置: I0411 13:38:14.460194 19 Util.cpp:166] commandline: --use_gpu=True --rnn_use_batch=False --log_clipping=True --trainer_count=1 [INFO 2018-04-11 13:38:17,982 layers.py:2606] output for conv_0: c = 32, h = 81, w = 54, size = 139968 [INFO 2018-04-11 13:38:17,983 layers.py:3133] output for batch_norm_0: c = 32, h = 81, w = 54, size = 139968 [INFO 2018-04-11 13:38:17,985 layers.py:7224] output for scale_sub_region_0: c = 32, h = 81, w = 54, size = 139968 [INFO 2018-04-11 13:38:17,986 layers.py:2606] output for conv_1: c = 32, h = 41, w = 54, size = 70848 [INFO 2018-04-11 13:38:17,987 layers.py:3133] output for batch_norm_1: c = 32, h = 41, w = 54, size = 70848 [INFO 2018-04-11 13:38:17,988 layers.py:7224] output for scale_sub_region_1: c = 32, h = 41, w = 54, size = 70848

add_arg('batch_size', int, 8, "Minibatch size.") add_arg('trainer_count', int, 1, "# of Trainers (CPUs or GPUs).") add_arg('num_passes', int, 200, "# of training epochs.") add_arg('num_proc_data', int, 16, "# of CPUs for data preprocessing.") add_arg('num_conv_layers', int, 2, "# of convolution layers.") add_arg('num_rnn_layers', int, 3, "# of recurrent layers.") add_arg('rnn_layer_size', int, 1024, "# of recurrent cells per layer.") add_arg('num_iter_print', int, 100, "Every # iterations for printing " "train cost.") add_arg('learning_rate', float, 5e-4, "Learning rate.") add_arg('max_duration', float, 60.0, "Longest audio duration allowed.") add_arg('min_duration', float, 0.0, "Shortest audio duration allowed.") add_arg('test_off', bool, False, "Turn off testing.") add_arg('use_sortagrad', bool, True, "Use SortaGrad or not.") add_arg('use_gpu', bool, True, "Use GPU or not.") add_arg('use_gru', bool, True, "Use GRUs instead of simple RNNs.") add_arg('is_local', bool, True, "Use pserver or not.") add_arg('share_rnn_weights',bool, False, "Share input-hidden weights across " "bi-directional RNNs. Not for GRU.")

语言模型:Mandarin LM Small.

环境: VGA compatible controller: NVIDIA Corporation GK110B [GeForce GTX TITAN Black] (rev a1) ubuntu 16.04

正常生成Manifest文件,均值方差文件,运用提供的词表。然后出现了这个问题,请问应该怎么解决?

指派人
分配到
无
里程碑
无
分配里程碑
工时统计
无
截止日期
无
标识: paddlepaddle/DeepSpeech#204
渝ICP备2023009037号

京公网安备11010502055752号

网络110报警服务 Powered by GitLab CE v13.7
开源知识
Git 入门 Pro Git 电子书 在线学 Git
Markdown 基础入门 IT 技术知识开源图谱
帮助
使用手册 反馈建议 博客
《GitCode 隐私声明》 《GitCode 服务条款》 关于GitCode
Powered by GitLab CE v13.7