Skip to content

  • 体验新版
    • 正在加载...
  • 登录
  • PaddlePaddle
  • Paddle
  • Issue
  • #12654

P
Paddle
  • 项目概览

PaddlePaddle / Paddle
大约 2 年 前同步成功

通知 2325
Star 20933
Fork 5424
  • 代码
    • 文件
    • 提交
    • 分支
    • Tags
    • 贡献者
    • 分支图
    • Diff
  • Issue 1423
    • 列表
    • 看板
    • 标记
    • 里程碑
  • 合并请求 543
  • Wiki 0
    • Wiki
  • 分析
    • 仓库
    • DevOps
  • 项目成员
  • Pages
P
Paddle
  • 项目概览
    • 项目概览
    • 详情
    • 发布
  • 仓库
    • 仓库
    • 文件
    • 提交
    • 分支
    • 标签
    • 贡献者
    • 分支图
    • 比较
  • Issue 1,423
    • Issue 1,423
    • 列表
    • 看板
    • 标记
    • 里程碑
  • 合并请求 543
    • 合并请求 543
  • Pages
  • 分析
    • 分析
    • 仓库分析
    • DevOps
  • Wiki 0
    • Wiki
  • 成员
    • 成员
  • 收起侧边栏
  • 动态
  • 分支图
  • 创建新Issue
  • 提交
  • Issue看板
已关闭
开放中
Opened 8月 13, 2018 by saxon_zh@saxon_zhGuest

MPI集群跑机器翻译模型demo,中途报错Forwarding __bidirectional_gru_0___fw

Created by: Bodhi-Tree

Sat Aug 11 15:35:27 2018[1,1]: Sat Aug 11 15:35:27 2018[1,1]:Pass 0, Batch 368, Cost 50.880733, {'classification_error_evaluator': 0.7685352563858032} Sat Aug 11 15:35:27 2018[1,2]: Sat Aug 11 15:35:27 2018[1,2]:Pass 0, Batch 368, Cost 53.013538, {'classification_error_evaluator': 0.7777777910232544} Sat Aug 11 15:35:42 2018[1,19]:Thread [139910583289600] Forwarding __bidirectional_gru_0___fw, Sat Aug 11 15:35:42 2018[1,19]:*** Aborted at 1533972942 (unix time) try "date -d @1533972942" if you are using GNU date *** Sat Aug 11 15:35:42 2018[1,13]:Thread [140696326326016] Forwarding __bidirectional_gru_0___fw, Sat Aug 11 15:35:42 2018[1,13]:*** Aborted at 1533972942 (unix time) try "date -d @1533972942" if you are using GNU date *** Sat Aug 11 15:35:42 2018[1,19]:PC: @ 0x0 (unknown) Sat Aug 11 15:35:42 2018[1,13]: Sat Aug 11 15:35:42 2018[1,13]:PC: @ 0x0 (unknown) Sat Aug 11 15:35:42 2018[1,20]:Thread [140598679189248] Forwarding __bidirectional_gru_0___fw, Sat Aug 11 15:35:42 2018[1,20]:*** Aborted at 1533972942 (unix time) try "date -d @1533972942" if you are using GNU date *** Sat Aug 11 15:35:42 2018[1,24]:Thread [139964018677504] Forwarding __bidirectional_gru_0___fw, Sat Aug 11 15:35:42 2018[1,24]:*** Aborted at 1533972942 (unix time) try "date -d @1533972942" if you are using GNU date *** Sat Aug 11 15:35:42 2018[1,11]:Thread [140135036057344] Forwarding __bidirectional_gru_0___fw, Sat Aug 11 15:35:42 2018[1,11]:*** Aborted at 1533972942 (unix time) try "date -d @1533972942" if you are using GNU date *** Sat Aug 11 15:35:42 2018[1,10]:Thread [139842902214400] Forwarding __bidirectional_gru_0___fw, Sat Aug 11 15:35:42 2018[1,10]:*** Aborted at 1533972942 (unix time) try "date -d @1533972942" if you are using GNU date *** Sat Aug 11 15:35:42 2018[1,20]: Sat Aug 11 15:35:42 2018[1,20]:PC: @ 0x0 (unknown) Sat Aug 11 15:35:42 2018[1,24]: Sat Aug 11 15:35:42 2018[1,24]:PC: @ 0x0 (unknown) Sat Aug 11 15:35:42 2018[1,11]: Sat Aug 11 15:35:42 2018[1,11]:PC: @ 0x0 (unknown) Sat Aug 11 15:35:42 2018[1,10]: Sat Aug 11 15:35:42 2018[1,10]:PC: @ 0x0 (unknown) Sat Aug 11 15:35:42 2018[1,18]:Thread [139962781460224] Forwarding __bidirectional_gru_0___fw, Sat Aug 11 15:35:42 2018[1,18]:*** Aborted at 1533972942 (unix time) try "date -d @1533972942" if you are using GNU date *** Sat Aug 11 15:35:42 2018[1,24]: Sat Aug 11 15:35:42 2018[1,24]:*** SIGFPE (@0x7f4d8904b460) received by PID 5470 (TID 0x7f4be99d5700) from PID 18446744071713371232; stack trace: *** Sat Aug 11 15:35:42 2018[1,20]: Sat Aug 11 15:35:42 2018[1,20]:*** SIGFPE (@0x7fdfff589460) received by PID 20103 (TID 0x7fdfae543700) from PID 18446744073698579552; stack trace: *** Sat Aug 11 15:35:42 2018[1,24]: Sat Aug 11 15:35:42 2018[1,24]: @ 0x7f4d8eb9f160 (unknown) Sat Aug 11 15:35:42 2018[1,20]: Sat Aug 11 15:35:42 2018[1,20]: @ 0x7fe0050dd160 (unknown) Sat Aug 11 15:35:42 2018[1,13]: Sat Aug 11 15:35:42 2018[1,13]:*** SIGFPE (@0x7ff6b4518460) received by PID 48799 (TID 0x7ff66a8d4700) from PID 18446744072439825504; stack trace: *** Sat Aug 11 15:35:42 2018[1,11]: Sat Aug 11 15:35:42 2018[1,11]:*** SIGFPE (@0x7f73f16fc460) received by PID 27779 (TID 0x7f73bb0b7700) from PID 18446744073465218144; stack trace: *** Sat Aug 11 15:35:42 2018[1,13]: Sat Aug 11 15:35:42 2018[1,13]: @ 0x7ff6ba06c160 (unknown) Sat Aug 11 15:35:42 2018[1,3]:Thread [140542075557632] Forwarding __bidirectional_gru_0___fw, Sat Aug 11 15:35:42 2018[1,3]:*** Aborted at 1533972942 (unix time) try "date -d @1533972942" if you are using GNU date *** Sat Aug 11 15:35:42 2018[1,19]:*** SIGFPE (@0x7f3fb9a26460) received by PID 51743 (TID 0x7f3f789e0700) from PID 18446744072529011808; stack trace: *** Sat Aug 11 15:35:42 2018[1,3]:PC: @ 0x0 (unknown) Sat Aug 11 15:35:42 2018[1,19]: @ 0x7f3fbf57a160 (unknown) Sat Aug 11 15:35:42 2018[1,21]:Thread [140391670904576] Forwarding __bidirectional_gru_0___fw, Sat Aug 11 15:35:42 2018[1,21]:*** Aborted at 1533972942 (unix time) try "date -d @1533972942" if you are using GNU date *** Sat Aug 11 15:35:42 2018[1,17]:Thread [140468229388032] Forwarding __bidirectional_gru_0___fw, Sat Aug 11 15:35:42 2018[1,17]:*** Aborted at 1533972942 (unix time) try "date -d @1533972942" if you are using GNU date *** Sat Aug 11 15:35:42 2018[1,11]: Sat Aug 11 15:35:42 2018[1,11]: @ 0x7f73f7250160 (unknown) Sat Aug 11 15:35:42 2018[1,10]: Sat Aug 11 15:35:42 2018[1,10]:*** SIGFPE (@0x7f3156ea8460) received by PID 35278 (TID 0x7f2fb682c700) from PID 1458209888; stack trace: *** Sat Aug 11 15:35:42 2018[1,10]: Sat Aug 11 15:35:42 2018[1,10]: @ 0x7f315c9fc160 (unknown) Sat Aug 11 15:35:42 2018[1,18]: Sat Aug 11 15:35:42 2018[1,18]:PC: @ 0x0 (unknown) Sat Aug 11 15:35:42 2018[1,20]: Sat Aug 11 15:35:42 2018[1,20]: @ 0x7fdfff589460 (unknown) Sat Aug 11 15:35:42 2018[1,20]: Sat Aug 11 15:35:42 2018[1,20]: @ 0x7fdfff2fc20b hl_cpu_gru_forward<>() Sat Aug 11 15:35:42 2018[1,12]:Thread [140489601419008] Forwarding __bidirectional_gru_0___fw, Sat Aug 11 15:35:42 2018[1,12]:*** Aborted at 1533972942 (unix time) try "date -d @1533972942" if you are using GNU date *** Sat Aug 11 15:35:42 2018[1,20]: Sat Aug 11 15:35:42 2018[1,20]: @ 0x7fdfff2fbce7 paddle::GruCompute::forward<>() Sat Aug 11 15:35:42 2018[1,21]: Sat Aug 11 15:35:42 2018[1,21]:PC: @ 0x0 (unknown) Sat Aug 11 15:35:42 2018[1,24]: Sat Aug 11 15:35:42 2018[1,24]: @ 0x7f4d8904b460 (unknown) Sat Aug 11 15:35:42 2018[1,19]: Sat Aug 11 15:35:42 2018[1,19]: @ 0x7f3fb9a26460 (unknown) Sat Aug 11 15:35:42 2018[1,12]: Sat Aug 11 15:35:42 2018[1,12]:PC: @ 0x0 (unknown) Sat Aug 11 15:35:42 2018[1,20]: Sat Aug 11 15:35:42 2018[1,20]: @ 0x7fdfff2ff9ae paddle::GatedRecurrentLayer::forwardBatch() Sat Aug 11 15:35:42 2018[1,24]: Sat Aug 11 15:35:42 2018[1,24]: @ 0x7f4d88dbe20b hl_cpu_gru_forward<>() Sat Aug 11 15:35:42 2018[1,19]: Sat Aug 11 15:35:42 2018[1,19]: @ 0x7f3fb979920b hl_cpu_gru_forward<>() ......

如题,我用集群跑paddle给的机器翻译的demo,刚开始跑得好好的,后来不知怎的就报上面这个错,不知是什么原因,还请老师帮忙看一下

指派人
分配到
无
里程碑
无
分配里程碑
工时统计
无
截止日期
无
标识: paddlepaddle/Paddle#12654
渝ICP备2023009037号

京公网安备11010502055752号

网络110报警服务 Powered by GitLab CE v13.7
开源知识
Git 入门 Pro Git 电子书 在线学 Git
Markdown 基础入门 IT 技术知识开源图谱
帮助
使用手册 反馈建议 博客
《GitCode 隐私声明》 《GitCode 服务条款》 关于GitCode
Powered by GitLab CE v13.7