Skip to content

  • 体验新版
    • 正在加载...
  • 登录
  • PaddlePaddle
  • Paddle
  • Issue
  • #22243

P
Paddle
  • 项目概览

PaddlePaddle / Paddle
大约 2 年 前同步成功

通知 2325
Star 20933
Fork 5424
  • 代码
    • 文件
    • 提交
    • 分支
    • Tags
    • 贡献者
    • 分支图
    • Diff
  • Issue 1423
    • 列表
    • 看板
    • 标记
    • 里程碑
  • 合并请求 543
  • Wiki 0
    • Wiki
  • 分析
    • 仓库
    • DevOps
  • 项目成员
  • Pages
P
Paddle
  • 项目概览
    • 项目概览
    • 详情
    • 发布
  • 仓库
    • 仓库
    • 文件
    • 提交
    • 分支
    • 标签
    • 贡献者
    • 分支图
    • 比较
  • Issue 1,423
    • Issue 1,423
    • 列表
    • 看板
    • 标记
    • 里程碑
  • 合并请求 543
    • 合并请求 543
  • Pages
  • 分析
    • 分析
    • 仓库分析
    • DevOps
  • Wiki 0
    • Wiki
  • 成员
    • 成员
  • 收起侧边栏
  • 动态
  • 分支图
  • 创建新Issue
  • 提交
  • Issue看板
已关闭
开放中
Opened 1月 13, 2020 by saxon_zh@saxon_zhGuest

用 paddlepaddle/paddle:latest-gpu-cuda10.0-cudnn7镜像训练vgg和resnet,有时可以正常跑有时报如下错误。

Created by: gentelyang

------------- Configuration Arguments ------------- batch_size : 32 checkpoint : None class_dim : 1000 data_dir : ./data/ILSVRC2012/ enable_ce : False fp16 : False image_shape : 3,224,224 is_distill : False l2_decay : 0.0001 label_smoothing_epsilon : 0.2 lower_ratio : 0.75 lower_scale : 0.08 lr : 0.1 lr_strategy : piecewise_decay mixup_alpha : 0.2 model : ResNet50 model_save_dir : output/ momentum_rate : 0.9 num_epochs : 120 pretrained_model : None resize_short_size : 256 scale_loss : 1.0 total_images : 1281167 upper_ratio : 1.33333333333 use_gpu : True use_label_smoothing : False use_mixup : False with_inplace : True with_mem_opt : 1

2020-01-13 06:33:49,690-WARNING: paddle.fluid.layers.py_reader() may be deprecated in the near future. Please use paddle.fluid.io.DataLoader.from_generator() instead. 2020-01-13 06:33:55,166-WARNING: paddle.fluid.layers.py_reader() may be deprecated in the near future. Please use paddle.fluid.io.DataLoader.from_generator() instead. 2020-01-13 06:33:55,711-WARNING: Caution! paddle.fluid.memory_optimize() is deprecated and not maintained any more, since it is not stable! This API would not take any memory optimizations on your Program now, since we have provided default strategies for you. The newest and stable memory optimization strategies (they are all enabled by default) are as follows:

  1. Garbage collection strategy, which is enabled by exporting environment variable FLAGS_eager_delete_tensor_gb=0 (0 is the default value).
  2. Inplace strategy, which is enabled by setting build_strategy.enable_inplace=True (True is the default value) when using CompiledProgram or ParallelExecutor.

2020-01-13 06:33:55,711-WARNING: Caution! paddle.fluid.memory_optimize() is deprecated and not maintained any more, since it is not stable! This API would not take any memory optimizations on your Program now, since we have provided default strategies for you. The newest and stable memory optimization strategies (they are all enabled by default) are as follows:

  1. Garbage collection strategy, which is enabled by exporting environment variable FLAGS_eager_delete_tensor_gb=0 (0 is the default value).
  2. Inplace strategy, which is enabled by setting build_strategy.enable_inplace=True (True is the default value) when using CompiledProgram or ParallelExecutor.

W0113 06:33:57.231657 7512 device_context.cc:237] Please NOTE: device: 0, CUDA Capability: 70, Driver API Version: 10.1, Runtime API Version: 10.0 W0113 06:33:57.316989 7512 device_context.cc:245] device: 0, cuDNN Version: 7.5. W0113 06:33:57.317095 7512 device_context.cc:271] WARNING: device: 0. The installed Paddle is compiled with CUDNN 7.6, but CUDNN version in your machine is 7.5, which may cause serious incompatible bug. Please recompile or reinstall Paddle with compatible CUDNN version. W0113 06:34:00.199051 7609 init.cc:209] Warning: PaddlePaddle catches a failure signal, it may not work properly W0113 06:34:00.199091 7609 init.cc:211] You could check whether you killed PaddlePaddle thread/process accidentally or report the case to PaddlePaddle W0113 06:34:00.199101 7609 init.cc:214] The detail failure signal is:

W0113 06:34:00.199110 7609 init.cc:217] *** Aborted at 1578897240 (unix time) try "date -d @1578897240" if you are using GNU date *** W0113 06:34:00.200392 7609 init.cc:217] PC: @ 0x0 (unknown) W0113 06:34:00.201520 7609 init.cc:217] *** SIGSEGV (@0x8e75c) received by PID 7512 (TID 0x7fa6bd771700) from PID 583516; stack trace: *** W0113 06:34:00.202455 7609 init.cc:217] @ 0x7fa6eb628390 (unknown) W0113 06:34:00.203292 7609 init.cc:217] @ 0x7fa6eb2d1512 cfree W0113 06:34:00.203785 7609 init.cc:217] @ 0x7fa640684c89 (unknown) W0113 06:34:00.204277 7609 init.cc:217] @ 0x7fa6403005b5 (unknown) W0113 06:34:00.204366 7609 init.cc:217] @ 0x4bce2f PyEval_EvalFrameEx W0113 06:34:00.204437 7609 init.cc:217] @ 0x4ba506 PyEval_EvalCodeEx W0113 06:34:00.204509 7609 init.cc:217] @ 0x4c1e32 PyEval_EvalFrameEx W0113 06:34:00.204573 7609 init.cc:217] @ 0x4ba506 PyEval_EvalCodeEx W0113 06:34:00.204644 7609 init.cc:217] @ 0x4d5e43 (unknown) W0113 06:34:00.204690 7609 init.cc:217] @ 0x4a646e PyObject_Call W0113 06:34:00.204763 7609 init.cc:217] @ 0x53a1dc (unknown) W0113 06:34:00.204833 7609 init.cc:217] @ 0x4c1c83 PyEval_EvalFrameEx W0113 06:34:00.204908 7609 init.cc:217] @ 0x4ba506 PyEval_EvalCodeEx W0113 06:34:00.204983 7609 init.cc:217] @ 0x4d5e43 (unknown) W0113 06:34:00.205029 7609 init.cc:217] @ 0x4a646e PyObject_Call W0113 06:34:00.205104 7609 init.cc:217] @ 0x4c2c4a PyEval_EvalFrameEx W0113 06:34:00.205174 7609 init.cc:217] @ 0x4c1934 PyEval_EvalFrameEx W0113 06:34:00.205245 7609 init.cc:217] @ 0x4c1934 PyEval_EvalFrameEx W0113 06:34:00.205308 7609 init.cc:217] @ 0x4ba506 PyEval_EvalCodeEx W0113 06:34:00.205377 7609 init.cc:217] @ 0x4d5d09 (unknown) W0113 06:34:00.205451 7609 init.cc:217] @ 0x4ee30e (unknown) W0113 06:34:00.205497 7609 init.cc:217] @ 0x4a646e PyObject_Call W0113 06:34:00.205559 7609 init.cc:217] @ 0x4c6690 PyEval_CallObjectWithKeywords W0113 06:34:00.205633 7609 init.cc:217] @ 0x588e42 (unknown) W0113 06:34:00.206462 7609 init.cc:217] @ 0x7fa6eb61e6ba start_thread W0113 06:34:00.207336 7609 init.cc:217] @ 0x7fa6eb35441d clone W0113 06:34:00.208194 7609 init.cc:217] @ 0x0 (unknown) run_resnet.sh: line 15: 7512 Segmentation fault python train.py --model=ResNet50 --batch_size=32 --total_images=1281167 --class_dim=1000 --image_shape=3,224,224 --model_save_dir=output/ --with_mem_opt=True --lr_strategy=piecewise_decay --num_epochs=120 --lr=0.1 --l2_decay=1e-4

指派人
分配到
无
里程碑
无
分配里程碑
工时统计
无
截止日期
无
标识: paddlepaddle/Paddle#22243
渝ICP备2023009037号

京公网安备11010502055752号

网络110报警服务 Powered by GitLab CE v13.7
开源知识
Git 入门 Pro Git 电子书 在线学 Git
Markdown 基础入门 IT 技术知识开源图谱
帮助
使用手册 反馈建议 博客
《GitCode 隐私声明》 《GitCode 服务条款》 关于GitCode
Powered by GitLab CE v13.7