printf PANIC when running Paddle
Created by: alvations
After a fresh installation on Ubuntu 14.04.5 and PaddlePaddle 0.9.0 using:
$ sudo gdebi paddle-0.9.0-gpu-ubuntu14.04.deb
The paddle binary works:
$ sudo paddle version
/usr/local/bin/paddle: line 41: printf: PANIC:: invalid number
PaddlePaddle 0.9.0, compiled with
with_avx: ON
with_gpu: ON
with_double: OFF
with_python: ON
with_rdma: OFF
with_glog: ON
with_gflags: ON
with_metric_learning:
with_timer: OFF
with_predict_sdk:
And also the quick_start/train.sh
trains and saves the model:
ltan@aya3:~/paddle/demo/quick_start$ bash train.sh
/usr/local/bin/paddle: line 41: printf: PANIC:: invalid number
I0116 12:19:55.051259 21633 Util.cpp:155] commandline: /usr/local/bin/../opt/paddle/bin/paddle_trainer --config=trainer_config.lr.py --save_dir=./output --trainer_count=4 --log_period=100 --num_passes=15 --use_gpu=true --show_parameter_stats_period=100 --test_all_data_in_one_period=1
I0116 12:20:02.245416 21633 Util.cpp:130] Calling runInitFunctions
I0116 12:20:02.245745 21633 Util.cpp:143] Call runInitFunctions done.
Traceback (most recent call last):
File "<string>", line 13, in <module>
NameError: name 'GLOG_logtostderr' is not defined
[INFO 2017-01-16 12:20:02,871 networks.py:1466] The input order is [word, label]
[INFO 2017-01-16 12:20:02,872 networks.py:1472] The output order is [__cost_0__]
I0116 12:20:02.878902 21633 Trainer.cpp:170] trainer mode: Normal
I0116 12:20:02.879051 21633 MultiGradientMachine.cpp:108] numLogicalDevices=1 numThreads=4 numDevices=4
I0116 12:20:03.017529 21633 PyDataProvider2.cpp:257] loading dataprovider dataprovider_bow::process
I0116 12:20:03.046632 21633 PyDataProvider2.cpp:257] loading dataprovider dataprovider_bow::process
I0116 12:20:03.046772 21633 GradientMachine.cpp:134] Initing parameters..
I0116 12:20:03.052978 21633 GradientMachine.cpp:141] Init parameters done.
I0116 12:20:19.850801 21727 ThreadLocal.cpp:37] thread use undeterministic rand seed:21728
...................................................................................................I0116 12:20:20.199605 21633 TrainerInternal.cpp:207] ___fc_layer_0__.w0 avg_abs_val=0.0141898 max_val=0.152015 avg_abs_grad=0.0618639 max_grad=55.8911
I0116 12:20:20.215919 21633 TrainerInternal.cpp:207] ___fc_layer_0__.wbias avg_abs_val=0.061232 max_val=0.061232 avg_abs_grad=5.89247 max_grad=5.89247
I0116 12:20:20.215975 21633 TrainerInternal.cpp:165] Batch=100 samples=12800 AvgCost=0.505322 CurrentCost=0.505322 Eval: classification_error_evaluator=0.178281 CurrentEval: classification_error_evaluator=0.178281
...................................................................................................I0116 12:20:20.512734 21633 TrainerInternal.cpp:207] ___fc_layer_0__.w0 avg_abs_val=0.021312 max_val=0.263946 avg_abs_grad=0.043029 max_grad=21.8291
But whenever the paddle
binary is called, it throws a PANIC
error that comes from printf
, I think it comes from https://github.com/PaddlePaddle/Paddle/blob/master/paddle/scripts/submit_local.sh.in#L41
Strangely this warning error doesn't appear on some machines but some other. All machines have 4 Titan X cards and have the following OS , CUDA and CuDNN installed :
$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 14.04.5 LTS
Release: 14.04
Codename: trusty
$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2016 NVIDIA Corporation
Built on Sun_Sep__4_22:14:01_CDT_2016
Cuda compilation tools, release 8.0, V8.0.44
$ cat /usr/local/cuda/include/cudnn.h | grep CUDNN_MAJOR -A 2
#define CUDNN_MAJOR 5
#define CUDNN_MINOR 1
#define CUDNN_PATCHLEVEL 5
--
#define CUDNN_VERSION (CUDNN_MAJOR * 1000 + CUDNN_MINOR * 100 + CUDNN_PATCHLEVEL)
#include "driver_types.h"