Compatiblity issue with CUDA 8.0
Created by: stoneyang
Hi, there,
I can successfully build Paddle on my machine installed Linux 14.04 LTS and CUDA 8.0 as the official guide. And for sure, the CPU version runs well except the speed....
When I ran the image classification demo with the script train.sh
in GPU mode (see train.sh
for more details), it unfortunately failed and threw out the following info:
I0901 19:18:02.916951 31272 Util.cpp:144] commandline: ../../build/paddle/trainer/paddle_trainer --config=vgg_16_cifar.py --dot_period=10 --log_period=100 --test_all_data_in_one_period=1 --use_gpu=1 --gpu_id=0 --trainer_count=1 --num_passes=200 --save_dir=./cifar_vgg_model
I0901 19:18:09.213749 31272 Util.cpp:113] Calling runInitFunctions
I0901 19:18:09.428228 31272 Util.cpp:126] Call runInitFunctions done.
[INFO 2016-09-01 19:18:09,580 layers.py:1430] channels=3 size=3072
[INFO 2016-09-01 19:18:09,580 layers.py:1430] output size for __conv_0__ is 32
[INFO 2016-09-01 19:18:09,583 layers.py:1430] channels=64 size=65536
[INFO 2016-09-01 19:18:09,583 layers.py:1430] output size for __conv_1__ is 32
[INFO 2016-09-01 19:18:09,586 layers.py:1490] output size for __pool_0__ is 16*16
[INFO 2016-09-01 19:18:09,587 layers.py:1430] channels=64 size=16384
[INFO 2016-09-01 19:18:09,587 layers.py:1430] output size for __conv_2__ is 16
[INFO 2016-09-01 19:18:09,590 layers.py:1430] channels=128 size=32768
[INFO 2016-09-01 19:18:09,590 layers.py:1430] output size for __conv_3__ is 16
[INFO 2016-09-01 19:18:09,592 layers.py:1490] output size for __pool_1__ is 8*8
[INFO 2016-09-01 19:18:09,593 layers.py:1430] channels=128 size=8192
[INFO 2016-09-01 19:18:09,594 layers.py:1430] output size for __conv_4__ is 8
[INFO 2016-09-01 19:18:09,596 layers.py:1430] channels=256 size=16384
[INFO 2016-09-01 19:18:09,597 layers.py:1430] output size for __conv_5__ is 8
[INFO 2016-09-01 19:18:09,599 layers.py:1430] channels=256 size=16384
[INFO 2016-09-01 19:18:09,599 layers.py:1430] output size for __conv_6__ is 8
[INFO 2016-09-01 19:18:09,601 layers.py:1490] output size for __pool_2__ is 4*4
[INFO 2016-09-01 19:18:09,602 layers.py:1430] channels=256 size=4096
[INFO 2016-09-01 19:18:09,603 layers.py:1430] output size for __conv_7__ is 4
[INFO 2016-09-01 19:18:09,605 layers.py:1430] channels=512 size=8192
[INFO 2016-09-01 19:18:09,605 layers.py:1430] output size for __conv_8__ is 4
[INFO 2016-09-01 19:18:09,608 layers.py:1430] channels=512 size=8192
[INFO 2016-09-01 19:18:09,608 layers.py:1430] output size for __conv_9__ is 4
[INFO 2016-09-01 19:18:09,610 layers.py:1490] output size for __pool_3__ is 2*2
[INFO 2016-09-01 19:18:09,611 layers.py:1490] output size for __pool_4__ is 1*1
[INFO 2016-09-01 19:18:09,615 networks.py:960] The input order is [image, label]
[INFO 2016-09-01 19:18:09,615 networks.py:963] The output order is [__cost_0__]
I0901 19:18:09.653937 31272 Trainer.cpp:169] trainer mode: Normal
F0901 19:18:09.658243 31272 hl_gpu_matrix_kernel.cuh:181] Check failed: cudaSuccess == err (0 vs. 8) [hl_gpu_apply_unary_op failed] CUDA error: invalid device function
*** Check failure stack trace: ***
@ 0x7efd1b172daa (unknown)
@ 0x7efd1b172ce4 (unknown)
@ 0x7efd1b1726e6 (unknown)
@ 0x7efd1b175687 (unknown)
@ 0x78b159 hl_gpu_apply_unary_op<>()
@ 0x753edf paddle::BaseMatrixT<>::applyUnary<>()
@ 0x753ac9 paddle::BaseMatrixT<>::applyUnary<>()
@ 0x73e04f paddle::BaseMatrixT<>::zero()
@ 0x62af8e paddle::Parameter::enableType()
@ 0x6272ec paddle::parameterInitNN()
@ 0x62975b paddle::NeuralNetwork::init()
@ 0x62eda3 paddle::GradientMachine::create()
@ 0x6a84e5 paddle::TrainerInternal::init()
@ 0x6a4907 paddle::Trainer::init()
@ 0x543935 main
@ 0x7efd1a37ef45 (unknown)
@ 0x54efd5 (unknown)
@ (nil) (unknown)
Aborted (core dumped)
No data to plot. Exiting!
It seems that Paddle still does not support the latest version of CUDA....
Appended my train.sh
as a clue:
(omitted the original copyright info)
#!/bin/bash
set -e
config=vgg_16_cifar.py
output=./cifar_vgg_model
log=train.log
../../build/paddle/trainer/paddle_trainer \
--config=$config \
--dot_period=10 \
--log_period=100 \
--test_all_data_in_one_period=1 \
--use_gpu=1 \
--gpu_id=0 \
--trainer_count=1 \
--num_passes=200 \
--save_dir=$output \
2>&1 | tee $log
python -m paddle.utils.plotcurve -i $log > plot.png