There is bug in concat CUDA kernel.
Created by: qingqing01
Add unit test in python/paddle/fluid/tests/unittests/test_concat_op.py
to reproduce the bug:
class TestConcatOp3(TestConcatOp):
def init_test_data(self):
self.x0 = np.random.random((1, 256, 170, 256)).astype('float32')
self.x1 = np.random.random((1, 128, 170, 256)).astype('float32')
self.x2 = np.random.random((1, 128, 170, 256)).astype('float32')
self.axis = 1
def test_check_grad(self):
pass
The error is:
220: terminate called after throwing an instance of 'paddle::platform::EnforceNotMet'
220: what(): cudaFree{Host} failed in GPUAllocator::Free.: an illegal memory access was encountered at [/paddle/Paddle/paddle/fluid/memory/detail/system_allocator.cc:130]
220: PaddlePaddle Call Stacks:
220: 0 0x7fb228be5f9cp paddle::platform::EnforceNotMet::EnforceNotMet(std::__exception_ptr::exception_ptr, char const*, int) + 572
220: 1 0x7fb229d1aec8p paddle::memory::detail::GPUAllocator::Free(void*, unsigned long, unsigned long) + 328
220: 2 0x7fb229d178d7p paddle::memory::detail::BuddyAllocator::Free(void*) + 1191
220: 3 0x7fb229c3468bp paddle::framework::Tensor::PlaceholderImpl<paddle::platform::CUDAPlace>::~PlaceholderImpl() + 43
220: 4 0x7fb229aa3139p paddle::framework::Vector<int>::~Vector() + 217
220: 5 0x7fb229aa7f94p paddle::operators::math::ConcatFunctor<paddle::platform::CUDADeviceContext, float>::operator()(paddle::platform::CUDADeviceContext const&, std::vector<paddle::framework::Tensor, std::allocator<paddle::framework::Tensor> > const&, int, paddle::framework::Tensor*) + 2916
220: 6 0x7fb22987a2dep paddle::operators::ConcatKernel<paddle::platform::CUDADeviceContext, float>::Compute(paddle::framework::ExecutionContext const&) const + 958