Created by: Aurelius84
- Remove
double
inpad2d
op .(Added by PR https://github.com/PaddlePaddle/Paddle/pull/21718) - Kernel with
double
will trigger a compile bug in CUDA10. The error info as follows: http://ce.paddlepaddle.org:8080/viewLog.html?buildId=74793&buildTypeId=Benchmark_FrameworkBenchmark_BenchmarkBuild&tab=buildLog
Reason
I found that atomicAdd()
for double-precision floating-point numbers is not available on devices with compute capability lower than 6.0 but it can be implemented as follows:
#if __CUDA_ARCH__ < 600
__device__ double atomicAdd(double* address, double val)
{
unsigned long long int* address_as_ull =
(unsigned long long int*)address;
unsigned long long int old = *address_as_ull, assumed;
do {
assumed = old;
old = atomicCAS(address_as_ull, assumed,
__double_as_longlong(val +
__longlong_as_double(assumed)));
// Note: uses integer comparison to avoid hang in case of NaN (since NaN != NaN)
} while (assumed != old);
return __longlong_as_double(old);
}
#endif
See details in here Same issue.