提交 58747e35 编写于 作者: J Joel Hestness 提交者: Jonathan Hseu

PhiloxRandom: Fix race in GPU fill function (#10298)

* PhiloxRandom: Fix race in GPU fill function

The PhiloxRandom fill kernel for the GPU had race conditions that caused the
outputs to be non-deterministic. In particular, the code previously executed
with N GPU threads (# thread contexts per GPU), but it would only advance the
fill addresses by N-1 stride in each step. This incorrect stride caused the
0th and N-1st threads to write to the same memory locations, racing for which
was last to write their common locations. Make the stride equal to the number
of threads to eliminate the race.

BONUS: By fixing this race, PhiloxRandom constant-sized GPU initializers now
match CPU initializers.

* Update random_ops_test.py to find race conditions

Increasing the size of arrays in the random_ops_test.py test to manifest
the race conditions to be resolved.
上级 2cbcda08
......@@ -141,7 +141,7 @@ struct FillPhiloxRandomKernel<Distribution, false> {
const typename Distribution::ResultType samples = dist(&gen);
copier(&data[offset], samples);
offset += (total_thread_count - 1) * kGroupSize;
offset += total_thread_count * kGroupSize;
gen.Skip(total_thread_count - 1);
}
......
......@@ -66,7 +66,8 @@ class RandomNormalTest(test.TestCase):
for dt in dtypes.float16, dtypes.float32, dtypes.float64:
results = {}
for use_gpu in [False, True]:
sampler = self._Sampler(1000, 0.0, 1.0, dt, use_gpu=use_gpu, seed=12345)
sampler = self._Sampler(
1000000, 0.0, 1.0, dt, use_gpu=use_gpu, seed=12345)
results[use_gpu] = sampler()
if dt == dtypes.float16:
self.assertAllClose(results[False], results[True], rtol=1e-3, atol=1e-3)
......@@ -135,7 +136,7 @@ class TruncatedNormalTest(test.TestCase):
# We need a particular larger number of samples to test multiple rounds
# on GPU
sampler = self._Sampler(
200000, 0.0, 1.0, dt, use_gpu=use_gpu, seed=12345)
1000000, 0.0, 1.0, dt, use_gpu=use_gpu, seed=12345)
results[use_gpu] = sampler()
if dt == dtypes.float16:
self.assertAllClose(results[False], results[True], rtol=1e-3, atol=1e-3)
......@@ -243,7 +244,7 @@ class RandomUniformTest(test.TestCase):
results = {}
for use_gpu in False, True:
sampler = self._Sampler(
1000, minv=0, maxv=maxv, dtype=dt, use_gpu=use_gpu, seed=12345)
1000000, minv=0, maxv=maxv, dtype=dt, use_gpu=use_gpu, seed=12345)
results[use_gpu] = sampler()
self.assertAllEqual(results[False], results[True])
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册