未验证 提交 b3090ad4 编写于 作者: L Leo Chen 提交者: GitHub

fix synchronization problem in softmax_with_cross_entropy_op, test=develop (#21480)

上级 01fa4ead
...@@ -200,6 +200,10 @@ static __global__ void RowReductionForDiffMaxSum(const T* logits_data, ...@@ -200,6 +200,10 @@ static __global__ void RowReductionForDiffMaxSum(const T* logits_data,
softmax[beg_idx] -= diff_max_sum; softmax[beg_idx] -= diff_max_sum;
beg_idx += step; beg_idx += step;
} }
// Note(zhiqiu): since different threads may use max_data[blockIdx.x] to
// calculate diff_max_sum, __syncthreads() is needed here.
__syncthreads();
if (threadIdx.x == 0) max_data[blockIdx.x] = 0; if (threadIdx.x == 0) max_data[blockIdx.x] = 0;
} }
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册