未验证 提交 5dcebb9b 编写于 作者: L Leo Chen 提交者: GitHub

Allocate and use new memory for temp data in cumsum kernel (#43101)

上级 f3d43fa9
......@@ -263,8 +263,9 @@ void CumsumKernel(const Context& dev_ctx,
dim3 blocks(32, 8);
dim3 transpose_grids((width + tile_size - 1) / tile_size,
(height + tile_size - 1) / tile_size);
out->Resize(out_dims);
auto* tmp_data = out->data<T>();
DenseTensor tmp_tensor;
tmp_tensor.Resize(out_dims);
auto* tmp_data = dev_ctx.template Alloc<T>(&tmp_tensor);
T* next_in_data = out_data;
T* next_out_data = tmp_data;
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册