未验证 提交 612d5da0 编写于 作者: Z zhoutianzi666 提交者: GitHub

[Paddle-TRT] Fix QkvToContextPluginDynamic bug (#50715)

* fix multihead

* fix multihead
上级 21c6eccf
......@@ -479,6 +479,8 @@ int QkvToContextPluginDynamic::enqueue(
const half *input1_data = static_cast<const half *>(qk_bias);
// BxSx3xNxH => tptr: 3xBxNxSxH.
if (need_padding) {
PADDLE_ENFORCE_GPU_SUCCESS(
cudaMemsetAsync(tptr, 0, sizeof(half) * input_num, stream));
TransposePadding(input0_data,
tptr,
batch,
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册