未验证 提交 3b686b18 编写于 作者: Z Zhang Zheng 提交者: GitHub

Limit the condition of entering optimized kernel (#41296)

Co-authored-by: Nroot <root@yq01-sys-hic-k8s-v100-box-a225-0186.yq01.baidu.com>
上级 16bfcd18
...@@ -98,7 +98,7 @@ void TopkKernel(const Context& dev_ctx, ...@@ -98,7 +98,7 @@ void TopkKernel(const Context& dev_ctx,
} }
#if defined(PADDLE_WITH_CUDA) && CUDA_VERSION >= 9000 #if defined(PADDLE_WITH_CUDA) && CUDA_VERSION >= 9000
if (input_width >= 1024 && input_height == 1) { if (input_width >= 1024 && in_dims.size() == 1) {
// 1. Gather TopK, but without sorting // 1. Gather TopK, but without sorting
constexpr int max_num_threads = 1024; constexpr int max_num_threads = 1024;
if (largest) { if (largest) {
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册