“abc167338a99f1b644e1a5d4bb324a566d2fd87f”上不存在“develop/doc_cn/howto/cmd_parameter/arguments_cn.html”
[cherry-pick] Improve topk performance. (#21087) (#21441)
* Improve topk performance.
give 200000 data to compute topk,
before opt: cost 1s
after opt: cost 0.0028s.
* Refine return value.
* Add cuda util funtions.
* Fix ComputeBlockSize bug & refine comments.
Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com>
Showing
想要评论请 注册 或 登录