Autotune the workspace_size_limit in conv. (#40338)
* Using the maximum workspace_size of all alogirhms to limit the workspace size in exhaustive search mode.
* Use the system cudaMalloc and cudaFree to allocate workspace during searching.
* Enable switch of two kind of workspace setting methods.
Co-authored-by: NLiu Yiqun <liuyiqun01@baidu.com>
Showing
想要评论请 注册 或 登录