• T
    CUDNN v8 Implementation of Convolution Kernels (#47454) · 14a6e67b
    Tian Zheng 提交于
    * Refactor conv_kernel and conv_grad_kernel to provide interface for CUDNNv8 implementation
    
    * Fix macro
    
    * Add implementation for conv_kernel and conv_grad_kernel
    
    * Modification after rebase onto latest develop
    
    * Modify plan cache to comply with the API of phi::autotune
    
    * Refactor to reduce duplicate code
    
    * Review fix:
    - move functions in  conv_kernel_impl_v8.h and conv_grad_kernel_impl_v8.h to conv_kernel.cu and conv_grad_kernelk.cu
    - add const specifier for input tensor
    - add logging when plans fail to execute
    - move CudnnConvBwdFilterV8 and CudnnConvBwdDataV8 to conv_cudnn_frontend.h
    
    * - move plan building outside of cache
    
    * Fix ROCM build
    14a6e67b
cache_cudnn_frontend.h 3.9 KB