1. 28 11月, 2022 1 次提交
    • H
      [PHI decoupling] move several header files from fluid to phi (#48415) · fd9c91c3
      huangjiyi 提交于
      * decouple cudnn_desc.h from fluid
      
      * move cudnn_desc.h from fluid to phi
      
      * fix bugs
      
      * decouple cudnn_helper.h from fluid
      
      * fix bugs
      
      * move cudnn_helper.h from fluid to phi
      
      * add fluid cudnn_helper.h
      
      * move miopen_desc.h from fluid to phi
      
      * move miopen_helper.h from fluid to phi
      
      * fix bugs
      
      * move gpu_dnn.h from fluid to phi
      
      * fix bugs
      
      * update copyright year
      
      * simplify gpu_dnn.h in fluid
      
      * fix bugs
      
      * fix xpu build bug
      
      * fix compile bug
      
      * fix bug
      fd9c91c3
  2. 24 11月, 2022 1 次提交
    • H
      [PHI decoupling] simplify "convert_utils.h" in fluid (#48168) · de4310e6
      huangjiyi 提交于
      * rm dependence to "convert_utils.h" in some files
      
      * fix bugs
      
      * replace DataType2String with DataTypeToString
      
      * replace framework::DataTypeSize with phi::SizeOf
      
      * mv convert_function from fluid to phi and rm old map
      
      * recommit with pre-commit
      
      * repalce ProtoVarType with ProtoDataType and update comment.
      
      * fix error about include "dnnl.hpp"
      
      * revert add dep mkldnn to convert_utils in phi
      
      * add mkldnn deps in convert_utils.h in phi
      
      * move deps to convert_utils.h in phi
      de4310e6
  3. 18 11月, 2022 1 次提交
    • T
      CUDNN v8 Implementation of Convolution Kernels (#47454) · 14a6e67b
      Tian Zheng 提交于
      * Refactor conv_kernel and conv_grad_kernel to provide interface for CUDNNv8 implementation
      
      * Fix macro
      
      * Add implementation for conv_kernel and conv_grad_kernel
      
      * Modification after rebase onto latest develop
      
      * Modify plan cache to comply with the API of phi::autotune
      
      * Refactor to reduce duplicate code
      
      * Review fix:
      - move functions in  conv_kernel_impl_v8.h and conv_grad_kernel_impl_v8.h to conv_kernel.cu and conv_grad_kernelk.cu
      - add const specifier for input tensor
      - add logging when plans fail to execute
      - move CudnnConvBwdFilterV8 and CudnnConvBwdDataV8 to conv_cudnn_frontend.h
      
      * - move plan building outside of cache
      
      * Fix ROCM build
      14a6e67b