1. 19 4月, 2023 18 次提交
  2. 18 4月, 2023 21 次提交
  3. 17 4月, 2023 1 次提交
    • Z
      [Paddle-Inference] Add cutlass conv2d_depthwise (#51792) · bd3b096a
      zhoutianzi666 提交于
      * initial commit for cutlass_teller
      
      * second commit for cutlass_teller
      
      * add conv2d_depthwise python template
      
      * add conv2d_depthwise cutlass template
      
      * /zhoukangkang/paddle_cutlass/Paddle/paddle/fluid/framework/ir/cutlass_teller.h
      
      * refine code in Conv2dFusionCanSupport
      
      * add macro in cutlass_teller.h
      
      * add 3x3 5x5 teller
      
      * add groups not 1 or conv2d_depthwise teller
      
      * 只生成ic是8的倍数的conv2d_depthwise 的kernel
      
      * add EXPLICIT in cutlass_teller.h
      
      * final commit
      
      * add split_k_slices in conv2d_depthwise
      
      * make stages == 2
      
      * 重构部分代码
      
      * add CutlassFusionType
      
      * solve illegal memory
      
      * make stride_h=stride_w && make dilation==1
      
      * must check HasAttr(use_cutlass) before GetAttrIfExists
      
      * add CONV2D_DEPTHWISE_BIAS_SILU to OpType2String
      
      * modify decl.h and util.cu
      bd3b096a