1. 19 4月, 2023 9 次提交
  2. 18 4月, 2023 28 次提交
  3. 17 4月, 2023 3 次提交
    • Y
      [Auto Parallel] Add the micro-bathsize config (#52912) · 94afa5ab
      Yulong Ao 提交于
      94afa5ab
    • T
      mv ps distributed dir (#52885) · 1765d5d1
      tianshuo78520a 提交于
      * mv ps distributed dir
      
      * fix
      
      * add del auto_parallel
      
      * add auto_parallel
      
      * fix ps
      
      * fix bug
      
      * fix test bug
      
      * fix test bug
      
      * merge develop fix error
      
      * merge develop fix error
      
      * merge develop fix error
      1765d5d1
    • Z
      [Paddle-Inference] Add cutlass conv2d_depthwise (#51792) · bd3b096a
      zhoutianzi666 提交于
      * initial commit for cutlass_teller
      
      * second commit for cutlass_teller
      
      * add conv2d_depthwise python template
      
      * add conv2d_depthwise cutlass template
      
      * /zhoukangkang/paddle_cutlass/Paddle/paddle/fluid/framework/ir/cutlass_teller.h
      
      * refine code in Conv2dFusionCanSupport
      
      * add macro in cutlass_teller.h
      
      * add 3x3 5x5 teller
      
      * add groups not 1 or conv2d_depthwise teller
      
      * 只生成ic是8的倍数的conv2d_depthwise 的kernel
      
      * add EXPLICIT in cutlass_teller.h
      
      * final commit
      
      * add split_k_slices in conv2d_depthwise
      
      * make stages == 2
      
      * 重构部分代码
      
      * add CutlassFusionType
      
      * solve illegal memory
      
      * make stride_h=stride_w && make dilation==1
      
      * must check HasAttr(use_cutlass) before GetAttrIfExists
      
      * add CONV2D_DEPTHWISE_BIAS_SILU to OpType2String
      
      * modify decl.h and util.cu
      bd3b096a