1. 28 4月, 2023 1 次提交
    • B
      Dropout optimize & clean broadcast inT and ElementwiseType (#52969) · d611e48c
      Bo Zhang 提交于
      * change judgement for DropoutGradGPUKernelDriver
      
      * add UnrollerWithoutVecSize and after this Loaddata to be refined
      
      * pass unittest
      
      * use same unroller with XPU
      
      * BroadcastWithInt64Index
      
      * BroadcastDataLoader template partial specialization
      
      * fix compile errs in ROCms
      
      * clean ElementwiseT and InT for BroadcastKernel
      
      * default axis and clean inT
      
      * remove redundant fast divmod computation
      
      * optimize drop_nd & drop_nd_grad
      
      * optimize BroadcastDataLoader bf16 fp16
      
      * rm InT etc. after merge develop
      
      * delete constexpr for windows ci
      
      * fix conflict
      
      * fix conflic with develop
      
      * fix conflic
      
      * new clean
      
      * clean
      d611e48c
  2. 13 3月, 2023 1 次提交
  3. 09 3月, 2023 1 次提交
  4. 15 11月, 2022 1 次提交
    • H
      [PHI decoupling] remove dependency on "paddle/fluid/operators/elementwise/xxx.h" in phi (#47870) · 04c29558
      huangjiyi 提交于
      * rm "paddle/fluid/operators/elementwise/xxx.h" in phi
      
      * fix bugs
      
      * add LaunchElementwiseCudaKernel in phi
      
      * Revert "add LaunchElementwiseCudaKernel in phi"
      
      This reverts commit 588f45bbdad2372ec7bff0c567a29bff675d22e1.
      
      * rm indirect dependence to "elementwise_op_impl.cu.h"
      
      rm indirect dependence to "elementwise_op_impl.cu.h"
      
      Revert "add LaunchElementwiseCudaKernel in phi"
      
      This reverts commit 588f45bbdad2372ec7bff0c567a29bff675d22e1.
      
      add LaunchElementwiseCudaKernel in phi
      
      fix bugs
      
      * rm LaunchSameDimsElementwiseCudaKernel and LaunchElementwiseCudaKernel in phi
      04c29558
  5. 16 9月, 2022 1 次提交
    • S
      Support broadcast elementwise operators with int64 index type (#45741) · 20b5bf84
      sneaxiy 提交于
      * support int64 non-broadcast
      
      * support broadcast case for int64 index
      
      * fix bug
      
      * support more Arity
      
      * remove some codes
      
      * upgrade patchelf to v0.15.0 to pass CI build
      
      * fix bug
      
      * fix patchelf installation
      
      * add debug flags
      
      * remove useless codes
      
      * fix viterbi_decode and set_value op uts
      
      * remove always enable int64
      20b5bf84
  6. 24 6月, 2022 1 次提交
    • Y
      [Phi]Change Copy from Kernel to basic component utils (#43622) · 2739bd73
      YuanRisheng 提交于
      * perfect copy
      
      * deal with conflict
      
      * deal with conflict
      
      * fix compile bugs
      
      * fix unittest bugs
      
      * change code format
      
      * deal with conflict
      
      * modify code by review
      
      * fix ce bugs
      
      * fix ce bugs
      
      * add lo
      
      * perfect code format
      
      * deal with conflicts
      2739bd73
  7. 05 6月, 2022 1 次提交
  8. 09 3月, 2022 1 次提交