1. 28 4月, 2023 1 次提交
    • B
      Dropout optimize & clean broadcast inT and ElementwiseType (#52969) · d611e48c
      Bo Zhang 提交于
      * change judgement for DropoutGradGPUKernelDriver
      
      * add UnrollerWithoutVecSize and after this Loaddata to be refined
      
      * pass unittest
      
      * use same unroller with XPU
      
      * BroadcastWithInt64Index
      
      * BroadcastDataLoader template partial specialization
      
      * fix compile errs in ROCms
      
      * clean ElementwiseT and InT for BroadcastKernel
      
      * default axis and clean inT
      
      * remove redundant fast divmod computation
      
      * optimize drop_nd & drop_nd_grad
      
      * optimize BroadcastDataLoader bf16 fp16
      
      * rm InT etc. after merge develop
      
      * delete constexpr for windows ci
      
      * fix conflict
      
      * fix conflic with develop
      
      * fix conflic
      
      * new clean
      
      * clean
      d611e48c
  2. 09 3月, 2023 1 次提交
  3. 13 2月, 2023 1 次提交
  4. 10 11月, 2022 1 次提交
  5. 27 4月, 2022 1 次提交
    • Z
      Optimize performance of dygraph (v4) (#42196) · 37e2f027
      zyfncg 提交于
      * optimize performance of dygraph
      
      * optimize performance of dygraph and elementwise_add
      
      * optimize the trace op
      
      * fix bug
      
      * fix bug
      
      * fix unittest bug
      
      * fix code format
      37e2f027
  6. 19 4月, 2022 1 次提交
  7. 14 4月, 2022 1 次提交
  8. 23 3月, 2022 1 次提交
  9. 14 3月, 2022 1 次提交