1. 03 4月, 2023 3 次提交
  2. 31 3月, 2023 5 次提交
  3. 30 3月, 2023 6 次提交
  4. 29 3月, 2023 1 次提交
  5. 28 3月, 2023 6 次提交
  6. 27 3月, 2023 3 次提交
  7. 24 3月, 2023 2 次提交
    • TaoTao Li's avatar
      add phi operator allreduce/reduce (#51857) · 47f87ad3
      TaoTao Li 提交于
      * add all_reduce, reduce kernel and api
      
      * fix all_reduce reduce ut
      
      fix reduce op maker conflict
      
      fix merge conflicts
      
      * fix conflicts, rename ReduceOp->ReduceBaseOp in reduce_ops
      
      rename allreduce op, to remove
      
      * fix code format
      
      fix comments
      
      * modify test_collective_reduce_api ut timeout
      
      * fix PR-CI-Build
      
      fix comments: format phi operator
      47f87ad3
    • Z
      Memory Efficient Attention (#51867) · e5ad3859
      ZhangDY-6483 提交于
      * first version, notest
      
      * return final rst, notest
      
      * use infinity() instead of max
      
      * ut structure
      
      * start up of ut
      
      * generate lse
      
      * update
      
      * add depense
      
      * reconstruct cmake
      
      * move file
      
      * add memory efficient attention and fix blasimpl
      
      * update
      
      * update cmake
      
      * add namespace
      
      * update cmake
      
      * use .cu
      
      * update for pad3d
      
      * bug fix
      
      * bug fix
      
      * update
      
      * bug fix
      
      * update enforce
      
      * add test case
      
      * merge the lse pad
      
      * fix kernel_fn of backward
      
      * fix PADDLE_ENFORCE_EQ and phi_api
      
      * fix PADDLE_ENFORCE
      
      * fix PADDLE_ENFORCE
      
      * rerun coverage
      
      * fix memory efficient attention test
      
      * rerun ci
      
      * add cuda version condition
      
      * add cuda version condition
      
      * delete WIP test
      
      * replace PADDLE_ENFORCE
      
      * edit the namespace of datatype in multiple.cc
      
      * rerun
      
      * rerun
      
      ---------
      Co-authored-by: Nliuyuang <liuyuang@baidu.com>
      e5ad3859
  8. 23 3月, 2023 8 次提交
  9. 22 3月, 2023 6 次提交
    • HappyHeavyRain's avatar
      Support optimizers operator to be generated (#51767) · 0b008e0c
      HappyHeavyRain 提交于
      * test_get_kernel
      
      * add invoke signature
      
      * change reduce_max
      
      * change frobenius_norm
      
      * reset reduce_max according to composite and change reduce_all
      
      * fix the bug when Scalar(*)
      
      * fix 'scalar when support_tensor'
      
      * change code according to review
      
      * change 'keep_signature' to 'manual_signature' and add some erro info
      
      * support optimizers autogen
      
      * change sgd yaml
      
      * change generate signature
      
      * fix test/cpp/new_executor/CM
      
      * reset signature generated function
      
      * change signature funciton
      
      * change signature funciton
      0b008e0c
    • S
      add fused dropout add (#51752) · 6ba0507d
      ShenLiang 提交于
      6ba0507d
    • S
      Extract fused_transpose op dedicated for oneDNN fuse passes (#50021) · 02296977
      Sławomir Siwek 提交于
      * extract common methods to reuse
      
      * add header for transpose ops
      
      * fused_transpose
      
      * Split big function
      
      * transpose2 tests
      
      * fused_transpose
      
      * Apply extra attributes
      
      * add pbtxt file
      
      * update pbtxt
      
      * Merge develop
      
      * add more strict op compats
      
      * code  style
      
      * remove mkldnn_data_type
      
      * unify SetOutMemDescWithReshape2FuseSupport
      
      * adjust quantize-dequantize for transpose
      
      * remove appendact
      
      * transpose2 quantization
      
      * fix int8 tests
      
      * adjust transpose_op to current develop
      
      * delete fusion code from transpose_kernel
      
      * add fused transpose to NHWC unittest
      
      * change order
      02296977
    • H
      [CustomOP Optional] CustomOP supports optional Tensor (#51923) · b74e00e1
      HongyuJia 提交于
      * [CustomOP Optional] CustomOP supports optional Tensor
      
      * fix test_custom_concat, add pytest to CMakeLists
      b74e00e1
    • W
      add autogen code support for index_add op (#51887) · 3065fa2c
      Wang Xin 提交于
      * add autogen code for index_add op
      
      * bug fixed
      3065fa2c
    • R
      support auto generate for p_norm (#51590) · 2b98993b
      RedContritio 提交于
      * supoort auto generate p_norm
      
      * fix bug in backward
      2b98993b