1. 14 6月, 2023 1 次提交
    • P
      support sharding stage1 (#54069) · 974676bc
      pangengzheng 提交于
      * support sharding stage1
      
      * fix unittest
      
      * format
      
      * pass sharded sharding params_and_grads to inner_opt apply_pptimize
      
      * change sharding gradient allreduce to reduce
      
      * support save state_dict adptively and support sharding with mp
      
      * fix sharding test
      
      * test set_state_dict
      
      * add more unit test
      
      * fix global norm of mp case
      
      * polish
      
      * hack to calculate global norm in order to remove diff in calculating global norm values in HybridParallelClipGrad compared to dp
      
      * remove print
      974676bc
  2. 13 4月, 2023 1 次提交
    • H
      [enforce.h Decouple logging.h] Delete glog/logging.h from enforce.h (#52651) · 5664ea26
      HongyuJia 提交于
      * [enforce.h Decouple logging.h] Delete glog/logging.h from enforce.h
      
      * Add logging.h for profiler.cc
      
      * Add logging.h for gloo_utils.h
      
      * Add logging.h for addmm_kernel_impl.h
      
      * Add logging.h for addmm_grad_kernel_impl.h
      
      * Add logging.h for p_send_kernel.cu
      
      * Add logging.h for determinant_grad_kernel_impl.h
      
      * Add logging.h for p_recv_kernel.cu
      
      * Add logging.h for elementwise_grad_base.h
      
      * Add logging.h for transfer_layout_kernel.cc
      
      * Add logging.h for eigvals_kernel.cc and index_select_impl.h
      
      * Add logging.h for all files in kernel directory
      
      * Add logging.h for xpu_info.cc
      
      * Add logging.h for xpu
      5664ea26
  3. 11 4月, 2023 2 次提交
  4. 04 4月, 2023 1 次提交
  5. 30 3月, 2023 2 次提交
  6. 27 3月, 2023 2 次提交
  7. 24 3月, 2023 2 次提交
    • P
      [PHI]fix momentum dtype infer (#51353) · 648ec795
      PuQing 提交于
      * fix momentum dtype infer
      
      * fix momentum datatype
      
      * fix on cpu
      
      * add momentum
      648ec795
    • Z
      Memory Efficient Attention (#51867) · e5ad3859
      ZhangDY-6483 提交于
      * first version, notest
      
      * return final rst, notest
      
      * use infinity() instead of max
      
      * ut structure
      
      * start up of ut
      
      * generate lse
      
      * update
      
      * add depense
      
      * reconstruct cmake
      
      * move file
      
      * add memory efficient attention and fix blasimpl
      
      * update
      
      * update cmake
      
      * add namespace
      
      * update cmake
      
      * use .cu
      
      * update for pad3d
      
      * bug fix
      
      * bug fix
      
      * update
      
      * bug fix
      
      * update enforce
      
      * add test case
      
      * merge the lse pad
      
      * fix kernel_fn of backward
      
      * fix PADDLE_ENFORCE_EQ and phi_api
      
      * fix PADDLE_ENFORCE
      
      * fix PADDLE_ENFORCE
      
      * rerun coverage
      
      * fix memory efficient attention test
      
      * rerun ci
      
      * add cuda version condition
      
      * add cuda version condition
      
      * delete WIP test
      
      * replace PADDLE_ENFORCE
      
      * edit the namespace of datatype in multiple.cc
      
      * rerun
      
      * rerun
      
      ---------
      Co-authored-by: Nliuyuang <liuyuang@baidu.com>
      e5ad3859
  8. 23 3月, 2023 1 次提交
    • C
      [Prim] add meshgrid composite rule (#51061) · 53bb883d
      chenjian 提交于
      * add meshgrid composite rule
      
      * add meshgrid composite rule
      
      * update
      
      * add into CMakeLists
      
      * fix
      
      * update
      
      * update
      
      * optimize code
      
      * fix meshgrid op
      
      * update test
      53bb883d
  9. 22 3月, 2023 1 次提交
    • S
      Add fused_linear_param_grad_add_kernel (#51805) · f59c5d8b
      sneaxiy 提交于
      * add fused_linear_param_grad_add_kernel
      
      * fix compile error
      
      * remove flag
      
      * fix ci compile error
      
      * fix ci compile error
      
      * revert pylayer revision
      
      * fix ci ut
      
      * improve performance
      f59c5d8b
  10. 21 3月, 2023 1 次提交
    • iSerendipity's avatar
      [PHI decoupling] Move DataType* from paddle:experimental to phi namespace (#51716) · 4638a62e
      iSerendipity 提交于
      * move DataType from paddle::experimental to phi
      
      * convert namespace
      
      * convert namespace
      
      * convert namespace
      
      * clarify namespace
      
      * convert more datatype
      
      * Revert "convert more datatype"
      
      This reverts commit 083b462959e6a22d4d8767707b628b95b396642e.
      
      * convert more in auto_code_generator
      
      * fix conflicts for XPU
      
      * fix namespace conflicts
      
      * fix errors
      
      * Revert "fix errors"
      
      This reverts commit f9d9958b54ee32141112274c8a5c3c381ab0f876.
      
      * fix errors
      
      * fix formatting
      4638a62e
  11. 09 3月, 2023 1 次提交
  12. 08 3月, 2023 1 次提交
  13. 06 3月, 2023 1 次提交
  14. 03 3月, 2023 1 次提交
  15. 01 3月, 2023 1 次提交
  16. 17 2月, 2023 1 次提交
    • Y
      Rename MultiTensorAdam To FusedAdam (#50449) · e6af9bd2
      yuehuayingxueluo 提交于
      * rename multi_tensor_adam to fused_adam
      
      * fix some bugs
      
      * fix CI coverage
      
      * rename test_fused_adam.py
      
      * fix some bug
      
      * add test_fused_adam_op.py
      
      * fix some bugs
      
      * fix fused_adam_op.cc
      
      * fix CI bugs
      
      * fix CI bug
      
      * fix CI bug
      e6af9bd2
  17. 16 2月, 2023 1 次提交
  18. 09 2月, 2023 1 次提交
    • Y
      Add MultiTenosrAdam OP (#49220) · 10654c77
      yuehuayingxueluo 提交于
      * add multi_tenosr_adam
      
      * update multi_tensor_base.py, test_multi_tensor_adam.py, adamw.py
      
      * fix adam.py optimizer.py
      
      * fix adamw.py
      
      * fix test_multi_tensor_adam.py
      
      * fix CI bug
      
      * fix CI coverage
      
      * fix ci bug
      
      * fix betapow
      
      * fix some bugs
      
      * fix test_adamw_op.py
      
      * fix CI coverage
      
      * fix multi_tensor_adam_kernel.cc
      
      * fix CI bug
      
      * fix multi_tensor_adam_op.cc and test_multi_tensor_adam.py
      
      * fix code style
      
      * update C++ parts
      
      * remove python parts modification temporarily
      
      * add C++ ut
      
      * update betapow copy code logic
      
      * fix ci ut
      
      * fix windows ci
      
      * fix coverage ci
      
      * improve coverage rate
      
      ---------
      Co-authored-by: Nsneaxiy <sneaxiy@126.com>
      10654c77
  19. 31 1月, 2023 2 次提交
  20. 16 1月, 2023 1 次提交
  21. 28 12月, 2022 1 次提交
  22. 26 12月, 2022 1 次提交
  23. 23 12月, 2022 1 次提交
  24. 22 12月, 2022 1 次提交
  25. 09 12月, 2022 1 次提交
  26. 05 12月, 2022 1 次提交
  27. 17 11月, 2022 1 次提交
  28. 11 11月, 2022 1 次提交
  29. 02 11月, 2022 1 次提交
  30. 01 11月, 2022 1 次提交
  31. 31 10月, 2022 1 次提交
  32. 17 10月, 2022 1 次提交
  33. 12 10月, 2022 1 次提交
  34. 10 10月, 2022 1 次提交
    • Y
      [PHI]Add RNN yaml (#46812) · ab60fd8b
      YuanRisheng 提交于
      * add yaml entry for rnn and rrnn_grad, move infershape function for rnn_grad to phi infer meta
      
      * WIP: move rnn kernrl to phi
      
      * Change the code generation to avoid converting from intializer list to tuple of heterogeneous types.
      This is only triggered when an api has intermediate outputs, and the result of the outputs are of heterogeneous types.
      
      * fix the bug that when none in a vector of tensors requires gradient, the conversion to InferShapeContext to InferMetaContext (a.k.a. BuildInferMetaContext) produces errorous results.
      
      * fix ci bugs
      
      * fix ci bugs
      
      * fix ci bugs
      
      * modify code according comment
      Co-authored-by: Nchenfeiyu <chenfeiyu@baidu.com>
      ab60fd8b
  35. 09 10月, 2022 1 次提交