1. 25 4月, 2023 1 次提交
  2. 24 4月, 2023 6 次提交
  3. 23 4月, 2023 1 次提交
  4. 21 4月, 2023 2 次提交
  5. 19 4月, 2023 1 次提交
  6. 18 4月, 2023 1 次提交
  7. 17 4月, 2023 1 次提交
  8. 14 4月, 2023 6 次提交
  9. 13 4月, 2023 2 次提交
    • C
      Add overlap_add, sign tests (#52667) · cb6de765
      chenxujun 提交于
      cb6de765
    • H
      [enforce.h Decouple logging.h] Delete glog/logging.h from enforce.h (#52651) · 5664ea26
      HongyuJia 提交于
      * [enforce.h Decouple logging.h] Delete glog/logging.h from enforce.h
      
      * Add logging.h for profiler.cc
      
      * Add logging.h for gloo_utils.h
      
      * Add logging.h for addmm_kernel_impl.h
      
      * Add logging.h for addmm_grad_kernel_impl.h
      
      * Add logging.h for p_send_kernel.cu
      
      * Add logging.h for determinant_grad_kernel_impl.h
      
      * Add logging.h for p_recv_kernel.cu
      
      * Add logging.h for elementwise_grad_base.h
      
      * Add logging.h for transfer_layout_kernel.cc
      
      * Add logging.h for eigvals_kernel.cc and index_select_impl.h
      
      * Add logging.h for all files in kernel directory
      
      * Add logging.h for xpu_info.cc
      
      * Add logging.h for xpu
      5664ea26
  10. 11 4月, 2023 2 次提交
  11. 10 4月, 2023 3 次提交
    • H
      [enforce.h Decouple gflags.h] Move gflags.h from enforce.h to enforce.cc (#52573) · 3c0b1795
      HongyuJia 提交于
      * [enforce.h Decouple gflags.h] Move gflags.h from enforce.h to enforce.cc
      
      * Add gflags.h for other files
      
      * Add gflags.h for other files
      
      * Add gflags.h for blas_impl.hip.h
      
      * Add gflags.h for miopen_helper.h
      3c0b1795
    • V
      [AMP OP&Test] Add fp16 and bf16 test to activation (#52521) · 6bd5fd75
      Vvsmile 提交于
      * adjust defalut tolerance of output and grad
      
      * fix a bug in the grad of OpTest
      
      * fix the type of setting defalut value in optest, both forward and
      backward
      
      * add defalut
      
      * fix test_sum_op
      
      * adjust tolerance
      
      * fix the tolerance of eager
      
      * add bf16 and fp16 to the activation tests
      
      * remove some fixs
      
      * fix activation
      
      * fix fp16
      
      * fix gelu
      
      * fix the activation tests
      
      * add bfloat16 specialization to singrad and cosgrad
      
      * fix bugs
      
      * fix bugs
      
      * add unittest
      
      * add skip
      
      * add fp/bf to rrelu/rrelu_grad
      
      * git add rrelu
      
      * fix bugs
      6bd5fd75
    • G
      modify ~MatmulDescriptor and remove [-Wunused-function] (#52618) · 45f660dd
      Galaxy1458 提交于
      * delete [-Wno-error=terminate], test=develop
      
      * remove GPUps[-Wterminate],test=develop
      
      * remove some -Wno-, test=develop
      
      * modify ~MatmulDescriptor
      
      * mess
      45f660dd
  12. 07 4月, 2023 1 次提交
  13. 06 4月, 2023 3 次提交
    • Y
      fix build bug (#52566) · 6c01ce8a
      yuehuayingxueluo 提交于
      6c01ce8a
    • S
      Move fused_attention op to phi [迁移前向 GPU OpKernel] (#51743) · a7ec8958
      Sonder 提交于
      * add kernel functions
      
      * update kernel functions
      
      * update func parameters' name
      
      * create codes for gpu device
      
      * 调整文件位置
      
      * fix include error
      
      * remove dependent files to phi/
      
      * restore fused_attention_op.cu
      
      * fix dependence errors
      
      * fix dependence errors
      
      * fix include error
      
      * fix all depandence errors[build success]
      
      * remove useless include
      
      * recover useless include
      
      * use phi::ToNCCLDataType
      
      * fix namespace
      
      * update new register code
      
      * fix error in fused_gemm_epilogue_utils
      
      * fix error in FusedAttentionKernel parm
      
      * finish fused_attention registe code[build success]
      
      * add paddle::optional
      
      * add sig file
      
      * fix build error
      
      * fix a include error
      
      * update CMkaeList
      
      * fix parameter sequence
      
      * add include file
      
      * update #if before include
      
      * fix grammly error
      
      * update codes for DropoutParam
      
      * remove const cast
      
      * trans some fluid api to phi api
      
      * add #if
      
      * update test code
      
      * update test codes
      
      * recover test codes
      
      * trans fused_attention to fluid
      
      * move #endif to end
      
      * move #endif
      
      * delete useless files
      
      * use fused attention utils and recover random seed
      
      * remove fluid include in phi
      a7ec8958
    • mv PADDLE_WITH_ASCEND_CL (#52535) · 80dd1672
      张春乔 提交于
      80dd1672
  14. 04 4月, 2023 1 次提交
  15. 03 4月, 2023 1 次提交
  16. 31 3月, 2023 1 次提交
  17. 30 3月, 2023 1 次提交
  18. 29 3月, 2023 1 次提交
    • Y
      Add Fuse Adamw Pass (#50484) · 66098bff
      yuehuayingxueluo 提交于
      * add fuse adamw pass
      
      * fix some bugs
      
      * fix CIbug
      
      * change chunk_size
      
      * fix CI bug
      
      * rm test_fused_adam_op.py
      
      * fix CI bugs
      
      * fix fuse_adamw_op_pass.cc
      
      * change code style
      
      * fix CI bug
      
      * fix ut bug and use_adamw_op_pass.cc
      
      * fix test_fuse_adamw_pass.py
      
      * fix CI bug
      
      * remove fluid
      
      * fix ci bug
      
      * fix CI bug
      66098bff
  19. 25 3月, 2023 1 次提交
  20. 24 3月, 2023 3 次提交
    • Y
      [PHI Decoupling]Remove memory header (Part3) (#51288) · 3d78e759
      YuanRisheng 提交于
      * decouple memory copy
      
      * fix ci bugs
      
      * fix ci compile bugs
      
      * fix rocm compile
      
      * fix ci bugs
      
      * decouple memory
      
      * deal with conflict
      
      * fix xpu compile bugs
      
      * fix xpu bugs
      
      * deal with xpu bugs
      
      * fix cmake bugs
      
      * fix windows bugs
      
      * fix ci bugs
      
      * fix ci bugs
      
      * delete redundance code
      
      * add code for pybind
      
      * fix py3 bugs
      
      * fix ci bugs
      3d78e759
    • T
      【PaddlePaddle Hackathon 4 No.40】为 Paddle 优化 kthvalue op 在 GPU 上的计算性能 (#51835) · e18f5339
      thunder95 提交于
      * untracked files
      
      * kthvalue perf
      
      * remove unused files
      
      * fix isnan
      
      * fix isnan2
      
      * fix bug
      
      * try to fix rocm error
      e18f5339
    • Z
      Memory Efficient Attention (#51867) · e5ad3859
      ZhangDY-6483 提交于
      * first version, notest
      
      * return final rst, notest
      
      * use infinity() instead of max
      
      * ut structure
      
      * start up of ut
      
      * generate lse
      
      * update
      
      * add depense
      
      * reconstruct cmake
      
      * move file
      
      * add memory efficient attention and fix blasimpl
      
      * update
      
      * update cmake
      
      * add namespace
      
      * update cmake
      
      * use .cu
      
      * update for pad3d
      
      * bug fix
      
      * bug fix
      
      * update
      
      * bug fix
      
      * update enforce
      
      * add test case
      
      * merge the lse pad
      
      * fix kernel_fn of backward
      
      * fix PADDLE_ENFORCE_EQ and phi_api
      
      * fix PADDLE_ENFORCE
      
      * fix PADDLE_ENFORCE
      
      * rerun coverage
      
      * fix memory efficient attention test
      
      * rerun ci
      
      * add cuda version condition
      
      * add cuda version condition
      
      * delete WIP test
      
      * replace PADDLE_ENFORCE
      
      * edit the namespace of datatype in multiple.cc
      
      * rerun
      
      * rerun
      
      ---------
      Co-authored-by: Nliuyuang <liuyuang@baidu.com>
      e5ad3859
  21. 23 3月, 2023 1 次提交